Grammaticalization Scenarios: Volume 1 Grammaticalization Scenarios from Europe and Asia 9783110563146, 9783110559378

This volume intends to fill the gap in the grammaticalization studies setting as its goal the systematic description of

241 26 86MB

English Pages 670 Year 2020

Report DMCA / Copyright

DOWNLOAD PDF FILE

Table of contents :
Acknowledgements
List of authors
Contents
1 Position paper: Universal and areal patterns in grammaticalization
2 Measuring Grammaticalization: A questionnaire
3 Grammaticalization in the Germanic languages
4 Mechanisms and paths of grammaticalization and reanalysis in Romance
5 Grammaticalization in Slavic
6 Grammaticalization in Lezgic (East Caucasian)
7 Grammaticalization in Uralic as viewed from a general Eurasian perspective
8 Grammaticalization in Ewen (North- Tungusic) in a comparative perspective
9 Areal features in Yeniseian grammaticalization
10 Grammaticalization and reanalysis in Iranian
11 Grammaticalization in standard Hindi/ Urdu and Hindi dialects
12 Grammaticalization in Japhug
13 Grammaticalization in Korean
14 Grammaticalization changes in Chinese
Recommend Papers

Grammaticalization Scenarios: Volume 1 Grammaticalization Scenarios from Europe and Asia
 9783110563146, 9783110559378

  • 0 0 0
  • Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up
File loading please wait...
Citation preview

Walter Bisang and Andrej Malchukov (Eds.) Grammaticalization Scenarios: Cross-linguistic Variation and Universal Tendencies Vol. 1

Comparative Handbooks of Linguistics

Edited by Edith Moravcsik and Andrej Malchukov

Volume 4.1

Grammaticalization Scenarios: Cross-linguistic Variation and Universal Tendencies

Volume 1: Grammaticalization Scenarios from Europe and Asia Edited by Walter Bisang and Andrej Malchukov

ISBN 978-3-11-055937-8 (Vol. 4.1) e-ISBN (PDF) 978-3-11-056314-6 (Vol. 4.1) e-ISBN (EPUB) 978-3-11-056044-2 (Vol. 4.1) ISBN 978-3-11-071268-1 (Set volume 4.1 & 4.2) ISSN 2364-4354 Library of Congress Control Number: 2020939094 Bibliographic information published by the Deutsche Nationalbibliothek The Deutsche Nationalbibliothek lists this publication in the Deutsche Nationalbibliografie; detailed bibliographic data are available on the Internet at http://dnb.dnb.de. © 2020 Walter de Gruyter GmbH, Berlin/Boston Cover image: Jupiterimages/PHOTOS.com/thinkstock Typesetting: Meta Systems Publishing & Printservices GmbH, Wustermark Printing and binding: CPI books GmbH, Leck www.degruyter.com

Acknowledgements The present volume is the outcome of a large-scale collaborative project on universal and areal patterns of grammaticalization, based at the University of Mainz. The financial help from the German Research Foundation of our project “Cross-linguistic variation in grammaticalization processes and areal patterns of grammaticalization” (Bi 591/12-1) is gratefully acknowledged. We are very much indebted to our authors for their contributions and for their useful discussions in the course of our internal reviewing. In particular, we would like to thank Christian Lehmann, Martin Haspelmath, Marianne Mithun and Denis Creissels for their advice and constructive critique. We would also like to thank our excellent MAGRAM team (https://www. linguistik.fb05.uni-mainz.de/magram/) for their engagement: Linlin Sun, Iris Rieder, Marvin Martiny, Eduard Schröder and Svenja Luell. Our thanks also go to Arne Nagels and Laura Becker for advice on statistical matters, and to Uta Reinöhl for stimulating discussions. Finally, we would like to express a big thank you to Barbara Karlson and Birgit Sievert at Mouton for their encouragement and patience, as well as to Dagmar Hanzlíková for excellent copy-editing and to Andreas Brandmair for his help at the last stages of the production of the volume. Mainz, 10 March 2020

https://doi.org/10.1515/9783110563146-202

Walter Bisang and Andrej Malchukov

List of authors Willem Adelaar, University Leiden, E-mail: [email protected] Walter Bisang, Johannes Gutenberg University, Mainz, E-mail: [email protected] Michela Cennamo, University of Naples Federico II, E-mail: [email protected] Denis Creissels, Université Lyon 2, E-mail: [email protected] Francis O. Egbokhare, University of Ibadan, Nigeria, E-mail: fo.egbokhare@mail. ui.edu.ng Zarina Estrada-Fernández, Universidad de Sonora, E-mail: [email protected] Sebastian Fedden, Université Sorbonne Nouvelle – Paris 3, E-mail: sebastian. [email protected] Martin Haspelmath, Max Planck Institute for Evolutionary Anthropology, E-mail: [email protected] Johannes Helmbrecht, University of Regensburg, E-mail: Johannes.Helmbrecht@ sprachlit.uni-regensburg.de Nikolaus P. Himmelmann, Universität zu Köln, E-mail: [email protected] Guillaume Jacques, CNRS, Paris, E-mail: [email protected] Juha Janhunen, University of Helsinki, E-mail: [email protected] Luise Kempf, Universität Bern, Bern, E-mail: [email protected] Agnes Korn, CNRS, Paris, E-mail: [email protected] Christian Lehmann, Universität Erfurt, E-mail: [email protected] Svenja Luell, Johannes Gutenberg University, Mainz, E-mail: [email protected] Timur Maisak, Institute of Linguistics of the Russian Academy of Sciences & HSE University, Moscow, Russian Federation, E-mail: [email protected] Andrej Malchukov, Johannes Gutenberg University, Mainz, E-mail: [email protected] Marvin Martiny, Johannes Gutenberg University, Mainz, E-mail: mmartiny@ students.uni-mainz.de William B. McGregor, Aarhus University, E-mail: [email protected] Susanne Michaelis, Leipzig University, E-mail: [email protected] Marianne Mithun, University of California, Santa Barbara, E-mail: mithun@ linguistics.ucsb.edu Annie Montaut, INALCO, Paris, E-mail: [email protected] Damaris Nübling, Johannes Gutenberg University, Mainz, E-mail: [email protected] Seongha Rhee, Hankuk University of Foreign Studies, E-mail: [email protected] Iris Rieder, Johannes Gutenberg University, Mainz, E-mail: [email protected] Ronald Schaefer, Southern Illinois University Edwardsville, USA, E-mail: rschaef@ siue.edu https://doi.org/10.1515/9783110563146-203

viii

List of authors

Linlin Sun, Johannes Gutenberg University, Mainz, E-mail: [email protected] Edward Vajda, Western Washington University, E-mail: [email protected] Martine Vanhove, CNRS, Paris, E-mail: [email protected] Björn Wiemer, Johannes Gutenberg University, Mainz, E-mail: [email protected]

Contents

Volume 1 Acknowledgements List of authors

v

vii

Walter Bisang, Andrej Malchukov, and the Mainz Grammaticalization Project team (Iris Rieder, Linlin Sun, Marvin Martiny, Svenja Luell) 1 1 Position paper: Universal and areal patterns in grammaticalization Walter Bisang, Andrej Malchukov, Iris Rieder, and Linlin Sun 2 Measuring Grammaticalization: A questionnaire 89 Damaris Nübling and Luise Kempf 3 Grammaticalization in the Germanic languages

105

Michela Cennamo 4 Mechanisms and paths of grammaticalization and reanalysis in Romance 165 Björn Wiemer 5 Grammaticalization in Slavic

249

Timur Maisak 6 Grammaticalization in Lezgic (East Caucasian)

309

Juha Janhunen 7 Grammaticalization in Uralic as viewed from a general Eurasian 361 perspective Andrej L. Malchukov 8 Grammaticalization in Ewen (North-Tungusic) in a comparative perspective 399 Edward Vajda 9 Areal features in Yeniseian grammaticalization

433

x

Contents

Agnes Korn 10 Grammaticalization and reanalysis in Iranian

465

Annie Montaut 11 Grammaticalization in standard Hindi/Urdu and Hindi dialects Guillaume Jacques 12 Grammaticalization in Japhug

539

Seongha Rhee 13 Grammaticalization in Korean

575

Linlin Sun and Walter Bisang 14 Grammaticalization changes in Chinese

609

499

xi

Contents

Volume 2 Walter Bisang, Andrej Malchukov, Iris Rieder, and Linlin Sun ix Measuring Grammaticalization: A questionnaire Martine Vanhove 15 Grammaticalization in Cushitic, with special reference to Beja Denis Creissels 16 Grammaticalization in Manding languages

659

695

Ronald P. Schaefer and Francis O. Egbokhare 17 Grammaticalization in Emai 729 Denis Creissels 18 Grammaticalization in Tswana

769

Christian Lehmann 19 Grammaticalization in Yucatec Maya

803

Zarina Estrada-Fernández 20 Grammaticalization in Uto-Aztecan languages from northwestern Mexico 853 Johannes Helmbrecht 21 Grammaticalizations in Hoocąk

903

Marianne Mithun 22 Grammaticalization and polysynthesis: Iroquoian

943

Willem F. H. Adelaar 23 Grammaticalization in the Quechuan and Aymaran languages of the Central 977 Andes Sebastian Fedden 24 Grammaticalization in Mountain Ok (Papua New Guinea)

1007

Nikolaus P. Himmelmann 25 Grammaticisation processes and reanalyses in Sulawesi languages William B. McGregor 26 Grammaticalization patterns in Nyulnyulan language

1077

1043

xii

Contents

Susanne Maria Michaelis and Martin Haspelmath 27 Grammaticalization in creole languages: Accelerated functionalization and semantic imitation 1109 Language index Subject index

1129 1137

Walter Bisang, Andrej Malchukov, and the Mainz Grammaticalization Project team (Iris Rieder, Linlin Sun, Marvin Martiny, Svenja Luell)

1 Position paper: Universal and areal patterns in grammaticalization  Introduction . Theoretical preliminaries: accomplishments and open questions in grammaticalization research Skipping early forerunners like A. W. von Schlegel (1818), studies on grammaticalization started out from Meillet (1912) and Kuryłowicz (1965) and were later associated with the work of prominent researchers like Joan Bybee, Talmy Givón, Bernd Heine and Christian Lehmann. Its definition in terms of a lexical item that develops into a grammatical morpheme or from a less grammatical into a more grammatical marker can be seen by now as a classical approach to the phenomenon. In the course of time, research on grammaticalization has become one of the most successful research paradigms introduced in 20th century linguistics. Milestones of grammaticalization research include, among others, such work as Lehmann’s (1995) Thoughts on grammaticalization, Heine, Claudi, and Hünnemeyer’s (1991) Grammaticalization. A conceptual framework, Bybee, Pagliuca, and Perkins’s (1994) The evolution of grammar, and Heine and Kuteva’s (2002) World lexicon of grammaticalization. Even critical voices like Newmeyer (1998, 2001), Campbell and Janda (2001) and others did not discourage research in this field,1 which currently numbers in the thousands of publications (cf. the monumental Handbook of Grammaticalization by Narrog and Heine [2011] for a state-of-the-art survey of research on grammaticalization). Yet, in spite of its obvious success, some important aspects remain controversial and are in need of further study. Most importantly, it still remains unclear to what extent grammaticalization is subject to cross-linguistic and areal variation. Our project approaches these issues by a systematic quantitative analysis on the basis of data collected from 29 leading experts on different languages and language families across the world. For that purpose, it will focus on source concepts and paths of grammaticalization, on the one hand, and scenarios of grammaticalization as they are defined by the interaction of different parameters and aspects of areality, on the other hand (on the notion of ‘scenario of grammaticalization’, also cf. Bisang and Malchukov [2017]). As will become clear in the course of this introductory chap-

 Also cf. Lehmann’s (2004) reaction to these criticisms. https://doi.org/10.1515/9783110563146-001

2

Walter Bisang, Andrej Malchukov, et al.

ter, we understand our project as a pilot project whose further development will depend on a considerable extension of our data base and most likely also on various aspects of methodology (refinement of the questionnaire and, with increasing availability of data, statistical methods). The work of Lehmann, Heine and colleagues and Bybee and colleagues is a starting point of prime importance for our research. According to Lehmann, “grammaticalization of a linguistic sign is a process in which it loses in autonomy by becoming subject to constraints of the linguistic system” (Lehmann 2004: 155). Lehmann (1995) further introduced grammaticalization parameters for assessing the autonomy of the linguistic sign, which will be crucial for the eight parameters used in our project (cf. section 3.1 and chapter 2). The original version of Lehmann’s grammaticalization parameters is reproduced in Table 1:

Tab. 1: Lehmann’s grammaticalization parameters (Lehmann 2002b: 132). Axis/Parameter

Paradigmatic

Syntagmatic

Weight Cohesion Variability

Integrity Paradigmaticity Paradigmatic Variability

Structural Scope Bondedness Syntagmatic Variability

As is clear from Table 1, Lehmann conceives grammaticalization as a reductive process which manifests itself in the reduction of Weight (semantic and phonetic integrity), increased Cohesion and decreased Variability in paradigmatic and syntagmatic dimensions. Even though this approach, which is sometimes labelled ‘reductionbased approach to grammaticalization’ (Cuyckens 2018), has not remained unchallenged 2 and other authors have suggested somewhat different parameters later,3 Lehmann has developed one of the conceptually most coherent models of how grammaticalization is reflected in linguistic structure. Moreover, he also provided important suggestions – both in published work (Lehmann 2015: 171–173) and in personal communication with our group – of how to operationalize various parameters in a way which makes quantification and cross-linguistic comparison possible (cf. questionnaire in chapter 2). The work of Heine and colleagues is of particular relevance for our project because of its findings on grammaticalization paths as they are compiled in the groundbreaking World lexicon of grammaticalization (Heine and Kuteva 2002). This work can be seen as a follow-up and radical extension of Heine and Reh (1984),

 Cf. Traugott (2011) and others.  Cf. Heine, Claudi, and Hünnemeyer (1991); Bybee, Pagliuca, and Perkins (1994); Haspelmath (1998); Hopper and Traugott (2003).

Position paper: Universal and areal patterns in grammaticalization

3

which documents phenomena of reanalysis and paths of grammaticalization in African languages. We chose a sample of 30 source concepts of grammaticalization from (Heine and Kuteva 2002) for the assessment of typologically widespread and rare paths and their potential areality. Another field of interest that our project shares with Heine and colleagues is areality. While their work is mainly concerned with the way in which grammaticalization operates in situations of language contact (contact-induced grammaticalization; cf. Heine and Kuteva [2005, 2006]), our interest in areality is more focused on how contact and geographic diffusion affect the development of specific, non-universal patterns of grammaticalization, be it in terms of individual paths or in terms of specific interactions between parameters within scenarios of grammaticalization. The work of Bybee, Pagliuca, and Perkins (1994) is of particular importance for our project because it was the first cross-linguistic quantitative study of grammaticalization. One of its major results was that meaning and form develop in parallel, i.e., ‘that the development of grammatical material is characterized by the dynamic coevolution of meaning and form’ (Bybee, Pagliuca, and Perkins 1994: 20). We will have more to say about that when we discuss the Parallel Reduction Hypothesis in § 3.2. below. As will be clear from the subsequent discussion, our results generally do not support Bybee’s Parallel Reduction Hypothesis, at least in its strong form, predicting change of form on a particular grammaticalization path, if there is a change in meaning. Still, at a more general level our project owes much to the work of Bybee and her associates, as it represents the second large-scale typological project trying to quantify grammaticalization phenomena. More recently, research on grammaticalization has experienced new inspirations from Construction Grammar. The advantage of this new perspective is its significant contribution to our understanding of constructionalization, i.e., the emergence of new constructions as new pairings of meaning and form in the context of grammar (Traugott and Trousdale 2013). The role of constructions as the framework within which individual linguistic items change into grammatical markers has long been recognized in the literature on grammaticalization (cf. Bybee, Pagliuca, and Perkins [1994]; Harris and Campbell [1995]; Hopper and Traugott [2003] and many others; also cf. section 4.4 on reanalysis and context sensitivity). What is new in research on constructionalization is its focus on the emergence of new constructions within the overall network of constructions that makes the grammar of a language. As was pointed out by Gisborne and Patten (2011), grammaticalization and constructionalization share a number of similarities, among them universality and unidirectionality (constructions universally develop from less schematic to more schematic types of constructions) and gradualness in their historical development. In spite of this, it is an open question to what extent research on grammaticalization and constructionalization actually overlap.4 Moreover, given the view (assumed, for ex-

 See Noël (2007) for a good overview of different views on this topic.

4

Walter Bisang, Andrej Malchukov, et al.

ample, in Radical Construction Grammar; Croft [2001]) that constructions are language-specific and that the meaning of components is derived from the construction rather than the other way around, it is completely unclear how to make typological generalizations of the type formulated by Heine and colleagues (the list of constructions may well risk being open-ended). To conclude, while we find ConstructionGrammar approaches highly insightful when dealing with individual languages, applying this approach for capturing typological generalizations remains a major challenge.5 The study of grammaticalization from the perspective of cross-linguistic variation is relatively new (Bisang 2004, 2008, 2011; Narrog and Ohori 2011). It has been first brought to the attention of the linguistic community through the work of Bisang, who demonstrated that South East Asian languages (EMSEA languages) present a challenge to proposed grammaticalization universals, in particular, to the assumption of the coevolution of meaning and form (Bybee, Pagliuca, and Perkins 1994; also cf. the Parallel Reduction Hypothesis below). The net result is that grammaticalization in EMSEA languages involves functional evolution (semantic reduction) and also reduction of Syntagmatic Variability (fixed word order) but not necessarily other parameters (Paradigmaticity, Obligatoriness, and most importantly Bondedness and Phonetic Reduction). Bisang framed this discussion in terms of areal variation in grammaticalization scenarios, which also includes the typological profile of isolating languages (cf. section 4.3.2 below for further discussion). Most recently, Narrog and Heine (2018) edited a volume on Grammaticalization from a typological perspective. The focus of the introductory chapter (Narrog and Heine 2018) is on two different hypotheses concerning scenarios of grammaticalization, or, more specifically, the interaction of change in meaning and change in form (also cf. section 4.1). The Parallel Reduction Hypothesis predicts that change in form parallels change in meaning and reflects Bybee, Pagliuca, and Perkins’s (1994) approach of the coevolution of meaning and form. In contrast, the Meaning First Hypothesis claims that change in meaning is primary and precedes form change in time. This hypothesis is held by various approaches, among them by the Context Model of Grammaticalization (Heine, Claudi, and Hünnemeyer 1991) and by Traugott and Dasher’s (2002) Invited Inference Theory of Semantic Change. Narrog and Heine (2018) discuss various results from their own research and from the contributions to their volume which clearly support the Meaning First Hypothesis, but they do not provide any quantitative analysis. Based on their examples, they go on argu-

 To appreciate this challenge, it is instructive to consider a constructionalization analysis proposed by Traugott (2015: 71) for the development of  going to to ‘future’ in English, with the account of the developments of go to Future found in grammaticalization literature (in the work by Bybee, also discussed in this context by Traugott). Clearly, the level of detail of the constructionalization analysis presented by Traugott (2015) is an advantage when intended as a fine-grained description of a change in an individual language (English), but it turns into a disadvantage if one attempts to generalize it across languages.

Position paper: Universal and areal patterns in grammaticalization

weak grammaticalization



process



bundle of semantic features; possibly polysyllabic



attrition



paradigmaticity

item participates loosely in semantic field



paradigmatici→ zation

small, tightly integrated paradigm

paradigmatic variability

free choice of items according to communicative intentions



obligatorifica→ tion

choice systematically constrained, use largely obligatory

structural scope

item relates to constituent of arbitrary complexity

‒ condensation →

item modifies word or stem

item is independently juxtaposed

‒ coalescence →

item is affix or even phonological feature of carrier

item can be shifted around freely



fixation

item occupies fixed slot

parameter integrity

bondedness

syntagmatic variability



5

strong grammaticalization few semantic features; oligo- or monosegmental

Fig. 1: Correlation between grammaticalization parameters (Lehmann 2002b: 146).

ing that processes of grammaticalization can be divided into four ‘basic aspects’ (Narrog and Heine 2018: 1), i.e., (a) extension (or context generalization), (b) desemanticization (loss or generalization) in meaning contexts, (c) decategorialization and (d) erosion (phonetic reduction). Given the Meaning First Hypothesis, aspects (a) and (b) are primary, while (c) and (d) may or may not follow in time. If one applies this sequence of aspects to typological variation in the outcome of processes of grammaticalization, the prediction is that there is limited variation in (a) and (b) and more variation in (c) and (d). The extent to which there is cross-linguistic variation in grammaticalization follows the reverse order. Thus, Narrog and Heine (2018: 2) suggest the following hierarchy of increasing variation in grammaticalization processes (cf. our Parameter Hierarchy in [24] below): (1) Extension > Desemanticization > Decategorialization > Erosion Lehmann (1995, 2015) in his early classic Thoughts on grammaticalization represents the covariation between his grammaticalization parameters in the following way (cf. Figure 1). This representation implies covariation between parameters at least in the default case,6 although later Lehmann (2015: 179–181) provides some discussion concerning the absence of covariation, admitting that there is no explanation for that:  Cf. Lehmann (2015: 174): “It is our contention that a normal grammaticalization process obeys the following condition: an item which is grammaticalized in a construction will occupy a point on each of the six parameters in such a way that the six points are roughly on a vertical line.”

6

Walter Bisang, Andrej Malchukov, et al.

Let me hasten to state that we have at present no explanation for a lack of correlation among the grammaticalization parameters. For one thing, we have no theoretical basis which would lead us to predict a 100 % correlation, or a correlation of whatever percentage, for that matter. (Lehmann 2015: 180).

We take this statement as an incentive for our own work to approach the question of covariation from the empirical perspective in order to provide the estimates for the degree of covariation between individual parameters. While these approaches try to account for grammaticalization in terms of some universal tendencies, sceptics regard grammaticalization as an epiphenomenon of mechanisms of language change that can be observed in other areas of linguistic change as well. Newmeyer (2001) represents his vision of grammaticalization in the form of the diagram in Figure 2. As is clear from this representation, grammaticalization is viewed as the mere cooccurrence of the general mechanisms of semantic change and syntactic reanalysis with phonetic reduction. These controversies are partially related to theoretical stances, which cannot be easily resolved. Thus, Lehmann (1995) proposes a new framework (‘Grammaticalization Theory’) starting from the assumptions of grammaticalization as a reductive process affecting the autonomy of a linguistic sign at different levels and in both paradigmatic and syntagmatic aspects. Newmeyer (1998, 2001) tries to reduce this notion to the (accidental) cooccurrence of familiar processes known from historical linguistics. Given this difference in underlying assumptions, one may ask the difficult question (posed by Edith Moravcsik, p.c.) of what kind of empirical evidence is needed to resolve this controversy. In this context, we expect that in Lehmann’s approach (in spite of his caveat given in the above quotation) grammaticalization can be seen as a more coherent phenomenon consisting of a natural class of dia-

Downgrading Reanalysis

Appropriate Semantic GRAMMATICALIZATION

Change

Phonetic Reduction

Fig. 2: Grammaticalization as an epiphenomenon (Newmeyer 2001: 202).

Position paper: Universal and areal patterns in grammaticalization

7

chronic processes which would manifest itself in more interdependencies between individual parameters.7 Newmeyer’s approach is non-committal in that respect, as he views grammaticalization as epiphenomenal and the cooccurrence of different processes as accidental. In what follows, we aim to contribute to resolving these controversies through an empirical typological study, by focusing in particular on a better understanding of how different parameters of grammaticalization relate to each other. As will be seen in the course of this chapter, the Network graph in Figure 11 as well as the heatmaps in Figures 9–10 show dependencies between individual parameters, as they emerge from our data, and thus provide the most direct answer to this question. As is to be expected from empirically based research, the patterns that result from our data do not clearly side with one of the two approaches. They rather resonate in one way or another with both sides. Overall, the results of our analysis of grammaticalization scenarios are more in line with the Meaning First Hypothesis than the Parallel Reduction Hypothesis, but they also provide evidence that more fine-grained analyses will be needed which go beyond Narrog and Heine’s (2018) four basic aspects. In other respects, the results of our project basically confirm the findings in the literature. For example, we found that it is necessary to redefine a number of grammaticalization paths (cf. section 2.2 for our discussion of granularity) but we did not come up with any results that challenge unidirectionality. In fact, the scarcity of degrammaticalization as described by Norde (2009) is also reflected in our data. We found 1003 paths of grammaticalization which follow unidirectionality and only eight examples which run against it, some of them even allowing different analyses (for the details, cf. section 4.2).

. Research agenda of the Mainz Grammaticalization Project (MAGRAM) As is clear from the above discussion, our Mainz Grammaticalization Project (MAGRAM) addresses the issues of universality and variation in grammaticalization processes. The term ‘universal’ is not used in terms of absolute universals, given that grammaticalization processes are not deterministic. It rather refers to typologically wide-spread patterns arising under different structural, areal and genealogical conditions. In that sense, ‘universality’ is opposed to ‘areality’, even though the

 This is in line with the following conclusion of Diewald (2010: 20) in her review of different definitions of grammaticalization in the literature: “Consequently, the distinctive and unique feature of grammaticalization is generally seen in its particular combination and serialization of several processes and stages, which – among other things – find their repercussion in grammaticalization scales and paths, and complex scenarios of successive contexts and constructions.”

8

Walter Bisang, Andrej Malchukov, et al.

latter is used occasionally (e.g., in § 2.3) also in a more specific sense and is contrasted to genealogical factors when we are discussing areality in the context of grammaticalization paths. More specifically, we will investigate cross-linguistically recurrent tendencies and typological variation in grammaticalization processes from two perspectives. On the one hand, we look for typological trends and variation with respect to grammaticalization paths (such as those listed in Heine and Kuteva’s [2002] Lexicon of grammaticalization). On the other hand, we explore typological tendencies and typological variation in scenarios of grammaticalization. Grammaticalization scenarios are defined through a (somewhat modified) list of Lehmann’s (1995) parameters (see § 3.1). That is a scenario is defined as a set of values for respective parameters, assigned either in terms of binary features (indicating whether a change has occurred or not), or associated with absolute values for particular parameters (as explained in § 3.1 and Chapter 2). Even though we are generally interested in universal trends and cross-linguistic variation, our focus will be different for the two perspectives. In the case of grammaticalization paths, we are mostly interested in crosslinguistically common and rare patterns. The latter may be areally restricted even though it is often hard to distinguish areal effects from effects of genealogical influence and ‘drift’ (see section 2.3). In contrast, scenarios are less likely to be the result of lexical diffusion. Since they are more likely to be conditioned by the typological profile of a language (cf. section 4.3), what is difficult to distinguish in this case are rather effects of areal and typological factors (see section 3.5). It was also necessary to make a number of simplifying assumptions in order to make the research project more manageable and to bring it in line with the literature we are following up on. As mentioned in section 1.1, our major sources of inspiration for our study are the works of Heine and colleagues on the documentation of grammaticalization paths, Lehmann’s conception of grammaticalization in terms of the reduction of the autonomy of the linguistic sign as it can be captured by parameters and the pioneering typological work of Bybee and colleagues on the quantitative assessment of form-function covariation. Moreover, our interest in the question of areal variation in grammaticalization processes is based on work by Bisang (2004, 2008, 2011). Even though these four strands of work all deal with fundamental properties of grammaticalization, they otherwise differ to some degree in their focus and methodology. In order to integrate these different approaches and relate our research questions and findings to them, we had to make a number of assumptions leading to certain modifications of the ground-breaking work we are following up on. Thus, we modified the list of Lehmann’s parameters in order to distinguish between form-related and function-related parameters more clearly. In Lehmann’s framework this distinction is of minor importance (both Semantic Integrity and Phonetic Reduction are subsumed under Weight, cf. section 3.1 and chapter 2) but it is of basic importance for us, since it looms large in the literature on grammaticalization, first and foremost in Bybee’s work on the coevolution of meaning and form.

Position paper: Universal and areal patterns in grammaticalization

9

Moreover, it is of central importance in the assessment of the two competing hypotheses, i.e., the Meaning First Hypothesis and the Parallel Reduction Hypothesis (cf. section 1.1). Another assumption, which is maybe more controversial, is that we pursue an item-based and function-based perspective in our analysis of grammaticalization paths that follows the work of Heine and colleagues. Thus, our research is based on a list of source concepts and the target concepts into which they grammaticalize, in line with Heine’s work. Even though we acknowledge that grammaticalization is construction-specific with constructions providing the framework within which individual linguistic items change into grammatical markers (cf. section 1.1), looking for individual source constructions would be unpractical because the list of constructions may easily become too long for cross-linguistic comparison. We will address this issue in some more detail in section 4.4 on the relation between reanalysis and grammaticalization. The data of MAGRAM are based on the contributions from 29 leading experts in individual languages, language families or geographic areas, among them Christian Lehmann, one of the founding fathers of grammaticalization research. The list below includes the names of contributors, as well as the names of language varieties addressed in respective contributions. The list of language varieties is somewhat heterogeneous, as the editors solicited volume contributions, which either deal with ‘Grammaticalization in language X in a comparative perspective’, or ‘Grammaticalization in language family Y’. To avoid using the relatively long expression of ‘individual languages or language families’, we use the term ‘genus’, which we define as the taxonomically lowest ranking group of languages which still entails all the related languages (in each chapter). Using the most basic cover-term helps to get an overview over the languages used in the sample, while being most precise (by not including languages or varieties of branches higher in the taxonomy). If grammaticalization paths are only from one language, the name of the language is used instead of a cover-term: Beja (Cushitic, Afroasiatic): Martine Vanhove Chinese (Sinitic, Sino-Tibetan): Linlin Sun and Walter Bisang Creoles and Pidgins: Susanne Michaelis and Martin Haspelmath Emai (Edoid, Niger-Congo): Ronald P. Schaefer and Francis O. Egbokhare Germanic (Indo-European): Damaris Nübling and Luise Kempf Hindi/Urdu (Indo-Aryan, Indo-European): Annie Montaut Hoocąk (Core Siouan): Johannes Helmbrecht Iranian (Indo-European): Agnes Korn Iroquoian: Marianne Mithun Japhug (Rgyalrong, Sino-Tibetan): Guillaume Jacques Korean: Seongha Rhee Lezgic (Northeast Caucasian): Timur Maisak Manding (Mande, Niger-Congo): Denis Creissels

10

Walter Bisang, Andrej Malchukov, et al.

Mountain Ok (Papua New Guinea): Sebastian Fedden Nyulnyulan (Non-Pamanyungan, Australian): William B. McGregor Quechua and Aymara: Willem F. H. Adelaar Romance (Indo-European): Michela Cennamo Slavic (Indo-European): Björn Wiemer Sulawesi (Western Malayo-Polynesian, Austronesian): Nikolaus P. Himmelmann Tswana (Bantu, Niger-Congo): Denis Creissels Tungusic (Manchu-Tungusic, Transeurasian): Andrej Malchukov Uralic: Juha Janhunen Uto-Aztecan: Zarina Estrada-Fernández Yeniseian: Edward Vajda Yucatec (Mayan): Christian Lehmann Each of the contributors to the MAGRAM project was asked to provide a survey of the phenomena and the grammaticalization paths attested in the language(s) of her/his expertise. In addition, we collected data on the list of 30 source concepts presented by Heine and Kuteva (2002). The data on these paths cover 29 genera:

Tab. 2: Genera in the database. . . . . . . . . . .

Aymaran Beja Chinese Creoles/Pidgins Emai German Hindi Hoocąk Iranian Iroquian

. . . . . . . . . .

Japhug Khmer Korean Lezgic Malayo-Polynesian Manding Mori Nyulnyulan Ok Quechua

. . . . . . . . .

Romance Slavic Southern Uto-Aztecan Thai Tswana Tungusic Uralic Yeniseian Yucatec

The general data from the chapters (432 paths) plus the data from the list of 30 source concepts (571 paths) together consist of 1003 grammaticalization paths,8 which form the basis of our statistical analysis. Based on a detailed questionnaire (cf. chapter 2), we checked each path for eight parameters, assigning each parameter a value of 1, 2, 3 or 4 (ordered along increasing degree of grammaticalization) plus a value which indicates if the value remained the same from source to target [–] or if it increased [+] (cf. section 3.1 for details). The few cases in which the value

 It should be noted that for certain research questions (such as issues of areality discussed in section 2.3), we restricted our dataset to the paths from the 30-source list (571 paths altogether), for purposes of better comparability.

Position paper: Universal and areal patterns in grammaticalization

11

from source to target decreased are collected under the heading of degrammaticalization (cf. section 4.2). The values were assigned in close exchange with the contributors. We assigned the values to the paths attested in the individual contributions and the contributors checked them carefully on the basis of our questionnaire. The project results in the present volume as well as an accompanying database recording grammaticalization paths extracted from individual languages. The present volume is called a Comparative Handbook (of the type represented by the Malchukov and Comrie [2015] volumes on Valency Classes), as it includes a position paper by the editorial team as well as 25 chapters on individual languages guided by a questionnaire. In this sense, it differs radically in terms of coherence, from conventional edited volumes on the topic of grammaticalization. The present position paper provides an introduction to the topic, as well as a summary of the most important qualitative and quantitative results as they emerged from the individual contributions in the second part. As is clear from the above, the MAGRAM project crucially relies on the expertise of our contributors. Indeed, our team of experts not only contributed the data and analyses for individual languages (in response to a questionnaire), but also provided a diachronic interpretation of the data, which is especially important for languages lacking historical documentation. Apart from the present volume, another outcome of the MAGRAM project is the database, which is planned to be published in the CLLD (Cross-Linguistic Linked Data) series (Haspelmath and Forkel 2015).9 It will include the grammaticalization paths extracted from the individual contributions, as well as the additional data from the list of 30 grammaticalization paths. The structure of this introductory chapter is as follows. Our results on universal and rare grammaticalization paths in the context of areality will be discussed in Section 2. The statistical analysis concerning scenarios of grammaticalization will be given in Section 3, which will start with a short summary of the questionnaire as far as it is needed for the purpose of this chapter. Sections 3.2 to 3.4 present our results on the coevolution of meaning and form (cf. the two hypotheses in section 1.1) with additional data from our chapters. Section 3.5 is on areality. Section 4 presents the factors which interact in various ways with our parameters and which are needed for modeling scenarios of grammaticalization, i.e., word order, phonological factors, competing motivations, morphological typology and areality. The chapter ends with a short conclusion in Section 5.

 The Cross-Linguistic Linked Data project is concerned with developing and curating interoperable data publication structures using Linked Data principles as integration mechanisms for distributed resources (Haspelmath & Forkel 2015; cf. https://clld.org).

12

Walter Bisang, Andrej Malchukov, et al.

 Grammaticalization paths – common patterns, rare paths and areality In this section we address cross-linguistic variation in grammaticalization paths, starting with a selection of grammaticalization paths (571 in total) elicited on the basis of the list of 30-source concepts. As it turns out, already this restricted sample reveals that the attested source concepts and paths vary widely in terms of frequencies and cross-linguistic distribution. In Section 2.1, we will focus on source concepts and paths which are widespread (typologically recurrent) in processes of grammaticalization, while Section 2.2 will be on potentially rare source concepts. Finally, Section 2.3 addresses issues of areality in grammaticalization paths. As discussed in Section 2.1, reliable conclusions concerning the universality of a source concept depends not only on its overall number of occurrences in processes of grammaticalization but also on the number of different targets it can produce and on its areal distribution. Thus, it seems possible to point out certain source concepts with a high potential of being universally relevant. The case of assessing rare phenomena as discussed in Section 2.2 shows that there are even more factors which need to be considered, among them semantic granularity and non-compositionality (nucleus mismatch), as well as a number of other methodological issues. This discussion uncovers some methodological challenges when comparing grammaticalization paths but also shows that one would need more data than we collected in our sample for coming up with definite statements on common (“universal”) and rare phenomena. Our results on areality presented in Section 2.3 must remain vague for the same reason. We cannot see areal effects at the level of macroareas in our data even though certain effects on smaller scales can be observed (cf. the NeighborNet plots in Section 2.3, Figures 5A, B), which can be explained through attested cases of language contact for languages with documented histories. The chapters reporting on specific cases of areal diffusion of grammaticalization paths through contact situations thus contribute to the literature on contact-induced grammaticalization, which is still limited to certain linguistic areas (cf. Heine and Kuteva [2006] on European languages; Bisang [1992, 1996] and Enfield [2003] on South East Asian languages; and Aikhenvald [2002] on Amazonian languages).

. Common source concepts and paths The study of grammaticalization paths is an important field for identifying crosslinguistically common patterns of change. For that purpose, we started out from a sample of grammaticalization paths presented in Heine and Kuteva’s (2002) World lexicon of grammaticalization. While establishing our sample, we concentrated on source concepts which are especially prone to polygrammaticalization, ending up

Position paper: Universal and areal patterns in grammaticalization

13

Fig. 3: Frequencies of source concepts (from a 30-source list).10 (Total number of paths for the 30-source list: 571. Number of different pathways: 288).

in various different target concepts. The list of 30 source concepts with their different targets resulting from this process is given in Chapter 2. It has been checked systematically by the contributors to the present volume. This led to a database consisting of 571 individual paths in individual languages, in which however the number of different pathways (in the sense of different source-target combinations) in the dataset is much lower: 288. 11 This is due to the fact that many pathways occur multiple times in the dataset, although with different frequencies, which, in its turn, is related to different frequencies of the 30-core concepts involved in grammaticalization paths in our dataset. The above chart in Figure 3 demonstrates the frequencies of individual pathways in our data, arranged from the source concept of  with the highest frequency down to the source concepts of  and  with the lowest frequencies (the frequency in terms of number of paths/tokens is indicated above each column) (Fig. 3). The above figure shows that the frequencies with which individual source concepts are selected varies considerably across the 30 concepts. The first seven source concepts (, , , , , , ) cover just about half of the frequencies of the whole set of grammaticalization paths (i.e., 285 paths out of 571 paths, 49.9 %). It is interesting to note, as pointed to us by Lehmann (p.c.),

 Bar charts were produced using the R package ggplot2 (Wickham, 2016).  For clarity we refer in this section to individual instances (tokens) of grammaticalization processes as ‘paths’, and to types as ‘pathways’ (also used in this sense in Heine and Kuteva [2002]). Thus, a pathway might be represented in our dataset through multiple paths (tokens), as also illustrated in Figure 4 below for more frequent pathways. When no ambiguity arises, we refer to both tokens and types of grammaticalization as ‘paths’.

14

Walter Bisang, Andrej Malchukov, et al.

that the frequency rates reported in Figure 3 correlate with increasing semanticity (with the exception of ‘here’, the very last item). At least the most frequent concepts serving as input for grammaticalization (like  and ) are already highly desemanticized. Absolute frequencies as presented in Figure 3 may provide important clues about some universal preferences for selecting certain source concepts for processes of grammaticalization. However, this will not be enough for assessing universal preferences in general, and universal preferences for paths, in particular. For that reason, we also looked at two other factors, i.e., (i) the number of different targets associated with a given source concept as given in Heine and Kuteva (2002) (extracted from the list in chapter 2) and (ii) the number of genera in which the source concept is used. For the purpose of illustrating how absolute frequency together with these factors may provide better indicators of the universal relevance of individual source concepts, we will first look at the seven concepts with the highest token frequencies in Figure 3, followed by the concept of  as an instance of   which represents the average of occurrences per item (19 individual paths, with the average being 19.03 individual paths/source concept) and the source concept of  as an instance of   with its low token frequency of four paths. In addition to these factors, we will show that there is an additional caveat against looking at source concepts alone because their constructional environment has its impact as well. At the end of this section, we will present some examples from the chapters in this volume. Looking at the seven most frequent concepts, one can see that they all end up in a relatively high number of different target concepts (for the individual figures, cf. the list in chapter 2): : : : : : : :

      

Targets Targets Targets Targets Targets Targets Targets

Cases in which the same sources give rise to different targets have been discussed in the literature under the rubric of polygrammaticalization (Craig 1991; Hopper and Traugott 2003: 114–115), a term which we will adopt here. Polygrammaticalization is related to the fact that grammaticalization paths are not only item-specific but also context-specific or context-sensitive, i.e., the same source concept develops into different target categories in different constructions. Most spectacular instances of this type of grammatical-context sensitivity are discussed in section 2.2 under the rubric of ‘nucleus mismatch’. They refer to cases in which it is not a lexical (free

Position paper: Universal and areal patterns in grammaticalization

15

Fig. 4: Frequencies of most frequent pathways (from a 30-source list; Total number of paths (tokens) for the 30-source list: 571. Number of different pathways: 288).

element or stem) which makes the most important semantic contribution but rather an element of its constructional context. The clearest cases of nucleus mismatches can be observed with the above source concept of , which is involved in the development of a multitude of target concepts depending on its constructional environment. To cite some examples from contributions to the present volume, copula is involved in irrealis marking in the context of the infinitive (Agul), in future marking in the context of the future participle (Balochi), in past marking in the context of the (past) participle (Hindi, Jamaican Creole), in the development of a conjunction with the meaning of ‘since’ if combined with the conditional persistive form (Tswana) and in the formation of a possession verb if combined with the Benefactive applicative suffix (Yawuru). In all these cases it is clear that the decisive contribution to the semantics of the target category does not come from the  itself, but from other elements of the grammatical context. We will return to the discussion of polygrammaticalization and grammatical-context sensitivity in section 4.4. Let us next consider the frequencies of specific pathways in our dataset. Figure 4 shows the most frequent pathways (source-target-combinations) which are at-

16

Walter Bisang, Andrej Malchukov, et al.

tested at least by five tokens/paths per source-target-combination, yielding a total of 172 paths distributed over the 23 most frequent pathways. Body parts are well-known to develop into spatial adpositions and ultimately into case-markers. They are represented by  in our set of 30 source concepts. If one looks at the semantic domain of body parts as a whole this development turns out to be even more common. A typical example of such an evolution from body-part terms can be illustrated by the data from Ewen (Tungusic; Malchukov, this volume, ex. [1]): (2)

Ewen d’uu-du < d’uu do-n (< do ‘guts’) house- house inside-3. ‘in (his) house’ ‘inside the house’ (lit. inside-its)

Similar cases illustrating the whole pathway or its parts (spatial adposition > locative case) are reported from languages of widely different typology: Iranian (bottom > down, etc.), Beja (back > behind), Japhug (side > ), Hoocąk (top > on > ), Ket (ʌqad ‘back, spine’ > =ʌʌd= ‘on the surface of ’; cf. [5]), Hindi (side >  > ), Manding, Quechua (č̣awpi ‘center’ > ), Mayan (as in [4]), as well as Emai (as in [3] below). Consider the following paths in Emai (Schaefer and Egbokhare, this volume, § 3.2), where body-part terms develop into relational nouns for spatial orientation, including ùòkhò ‘back’ > ‘behind’ (úókhó úgbà ‘behind the fence’) and ékéìn ‘inside’ from ékéìn ‘belly.’ (3)

Emai: égbè ‘body’ ídámà ‘chest’ àgbàn ‘jaw’

> > >

‘beside’ ‘middle of ’ ‘edge of ’

In Yucatec Maya (Lehmann, this volume, Table 2), one can observe similar changes (in the glosses below, X is the possessive clitic agreeing in person and number with the complement Z): (4) Modern Yucatec denominal local prepositions noun meaning complex preposition iknal ‘proximity’ ti’ X iknal Z táan ‘front’ ti’ X táan Z ~ táanil ti’Z paach ‘back’ ti’ X paach Z ~ paachil ti ‘Z óok'ol ‘top’ ti’ X óok’ol Z áanal ‘bottom’ ti’ X áanal Z tséel ‘side’ ti’ X tséel Z

meaning ‘by, near Z’ ‘in front of Z’ ‘in back of Z’ ‘on, above Z’ ‘under Z’ ‘beside Z’

Position paper: Universal and areal patterns in grammaticalization

17

For Yeniseian, Vajda (this volume, Table 4) reports the following examples: (5) Ket Ket postposition ()-=bal- ‘between’ ()- =kūp- ‘in front of ’ ()- =hɯj- ‘inside of ’ ()- =ɯ̄ n- ‘beneath’

Modern Ket noun ba’l ‘space’, ‘gap’ kūp ‘beak’, ‘tip’ hɯ̄ j ‘stomach’, ‘room’ ɯ’n ‘sled runner’

Proto-Yeniseian noun *ba’r ‘space’, ‘gap’ *kūb ‘beak’, ‘tip’ *pɯ̄ j ‘stomach’ *ki’n ‘belly’, ‘underside’

In Mandarin Chinese, we find so-called relational nouns like lĭ ‘inside of, in’ (Sun and Bisang, this volume, Table 2): (6) Mandarin Chinese: lĭ ‘inside’: jiàoshì-lĭ classroom-in ‘in the classroom’ As is clear from the examples above, the constructions in question are rather different: some items undergoing grammaticalization are preposed (prepositional) as in Emai or Mayan, some others are postposed (postpositional) as in Tungusic, Yeniseian and Mandarin. Some involve head-marking (of possessive relations) as in Tungusic and Mayan, some are dependent-marking as in Iranian, some are doublemarking as in Ket, and some are no-marking as in Mandarin. In spite of these differences, the general paths are similar. Given that these languages differ widely in terms of their typological characteristics as well as in their genealogical affiliation and geographic distribution, this is strong evidence for the universality of the paths involved, and, more generally, for Heine, Kuteva, and Narrog’s (2017) proposal to the effect that universality of grammaticalization processes can be captured in semantic terms. Another frequent pathway in the domain of the noun is  > . In fact, this is the most frequent path (cf. Figure 4), which is attested in Pima Bajo, Yucatec, and Lezgic, among others. For Uto-Aztecan, Estrada-Fernández (this volume, ex. [53]) reports the following data: (7) Uto-Aztecan languages Guarijio (Warihio) Tarahumara Pima Bajo Northern Tepehuan Tohono O’odham Névome (†)

piipi ‘one’ bilé ‘one’ himak ‘one imóóko ‘one’ hemako ‘one’ maco ‘one’

> > > > > >

pii ‘a’ (indef.) bilé ‘a’ (indef.) hímak ‘a’ (indef.) imó, umó ‘a’ (indef.) hema ‘a’ (indef.) maco ‘a’ (indef.)

18

Walter Bisang, Andrej Malchukov, et al.

In some cases, indefinite articles undergo the full cycle of grammaticalization by further developing into affixes. Thus, in Sorani (and some other Iranian languages; Korn, this volume, § 2.5), the enclitic indefinite article (=êk) derives from the numeral ‘one’ (cf. Persian yak). The verbal domain can be illustrated by the path of  >  as a textbook example of grammaticalization that is also known from English. As noted in literature (Bybee, Pagliuca, and Perkins 1994; Hopper and Traugott 2003), going to first develops intentional meaning and, at a later stage, future meaning. In its later function, it can be further reduced to gonna. This is a nice example of form-function coevolution, as well as of the fact that functional reanalysis precedes formal reanalysis in processes of grammaticalization. This path as well as the related path starting from  have been found to be frequent cross-linguistically (Bybee, Pagliuca, and Perkins 1994), an observation that is also supported by our data. In Mayan (Lehmann, this volume, ex. [58] and ex. [61]), the future developed from a motion-cum-purpose construction as can be seen from comparing the Colonial Yucatec form in (8) and the Modern Yucatec form in (9): (8)

Colonial Yucatec t bin-en in cim-ez uacax. [ go()-.1.] .1. die-() cow ‘I went to kill cows’

(9)

Modern Yucatec Bíin suu-nak yéetel bíin in wil-eh.  return-(.3) and fut .1. see-(.3) ‘He will come back and I will see him.’

Overall, the cases of  and  reveal a tendency of polygrammaticalization with target categories like ,  and . Similarly, developments from posture verbs into aspectual markers are fairly common. For example,  >  which is found in many language families, including Turkic (‘stand’, ‘sit’ > resultatives, progressives; Johanson and Csató [2018]), Iranian (stand > continuous), Hindi (stay > imperfective), Yaqui (‘lie’ > stative > nominalizer; cf. Estrada-Fernández, this volume, § 3.3.2 [78]) and Siouan: (10) Hoocąk (Helmbrecht 2017: 144) wee=nąk=šąną: talk=sit(3)= ‘he is talking’ In Iranian idioms (Korn, this volume, ex. [24]),  develops first into an imperfective marker (as in Sogdian) and ends up as a present-tense marker in some idioms such as Yaghnobi (Korn, this volume, ex. [25]):

Position paper: Universal and areal patterns in grammaticalization

19

(11) Sogdian (Buddhist) wyn-ʾm ʾštn see-.1 stand. ‘we are seeing’ (12) Yaghnobi wēn-om=išt see-1= ‘I see’ The examples from Iranian are good examples of the coevolution of meaning and form as it was investigated in Bybee’s work (Bybee, Pagliuca, and Perkins 1994). In Papuan languages like Kalam (Fedden, this volume), there are weaker grammaticalized serial-verb constructions involving , , or  for the expression of continuous aspect. The source concept of  is another well-known example of polygrammaticalization from the verbal domain. In many languages,  undergoes polygrammaticalization with its two most common target categories of  and . In Lezgic, the verb  grammaticalizes into the target functions of ,  and . Cushitic languages are champions in this respect. Vanhove (this volume, Table 5) lists the following targets for the speech verb:

Say

>

Quotative Light verb construction Verbal flectional morphemes Relative verbs Intention Volition Future Purpose clause marker Discourse marker Benefactive

Even though the above list looks like a case of particularly extensive polygrammaticalization, some of its targets can probably be “chained” on semantic or formal grounds ( > Volition/Intention > Future;  > Light verb construction > Verbal flectional morphemes). However, some other targets should be seen as the result of independent development, since they involve different constructions and different host categories (in terms of Himmelmann [2004]). At its most advanced stage of grammaticalization in Cushitic,  gives rise to a new conjugation paradigm when combined with verbal nouns as in the following converbal (CVB) form in Beja (see also Table 2 in the questionnaire):

20

Walter Bisang, Andrej Malchukov, et al.

(13) Beja (Vanhove, this volume, ex. [22]) ʔas-ti far-iːni be_up-. jump-.3. ‘He jumps upwards.’ Overall, these data provide additional evidence for the universality of grammaticalization paths, which are observed under different structural conditions. Note, however, that some cross-linguistic parallels between the paths may be deceptive. For example, Montaut (this volume) notes that in Hindi the seemingly familiar path  >  is actually based on the past participle of ‘go’ plus a subjunctive. Hence, the future meaning derives rather from a subjunctive. Lehmann (this volume) describes complexities of the path from  >  (illustrated in [8]–[9] above), which do not have parallels in European languages because of the peculiarities of the underlying structure. In a nutshell, the development proceeds as follows (Lehmann, this volume, § 3.5.3): “First, on the basis of the verb ‘go’ in focus, a focused progressive is formed. Second, this strategy applies to the ‘go’ verb of the motion-cum-purpose construction to form the immediate future of its purpose component …” As is shown by Nübling and Kempf (this volume) for Germanic, following up on the work by Hilpert (2008) and others, the trajectories from  >  is somewhat different in terms of collocational preferences and semantic restrictions, but the net result is similar. This can be seen as evidence both for the function-based explanation of the commonalities of grammaticalization paths as argued by Heine, Kuteva, and Narrog (2017) in their comparison of Germanic and Khoisan. Alternatively, it can be construed as evidence for convergent developments, in which differences of constructions (including semantic differences) are obliterated as constructions proceed along a given path of grammaticalization. This is in line with Bisang and Malchukov’s (2017) explanation for the fact that grammaticalization produces universal (common) outputs, starting from language-specific and construction-specific inputs. We will return to the question of variation in grammaticalization paths in section 2.2 below, when we address issues of granularity and non-compositionality arising from ‘nucleus mismatches’. Generally, however, it is clear that certain pathways are more common, while others show areal distribution, and still others are restricted to a single language. In this context, it is also worth pointing out that certain items in individual languages are particularly prominent as grammaticalization sources. Above, we cited the case of polygrammaticalization of the  verb in Beja (Vanhove, this volume). In Siouan (Helmbrecht, this volume), determiners were repeatedly renewed from demonstratives, which were polygrammaticalized (> , , etc.). Moreover, the Siouan position verbs (, etc.) gave rise to a variety of verbal and nominal categories ( , as well as  ; for an illustration, cf. [21] below). In Tungusic, the verb ŋene- ‘go’ developed into a marker of ,  and  (Malchukov, this volume; also cf. de

Position paper: Universal and areal patterns in grammaticalization

21

la Fuente [2011]). Ultimately, polygrammaticalization can also be explained by the high frequency of the source items (both in the sense of token frequency, and in terms of type frequency – occurrence in different syntactic environments), which makes them especially eligible for grammaticalization.

. Some rare and “exotic” paths of grammaticalization Apart from common paths, the contributions to this volume also show that some paths are language-particular, rare or even cross-linguistically unique. The following list offers examples from the individual chapters: (14) Some rare grammaticalization paths a. Hindi: ‘touch’ > dative, future b. Hindi: ‘ear’ > locative > dative > ergative, instrumental c. Iranian: ‘beat’ > pro-verb d. Romance: reflexives > indefinite markers (of non-specific agents in a non-active diathesis) e. Lezgic: ‘see’ > verificative f. Japhug: locative (approximate location) > plural g. Japhug: ‘east’ > egophoric. h. Mian: ‘sleep’ > (hodiernal) Past i. Sulawesi: ‘fruit’ > classifier j. Tungusic: ‘sick’ > cannot, afraid > cannot, won’t; lazy > ‘won’t’ k. Iroquian: hither > future; thither > past l. Hoocąk: demonstrative pronoun (distal) > proper name marker m. Hoocąk: posture verbs > determiner (new demonstratives) n. Pima Bajo: Posture verb > stative marker (participle) > nominalizer o. Guarijio: ‘cattle’ >   p. Mayan: ‘counterpart, replica’ > reflexive q. Beja: Possessive pronouns on verbs > conditional r. Beja: ‘stand’ + ‘after’ > immediate future s. Manding: ‘owner’ > emphatic 3rd person pronoun t. Manding: ‘be equal’ > obligative auxiliary u. Tswana: demonstrative > noun-modifier linker v. Tswana: ‘build’ > auxiliary ‘do something continually’ w. Emai: dee óbò ‘lower hand’ > make mistake > inadvertently x. Emai: daa ábò ‘raise hands’ > ‘prepare self ’ > ‘deliberately’ y. Mandarin: ‘blanket’ > passive z. Mandarin: ‘waist’ > modal auxiliary This data, however, needs certain qualifications. In fact, the assessment of paths as common/attested or rare/new, is not always straightforward. This is even true if

22

Walter Bisang, Andrej Malchukov, et al.

‘common’ (attested cross-linguistically) is defined for our purposes as ‘present in Heine and Kuteva’s Lexicon’ (Heine and Kuteva 2002). This qualification is necessary since we specifically asked our contributors to highlight which of the attested grammaticalization paths are missing in Heine and Kuteva (2002). To begin with, some pathways that are qualified as rare in our sample on the basis of Heine and Kuteva (2002)12 need not be typologically infrequent (C. Lehmann, p.c., notes this specifically in relation to [14i], [14u]). Moreover, some of these paths show remarkable parallels in our data. A good example is the development of past markers from the verb  in Mian and Tswana – a path which is not mentioned in Heine and Kuteva’s (2002) World lexicon of grammaticalization. In Mian, the ‘hesternal past’ in -so (15a) is most likely derived from a (reconstructed) serial verb construction (15b) with the verb s- ‘sleep’ (Fedden 2011: 453): (15) Mian (Fedden, this volume, ex. [16]) a. (sintalo) fu-n-eb-so=be (yesterday) cook--2.-= ‘you cooked yesterday’ b. *fu-n-eb s-o cook--2. sleep-3.. ‘you cooked and then there was sleeping’ A remarkably similar development occurs in Tswana, where létsɩ́, which is the perfect form of lálá ‘spend the night’, develops the meaning of ‘having done something the day before’: (16) Tswana (Creissels, this volume, ex. [34]) kɩ ̀-létsɩ́ kɩ́-kwàlá dì-t ɬ hát ɬ hʊ̂ :bɔ ̀. .1-spend_the_night:: .1-write:: 8/10-examination ‘I took an examination yesterday.’ Apart from the fact that some paths may turn out to be more frequent after more data will be available, the following factors may influence our view on the rarity of a grammaticalization path: (i) Lexicalization/Semigrammaticalization (ii) Semantic granularity (iii) Directionality (iv) Non-compositionality (nucleus mismatch) (v) Specific pragmatic inferences (arising in culture specific contexts)

 It should be mentioned in this context that there is a revised version of Heine and Kuteva’s (2002) lexicon published under Kuteva et al. (2020). This version was not available when we conceptualised our research project.

Position paper: Universal and areal patterns in grammaticalization

23

The first factor is related to the notorious problem of distinguishing grammaticalization from lexicalization (cf. Lehmann [2004]; Himmelmann [2004]; Boye and Harder [2012] for some distinctive criteria). The next two factors are significant methodologically, as they have direct bearings on recognizing a path as (cross-linguistically) common. The last two factors have to do with language-specific properties of grammar. Thus, non-compositionality and ‘nucleus mismatches’ relate to constructionspecific properties, while factor (v) is concerned with highly specific pragmatic inferences which cannot be directly recovered from the meaning of the components of the constructions involved and depend on the sociocultural setting. First, in some cases, as those cited for Emai (Schaefer and Egbokhare, this volume), we are arguably dealing with lexicalization rather than grammaticalization, even though such modal items as ‘deliberately’ or ‘inadvertently’ may be integrated later into the system of actional modifiers (pre- or postverbal) and can be regarded as “semigrammaticalized”.13 As observed by Schaefer and Egbokhare (this volume, § 6.2) on Emai, “grammaticalization has resulted in four preverbs ascribing an attribute or quality to the grammatical subject.” The grammaticalization and the lexicalization of the two ‘intentional preverbs’ dábo ‘deliberately’ (from the verb phrase daa ábò ‘prop hands’) and dóbo ‘mistakenly’ (from dee óbò ‘lower hand’) is illustrated in (17) and (18) below: (17) Emai (Schaefer and Egbokhare, this volume, ex. [69a] and ex. [70a]) a. ójé dá'bò fí ólì èkpà fí à. Oje .deliberately threw the bag extend  ‘Oje deliberately threw the bag away.’ b. ójé dáá' ábò. Oje .prop Arms ‘Oje propped / raised his arms.’ (to do something) (18) Emai (Schaefer and Egbokhare, this volume, ex. [69b] and ex. [70b]) a. ójé dó'bò é ólí émàè. Oje .mistakenly eat the food ‘Oje mistakenly/accidentally ate the food.’ b. ójé déé' óbò. Oje .lower Hand ‘Oje committed a mistake.’ lit. ‘Oje lowered his hand.’

 The relation of lexicalization to grammaticalization is a complex issue. We side here with Lehmann (2004) according to whom these notions are not mutually exclusive but show a considerable overlap (in particular, in the domain of derivational morphology). This obviates the question whether the use of “classificatory verbs” (with aspectual function) in Nyulnyulan is a case of grammaticalization or lexicalization (see McGregor, this volume, for discussion). It is arguably both, just as in case of aspectual opposition in Slavic (as discussed by Wiemer, this volume).

24

Walter Bisang, Andrej Malchukov, et al.

Such cases of lexicalization represent relatively early stages of grammaticalization, at which more variation is to be expected. Second, it is important to recognize that a given path may be defined with more or less granularity and thus be described as more common or rarer. As Mithun (this volume) notes, the development from  to future is an instantiation of the familiar path  >  (Heine, Claudi, and Hünnemeyer 1991; Haspelmath 1997). If seen from that perspective, a seemingly rare path fits into a very common pattern. The following examples show the same prefix in the function of a  marker and a  marker in Cherokee: (19) Cherokee (Mithun, this volume, ex. [51a] and ex. [52a]) a. nihi t-hi-hwahtvvh-i 2 -2.-find- ‘You will find it.’ b. ta-kinii-atansiinoo-heéli -1.-crawl-- ‘He’s crawling to us.’ While this is true, this conclusion only holds as long as we look at it from a semantically abstract perspective. Taking the more specific perspective, the path ‘hither’ >  (also ‘thither’ > ) is not reported in the literature. In fact, it might actually be an areal feature of Amerindian languages, as it is also found in Mayan, with its path ‘there’ >   (Lehmann, this volume). An opposite case is found in Japhug (Jacques 2017, this volume). Here, the development of  from  has been described as unattested (cf. kɯɤɣɯrɟɯrɟit ‘with many children’ < NMZR-have.many.children). However, there is a semantically more general path from  to , which is familiar from grammaticalization research (also cf. Stolz [2001] for the opposite direction of  <  in (Indo)-European). As for granularity and the identification of grammaticalization paths, it is also instructive to cite the following passage from McGregor (this volume, § 5): Of the grammaticalization targets of indexical items mentioned in this chapter, only two are mentioned in Heine and Kuteva (2002): demonstrative > third person pronoun (which perhaps occurred in pre-pNN) and either of these > copula (Heine and Kuteva 2002: 108–109, 235). However, what they refer to is an identifying or attributing copula, not a possessive one, and the other person forms of the pronouns are not involved, as they are in Nyulnyulan.

As is clear from this citation, qualifying a path as novel depends crucially on the level of detail, i.e., on granularity. Similar issues of granularity and terminology are addressed in other contributions (cf. the discussion of rare grammaticalization paths in Tarahumara and other Uto-Aztecan languages in Estrada-Fernández, this volume). The issue of granularity becomes particularly pressing, if we consider lexical variation at the starting point of grammaticalization paths (see below).

Position paper: Universal and areal patterns in grammaticalization

25

Third, the assessment of the commonality of a grammaticalization path may be related to (assumed) directionality. Thus, Japhug (Jacques, this volume) shows a peculiar language specific development from ku- ‘towards east’ to   and from ɲɯ- ‘towards west’ to  , as illustrated below. (20) Japhug (Jacques, this volume, [42]) kuabao ɯ-spa ci ku-taʁ-a. bag 3.-material  :-weave-1 ‘I am weaving (cloth to make) a bag.’ However, as noted by the author himself (Jacques, this volume), this unidirectional analysis is partially misleading, because the original meaning of these forms was most likely directional (“centrifugal” vs. “centripetal” distinctions). If so, we are dealing with polygrammaticalization here: east < hither > egophoric; west < thither > sensory. Based on this analysis, the path ‘away/thither’ > perfectivizer looks typologically more familiar. Similarly, it is possible that also some of the rare paths suggested for Pima Bajo (Estrada-Fernández, this volume) such as   >  >  can be brought in line with the grammaticalization literature by postulating polygrammaticalization. Fourth, “quirky” grammaticalization paths, may be the result of what we call a nucleus mismatch. As explained in the questionnaire, in the case of a complex construction, we take the more lexical part of the source as a nucleus which also represents the initial phase of the path as a whole (notice, that this notion is similar to the ‘construction marker’ in the sense of Himmelmann [2005]).14 Thus, the change from going to > gonna in English will be represented as ‘go’ >  (also cf. Heine and Kuteva 2002) with the lexical verb representing the source item (rather than the infinitive marker to). In this and most other cases of grammaticalization (what we refer to as ‘endocentric grammaticalization’ in section 4.4), this provides intuitively acceptable results, but in some cases it does not. Consider the case of the third person pronoun pi.da in Nenets (Samoyedic; Janhunen this volume), which derives from the combination of a dummy noun and a 3rd person possessive agreement marker (< *pid°-da ~ *pud°-da < *pix°də-da), with the dummy noun being based on the lexical noun puxəd° ‘body’. In cases like these, it is the possessive affix which provides the main semantic contribution rather than the stem. The Tungusic 3rd person pronoun noŋa.n similarly derives from a petrified nominal stem in combination with a possessive ending (Malchukov, this volume). This probably also holds for the personal pronouns in Aymara, which appear to

 Cf. “Grammaticisation pertains to an element in its constructional context or, put in a slightly different way, to constructions which are identifiable by a construction marker (in the sense that an accusative construction involves an accusative case marker and a future construction is identifiable by its future marker, etc.).” (Himmelmann 2005: 80).

26

Walter Bisang, Andrej Malchukov, et al.

contain a shared pronominal root element *hu- in combination with the possessive affixes (cf. Aymara huma ‘you’, uta-ma ‘your house’; Adelaar, this volume, Table 4). Finally, Creissels (this volume, § 2.8.3) describes a similar pattern in Bambara. Depending on context, the construction of à tìgî can be interpreted either with its literal meaning of ‘its owner’ or as an emphatic 3rd person pronoun (‘the person in question’). In all these cases, a pronoun actually derives its meaning from the corresponding possessive marker (affix) rather than the lexical stem (nucleus). The same phenomenon of nucleus mismatch can also be illustrated by the development from   to  (“new demonstrative”) in Siouan (Helmbrecht, this volume) – a path which seems to be unusual and which is illustrated in (21) with the example of the demonstrative =jaane ‘this.standing’: (21) Hoocąk (Helmbrecht, this volume, ex. [10]) hegų caaxšep=jaane coowe=ra ho-gi-'ųų that.way eagle=.: in.front= .-.-do/make wąąk wašoše 'anąga man be.brave and ‘This eagle, he was hanging around in front of the brave man, and …’ (lit. trans.: ‘It was (standing) the eagle, he was in front of him, the brave man, and …’) The development of the demonstrative from a posture verb is puzzling, but the paradox is resolved once one recognizes that the semantics of the target are not derived from the posture verb but from an “old determiner” which is combined with that posture verb (see Helmbrecht, this volume, for details). Nucleus mismatches are also attested in more familiar languages, as in the case of the emergence of the German new use of würde as a past subjunctive form of werden ‘become’. If stated as a path  > , it turns out to be counterintuitive and rare. Again, the problem can be solved under the assumption that the modal meaning does not come from the lexical meaning of the verb root but rather from its grammatical form with its subjunctive inflection. Similarly, the path  >  in Iranian can be explained by the syntactic context of the copula verb (in a pattern of the type “for me X is to do”). In fact, this holds for most patterns involving , as it is rarely the copula itself which contributes the main semantic information. In such cases, the functional approach to the formulation of grammaticalization paths as it is advocated by Heine reaches its limits. We will return to the discussion of nucleus mismatches in section 4.4. Fifth, some of the source concepts discussed in the individual chapters are semantically highly specific. Thus, their grammaticalization into a grammatical marker must have started out from highly specific pragmatic inferences arising under specific contextual conditions. A good case in point is the source concept  in Hindi, which ultimately develops into an ergative marker. As noted by Montaut (this vol-

Position paper: Universal and areal patterns in grammaticalization

27

ume, § 2.6.3), “the ergative most current in Hindi and Indo-Aryan is ne, nẽ, na, ni, and is derived from the Sanskrit noun karṇa ‘ear’, in the locative case karṇe”. As explained by Montaut, this is actually a result of a longer path with several intermediate steps (which are otherwise mostly familiar from other languages, apart from the first step that should arise in very specific collocations): ‘ear(+locative)’ (karṇe) > locative > dative > ergative, instrumental The case of Mandarin yào as a marker of deontic or epistemic ‘must’, desire and future may be another case in point. This path arguably starts from the concept of yāo ‘waist’ and its verbal use of ‘tie with a waistband’, which is then metaphorically expanded to ‘keep within bounds, constrain’. This metaphorical extension may then serve as a bridge to the domain of deontic modality and other types of modality up to its use as a future marker via its meaning of ‘desire, want’ (on different approaches to the grammaticalization of yào, cf. Sun and Bisang, this volume). While the beginning of this development is highly language-specific and partially also depends on the word-class flexibility of lexical items in their use as nouns (‘waist’) and verbs (‘tie with a waistband’) (Bisang 2013; Sun 2015), the later development seems to follow paths which are well-known in the contexts of modality and of future tense (/ > ). On the whole, the development of yào as a modal verb and a future marker is the result of a long and entangled chain with partial loss of intermediate meanings, which still needs further research. For now, we can represent the general path of development in the following way: ‘waist’ > ‘tie with a waistband’ > ‘keep within bounds, constrain’ > ‘force, coerce’ > ‘desire, seek, pursue’ > ‘must, want’ These examples seem to suggest a general tendency of the initial stages of grammaticalization (source) displaying more cross-linguistic variation than later developments, which show more convergence with general paths of grammaticalization. This observation finds further evidence in the literature, although it is rarely stated explicitly in this general form. Consider the spectacular path “from wood to future” in Hup (Epps 2008). While the overall development ‘from wood to future’ is certainly exotic, a closer look at the details of the scale (wood > instrument nominalizer > causal/purpose nominalizer > infinitive/supine > future’) shows that the individual steps can be replicated for different languages with the exception of the first step. The first step in the chain from lexical to grammatical item may be based on frequency, given that ‘wood’ was the most common material in Hup culture and was thus frequently used as the head of nominal compounds. Within this context, it became a nominalizer referring to means and instruments, starting a further chain of developments. The impact of culture-specific factors on the selection of source concepts is also reported from Sulawesi (Himmelmann, this volume) where the noun meaning ‘fruit’ develops into a general noun classifier – a development which is obviously also due

28

Walter Bisang, Andrej Malchukov, et al.

to cultural prominence and resulting frequency. For similar reasons, the noun ‘cattle’ developed into a possessive classifier in Guarijio and other Uto-Aztecan languages (Estrada-Fernández, this volume). As was pointed out by Bisang (2017a), classifiers are an ideal domain for attracting culture-specific concepts into the grammatical system of a language. This can be illustrated with a short look at section 2.1 on numeral classifiers in the chapter on Chinese (Sun and Bisang, this volume). As in many other classifier systems, there is a rich inventory of source concepts referring to parts of trees or plants like kē ‘stalk, stem’, tiáo ‘branch’, zhī ‘twig’, gēn ‘root’ and duŏ ‘blossom’. Even the general classifier ge is derived from gè ‘bamboo tree’, similarly to its predecessor méi, whose meaning as a full noun was ‘stalk, stem’. Other culture-specific concepts are found in the source concepts represented by bă ‘handle’ (for objects having a handle), fēng ‘seal’ (for sealed objects like letters) or liàng ‘charriot with two wheels’ (for vehicles in general). This is only a very small list of examples one can find cross-linguistically, but it may be sufficient as an illustration within this chapter.

. Areality in grammaticalization paths and sources This section addresses areality from the perspective of grammaticalization paths. It focuses on the extent to which these paths and source concepts are geographically distributed. For that purpose, it starts with a general look at how the number of source concepts from the list of 30 sources in the individual genera interact across genera. Each of the 571 paths with one of the 30 source concepts was assigned to its genus.15 Based on these data, we checked for potential areal clusterings by using NeighborNet analysis, frequently employed in language typology to detect areal/ genealogical effects (Dunn et al. 2008; Cysouw 2014; Wichmann 2015). The result is shown in Figure 5A, while Figure 5B visualizes clusterings of grammaticalization paths.16 For both plots the numbers in brackets indicate the number of paths of the respective genus for which a source or a source-target-combination is attested in the data. The first NeighborNet plot in Figure 5A is based on the quantification of the difference between genera with regard to their patterns of occurrence of the source concepts. For each of the 30 source concepts, the genera are evaluated as to whether the source concept is present in the genus or whether it is absent. The more two genera differ with respect to the presence and absence of the 30 source concepts the further apart they are located in the NeighborNet. The distance between two genera corresponds to the length of the shortest path which connects them. The

 Notice that our total number of paths is 1003. Out of these 1003 paths, 571 are based on a source concept from our list of 30 concepts (cf. chapter 2).  NeighborNet graphs were made using the program SplitsTree (Huson and Bryant 2006).

Position paper: Universal and areal patterns in grammaticalization

29

Emai [13]

Uralic [11] Slavic [11]

Hoocak [14] Beja [16]

Creoles and Pidgins [19] Japhug [14]

Tungusic [11]

Romance [29] Iroquian [4] Ok [10] Tswana [42] Malayo-Polynesian [1] Yeniseian [3] Hindi [24] German [9]

Yucatec [9] Southern Uto-Aztecan [11] Nyulnyulan [7]

Iranian [45]

Lezgic [42] Quechua [17]

Chinese [25] Khmer [50] Thai [45] Manding [45] Korean [44]

Fig. 5A: NeighborNet plot showing clusterings of languages for the 30 source concepts.

second NeighborNet plot in Figure 5B visualizes the differences between genera with regard to grammaticalization paths (source-target combinations). As it turns out, this visualization is of limited use for detecting areal effects for various reasons. One of them is the relatively small number of source concepts and paths in general. Another one is the observation that in certain genera, visible at the right side of the NeighborNet plots in Figures 5A, B the number of attested paths is very small (the number of paths in a genus is indicated in square brackets). Since the genera are basically arranged according to decreasing number of paths from left to right, the probability that proximity of two or more genera in the network reflects areal correlations decreases from left to right as well. Based on this information, there are several clusterings on the left side which can be expected from our knowledge of potential contact between speakers. One of them is (i) Khmer, Thai, Koreanic and Chinese (but notice that Manding also sides with these genera in Figure 5A). Other clusterings of this type are (ii) Iranian and Lezgic (Figure 5A) and (iii) Iranian and Hindi (Figures 5A and 5B). Some clusterings reflecting plausible contact scenar-

30

Walter Bisang, Andrej Malchukov, et al.

Korean [44]

Romance [29]

Hoocak [14] Japhug [14] Uralic [11] Slavic [11] Creoles and Pidgins [19]

Thai [45]

German [9]

Khmer [50]

Nyulnyulan [7] Southern Uto-Aztecan [11] Tungusic [11] Yeniseian [3] Malayo-Polynesian [1] Iroquian [4]

Chinese [25]

Hindi [24]

Beja [16]

Yucatec [9] Ok [10]

Emai [13]

Quechua [17] Iranian [45]

Lezgic [42]

Manding [45]

Tswana [42]

Fig. 5B: NeighborNet plot for grammaticalization paths (source-target-combinations).

ios can even be found among genera with comparatively small numbers of paths in at least one of the two Figures, i.e., (iv) Uralic and Slavic (Figures 5A and 5B), (v) Tswana and Manding (Figure 5B) (vi) and German and Slavic (Figure 5B). If we take into consideration that Figure 5A is based on the occurrence of the source concepts in the respective languages, and Figure 5B reflects the occurrence of the grammaticalization paths, one may take the above results as indicators that a larger number of paths has the potential to produce more reliable results. This is at least confirmed by the exclusion of Manding from the cluster of Khmer, Thai, Koreanic and Chinese in Figure 5B and from the additional clusters in (iv), (v) and (vi). From that, one may conclude that weak signals of areal effects at the level of macro-areas are due to our limited set of data. But then, adding information about the targets as in Figure 5B representing paths only partially improves the results. In principle, shared paths (source-target combinations) are clearly more informative than shared source concepts for the detection of areal signals, yet the number of shared paths is even more restricted in our dataset than the number of shared sources and is therefore often unable to reveal areal clusterings either, at least on a larger scale. To summa-

Position paper: Universal and areal patterns in grammaticalization

31

rize, weak signals of areal effects at the level of macro-areas may be partially due to our limited set of data but it may well be the case that there is a more substantial reason for that. If similarities at the level of macro-areas are due to prehistoric population dispersals (Nichols 1997; Bickel 2017), they may not be expected for the grammaticalization paths under discussion, which are chronologically fairly recent. Clearly, more research is needed on that question. Even though larger, continent-size areal effects at the level of macro-areas are not always detectable from our data it is clear that there are areal phenomena on smaller scales that can be observed (and are also partially revealed by the plots) for which there are historical explanations. Thus, Heine and Kuteva (2007) as well as Haspelmath (1998) have identified a number of paths common for Standard Average European. In their discussion of the grammaticalization of Germanic against the background of European languages, Nübling and Kempf (this volume, § 6) make the following comment: A number of phenomena constitute major Standard Average European features, but are rare in the European peripheries and/or in other languages of the world. This concerns the grammaticalization of a definite as well as an indefinite article (found in less than 8 % of the languages of Dryer’s 1989: 85 sample, cf. Haspelmath 2008: 1494), relative pronouns (“relative clauses formed using the relative pronoun strategy are quite exceptional outside Europe, except as a recent result of the influence of European languages” Comrie 1998: 61), ‘have’-perfect (“almost exclusively found in Europe”, Haspelmath [2008: 1495], referencing Dahl [1995]).

Similar areal features have been discussed for other linguistic areas. Bisang (1992, 1996) and Enfield (2003) discuss the evolution of verbs with the meaning of ‘get, come to have’ in East and mainland Southeast Asian languages (e.g., 得 dé in Chinese, được in Vietnamese, dâj in Thai, ba:n in Khmer or tau in Hmong). The main functions of markers derived from these verbs in the preverbal position are epistemic (possibility), deontic (obligation), factual (against wrong presuppositions) or even past action. The concrete functional range of individual ‘come to have’-verbs in individual languages is subject to some variation. In the postverbal position, ‘come to have’-verbs basically denote the successful achievement of an event. More specifically, the comparison of Chinese, Thai and Khmer in our database reveals the following parallel paths of grammaticalization: (22) (Some common paths in EMSEA languages: A.  (Chinese: gěi; Thai: hây; Khmer: Ɂaoy) as markers of benefactive. In addition, Chinese gěi can mark causative and passive in specific situations. In Khmer and Thai, the ‘give’-verbs additionally function as markers of causativity, adverbial subordination (purpose), manner adverbs and as complementizers. B.  (Chinese: dào; Thai: thy̌ ŋ; Khmer: dɔl) as prepositions indicating the goal of an event.

32

Walter Bisang, Andrej Malchukov, et al.

C. : There are various verbs meaning ‘follow, go along’ with various functions. Chinese: 沿着 yán-zhe [go.along-DUR] ‘along’, 顺着 shùn-zhe [go.in.the.same.direction-DUR] ‘go along’; 跟 gēn ‘follow’ as a preposition marking the notion of ‘in the wake of ’, comitative and the standard in comparative constructions. Thai: taam ‘follow’ / Khmer ta:m ‘follow’: as a preposition marking path ‘go along’ and abstract relations of the type of ‘according to’. D. /  (Chinese: zài; Thai: yùu; Khmer: nɤ̀ u): In all three languages: locative preposition; in Chinese and Khmer: durative aspect marker. E. -verbs are very common. They mark completion of an action in various ways. Thai: sèt ‘finish, stop doing for the time being’, còp ‘finish for good’. Khmer: sräc ‘finish, stop doing for the time being’, cɔp ‘finish for good’ and rù:ǝc ‘finish, get out of, escape from’. In Chinese, the verb 了 liăo ‘finish’ developed into a marker of perfective aspect. The same root is also used in the perfective marker lέεw in Thai, in which its use as a full verb is extremely marginal. In Khmer, the perfective marker haǝy only occurs in perfective function but its root is still active in the causative form of bɔŋhaǝy ‘get something done’. F. Motion verbs with the meaning of ‘come’ and ‘go’ as well as ‘move out’, ‘move in’, ‘move up’ and ‘move down’ are systematically used in the function of directional verbs in all three languages. G. : Nouns with the meaning of ‘side, edge’ (Chinese: 边 biān; Thai: khâŋ; Khmer: kha:ŋ) are consistently used as components of relational nouns specifying location in space. H. -verbs are grammaticalized into quotatives and complementizers in Thai (wâa) and in Khmer (tha:). Even though this is not the case in Mandarin, -verbs with these functions are found in various other Sinitic languages (Southern Min, Beijing Mandarin, Taiwanese Mandarin, and Hakka/Kejia). The distribution of these shared paths is due to language contact as well as typological profile. The first factor is uncontroversial (cf. also Enfield’s [2003] suggestive term ‘linguistic epidemiology’ for the spread of certain paths in EMSEA languages) and it is also amply illustrated in the contributions to this volume. Thus, the creation of articles in Germanic and Romance has been clearly motivated by language contact. On a smaller scale, the development of determiners in Germanic demonstrates a clear areal pattern that is shown in Nübling and Kempf (this volume). As they point out with reference to Dahl (2004), double determination (“over-determination”) is the result of two competing grammaticalization areas, i.e., the older suffixation in North-East Scandinavia and the “West-Germanic” model of prenominal determiners in the south. A number of common paths have been observed for the domain of Altaic/Transeurasian languages, which form one of the typical spread zones in terms of Nichols

Position paper: Universal and areal patterns in grammaticalization

33

(1998). These similarities are largely due to areality and typological profile (Malchukov and Czerwinski forthcoming a, b). Thus, certain grammaticalization phenomena common to Uralic and Tungusic such as the development of converbs on the basis of case-marked participles are likely to be largely due to typological profile in the sense that the presence of mechanisms of nominalization is a precondition for their emergence. Insubordination as found in North-Asian languages (cf. section 4.2 for an illustration), may be due to typological profile and/or contact-induced convergence (Malchukov 2013; Malchukov and Czerwinski forthcoming b), or even to ‘drift’ of genealogically related languages (Robbeets 2009, 2017). More generally, Malchukov and Czerwinski (forthcoming a) discuss the following paths for ‘Transeurasian’ (Macro-Altaic): (23) Some common paths in ‘Transeurasian’ languages: a.  → progressive (Turkic -(V)r + er-; Mong. bai- ‘to be’ [copula-existential] > ‘to be engaged in doing something’; Korean -ko iss-, Japanese -te ir-). b.  → progressive (Turkish =yor=; Korean -a/e ka-, Japanese -te ik-). c. ,  → continuous (Turkic tur-; Mong. soo- ‘to sit, to dwell, to live’ > ‘to do something continuously’. d.  → perfective/completive/intensive (Mongolian oryx- ‘to throw’ > ‘to do something completely’, xay- ‘to throw (away)’ > ‘to do something rapidly and completely’; Korean -a/e peli- from ‘to throw away’). e.  → ‘do in advance’ (Korean -a/e tu-; Japanese -te ok-). f.  → benefactive (Mongolian eug- ‘to give’ > ‘to do something for someone’; Korean -a/e cwu-; Japanese -te kure-, -te age-). g.  → conative (Turkic kör- ‘see, try’; cf. Uzbek ye-b kör- ‘try to eat’; Johanson 2011: 760); Mong. udz- ‘try‘; Manchu tuwa-; Korean -a/e po-; Japanese -te mi-). h. / → potential (Turkish bol- ‘be’, ‘become’; Mongolian bol-; Japanese -i na-re-). The paths in (23) are likely to be due to areal factors, supported by language contact and similarities in typological profile (Malchukov and Czerwinski forthcoming a). Remarkably enough, some of these paths are also attested in geographically adjacent areas of Asia, in particular, in Iranian (Korn, this volume) and Indo-Aryan (Montaut, this volume). In Siouan (Helmbrecht, this volume), repeated cycles of grammaticalization of articles in individual languages may be due to ‘drift’, i.e., the parallel development of genealogically related languages which is not inherited from a protolanguage. Many processes of development in various branches of Uralic are also the result of parallel language-internal factors. As observed by Janhunen (this volume, § 14) in this connection: “Ultimately, we have to recognize that all stages in the evolution of the Uralic languages belong to a typological cycle, which successively erases and

34

Walter Bisang, Andrej Malchukov, et al.

rebuilds patterns and distinctions (Janhunen 2000)”. Similarities between Iranian and Indic languages in the development of parallel sets of aspectual markers and markers of valency (be/do) operators (discussed by Korn and Montaut) allow for multiple explanations, including drift, contact and typological profile. In Romance (Cennamo, this volume), the rise/proliferation of clitics might be another issue of general drift. Here it is important to notice in addition that the rise/proliferation of clitic pronouns is a major Romance innovation that extended beyond Romance languages due to contact (cf. Cennamo, this volume, for further discussion). Concerning the issues of areality, the following quote from Vanhove (this volume, § 7) is instructive, whatever its ultimate explanation for Cushitic will be: Among the 70 grammaticalizations in Cushitic languages studied in this chapter, 18 occur only in NC (NC = North Cushitic), and a further 12 cannot be assessed for lack of information on these domains in the grammars consulted. Among the remaining 40, Beja shares 21 of them with at least one language in any of the other three Cushitic branches. Eight concern all four branches …

Clearly, in this case multiple causal factors are conceivable, including inherited grammaticalization patterns, drift, language contact and general typological profile. For other cases there is evidence for a particular scenario, such as contact influence. Thus, Nübling and Kempf (this volume, § 6) note various instances of areal influence on Germanic due to language contact: A contact induced feature that has spread very widely across Germanic languages is the grammaticalization of W-relatives out of interrogative pronouns or adverbs. This path is likely to have transferred from various Romance languages to West Germanic languages at different times.

In fact, the role of language contact is undeniable in many cases and it is explicitly recognized by the authors. Thus, Germanic developed subordinators in complex constructions under the influence of Romance. Western Slavic developed articles under German influence. Mandinka developed a specific relative clause marker under influence of Atlantic languages. The Indic language Maithili developed double agreement under the influence of Mundari. Other Indic languages borrowed the complementizer ki from Persian and restructured their relative clauses under Dravidian influence. Lehmann (this volume) notes changes in Mayan conditioned by Spanish. Korean developed a special case of noun-based modal construction (“mermaid construction”) on the basis of the Chinese model. For better documented language families (such as Romance), one can often identify a specific source of contact influence. One clear case discussed by Cennamo is the so-called ‘aoristic drift’ of the perfect which affected most Romance varieties albeit to different degrees. This can be shown to be contact-induced, radiating from the twelfth century Parisian French (see Cennamo, this volume, for references and further discussion).

Position paper: Universal and areal patterns in grammaticalization

35

It should also be noted that different types of contact situations should be taken into account. Thus, Michaelis and Haspelmath (this volume) highlight the role of substrate languages in processes of semantic change in creole languages. As the authors observe, the change manifests itself in functionalization/auxiliation and does not result in coalescence/bondedness. In this respect the situation is similar to EMSEA languages, discussed by Bisang (cf. section 4.3.2). Overall, as our discussion shows, the areal effects are better detected and more readily interpretable at the level of micro-areas, while they become weaker at a larger scale.

 Scenarios of grammaticalization: A quantitative study In this section, we present some first statistical results on grammaticalization scenarios defined by the values of our eight parameters for the 1003 grammaticalization paths investigated in our study. Our interest in scenarios of grammaticalization will focus on two issues: The first is coevolution of meaning and form, understood broadly, that is extended to the discussion of meaning vs. form-related parameters. This issue will be taken up later, when evaluating the competing predictions of the Parallel Reduction Hypothesis and the Meaning First Hypothesis. The second issue is areality, i.e., the question of the extent to which grammaticalization differs across different areas. The section will start with some methodological preliminaries related to our definition/quantification of parameters adopted from Lehmann’s work (for details, cf. chapter 2). After that, we present the statistical analyses concerning the meaning/form coevolution (sections 3.2–3.4) and areality (section 3.5).

. Methodological preliminaries: Summary of the questionnaire As explained in more detail in Chapter 2, our questionnaire consists of eight parameters for measuring the degree of autonomy of a linguistic sign, i.e., its degree of grammaticalization. The starting point for selecting these parameters are the six parameters suggested by Lehmann (1995). As already mentioned in Section 1.1, Lehmann starts out from the three parameters of weight, cohesion and variability, which he further splits into a paradigmatic and a syntagmatic type of parameters. To illustrate in what way our first six parameters deviate from Lehmann (1995), compare Table 3, with Lehmann’s original version in Table 1 above. One major difference between the modified list of parameters in Table 3 and the original Table 1 adopted from Lehmann’s work, is that we split the parameter of

36

Walter Bisang, Andrej Malchukov, et al.

Tab. 3: Modified list of Lehmann’s parameters. Paradigmatic

Syntagmatic

Weight

. Semantic Integrity . Phonetic Reduction (Phonetic Integrity)

Structural Scope

Cohesion

. Paradigmaticity

. Bondedness

Variability

. Paradigmatic Variability

. Syntagmatic Variability

Weight into Phonetic Integrity (which we call Phonetic Reduction) and Semantic Integrity. As explained in section 1.2, this is related to the fact that we are interested in the covariation of meaning and form for evaluating the Parallel Reduction Hypothesis and the Meaning First Hypothesis. In the same vein, we characterize Lehmann’s parameters as either meaning oriented (Semantic Integrity), or form oriented (Phonetic Reduction, but also Allomorphy and Bondedness). It is more difficult to characterize the other parameters in these terms, since some of them (Paradigmaticity, Decategorization) pertain both to meaning and form, while some others (Paradigmatic Variability, Syntagmatic Variability) are rather orthogonal to this distinction. In any case, this group of parameters takes an intermediate ground on the scale from meaning-related to form-related parameters. Apart from splitting paradigmatic weight into two parameters of Semantic Integrity and Phonetic Reduction, we introduced a few changes into Lehmann’s original framework. While Paradigmaticity (PM), Bondedness (BD), Paradigmatic Variability (PV) and Syntagmatic Variability (SV) are identical to Lehmann’s (1995) parameters, we did not include Structural Scope because it proved to be theoretically and empirically most challenging (see Tabor and Traugott [1998]; Lehmann [2004]; Diewald [2010] and Norde [2012] for different views). Moreover, we added the parameters of Decategorization (DC) and Allomorphy (AM), which are frequently discussed in the literature on grammaticalization (see Hopper and Traugott 2003: 106–110).17 It has been noticed in the literature on grammaticalization (Himmelmann 2005; Norde 2012), that Paradigmaticity and Paradigmatic Variability (Obligatoriness) have a somewhat different status than other parameters since they relate to properties of grammatical categories rather than individual items. While this observation is valid, we take these parameters into consideration, since, as noted by Diewald (2010), they are crucial for purposes of distinguishing grammaticalization from related processes (of lexicalization).

 In Lehmann’s framework, Decategorization (“morphological degeneration”) is actually subsumed under Integrity (parameter 1), and Allomorphy under Paradigmaticity (parameter 3).

Position paper: Universal and areal patterns in grammaticalization

37

The above eight parameters are logically independent. Even if all of them instantiate loss of autonomy in Lehmann’s (1995) framework, there is no covariation in the sense that the change of one parameter automatically entails the change of all other parameters. In fact, this project expects interesting cross-linguistic variation. Thus, the extent to which there are correlations and the extent to which these correlations are subject to cross-linguistic variation will be one of the results of our analysis. For making quantification of the parameters possible on a comparative level, we defined four values for each parameter. The values were numbered 1, 2, 3 and 4, indicating increasing loss of autonomy. To give an example, we distinguish the following four values of Bondedness (for more details on quantification of other parameters, cf. chapter 2): 1. The linguistic sign is a free morpheme or is the lexical root of a word. 2. The linguistic sign is a clitic (its use is not limited to a single word class). 3. The linguistic sign is an agglutinative affix (affixed to an individual word which is a member of a single word class). 4. The linguistic sign is part of a porte-manteau morpheme or is a suprasegmental (e.g., tonal marker) or a process morpheme (Ablaut, …), or a zero morpheme. It is needless to say that some of the choices we made are clearly stipulative or may be defined in another way. Stipulations of this kind are inevitable in quantitative studies. In some cases, the stipulations are also due to the need of working with a consistent set of four values for each parameter. It should be further noted that while the parameters as such are logically independent, this does not necessarily hold for all possible combinations of values associated with individual parameters (for example, Bondedness at values ‘3’ and ‘4’ entails highest values for (reduction of) Syntagmatic Variability). This is again unavoidable if one evaluates the parameters in their own right (independently) without reference to other parameters. Furthermore, as is clear from the following discussion, certain value (co)dependencies can be partially identified through statistical analysis as very strong correlations (see the discussion of heatmap in Figure 10 in § 3.3) In addition to these four values, each grammaticalization path was evaluated for the presence or absence of a change of value between the source and the target. If the value of the source was lower than the value of the target, the value [+] was given to that parameter. If the target value was identical to the source value, no change has taken place, thus the parameter was given the value [–]. Each of the 1003 paths can be characterized by patterns of change in terms of a sequence of its eight parameter values. In the case of the [+]/[–] values, we get patterns of the type [+ – + – – + + –], as occasionally used below. These patterns should be read as a shorthand for a scenario with the following parameter settings [+SI, –PR, +PM, –BD, –PV, +SV, +DC, –AM]. SI stands for (loss of) Semantic Integri-

38

Walter Bisang, Andrej Malchukov, et al.

ty, PR for Phonetic Reduction, PM for Paradigmaticity, BD for Bondedness, PV for Paradigmatic Variability (Obligatoriness), SV for Syntagmatic Variability, DC for Decategorization, and AM for Allomorphy. To follow-up on the discussion of the parameter of Bondedness above, a change from clitic to (agglutinating) affix (in the example above) will be represented as [2 > 3] in numerical terms, and as [+] in binary terms, while retaining its clitic status would count as [–2] for Bondedness or simply [–] in binary terms. The following screenshot from the Exceltable shows the (simplified) layout of our database and illustrates of how individual paths are annotated (Tab. 4). Another decision which we had to make was concerned with the items listed as sources and targets. If the source consisted of more than one morpheme it was necessary to single out one specific morpheme for determining the parameter values. In such cases, our measurement was focused only on the nucleus of that construction, which we defined as the most “lexical” component of the source construction (cf. the notion of ‘construction marker’ in Himmelmann [2005: 80]; see FN 15 above). This issue is of particular importance in cases of nucleus mismatches as they were discussed in section 2.2 (also cf. section 4.4). We would like to conclude this section with pointing out that our design shows clear similarities to the design of Bybee, Pagliuca, and Perkins (1994) but is broader in scope, as it is not confined to the coevolution of meaning and form and includes other parameters.18 Still, our design has been informed by the pioneering work of Bybee, Pagliuca, and Perkins (1994) in many ways, in particular, with regard to the idea of measuring values for particular parameters.

 In fairness, it should be noted that in connection to formal reduction, Bybee, Pagliuca, and Perkins (1994) also make use of some other parameters, among them allomorphy, which they see as one of the correlates of increased Bondedness.

DEMONSTRATIVE

bǐn

kánàa (Bm ‘come’ kànâ, Koy ká)

‘say’

bán





mɛˆn / mîn

nǎa jǎŋ



Manding

Manding

Manding

Manding

Manding

Manding

Manding

Manding

‘see, look’ (IMP)

‘come here’

‘come’ (INCOMPLETIVE)

‘fall’

‘finish’

‘finish’ (+ COMPLETIVE)

bán-tá

Manding

Source meaning

Source

Genus

Target meaning



nǎŋ

mɛˆn / mîn





kánàa (Bm kànâ, Koy ká)

bǐn

bán

+

COPULA

centripetal marker

RELATIVIZER

COMPLEMENTIZER

FUTURE

PROHIBITIVE marker

INCHOATIVE AUX

+

+

+

+

+

+

+

−

+

−

−

−

−

−

−

−

+

+

−

+

+

+

+

+

+

−

−

−

−

−

−

−

−

−

Seman- Phonet- ParaBondtic Inic Redigmat- edness tegrity duction icity

‘already’ (parti- + cle in postverbal position)

bárá ~ bádá PERFECT (Maninka-mori) predicative marker

Target

Tab. 4: Grammaticalization database design illustrated for a selection of paths.

−

−

−

−

+

−

−

−

+

Paradigmatic Variability

−

+

−

+

+

+

+

+

+

+

+

−

−

+

−

−

+

+

−

−

−

−

−

−

−

−

−

Syntag- Decate- Allomatic goriza- morphy Varition ability Position paper: Universal and areal patterns in grammaticalization

39

40

Walter Bisang, Andrej Malchukov, et al.

. Coevolution of meaning and form: Frequencies of change of values for the eight individual parameters In this section, we check the percentage of paths (N = 1003) for which there is a change in the grammaticalization value from  to  for each individual parameter. For that purpose, we are interested if the value at the  is higher than the value at the . If so, a given path is assigned a [+] value for a given parameter. If there is no change, the value is [–]. Based on the number of [+] values and [–] values, we calculate the percentage of change for each parameter separately. The results are presented in Figure 6. Since Semantic Integrity (SI; change rate: 84 %) is among the parameter relating (exclusively) to meaning change, the hypothesis of the coevolution of meaning and form or the Parallel Reduction Hypothesis can be analyzed by comparing the extent to which the other parameters covary with that parameter. A look at Figure 6 shows that there are only two other parameters with comparable percentages of change frequency, i.e., Paradigmaticity (PM; change rate: 86 %) and Syntagmatic Variability (SV; change rate: 82 %). In contrast, the purely form-related parameters of Bondedness (BD; change rate: 24 %), Phonetic Reduction (PR; change rate: 17 %) and Allomorphy (AM; change rate: 9 %) show much lower percentages. The change rates for Decategorization (DC; 61 %) and Paradigmatic Variability (PV; 37 %) lie somewhere between these values. Note that the above results are more in line with the Meaning

Fig. 6: Frequency of parameter change.

Position paper: Universal and areal patterns in grammaticalization

41

Fig. 7: Correlation between Semantic Integrity and Phonetic Reduction measured through binary features and parameter values.

First Hypothesis than with the Parallel Reduction Hypothesis, inasmuch as they show that function-related parameters (Semantic Integrity, but also Paradigmaticity which also relates to meaning) are first involved in grammaticalization processes, while form-related parameters (Bondedness, Phonetic Reduction) lag behind. In the following sections we look in more detail at how the parameters covary for understanding processes of grammaticalization. Since relation between (the loss of) Semantic Integrity and Phonetic Reduction is often seen as the corner stone of meaning/form coevolution, we zoom into the change values of [+]/[–] at the level of individual values [1, 2, 3, 4] found in the combination of Semantic Integrity (SI) and Phonetic Reduction (PR). As one can see from Figure 7, the highest percentage of [+] changes in Semantic Integrity [SI+] is observed at the column representing value 3 [66 %], which is characterized by a relatively high degree of semantic abstractness (the linguistic sign has an abstract non-denotational/relational meaning). If one further checks the individual values of Phonetic Reduction within that column, including values with change from  to  ([+2, +3, +4]) and values with no change ([–1, –2, –3, –4]), one can see that the percentage of [+] values for Phonetic Reduction within value 3 of Semantic Integrity is small. These results show that in the majority of cases Semantic Integrity is not correlated to Phonetic Reduction. Overall, this result is rather in line with Meaning First Hypotheses like the Context Model of Grammaticalization (Heine, Claudi, and Hünnemeyer 1991; Narrog and Heine 2018) and the Invited Inference Theory of Semantic Change (Traugott and Dasher 2002), but it also suggests that the interaction of form

42

Walter Bisang, Andrej Malchukov, et al.

and function is more complex and variegated. We will return to this point when discussing the patterns of correlations visualized by the heatmap plots in Figures 9 and 10 and the Network Graph in Figure 11. The overall dependencies can be roughly captured by the following sequence starting from semantic factors on the left side and ending with formal factors on the right side, with Decategorization and Obligatoriness (Paradigmatic Variability) placed in between. The ranking (‘>’) represents relative frequency of individual parameters involved in grammaticalization, and more tentatively, implicational relations, where involvement of parameters lower on the hierarchy implies involvement of the parameters higher on the hierarchy. The bracketed parameters are likely to “tie” on this Parameter Hierarchy, i.e., they show a strong correlation which defies strict ranking: (24) [SI, PM, SV] > DC > PV > [BD, PR, AM] This hierarchy is partially in line with Narrog and Heine’s (2018) hierarchy in (1), as far as the same categories (parameters) are concerned; note in particular, the sequence Semantic Integrity/Desemanticization > Decategorization > Phonetic Reduction/Erosion. Yet, the above hierarchy is more complex as it involves a number of additional parameters (and it also lacks a direct correspondence to Heine’s ‘extension’). In spite of this, the more complex hierarchy in (24) is an oversimplified representation, as will be clear from the results of the correlational study in section 3.4 below.

. Coevolution of meaning and form: Patterns of change Each grammaticalization path can be characterized by the list of its [+] and [–] values for each of the eight parameters, which are arranged in the order of Semantic Integrity (SI), Phonetic Reduction (PR), Paradigmaticity (PM), Bondedness (BD), Paradigmatic Variability (PV), Syntagmatic Variability (SV), Decategorization (DC) and Allomorphy (AM) (cf. section 3.1). This yields patterns of change of the type [+ – + – – + + –], in which the first [+] value stands for Semantic Integrity, etc. (see also Table 5 below for a more explicit format of presentation). The total number of possible patterns equals 28 = 256. A look at the overall frequencies of individual patterns in Figure 8 reveals the striking result that there is only a small number of very frequent patterns. In fact, the five most frequent patterns cover almost half of the 1003 paths of our sample (i.e., 49.36 %). A relatively small number of only 17 patterns constitute about three quarters of our paths (74.98 %). The overall number of attested change patterns in our data is 100, which corresponds to 39.06 % of the set of possible combinations of parameter values. One particular result which stands out if one looks at the first five patterns in Figure 8 is the consistency of the [+] values for Semantic Integrity (SI), Paradigmat-

43

Position paper: Universal and areal patterns in grammaticalization

Fig. 8: Frequency of parameter change patterns.

Tab. 5: Five most frequent change patterns: binary values. Pattern

SI

PR

PM

BD

PV

SV

DC

AM

 (. %)

+



+





+

+



 (. %)

+



+





+





 (. %)

+



+



+

+

+



 (. %)

+



+



+

+





 (. %)

+



+

+

+

+

+



icity (PM) and Syntagmatic Variation (SV). Moreover, the values for Phonetic Reduction (PR) and Allomorphy (AM) are consistently [–]. For better readability, these patterns are represented again in Table 5. This representation points out again the covariation of Semantic Integrity with Paradigmaticity and Syntagmatic Variability. The pattern with [+] values for all parameters, which would ideally be expected to be prominent in the case of the general coevolution of meaning and form (recall “the normal grammaticalization process” mentioned in the quote from Lehmann in FN 7 above), takes position 17 in Figure 8 and shows the low frequency of 1.3 %. In addition to this pattern, there are two more patterns with seven [+] values among the first 17 patterns in Figure 8 (patterns 9 and 11 with frequencies of 2.89 % and 1.69 %, respectively). These pat-

44

Walter Bisang, Andrej Malchukov, et al.

terns with large numbers of [+] values are counterbalanced by the pattern on rank 7 (3.59 %), which consists exclusively of [–] values (see Section 4.2.2 for further discussion of patterns of “covert grammaticalization”). These results call for a more detailed calculation of correlations presented in the next section.

. Correlations between parameters In this section we address the covariation of individual parameters, which will be visualized in the heatmap below. More specifically, we used R (R Core Team, 2018; version 3.5.2) to evaluate correlations between parameters with Kendall’s τ > 0.30 (Kendall’s tau-b). The high correlation coefficients with τ > 0.30, representing covariation between [+] and [–] values across parameters can be seen from the following heatmap:

Fig. 9: Heatmap of correlations between parameters based on [+]/[–] value changes.19 The heatmap shows the correlation coefficients (Kendall’s tau) for the correlations between the parameters Semantic Integrity (SI), Phonetic Reduction (PR), Paradigmaticity (PM), Bondedness (BD), Paradigmatic Variability (PV), Syntagmatic Variability (SV), Decategorization (DC) and Allomorphy (AM) with regard to whether a change along this parameter took place [+] or not [–]. Background colour corresponds to correlation coefficients. Relationships with coloured background are statistically significant (significance level: p < 0.05); non-significant relationships with background left blank.

 The heatmaps were created by using R package corrplot (Wei and Simko 2017). Parameter change is treated as ordinal variable with values [–] coded as 0 and value [+] coded as 1.

Position paper: Universal and areal patterns in grammaticalization

45

As this first preliminary correlation analysis reveals, there is a correlation (revealed in correlation coefficients with τ > 0.3 20) between Semantic Integrity (SI) and the parameters of Syntagmatic Variability (SV) and Paradigmaticity (PM Paradigmaticity (PM) (cf. SI/SV, τ = 0.44, p < .001; SI/PM, τ = 0.4, p < .001). In addition to the dependencies noted in sections 3.2 and 3.3, there is also covariation with Decategorization (DC) (SI/DC, τ = 0.31, p < .001). These results partly invite a functional explanation inasmuch as function-related parameters show high correlation coefficients (Semantic Integrity/Paradigmaticity, Semantic Integrity/Syntagmatic Variation). In addition, we observe a correlation between Semantic Integrity and Decategorization. Given the low correlation coefficients between Semantic Integrity and the other parameters (Phonetic Reduction, Bondedness, Paradigmatic Variability and Allomorphy), these findings also challenge undifferentiated generalizations on the covariation of meaning and form. A look at the values higher than 0.3 shows that there are other correlations which are concerned partially or exclusively with form-related parameters: SV/DC, τ = 0.45, PM/SV, τ = 0.39, PR/BD, τ = 0.33 (all ps < .001). Some of these correlations are to be expected and have a straightforward explanation. In the case of Syntagmatic Variability/Decategorization, one would expect that restrictions of morpheme order come with the inability of the  item to keep up with the whole set of categorial features (Decategorization) associated with the  item. Similarly, the emergence of more strongly integrated paradigms presupposes reduction of freedom in morpheme-order. This also applies to the correlation between the increasing tightness of the morphological integration of a linguistic item (Bondedness) and its phonetic substance (Phonetic Reduction, in particular, value 3 for subsyllabic morphemes and value 4 for suprasegmental features and zero markers). Looking at the individual parameters, it is remarkable that Paradigmatic Variability (Obligatoriness) shows no values above 0.30 for Kendall’s τ. This might be partially due to the methodological issue that the assignment of values 2 and 3 offers some leeway for interpretation (see the discussion of the respective values in the Questionnaire in Chapter 2), but it may also indicate that obligatoriness is not as important for assessing degrees of grammaticalization as suggested in the literature, since only a subset of grammaticalized markers may become obligatory. We will take up the issue of Obligatoriness in the conclusions (§ 5.5). In the case of Bondedness, there is a correlation with Phonetic Reduction (BD/PR, τ = 0.33). Moreover, the parameter of Allomorphy almost reaches the critical value of < 0.3). As a next step let us consider the correlations between grammaticalization parameters as they emerge from our data when we look at specific parameter values

 We assume a threshold of τ = 0.30, to distinguish between weak(er) vs. moderate/strong correlations, in accordance with Hemphill (2003). Of course, we are aware that thresholds/cut-off values may differ for different reasearch questions and fields.

46

Walter Bisang, Andrej Malchukov, et al.

Fig. 10: Heatmap of correlations between parameters based on concrete parameters values. The heatmap shows the correlation coefficients (Kendall’s tau) for the correlations between the parameters Semantic Integrity (SI), Phonetic Reduction (PR), Paradigmaticity (PM), Bondedness (BD), Paradigmatic Variability (PV), Syntagmatic Variability (SV), Decategorization (DC) and Allomorphy (AM) with regard to parameter values (levels) [1], [2] [3] and [4]. Background colour corresponds to correlation coefficients. Relationships with coloured background are statistically significant (significance level p < 0.05); non-significant relationships with background left blank.

(levels: [1], [2] [3] and [4]), rather than binary features. These correlations are represented above as another heatmap (Figure 10) for purposes of better comparability. The comparison of the two heatmaps in Figures 9 and 10 shows interesting similarities and differences. For evaluating these differences, one should bear in mind that the map with the values in Figure 10 just estimates correlations between high vs. low values between individual parameters but is non-committal as regards whether the values changed on a path from source to target. Consider the example of the morphologization of negative auxiliaries in Tungusic languages discussed in section 3.4.1 below. While in North-Tungusic languages, negative auxiliaries retain their auxiliary status (as exemplified in [35] from Ewen), in East-Tungusic languages, they become encliticized and eventually suffixed (as exemplified for Nanai in [36]). Now, this change does not ostensibly affect Semantic Integrity, but it does affect Bondedness. Consequently, we have [–] for SI in our notations, and [+] for BD. It is also remarkable that only formal properties (Bondedness) are affected, while the semantic properties remain unchanged, a result which runs against the predictions of the Meaning First Hypothesis. Thus, there is a mismatch between the two parameters in this respect, i.e., there is no correlation in terms of binary features. When characterized in terms of parameter values, the situation is radically different. The negative suffixes are characterized as highly grammaticalized on the Bondedness parameter (which is [3] for agglutinating affixes), but they will also

Position paper: Universal and areal patterns in grammaticalization

47

feature high at the level of Semantic Integrity (which is [3] for grammatical nonsyntactic markers). For that reason, Semantic Integrity and Bondedness do show a correlation on this path if measured in terms of values rather than binary features. This has to do with the fact that the estimation in terms of feature values looks for general correlations between high vs. low values for individual parameters, while it is non-committal as to whether the values underwent any change on the path from source to target. Turning back to the comparison of the two heatmaps, some converging results can be observed but one can also discern additional correlations in the second heatmap. In particular, the second map confirms the correlation between (the reduction of) Semantic Integrity and Paradigmaticity (τ = 0.37, p < .001), also found on the heatmap with binary features. However, the latter heatmap additionally shows a correlation with Paradigmatic Variability (τ = 0.33, p < .001), which was not visible on the binary feature map. Looking at interactions among the more form-based parameters we note a correlation between Bondedness and Allomorphy (τ = 0.33, p < .001), which is almost there also on the binary feature map (τ = 0.3, p < .001). But now, Bondedness also shows high correlation coefficients (τ > 0.30) with Syntagmatic Variability, Decategorization, and marginally (borderline case τ = 0.29) with Paradigmatic Variability (all ps < .001). The very strong correlation with Syntagmatic Variability (τ = 0.69, p < .001) is expected given that higher values of Bondedness necessarily involve restrictions in Syntagmatic Variability, but the connections with the other parameters are subtler even though not totally unexpected. Also, some other correlations are more visible on the value map, such as the connections of Paradigmaticity to Syntagmatic Variability and Paradigmatic Variability, but also to Bondedness. An overall picture revealed by the heatmap with values is therefore more coherent in the sense of correlations between multiple parameters. Moreover, Bondedness seems to show correlations between the function-related parameters, on the one hand, and form-related parameters, on the other hand. As an attempt to capture correlations between individual parameters, revealed by both heatmaps, we introduce the following Network Graph (Figure 11), where moderate/strong correlations (with high correlation coefficients) are represented by lines. The thick lines represent correlations which are visible on both heatmaps, thinner lines those found exclusively on the map with feature values (Figure 10); the broken line indicates the borderline case (τ = 0.29 in Figure 10). Overall, Semantic Integrity seems to involve stronger interactions with functionrelated parameters such as Paradigmaticity and Syntagmatic Variability, while Phonetic Reduction only shows a strong correlation with Bondedness. Thus, Bondedness seems to be a parameter subject to variation with both form-related parameters (Phonetic Reduction, Allomorphy) but also with function-related parameters (Paradigmaticity, marginally also with Paradigmatic Variability). Furthermore, as one can see from the heatmap on feature values (Figure 10), Bondedness shows a very high correlation coefficient (τ = 0.69, p < .001) with the syntactic parameter of Syntag-

48

Walter Bisang, Andrej Malchukov, et al.

PM

SI

PV

AM

BD

PR

SV DC

Fig. 11: Network graph showing correlation between individual parameters.

matic Variability. This is expected, given that Bondedness can be seen as a manifestation of Syntagmatic Variability in the domain of morphology. One way to interpret the central position of Bondedness in the network is to relate it to two different scenarios of grammaticalization, the function-driven grammaticalization, on the one hand, and the form-driven grammaticalization on the other (labelled “Grammaticalization from above” vs. “Grammaticalization from below” in section 4.2.1). Needless to say, the correlations displayed in the graph above need to be confirmed (replicated) on the basis of a larger database. Moreover, additional analyses will have to be conducted for detecting unilateral dependencies between individual parameters. In spite of this, the graph in Figure 11 in its present form already offers much more specific claims about possible correlations between individual parameters than those that are found in the literature or can be deduced from the unidimensional Parameter Hierarchy in (24). A comparison of the Network Graph in Figure 11 with the unidimensional hierarchy in (24) above reveals interesting similarities and differences. On the one hand, both the Graph and the Hierarchy indicate some co-dependencies of parameters either related to meaning (in particular, Semantic Integrity with Paradigmaticity, as well as with Syntagmatic Variability), or to form (Bondedness, Phonetic Reduction, Allomorphy). These co-dependencies are visible as correlations (indicated by arrows) on the Graph, and are indicated by bracketing in the Hierarchy (24): [SI, PM, SV] > DC > PV > [BD, PR, AM]. On the other hand, the Graph in Figure 11 shows a more complex pattern of correlations between parameters, as compared to the Hierarchy. This difference, has to do with the fact that the Hierarchy is more general in nature: differently from the Graph it reflects not just correlations, but also frequencies of individual parameters, frequencies on individual paths and, by extension, cross-linguistic frequencies. Therefore, the patterns displayed by the Graph and the Hierarchy are of complementary nature.

Position paper: Universal and areal patterns in grammaticalization

49

. Areality in scenarios of grammaticalization In analogy to section 3.3, we first look at the five most frequent change patterns of [+] and [–] values in the two macroareas21 for which we have the largest number of paths (= N), i.e., for Eurasia with 574 paths and for Africa with 245 paths (cf. Figures 12A and 12B, respectively). As the juxtaposition of the change patterns in Africa and Eurasia in Figures 12A and B shows, there are various differences, but these differences are not always easily interpretable. Thus, our data suggests that overall Decategorization plays a more important role in grammaticalization processes in Eurasia as compared to Africa, since the most frequent pattern (22.47 %) in Eurasia ([+ – + – – + + –]), features Decategorization in addition to increased Semantic Integrity, increased Paradigmaticity, and increased Syntagmatic Variability. In Africa, this pattern is taking the second rank (25.31 %), while the most frequent pattern (32.65 %) involves increased Semantic Integrity, increased Paradigmaticity, and increased Syntagmatic Variability, but no increase in Decategorization. In our sample, Eurasia also shows higher rates for Paradigmatic Variability (cf. the second most frequent pattern [+ – + – + + + –] with 11.67 %), in spite of the fact that South East Asian languages generally do not feature obligatory categories (cf. Bisang 2011). Needless to say,

Fig. 12A: The five most frequent change patterns [+]/[–] in Africa.

 The two macroareas of Africa and Eurasia correspond to those defined in Glottolog. They also correspond to the large linguistic areas in Dryer (1992).

50

Walter Bisang, Andrej Malchukov, et al.

Fig. 12B: The five most frequent change patterns [+]/[–] in Eurasia.

Tab. 6: Occurrence of parameter change patterns in Africa and Eurasia.

N =  patterns N =  patterns N =  patterns N =  patterns

Eurasia ( paths)

Africa ( paths)

x x ø ø

x ø x ø

these conclusions should not be taken at face value, until they are replicated on a larger sample. The data in Table 6 above contrast Eurasia with Africa and further highlights the differences between these two macro-areas. As one can see, Eurasia and Africa share 32 patterns (x = presence of N, ø = absence of N). There are 44 additional patterns which are only attested in Eurasia and another seven patterns which are limited to Africa. Moreover, 173 patterns are unattested in both macro-areas. This difference between Eurasia and Africa is significant (χ2(1) = 15.35, p < .001). While the analysis above clearly shows that there are areal differences in grammaticalization scenarios, the interpretation of these differences is more challenging, at least at the level of macro-areas. For that reason, we look for potential areal

Position paper: Universal and areal patterns in grammaticalization

51

differences between Africa and two subareas of Eurasia, i.e., Europe22 and Southeast Asia23 in a second analysis (cf. Figure 13). The data for Africa consists of 229 paths, while there are 95 paths for Europe and 129 paths for Southeast Asia. Since these figures are comparatively small, the results discussed with regard to these three areas can only be taken as some first indicators which need further testing on a larger database. Comparing the percentages of [+] values for change in the individual parameters (cf. section 3.2) in Figure 13, one can see that the overall distribution of the three parameters with the highest percentages (Semantic Integrity, Paradigmaticity, Syntagmatic Variability) and with the lowest percentages (Bondedness, Phonetic Reduction, Allomorphy) remain the same in each area (cf. section 3.4), confirming the Parameter Hierarchy in (24) above: [SI, PM, SV] > DC > PV > [BD, PR, AM]. What can be observed is that the values for the two medial parameters on the hierarchy show divergent values for each area (Decategorization: 88 % in SE Asia, 60 % in Europe and 40 % in Africa; Paradigmatic Variability: 47 % in Europe, 23 % in SE Asia and 13 % in Africa). Moreover, the ranking of the three parameters with the lowest percentages differs in each area. The value for Phonetic Reduction is 0 % in Southeast Asia, in accordance with the observations by Bisang and others that Phonetic Reduction plays a limited role in EMSEA languages (cf. section 4.3.2). Less expected is the relatively low percentage of cases of functional parameters (SI, PM, SV) in Europe, as compared to Asia and Africa, but this is due to processes of “secondary grammaticalization” in European languages which do not necessarily involve semantic reduction. Indeed, secondary grammaticalization understood as later developments “from grammatical to more grammatical” (cf. Norde 2012) do not necessarily reduce Semantic Integrity (in our metrics), as for example, in the case of directional prefixes evolving into derivational aspect markers in Slavic mentioned in § 4.2.1. below. Further potential areal differences show up if one zooms into the individual values [1, 2, 3, 4] of the individual parameters. For the purpose of this study, we illustrate this only for the parameter of Paradigmaticity. Figure 14 presents the frequency ratios of change for Paradigmaticity in Africa, Europe and Southeast Asia. The comparison of values 2 and 3 shows certain similarities between Africa and SE Asia as far as ratios of parameter values are concerned, i.e., proportions of the respective parameter value in paths with a change (value 2: 74 % in Africa, 68 % in SE Asia; value 3: 18 % in Africa, 32 % in SE Asia), while Europe shows different frequency ratios of change (value 2: 25 %; value 3: 60 %). Moreover, there is no value 4 for Paradigmaticity in our data on SE Asian languages. This is to be expected,

 For the purpose of this analysis languages of Europe are restricted to Germanic, Romance and Slavic languages (the data from Uralic languages are not included, since they are not restricted to Europe).  We use ‘Southeast Asia’ as a shorthand for East and mainland Southeast Asia (EMSEA). Our data are from three languages: Mandarin, Thai and Khmer.

52

Walter Bisang, Andrej Malchukov, et al.

Fig. 13: Frequency of parameter change (binary coding) for Europe, South East Asia and Africa.

Position paper: Universal and areal patterns in grammaticalization

53

Fig. 14: Comparison of Paradigmaticity ratios (parameter values) for Europe, South East Asia and Africa.

given that grammaticalization in SE Asian languages does not yield coherent paradigms (for some exceptions, cf. Bisang [2020]). Thus, as in the case of grammaticalization paths, areal variation in grammaticalization scenarios reveals itself again more clearly on a smaller scale. Moreover, the impact of typological factors is more noticeable in the context of scenarios, as compared to the context of paths, which is more prone to contact-induced replication.

 Typological variation in grammaticalization: A discussion . Setting the stage Describing phenomena of grammaticalization in an individual language or language family often presents challenges, particularly if the language(s) in question are not that frequently discussed in the context of grammaticalization. The following list of features which are characteristic of Australian languages as presented in the chapter of McGregor is a good illustration of both potential candidates for typological variation (paths and scenarios) and the problems associated with claims of potential generalizations:

54 I.

II. III. IV. V.

Walter Bisang, Andrej Malchukov, et al.

Prosodic changes do not always accompany grammaticalization, and thus bound items deriving from separate words may retain status as full phonological words; Recategorialisation is common, though often with retention of (some or all) morphological properties of the item: that is, it is not decategorialised; Only a small fraction of the grammatical morphemes of a typical language can be traced back to anything else with any degree of certainty; There is overall limited evidence of lexical sources for grammatical markers – evidence is generally best in the verbal domain; Metaphoric transfer plays a limited role in grammaticalization.

Not all features seem to be of an equal status in this list. The first two features pertain to parameters of grammaticalization. The first one is related to Phonetic Reduction and Bondedness in our analysis, while the second one corresponds to Decategorization. The next two features address the methodological challenges of determining paths of grammaticalization from the synchronic and diachronic linguistic data as they are currently available. In such a context, the limited presence or even the absence of cases of documented grammaticalization in a language may suggest its reduced importance in this language (which might rely on other mechanisms of change, including morphological and syntactic reanalysis instead). But it could also be the result of its diachronic blurring to the extent that it is no longer observable today. This methodological problem will always set certain limits to the reliability of statistical analyses which is hard to quantify and will not be further discussed in this paper. In spite of this, the correlations in sections 3.3 and 3.4 and the overall frequency of certain patterns of parameter change in general and within macro-areas (section 3.5) are robust as far as one can see from our database. A more specific look at feature IV reveals that there are areal differences. In Australian languages, lexical sources can be detected more easily if they belong to the domain of the verb. In contrast, Quechuan languages allow the reconstruction of nominal morphology more readily than verbal morphology (cf. Adelaar, this volume). Another areal property of Australian is the minor importance of metaphoric transfer (feature V). As McGregor (this volume) points out, metaphoric transfer is prominent in many languages (cf. the transfer from  to ) but it is of minimal relevance in Australia. This short description of the specific situation of Australian languages nicely shows the context in which scenarios of grammaticalization must be discussed. Apart from the general problem of the extent to which phenomena of grammaticalization are still accessible, there is the issue of parameters and their interaction or non-interaction and there are various aspects of language-specific, genus-specific or area-specific properties. These specific properties should not be underestimated when discussing correlations between parameters. After all, grammaticalization never happens out of the blue. It is always related to the properties which are al-

Position paper: Universal and areal patterns in grammaticalization

55

ready there in a language and which have their effects at the time when a given  concept starts developing into a new  concept. The list of potentially relevant properties is long. They cover structural factors like word order or phonological constraints as well as morphological typology, the competing motivations of explicitness vs. economy and, finally, areality. Not all of these factors will be treated in equal detail, but they should at least be mentioned, since they seem to have their impact on the outcome of grammaticalization processes in the languages under investigation. The scenarios of grammaticalization as they show up in the interaction of parameters is the object of section 4.2. This section also includes a perspective on the taxonomy of grammaticalization processes, as well as related processes, which share some features with grammaticalization (frequently, subsumed under the rubric of reanalysis). Section 4.3 discusses typological and areal variation in grammaticalization on two levels. At the micro-level of specific constructions, we discuss factors which facilitate (or inhibit) processes of grammaticalization (cf. section 4.3.1). At the macro-level, the discussion is extended to the role of morphological types, following up on the work by Bisang (2011) on SE Asian languages of isolating typology (cf. section 4.3.2). Finally, the question of how grammaticalization processes relate to constructionalization is discussed in section 4.4.

. Grammaticalization scenarios and parameters: qualitative reassessment .. Frequent and rare scenarios Our description of scenarios of grammaticalization described in section 3 was based on the importance of individual parameters (section 3.2), on correlations between different parameters (section 3.3) and on patterns of parameter change (sections 3.4 and 3.5). This section will discuss prominent and rare scenarios of grammaticalization with examples from individual chapters. It will also address what we call ‘covert grammaticalization’, i.e., phenomena which do not manifest themselves in change of parameter values, and may be regarded as reanalysis rather than grammaticalization. Starting out from patterns of parameter change, it may come as a surprise to some readers that the prototypical case of grammaticalization with value changes in all parameters [+ + + + + + + +] is not among the dominant patterns. In fact, it takes the last position in Figure 8 (position 17 with a frequency of 1.3 %), even though it belongs to the more frequent patterns in the overall dataset. This may be partly due to the fact that certain parameters are affected infrequently in our data. A good case in point is the parameter of Allomorphy, which is attested only for 9 % of our 1003 paths. More generally, as one can also see from Figure 6, is the overall

56

Walter Bisang, Andrej Malchukov, et al.

domination of function-related parameters over the form-related parameters (Bondedness, Phonetic Reduction) which occur at the lower end to the right of Figure 6. In diachronic terms, this suggests that form-related changes take place later than meaning related changes, which is in accordance with the Meaning First Hypothesis. This can be seen most clearly from chapters on variation in grammaticalization across languages within families. For example, while the verb ‘do’ shows grammaticalization across all Iranian languages, only Chorasmian shows the most advanced stage of grammaticalization, where the auxiliary is suffixed to the verb and phonetically reduced to k-: (25) Chorasmian (Korn, this volume, ex. [37]) ka=fa=ma ne=pard=k-i for==1 =restrain.=-2 ‘for you cannot restrain me’ Overall, the most frequent type of grammaticalization attested in our data involves auxiliation, or, in a broader perspective, functionalization as it is introduced by Michaelis and Haspelmath (this volume) for cases of lexical items acquiring grammatical functions without concomitant downgrading, i.e., cliticization, agglutination or fusion. The most common type of functionalization in the verbal domain is auxiliation (in terms of Kuteva 2004), while the development of relational nouns into adpositions is most common in the nominal domain. The dominant pattern of functionalization involves Semantic Integrity plus increased Paradigmaticity and reduction of Syntagmatic Variability (see frequency distributions shown in Figure 8). The parameters of Bondedness, Phonetic Reduction and Allomorphy are of minor importance (see the Parameter Hierarchy in [24] above). The parameters of Decategorization (DC) and Paradigmatic Variability (PV/ Obligatoriness) take an intermediate position on the Hierarchy and create some subvarieties of auxiliation. Both distinctions are somewhat problematic inasmuch as certain distinctions, in particular the definition of specific values, are subject to stipulations (see chapter 2 for further discussion). In spite of these problems, these subvarieties can be distinguished with respect to individual languages, or across related languages. Thus, distinctions in degrees of Obligatoriness are well illustrated in different extensions of articles in varieties of Germanic, as discussed by Nübling and Kempf (this volume), and in Austronesian languages (Sulawesi), as discussed by Himmelmann (this volume, especially his Table 7). Himmelmann’s (2004) expansion-based approach to grammaticalization provides a good framework for capturing different degrees of obligatoriness in terms of context-expansion. The limitations of Decategorization in the context of high degrees of Semantic Integrity and Bondedness is nicely illustrated by verbal cases in Kayardild (Australian) as they are discussed in another context by Malchukov (2009). Verbal cases in Kayardild, as analyzed by Evans (1995), are truly remarkable in that they behave

Position paper: Universal and areal patterns in grammaticalization

57

syntactically like other cases but are additionally marked for verbal categories. Thus, the recipient in (26) is marked by the “verbal dative” which further copies the imperative inflection from the main verb (differences between the forms of the imperative and the verbal dative stem from differences in conjugations; conjugation markers are cumulative with the imperative in the examples below; [Evans 1995: 165]): (26) Kayardild (Evans 1995: 336) wuu.ja wirrin-da ngijin-maru.th give. money- 1-v. ‘Give me the money!’ As observed by Malchukov (2009: 647), this pattern remains puzzling until one realizes that verbal cases originate from noun-verb compounds which explain their verbal inflection (the verbal dative in [26] stems from maru.tha ‘put’; Evans [1995: 164]). The diachronic scenario itself is common – many case markers originate from verbs, especially in languages with verb serialization. What is typologically extraordinary is that the verb becomes bound without being decategorized in the course of grammaticalization. Note also that this pattern, while truly remarkable, may be a reflection of areal trends in Australian languages. Recall that McGregor (this volume) notes absence of Decategorization as one of the general traits of Australian languages (cf. section 4.1). While the patterns of functionalization/auxiliation emerge as the most frequent pattern in our data, it is also instructive to consider rare types of scenarios, particularly those in which only one parameter has a [+] value. This has to do with the fact that Semantic Integrity is a very prominent parameter which occurs in combination with other parameters. Notice that the first pattern of change in Figure 8 in which the first value for Semantic Integrity is [–] is [– – – – – – – –], i.e., the pattern of ‘covert grammaticalization’ which cannot be detected by our metrics. Conversely, if other parameters are involved, Semantic Integrity will be involved as well. Exceptions pertain to special cases, as when negation is encliticized in East-Tungusic (cf. example [36] from Nanai below) without ostensible semantic reduction. Cases with Bondedness or Phonetic Reduction as the only parameters with a [+] value are rather exceptional. Cases in which only Phonetic Reduction and Bondedness are involved have been discussed earlier in terms of ‘chunking’ (Bybee and Beckner 2015: 508–509; cf. Krug 2000) or ‘univerbation’ (Lehmann 2002: 135; Lehmann forthcoming), and will be characterized as instances of “grammaticalization from below” in section 4.3.2 below. Some of these cases, like the development of inflected prepositions in Germanic discussed by Nübling and Kempf (cf. beim < bei+ dem [ ] and the like), may be regarded as counterexamples to the Meaning First Hypothesis. They show an unusual scenario in which the form of the grammaticalized item (the article) is affected (Phonetic Reduction, Bondedness) rather than se-

58

Walter Bisang, Andrej Malchukov, et al.

mantics. It should be noted though (as also pointed out to us by Damaris Nübling and Christian Lehmann, in p. c.) that certain aspects of meaning change seem to be involved in this development as well; in particular the encliticized definite article cannot be used in a deictic function. And of course, it does not contradict the more general view on meaning-form covariation, given that articles have already an abstract meaning (recall the discussion of the two heatmaps above). Cases with a [+] value only in Paradigmaticity are unusual as well, though some cases of secondary grammaticalization may serve as examples. Thus, the development of the Slavic type of aspectual systems formed by directional prefixes increase the Paradigmaticity value (and possibly the value for Obligatoriness) but no values for the other parameters. As is pointed out by Wiemer (this volume, § 1), this is an unusual scenario: “Ubiquitous and pervasive as this system is for all Slavic languages, it is highly ‘inconvenient’ in terms of standard parameters of grammaticalization.” This phenomenon, however, has typological parallels. One of them is the case of classificatory verbs with aspectual function in Nyulnyulan (McGregor, this volume), which are also ambiguous in relation to their status in terms of grammaticalization or lexicalization. The loss of Syntagmatic Variability as a single parameter independently of Semantic Integrity is another rare phenomenon. The clearest cases are those that hardly count as grammaticalization like the rise of V2 word order in Germanic discussed by Nübling and Kempf (this volume).24 As Bisang (2004) pointed out, loss of Syntagmatic Variability is a very prominent parameter in EMSEA languages but here it is mostly combined with loss of Semantic Integrity. Similarly, Allomorphy as a single signal of grammaticalization is highly unusual. If this pattern occurs at all, it is mostly found in cases of secondary grammaticalization in which other parameters changed their values at previous stages. This also explains some patterns in which Phonetic Reduction and Bondedness are not accompanied by semantic reduction, contrary to a general trend. After all, both negation in the Tungusic examples mentioned below in (35) and (36) and the suffixed articles in Germanic involve semantically highly grammaticalized items. Such cases involve “secondary grammaticalization” where an affix is not semantically generalized by our metrics but may undergo phonetic reduction.

 But can count as ‘constructionalization’, which is a broader notion, since it does not need to involve a grammaticalized item (lexical source/construction marker); cf. Noël (2007); Traugott (2015). In the grammaticalization literature, word order change is most often treated as (syntactic) reanalysis not reducible to grammaticalization (cf. Hopper and Traugott 2003: 59–63).

Position paper: Universal and areal patterns in grammaticalization

59

.. “Covert grammaticalization” and reanalysis This section is about cases of ‘covert grammaticalization’ that are not detected through our list of parameters. It should be noted that some of them might be discovered under different assumptions or with different lists of parameters. Thus, even if we generally follow Lehmann (1995), we did not use his parameter of Structural Scope. Similarly, other parameters mentioned in the literature can be added, such as Traugott and Dasher’s (2002) (Inter)subjectification. With these qualifications in mind, it is still revealing to look at cases of change which do not count as grammaticalization by our criteria, i.e., cases of [– – – – – – – –]. As it turns out, such cases do not qualify as typical instances of grammaticalization in the view of our contributors as well. Thus, Himmelmann (this volume) only finds few typical cases of grammaticalization in Sulawesi. What he finds, however, is a lot of reanalysis. The origin of a specific type of person conjugation expressed by prefixes is explained by Himmelmann (this volume, § 3.3) as follows: “It seems very likely that the prefixal series has its origins in a construction with the following features: an auxiliary-like matrix predicate followed by a subordinate predicate usually in a special subjunctive mode.” The following scheme from Himmelmann (this volume, ex. [40]) summarizes these developments: (27) possible source constructions for Sulawesi person markers a. [(XPi) [predicate/aux=pron.clitic i [verb YP]]] → [proclitici=verb XPi YP] b. [(XPi) [aux=2nd position clitici verb YP]] → [proclitici=verb XPi YP] The situation in Lauje in (28) represents a more advanced stage of reanalysis, while the situation in Cebuano represents the original construction. Due to reanalysis, genitive actor enclitics (niya in [29]) become proclitics and thus lose their status as second position clitics: (28) Lauje (Himmelmann, this volume, ex. [27]) láupe 'u-otoi / no-'ootoi-im? not.yet -know .-know-. ‘I don’t know yet.’ / ‘You don’t know yet?’

/ /

no-'ootoí-ny .-know-. ‘She doesn’t know yet.’

(29) Cebuano (Zorc 1977: 151) waʔ niya saky-í ang taksi . 3. ride.on-.  taxi ‘He did not ride in the taxi.’ Another example which also qualifies as reanalysis and is difficult to capture in terms of grammaticalization parameters is concerned with the origin of the passive marker in a preposition (for details, cf. Himmelmann, this volume).

60

Walter Bisang, Andrej Malchukov, et al.

Similar cases of syntactic reanalysis are attested in the following chapters: reanalysis of a ‘transimpersonal’ construction in Ket (cf. Vajda, this volume), reanalysis of a postposition to an auxiliary in Manding (cf. Creissels, this volume), reanalysis of clefts resulting in a new constituent order in Mayan (cf. Lehmann, this volume), insubordination in Korean (cf. Rhee, this volume) plus examples from Tungusic (cf. Malchukov, this volume), Cushitic (cf. Vanhove, this volume) and Romance (cf. Cennamo, this volume). The scenario of insubordination is especially telling as it shows unusual properties from the perspective of grammaticalization. Insubordination, as defined by Evans (2007), involves the reanalysis of an erstwhile subordinate clause into a main clause, accompanied by the loss of the matrix verb and the reanalysis of the subordinate (often non-finite) inflection in terms of finite inflection. While the two processes involved (Phonetic Reduction and semantic shift) are typical of grammaticalization, insubordination is peculiar in that Phonetic Reduction and semantic change target different constituents of the underlying structure. Thus, the deletion of the matrix verb in insubordination can hardly be seen as Phonetic Reduction, as it pertains to another element of the source construction (see also below). The following example illustrates insubordination in Korean, where non-finite forms (“connectives” or converbs) develop into finite forms (“sentence-enders”): (30) From Connective to Sentence-Ender (Rhee, this volume, ex. [38]) a. -ketun Hypothetical conditional, Comparative conditional → Topic presentation, Reason, Incidentality b. -nikka Cause/Reason, Contingency, Adversative → Addressee reconfirmation, Protest, Assertion c. -myense Concurrence, Contrast → Addressee confirmation, Challenge, Derisive d. -(nu)ntey Background, Adversative → Surprise, Reluctance, Reason, Background e. -key Mode → Exhortative, Dubitative Malchukov (2013, this volume) distinguishes between insubordination, on the one hand, and verbalization, on the other hand, as two related but different scenarios. In insubordination, the complement or adjunct clause is reanalyzed as a matrix clause, while the matrix verb is omitted. In verbalization, reanalysis involves a nominal predicate, which is reanalyzed as a verbal predicate. As shown by Malchukov (2013) for Tungusic, both scenarios may yield finitization of a participial predicate but proceed in somewhat different ways. The two scenarios can be schematically illustrated as follows: Scenario 1. “Verbalization”: Reanalysis of nominal predicate into a verbal predicate [Sb] [N/Part] [COP] → [Sb] [V2 Aux] ( → [Sb] [V])

Position paper: Universal and areal patterns in grammaticalization

61

(31) Ewen (Malchukov, this volume, ex. [28]) a. Bej [hör-če] [bi-si-n]. man go-. be--3 ‘The man had left.’ b. Bej [hör-če bi-si-n]. man go-. be--3 ‘The man had left.’ Scenario 2. “Insubordination (proper)”: Reanalysis of sentential arguments as main clauses [Sb Part-agr.pos] [COP] → [Sb Part-agr.pos] ø → [Sb] [V-agr.pos] (32) Ewen (Malchukov, this volume, ex. [27]) a. [Bej-il hör-ri-ten] bi-d’i-n. man- go-.-3. be--3 ‘The men probably left.’ (lit. ‘The men’s leaving will be.’) b. Bej-il hör-ri-ten. man- go--3() ‘The men left.’ As pointed out above, it is difficult to characterize insubordination in terms of grammaticalization parameters because different parameters are targeting different constituents (the matrix verb is deleted, while the participial form shows semantic change and increased obligatoriness). Verbalization is closer to typical cases of grammaticalization (contraction of the verbal predicate) but shows some common properties with insubordination (auxiliary omission). More generally, syntactic reanalysis pertains to more complex scenarios. The grammaticalization of a particular element is just one side of the development but what makes the situation more complex is that different grammaticalization parameters, if invoked, tend to be distributed across different items. As Harris and Campbell (1995: 65) observe, “many instances of reanalysis involve more than one of these aspects of underlying structure at once”. Another interesting case of reanalysis which is not “detectable” through our set of grammaticalization parameters is the reanalysis of the imperfective/simultaneous marker as a marker of switch-reference in Mian (cf. Fedden, this volume). (33) Mian (Fedden, this volume, ex. [33]) ngaan-b-e=a naka=i wentê-n-ib=a call.-(.)-3..= man=. hear.--3..= ‘While he was calling, the men heard (him), and then …’ [Dafinau]

62

Walter Bisang, Andrej Malchukov, et al.

As observed by Fedden (this volume), who follows (Longacre 1983), this reanalysis is based on the inference that simultaneous actions are likely to involve different participants, while successive actions are associated with the same participants. Even though this may be regarded as a nice example of inference-based grammaticalization, it is not “detectable” through our set of grammaticalization parameters. Further, there are cases of “covert grammaticalization”, which are closer to bona fide cases of grammaticalization, even though they cannot be identified as such by our list of parameters and respective values. These include cases in which demonstratives develop into (3rd person) pronouns as in Mountain Ok and Cushitic. Such cases do not involve increasing values for any of our parameters, since both linguistic signs count as abstract and referential (value 2 for Semantic Integrity), and do not show any increase in Paradigmaticity, Syntagmatic Variability, Decategorization and Obligatoriness, nor do they show any Phonetic Reduction or Bondedness. The characterization of such cases as covert grammaticalization, however, crucially depends on the precise definition of particular values of individual parameters, and on some slightly different assumptions, they may be brought in line with more typical cases of grammaticalization (C. Lehmann, p.c.), Some other cases of covert grammaticalization include the latest stages of grammaticalization, which might be described as “tertiary grammaticalization”, exaptation, or, more broadly, morphological reanalysis (what Dahl [2004] calls “maturation”). Several examples of the sort have been discussed in the chapter on Romance languages, in cases in which previous Latin gender/number classes were reanalyzed in terms of other classes (Cennamo, this volume, Table 1); cf. neuter plural (-a) > feminine singular folium-a ‘leaf ’; It. foglia ‘leaf ’, Pt. folha, Cat. folla. Ro. foaie.

.. Degrammaticalization There are few clear cases of degrammaticalization in our data (only eight examples altogether), and these are rather heterogeneous. Moreover, some of them may be regarded as artefacts of our way of measuring grammaticalization parameters. Thus, the development of a case marker into a conjunction would count as degrammaticalization (increase in Semantic Autonomy) if affecting bound forms (as is the case in Nyulnyulan; McGregor, this volume). This is due to the fact that bound case markers are assigned value 4 on the scale of Semantic Integrity, while purposive complementizers get value 3. Other potential cases have other explanations. Thus, Jacques (this volume) describes the following remarkable development in Japhug: Pronouns and possessive prefixes are very similar (see Table 1), but it appears that in Japhug pronouns are derived from possessive prefixes rather than the opposite: pronouns other than 3 and 3 are build by combining the possessive prefix with the root -ʑo meaning originally ‘self ’ … Japhug thus exemplifies a pathway pronominal affix > pronoun … (Jacques, this volume, § 2.2)

Position paper: Universal and areal patterns in grammaticalization

63

However, he offers the following clarification later on: “It is not however a case of degrammaticalization in the strict sense, since the bound pronominal prefixes have not become free morphemes by themselves.” (Jacques, this volume, § 2.2). Indeed, this pattern is similar to cases of nucleus mismatch involving possessive markers used on dummy nouns giving rise to personal pronouns as discussed in section 2.2 above. Some other examples seem to be true instances of degrammaticalization Thus, gender markers developing into diminutives (as is the case in Mountain Ok or Cushitic) are good examples from the perspective of Paradigmatic Variability because they are not obligatory in their function of diminutive markers. Similarly, the development from spatial adverbial yeki  to the 1st person pronoun in Korean (Seongha Rhee, p.c.), involves degrammaticalization, at least as far as Decategorization is concerned. Finally, some cases of degrammaticalization are contact-induced. Especially telling is the case of Ket (Vajda, this volume), which under the influence of neighboring Uralic and Altaic languages, changed its prefixing profile in its verbal templates to patterns with extensive suffixation. One of the conservative verb templates in Ket incorporates an action nominal/infinitive into slot 7 (P7) and the verb root appears in the original semantic head position (P0). This verbal template has undergone reanalysis to the effect that the verb root in the original position (P0) has eroded both semantically and phonologically, giving rise to suffixes marking transitivity and aspect, while the original action nominal in P7 was reinterpreted as a verb stem. For example, the base -bɛt in (34) originally meant ‘make’ (the superscripts in this representation refer to slots in the template): (34) Ket (Vajda, this volume, ex. [18a]) d8-igbɛs6-ku 6-ɣ 5-o4-l 2-bɛt 0 18-arrive.7-2.6-5- 4/2-.0 ‘I brought you. (many times)’ This development thus involved the reanalysis of incorporated infinitives as stems, and the reanalysis of auxiliaries as suffixes. Moreover, this change resulted in the partial degrammaticalization of prefixes in slot P8, which in certain contexts get encliticized/suffixed to a preceding word. As explained in detail by Vajda (this volume), for example, da8=don 7-si4-bɛd 0 ‘she makes a knife’ may be pronounced – depending on phonological and morphosyntactic context – as either da=donsibed or =da # donsibɛd (see Vajda, this volume, ex. [20]). In some Iranian languages like Balochi, certain pronominals are diachronically derived from verbal endings (Korn, this volume). In this case, however, the cause of degrammaticalization might be language-internal, i.e., it seems to be grounded in the reorganization of the agreement system (cf. Korn, this volume, for details). In contrast, the case of Ket is clearly due to language contact. Thus, apart from the more familiar cases of contact-induced grammaticalization (Heine and Kuteva 2005, 2006), contact-induced degrammaticalization is attested as well.

64

Walter Bisang, Andrej Malchukov, et al.

. Language typology and variation in grammaticalization scenarios This section is about the structural properties responsible for variation in grammaticalization scenarios. Section 4.3.1 is concerned with construction-specific properties facilitating or inhibiting grammaticalization, i.e., word order and phonetic factors. Section 4.3.2 switches the perspective from micro-level to macro-level and discusses the general influence of language typology on the outcome of grammaticalization processes. This section follows up on Bisang’s (2004, 2011) work on the special properties of grammaticalization in SE Asian languages, which was the first demonstration of how grammaticalization depends on areality and language typology. Finally, this section addresses is about the relationship between morphological typology and grammaticalization.

.. Structural factors and construction-specific constraints: Word order and phonological properties The role of structural factors has been discussed in the literature of grammaticalization, and more specifically in the literature on syntactic change and reanalysis (see also Narrog and Heine [2018] for a recent discussion). Thus, Givón (2009) showed that the propensity to reanalyze matrix verbs in certain complement clauses as TAM auxiliaries and further as verbal TAM markers is facilitated by SOV word order, which places the verb and erstwhile complements/auxiliaries in adjacent positions. A good example illustrating the role of word order in grammaticalization comes from Tungusic (cf. Malchukov, this volume). In most Tungusic languages, verbal negation is expressed by a negative auxiliary, which occurs preverbally and retains its word status. In East Tungusic, however, the auxiliary was postposed, and then encliticized and eventually fused with the lexical verb. In Northern Tungusic languages (like in Uralic), negation is periphrastic and is expressed by the inflected negative verb e- and a lexical verb in the special non-finite (connegative) form in –R(A), which is cognate with the aorist marker. (35) Ewen (Malchukov, this volume, ex. [14]) e-he-m gaa-d. not.do--1 know-. ‘(I) don’t know.’ In some Eastern Tungusic languages like Nanai, the negative verb was postposed to the lexical verb in analogy to auxiliaries in general and underwent suffixation:

Position paper: Universal and areal patterns in grammaticalization

65

(36) Nanai (Malchukov, this volume, ex. [15]) Gad-a-se-mbi < gad + e-se-mbi take---1 ‘I didn’t take.’ While structural factors clearly inhibited or promoted the grammaticalization of the negative marker in Tungusic, it is more difficult to explain the mechanisms which induced it. One possibility is analogy-driven grammaticalization. Tungusic in general is a suffixing language. Preposed material does not grammaticalize into prefixes but may lexicalize into particles, as it indeed happened to some negative markers in Tungusic and also in Mongolian. The importance of structural factors like word order is also observed elsewhere. Lehmann (this volume) discusses structural conditions in Mayan languages inhibiting or allowing pronominal elements to grammaticalize into verbal prefixes. Example (37) from Ch’ol shows an earlier stage at which a pronominal element is (still) a second-position clitic. At a later stage illustrated by (38) from Tz'utujil, the pronominal element is realized as a prefix on the verb: (37) Ch’ol (Lehmann, this volume, [23b]) tsa=c taj-a =.1. meet- ‘I met him’ (38) Tz'utujil (Lehmann, this volume, [26b]) x-at=w-aajo’ -.2.=.1.-love ‘I loved you’. Similar examples of the role of word order as a facilitating or inhibiting factor are discussed by Nübling and Kempf (this volume, § 4) for Germanic: “Word order in Germanic languages can on the one hand prevent grammaticalization processes, e.g., the amalgamation of main verb and auxiliary, but on the other hand it can be the target of grammaticalization itself.” And further: In German, bracket constructions prevent further grammaticalization of the periphrastic verb forms … The same holds for the semantically and phonologically reduced future auxiliaries (< modals < movement verbs), whereas the famous framing constructions may impede further coalescence with the full verb; in many Germanic languages, auxiliaries and full verbs are separated by a middle field. Only English ’ll < will attaches to the preceding word (mostly pronouns). (Nübling and Kempf, this volume, § 4.1 and § 6)

While the structural factors discussed above all pertain to word order, another general factor responsible for idiosyncrasies and variation in grammaticalization sce-

66

Walter Bisang, Andrej Malchukov, et al.

narios is related to the phonological properties of a language and its elements undergoing grammaticalization. Obviously, this has its influence on the (amount of) Phonetic Reduction, in the first place, but it also affects other parameters, including Bondedness, and less directly, other functional parameters. More generally, diachronic idiosyncrasies can be attributed to the complex interplay of semantic with phonological and/or prosodic factors, which are orthogonal to semantics. Thus, Lehmann (this volume) comments on some grammaticalization processes in Mayan, highlighting complex processes of linguistic change through a mixture of grammatical and phonetic factors: Intransitive completive verbs get a Set B index suffixed /…/. The monophonematic auxiliary therefore hits directly on the verb, which may start with a consonant. Yucatec has a phonological rule converting /t/ into /h/ in front of /t/. An extended version of this rule may have applied to the perfective auxiliary. At any rate, this auxiliary has an allomorph h with intransitive verbs. A preconsonantal /h/, however, generally disappears in Yucatecan. (Lehmann, this volume, § 3.5.1)

Similarly, rampant grammaticalization resulting in radical reduction in Korean (Rhee, this volume) seems to be phonetically conditioned. In Korean, many processes of grammaticalization are based on the “light verb” ha- ‘say, do’ and its further phonetic reduction. Thus, complementizers are derived from combinations of a mood marker, the locution verb ha- ‘say’ and the connective marker -ko ‘and’ by phonetic erosion: (39) Korean (Rhee, this volume, ex. [32]) a. -ta/-la dec + ha ‘say’ + -ko conn b. -la imp + ha ‘say’ + -ko conn c. -nya int + ha ‘say’ + -ko conn d. -ca hort + ha ‘say’ + -ko conn

> > > >

-tako/-lako dec.comp -lako imp.comp -nyako int.comp -cako hort.comp

Rhee (this volume, § 5.2) observes that “[t]he verb ha- is light not only semantically but also phonetically, which makes it vulnerable to reduction and loss, as seen in its erosion in the development of complementizers.” Phonetic factors also seem to contribute to the contraction of the copula in Ecuadorian Quichua (Adelaar, this volume). In this language, the allomorph ga- of the verb ‘to be’ can merge with clitics that were originally attached to the previous word. In fact, as noted by Adelaar, the root may disappear completely if the subject is 1st or 2nd person (-ni and -ngi, respectively) as in Pedro mi-ni (< *Pedro-mi gani) ‘I am Pedro’. Himmelmann (this volume, § 6) also points out the importance of phonological factors in Sulawesi, for which he observes that “the late stages in the grammaticisation of articles involve phonological factors to date not widely noted in the literature”. Clearly, these phonetic conditions for grammaticalization cannot be predicted across languages, so they may be in part responsible for variation in grammaticali-

Position paper: Universal and areal patterns in grammaticalization

67

zation outcomes (especially concerning Bondedness). Such observations are in line with suggestions that phonetic and prosodic factors are of universal importance in processes of morphologization (cf. Heath [1998] and Reinöhl [2017] on Hindi25).26 The role of prosody (and prosodic domains) as a factor enabling or inhibiting grammaticalization, and morphologization in particular, is also highlighted in Lehmann’s (this volume, 2.5.2.2) discussion of grammaticalization of agreement indices in Mayan languages. As he noted, in some languages like Yucatec, subject (A-set) agreement indices do not morphologize, but rather “in syntactic terms, they cliticize to the wrong side …”. However, Lehmann proceeds to observe that in other languages, like Tz'utujil, “it appears that the set A clitics did change their prosodic orientation and directly became prefixes to their host.” Heath (1998) also points out phonetic factors as enabling conditions for grammaticalization (morphologization). Interestingly, his scenario invokes phonetic properties of the target category in a renewal scenario, when a grammaticalized form blends with a phonetically similar bound form and ultimately replaces it (one of his examples involves the formation of the Germanic “weak” preterites mentioned in section 4.3.2 below). Thus, many cases treated as morphological reanalysis may actually be dependent on phonetic conditions.

.. Variation in grammaticalization scenarios: role of areal and typological factors As mentioned in the introduction, the question of variation in grammaticalization processes came to the fore in grammaticalization studies only recently. Earlier, researchers on grammaticalization were focused more on demonstrating universality, i.e., the cross-linguistic recurrence of grammaticalization paths and scenarios. Yet, the work by Bisang (2004, 2006, 2008, 2011, 2015a, 2015b, 2017b) on East and Mainland South East Asian (EMSEA) languages has introduced the observation of important cross-linguistic differences in grammaticalization scenarios to the field of grammaticalization studies. Bisang framed his discussion in terms of areal variation in grammaticalization, as he addressed both variation in grammaticalization paths and grammaticalization scenarios. It should be noted that the notion of areality has somewhat different relevance for the two domains. In the case of grammaticalization paths, there are clear areal patterns (as discussed in section 2.3), which result

 In a follow-up study Reinöhl and Casaretto (2018) show how prosodic constraints inhibit certain grammaticalization paths. Thus, locative particles in Indo-Aryan languages fail to develop into adpositions due to the fact that they form prosodic units with the verbs rather than their nominal argument.  See also Wichmann (2011) for a general discussion of the role of prosody and for a claim that phonetic erosion in the processes of grammaticalization is a consequence of loss of prosodic prominence.

68

Walter Bisang, Andrej Malchukov, et al.

from the diffusion of grammaticalization paths across neighboring languages. Here, the challenge lies rather in the distinction of areal features from genealogical inheritance and drift, i.e., the parallel development of genealogically related languages not inherited from a protolanguage. With regard to grammaticalization scenarios, in general, and form-function covariation, in particular, the question is rather how to assess the role of areal and genealogical factors as opposed to typological properties. This is also true for the case of EMSEA languages, discussed by Bisang. A number of peculiarities of grammaticalization processes in this domain, first and foremost, the absence of form-function covariation as expected under the Parallel Reduction Hypothesis may not just be an areal feature of EMSEA languages, but it may be due to their isolating character. In the literature, different factors have been invoked for explaining the peculiarities of grammaticalization processes in EMSEA languages, including prosodic factors, linguistic contact and various factors which block the emergence of morphological paradigms and bound morphology. On the one hand, some authors (Ansaldo and Lim 2004; Schiering 2006), suggested that it basically depends on prosodic factors; Phonetic Reduction and Bondedness is more expected for stress-based languages than for mora-based languages. On the other hand, Bisang (2004, 2014, 2020) attributes the paucity of morphologization in EMSEA languages to the lack of obligatoriness and the multifunctionality of many grammatical markers. As Bisang (2004, 2014, 2020) points out, the emergence of bound morphology and morphological paradigms under these conditions is not excluded (cf. Arcodia [2013, 2015] for examples from Sinitic) but the emergence of complex morphological structures is comparatively unlikely. Finally, one may suggest that analogical grammaticalization can play a role here, since it is less likely to produce affixes in a language which has no or few affixes (cf. also Narrog and Heine 2018: 4). Bisang (2009, 2015b, 2020) also provides a more general cognitive explanation for typological and areal variation in grammaticalization in terms of the competing motivations of economy and explicitness. Bisang starts from the idea that pragmatic inference contributes to economy in the sense that the speaker has to use less phonological substance to articulate the information s/he intends to pass on to the hearer. If economy in the sense of taking recourse to pragmatic inference wins, a marker of a grammatical category tends to be non-obligatory or multifunctional, leaving the contextually adequate information to the hearer’s inferential abilities. If explicitness wins, there is a tendency for a grammatical marker to become obligatory and/or to focus its function on a specific grammatical domain (e.g., tense or number) (Bisang 2004, 2015b, 2020). Thus, the competition between the general factors of explicitness and economy determines the outcome of grammaticalization processes. According to Bisang, grammaticalization processes in EMSEA languages are generally guided by economy. This shows its effects in phonology (limited reduction of phonetic substance, rareness of subsyllabic morphology; reduced coevolution of meaning and form), morphological typology (tendency to isolating struc-

Position paper: Universal and areal patterns in grammaticalization

69

tures) as well as in the prominent role of pragmatic inference even if a grammatical marker expresses a grammatical category. These cognitive explanations may be at work also in creole languages, which equally feature extensive functionalization but no morphologization (Michaelis and Haspelmath, this volume). In this respect, the situation seems to be similar to EMSEA languages as discussed by Bisang. However, in the case of creoles, there is a further factor to be considered. As McWhorter (2001, 2005) famously argues, the analyticity of creoles is due to the relatively short time of their existence. Michaelis and Haspelmath (this volume, footnote 4) agree with this view when they write that “the time factor is probably sufficient to explain why we do not see much coalescence (yet) in creoles”. EMSEA languages are clearly much older than creoles. Thus, the factor of a short life span can safely be excluded (pace Dahl 2018). There might however be other sociolinguistic factors at play. In particular, the influence of the written language, which may have its conservative effects on the emergence of more complex inflectional morphology. Indeed, Arcodia (2013, 2015) observes that in spoken vernaculars of Sinitic a tendency of Phonetic Reduction and Bondedness is more pronounced than in standard varieties. One should however keep in mind that the phenomena of ablaut-style stem alternations he describes cannot be fully compared to what is known from Indo-European languages (cf. Bisang 2014, 2020). Similar effects of the writing system as a factor inhibiting processes of grammaticalization have been reported elsewhere (cf. Narrog, Rhee, and Whitman [2018] on the impact of the written language on grammaticalization in Japanese and Korean). Overall, it remains clear that given our present knowledge it is fair to say that multiple factors may determine grammaticalization processes. These range from sociolinguistic factors (written norms, life span, contact influence), to structural factors (prosody, analogy, word order), to general cognitive factors (motivations of economy and explicitness). The case of EMSEA languages discussed above, presents the clearest case of the impact of areal and typological features on the outcome of grammaticalization processes. The question to be addressed in this section is whether we can detect further correlations of grammaticalization scenarios with morphological typology. In this context, it is interesting to observe that polysynthetic languages often show processes of morphological reanalysis targeting certain slots in a morphological structure. This is observed in Iroquoian and Siouan and to a certain extent in Mayan and Sulawesi. For Sulawesi and other Western Malayo-Polynesian languages, Himmelman (this volume, § 1.2) explicitly claims that few pure cases of grammaticalization are attested in Polynesian. Most cases are better understood in terms of reanalysis: That is, what is amply attested is the reorganisation of a grammatical system centering on voice morphology and determiner-like elements. These changes, however, may not be instances of grammaticisation in the strict sense of an (essentially self-propelled) development along a unidirectional cline, but rather may instantiate reanalyses of various types, including analogical extensions.

70

Walter Bisang, Andrej Malchukov, et al.

But also for languages with polysynthetic traits, there is large variation. For example, in Hoocąk, prefixes are older and opaque, while suffixes are less bound and more transparent (Helmbrecht, this volume). Even in inflectional languages, the meaning-form covariation is not straightforward. As observed by Wiemer (this volume), in Slavic, most grammaticalization processes stop at the level of auxiliation. There are only very few attested instances of canonical grammaticalization resulting in morphologization. As pointed out by Wiemer (this volume, § 6): “However, apart from the definitorial property that by auxiliation a lexically autonomous unit loses its argument structure, the appearance of auxiliaries in Slavic has not, as a rule, been accompanied by erosion, a decrease in syntagmatic variability or an increase in bondedness.” Similarly, Germanic languages as described by Nübling and Kempf (this volume) reveal few documented paths of morphologization, since the general trajectory is rather deflection. One of the documented paths briefly addressed by Nübling and Kempf (this volume, § 3.3) is the well-known (but still controversially discussed) development of the preterite of weak verbs in Germanic (with a dental suffix; cf. Engl. love – loved), going back to an originally periphrastic construction with ProtoGermanic *dōn ‘do’. As mentioned above, for the case of preterite, Heath (1998) invokes phonetic factors as facilitating morphologization (phonetic similarities of the older tense formatives with the cliticized marker). In comparison, agglutinating languages with more transparent morphology often show more evidence for grammaticalization (see chapters on Uralic, Tungusic, Korean). Consider the following examples from Korean, resulting in the univerbation of a string of verbal markers: (40) Korean (Rhee, this volume, ex. [21]) a. keki ka-myen an toy-e there go-if not be.good- ‘(Things) will not be as good (as they should be), if (you) go there.’ b. keki ka-myen.an.toy-e there go- ‘(You) must not go there.’ However, even for agglutinating languages we observe significant variation. Thus, for Quechua, Adelaar (this volume) finds more examples of grammaticalization in the nominal domain, while verbal categories remain intransparent to etymology. In fact, this pattern finds parallels in other languages. As Janhunen (2012) points out for Mongolic, nominal morphology is generally more transparent and can be reconstructed more easily. The same is true of Tungusic and also of Ket. In Ket, for example, many local case markers/postpositions can be traced back to relational nouns, but most of their verbal morphology is opaque (Vajda, this volume). If verbal markers undergo change, these changes are instances of reinterpretation/exaptation

Position paper: Universal and areal patterns in grammaticalization

71

rather than familiar cases of grammaticalization. What remains unclear is how to explain this tendency (at least for the languages mentioned). Under what conditions does it hold? Does it imply that verbal morphology is generally older in these languages? It seems that additional factors may play a role here. One factor could be that verbal morphology is more often less transparent due to (complex) grammaticalization scenarios involved in univerbation. This is also true of Quechua (Adelaar, this volume, § 3.3.5), in which changes in the verbal domain become unrecognizable due to processes of univerbation. In Ecuadorian Quichua, the combination -q ri- developed into the verbal suffix gri-, as illustrated by miku-gri- ‘to go and eat’ (< *miku-q ri-). But it is also true of Tungusic (and Mongolian) where similar univerbation scenarios may result in opaque morphology: cf. Manchu -mbi < me + bi (a complex suffix is reanalyzed from a suffix-cum-aux combination). Interestingly, Maisak (this volume), distinguishes between conventional grammaticalization (from verbal and nominal sources) and ‘grammaticalization of constructions’. Among the latter, he lists cases of univerbation, as in (41): (41) “Grammaticalization of constructions” in Lezgic (Maisak, this volume, § 6): Converb (perfective) + copula > aorist (perfective past) Converb (perfective) + existential verb > resultative / perfect / unwitnessed past Participle (perfective) + copula > experiential/existential perfect Converb (imperfective) + copula > habitual Converb (imperfective) + existential verb > progressive / general present Participle (imperfective) + copula > generic present / future The univerbation of the aorist participle in Agul is illustrated below: (42) Agul (Maisak, this volume, ex. [44a]): aq’u-nde (< *aq’u-na i-de) do.-: do.- - However, the tendency of nominal morphology to be more easily reconstructable than verbal morphology can hardly by universal, since there are other languages showing the reverse pattern. In Beja (Cushitic; Vanhove, this volume) and in Nyulnyulan (McGregor, this volume) verb morphology can be reconstructed more easily than nominal morphology. Altogether, it might be said that apart from the distinct profile of isolating languages in processes of grammaticalization, morphological typology as a whole does not allow clear-cut predictions, since processes of grammaticalization might differ across domains. Thus, in Hoocąk, there is almost no nominal morphology, so the question of grammaticalization in the sense of morphologization does not arise for this domain.

72

Walter Bisang, Andrej Malchukov, et al.

A potential exception to this claim is the very interesting observation by Mithun (this volume) that functional reanalysis often seems to follow formal reanalysis in Iroquoian, with fusion happening before generalization in meaning. As noted by Mithun (this volume), formal (reductive) processes often precede functional ones at least in verb morphology, starting out from univerbation with verbs retaining their original meaning. Moreover, Mithun also observed the absence of form-function covariation in more recent cases of grammaticalization: In Southern Iroquoian (Cherokee), the cislocative (hither) has developed into a future, and in Northern Iroquoian, the translocative (thither) has developed into a past. In none of these cases have the functional developments been followed by further formal reduction. The affixes descended from roots via compounding have retained the full forms of their root sources, and those descended from other affixes have retained the same forms as their affix sources. (Mithun, this volume, § 5)

It is not clear, to what extent this is generally true for polysynthetic languages, but if confirmed, it might again be an effect of analogical grammaticalization (high degree of synthesis promoting Bondedness). And it is telling that other languages with polysynthetic traits (like Ket and Sulawesi), do not document many cases of secondary grammaticalization either. Most cases reported are rather instances of morphological or syntactic reanalysis. For Lezgic, Maisak (2016, this volume) notes that morphological fusion in the processes of univerbation may precede syntactic reanalysis. There may be a more general issue involved here, which we discuss in terms of the following dichotomy: “grammaticalization from above” (semantically driven) vs. “grammaticalization from below” (phonetically driven).27 The former is in accordance with gradualist grammaticalization scenarios (Meaning First Hypothesis), where functional change induces increased frequency, and ultimately results in phonological reduction and morphologization. Thus, “grammaticalization from above” relates to the typical grammaticalization scenarios in which functional reduction precedes and conditions formal reduction. “Grammaticalization from below” does not seem to be driven by functional considerations (in contradiction to the Meaning First Hypothesis), but rather by frequencies (‘string frequencies’; Krug [2000]). Such cases have more often been discussed in terms of reanalysis, and may be illustrated by the development of postposed articles in Germanic: (43) Icelandic/Old Norse (Nübling and Kempf, this volume, § 2.4.2.1) ON maðr hinn gamli > maðrinn gamli > maðrinn (gamli) ‘the old man’ (lit. ‘man the old’) (Barðdal et al. 1997: 302)

 The terms themselves were introduced in the volume by Bisang, Himmelmann, and Wiemer (2004), in particular, in Gaeto’s (2004) contribution, even though they are used in a somewhat different sense.

Position paper: Universal and areal patterns in grammaticalization

73

Here, phonetic reanalysis precedes functional change. A similar case is the development of ‘inflected prepositions’ in German, where a definite article encliticizes to a preposition; cf. beim < bei+ dem and the like.28 In this case, however, functional repercussions are more noticeable (as mentioned in § 4.2.1). As observed in this context by Nübling and Kempf (this volume, § 2.4.2.2), encliticized articles, “clearly exceed the full article in frequency, build small paradigms, and can even be obligatory, provided they follow specific prepositions (such as bei, in, an, zu, von, vor ‘by, in, at, to, of, before’)”. Similarly, in Manding (Creissels, this volume), the merger of the noun plus determiner in (44) clearly results in Phonetic Reduction and Bondedness without any functional consequences: (44) Manding (Creissels, this volume, ex. [1a]) Mùsôo máŋ nǎa. (mùsôo < mùsú + ò) woman. . come ‘The woman did not come.’ The status of such cases, as well as other cases of phonetically driven grammaticalization, is controversial, particularly with respect to the central question of whether these phenomena count as instances of grammmaticalization. Reductive processes of this type have also been discussed in terms of “chunking” by Bybee (see Bybee 2011; Bybee and Beckner 2015); another, more traditional, term is ‘univerbation’ (Lehmann [2002: 135; forthcoming], attributes this term to Karl Brugmann). As Bybee points out, when chunking is affecting the items which are not semantically and/or grammatically related this does not produce meaning shifts, i.e., the meaning remains compositional (this is also true of the German definite articles encliticized/fused with the prepositions). Thus, chunking of contiguous units does not necessarily engender meaning-form coevolution in such cases. On the other hand, when the contiguous units are semantically and/or grammatically related, the outcome of chunking can be legitimately characterized as cases of grammaticalization or lexicalization (cf. the emergence of secondary prepositions like in spite of). While the distinction between “grammaticalization from above” (function-driven) vs. “grammaticalization from below” (phonetically driven, aka ‘chunking’ or ‘univerbation’) is intuitively plausible, it is difficult to qualify many phenomena in these terms, also because of the conundrum concerning the validity of the concept of word (cf. Haspelmath’s review of the polysynthesis volume by Fortescue, Mithun, and Evans [2017]). In any case, chunking or univerbation, as far as it produces conventional, rather than occasional language change, is not confined to grammaticalization as such, but equally pertains to lexicalization (see Lehmann [forthcoming], for a comprehensive discussion of univerbation).

 See also Himmelmann (2014) for a general discussion of ‘ditropic clitics’.

74

Walter Bisang, Andrej Malchukov, et al.

. Final remarks on grammaticalization, reanalysis and grammatical-context sensitivity Reanalysis is generally defined in one way or another as the assignment of a new morphosyntactic analysis to a given linguistic structure (e.g., Harris and Campbell 1995: 50–51; Hopper and Traugott 2003: 88). What is less clear is its relation to grammaticalization. In this chapter, we basically adopt the view from Lehmann (2004)29 that reanalysis and grammaticalization intersect even though they are basically independent.30 In the following discussion we would like to make two points (partially in response to the challenges posed by the data discussed in individual contributions): (i) grammaticalization is context-sensitive, in the sense that its outcome depends on the morphosyntactic properties of the construction in which a given linguistic item occurs, but this context-sensitivity is a matter of degree; and (ii) while certain grammaticalization types can be alternatively viewed as reanalysis (“downgrading reanalysis” of Newmeyer [2001], or “reanalysis of category labels” in Harris and Campbell [1995: 92]), crucially, some other types of reanalysis cannot be reduced to grammaticalization as they involve change of more than one of the aspects of the underlying structure at once (Harris and Campbell 1995: 92, 65). While the role of grammatical context is generally acknowledged in the literature (including Bybee, Pagliuca, and Perkins [1994], and Lehmann [2004]), in practice, grammaticalization studies often pursue an item-based approach. This is most obvious in Heine’s work (cf. Heine, Kuteva, and Narrog [2017] for a clear statement of this position),31 which describes grammaticalization paths in terms of semantics:  > ,  > , etc. (see Appendix in chapter 2). While this approach is admittedly reductive, it is able to produce highly interesting cross-linguistic generalizations in the form of a catalogue of grammaticalization paths (Heine and Kuteva 2002; Kuteva et al. 2020). However, this approach reaches its limitations if certain paths depend heavily on grammatical context. Thus, ‘go’ develops into a future marker most often in the context of subjunctive or related markers, while it may develop into a perfective or directional marker in other contexts. Disregarding the effects of the constructional environment within which a lexical item occurs comes with a price, which one may call “rampant polygrammaticalization”, i.e., the claim that one and the same source can grammaticalize into a large number

 As Lehmann (2004) points out, reanalysis differs from grammaticalization inasmuch as it is not unidirectional, is not necessarily associated with loss of autonomy, is not gradual and consists only of two stages, i.e., it is not involved in sequences of stages.  For a good summary of different approaches, cf. Traugott (2011). In addition to the one we adopt, she discusses the following other approaches: (i) grammaticalization is a subtype of reanalysis (UG-based approaches), (ii) reanalysis is largely irrelevant to grammaticalization (Haspelmath 1998) and (iii) grammaticalization is driven by analogy rather than reanalysis (Fischer 2007).  Also cf. Heine and Kuteva (2002) and Kuteva et al. (2020) for a consistent implementation of this approach.

Position paper: Universal and areal patterns in grammaticalization

75

of target functions. If seen from the perspective of grammatical context, that number can be reduced, and the different functions get their motivations. In some cases, the item-based approach can even lead to contradictions, when both paths  > , and  >  are postulated, while this is of course a combination of the two which produces the target meaning. More generally, it should be acknowledged that both divergence/polygrammaticalization (one source multiple targets) and convergence (multiple sources one target) weaken item-based approaches to grammaticalization. Context-sensitivity of grammaticalization processes is well illustrated in many contributions to this volume. The following citation from Montaut (this volume) makes this point very explicitly with respect to Hindi and other Indo-Aryan languages: In the verbal category, the number of words involved in the various paths is limited (‘stay’, ‘do’, ‘be’, ‘go’), the grammaticalization paths proliferating, particularly in Standard Hindi: constructions are more relevant than the lexical word used (‘be’, ‘stay’, ‘go’ are examples of multiple grammaticalizations, but in different constructions). (Montaut, this volume, § 3.5.3)

While context plays a prominent role in grammaticalization processes, it is more challenging to capture its role systematically in typological work if one wants to avoid an open-ended catalogue of paths which ultimately makes the finding of generalizations impossible. In practical terms, one possible solution is to include the relevant context-related information in one way or another. By way of exemplification, the two paths mentioned above may be formalized in the following way: (45)  /_ / →  (46)  +  >  >  The first formalization in (45) is to be read as ‘ develops into a future marker in the context of a subjunctive gram’. This way of representation is adopted from phonology. The second formalization in (46) represents a case where it is less clear what counts as the nucleus, so the two items are treated on a par ( /_/ >  and  /_/ >  are equivalent). Since such cases are reminiscent of nuclear mismatches or rather nucleus indeterminacy, they are more deviant from simpler grammaticalization scenarios. The representations in (45) and (46) explicitly acknowledge the role of context in the development of individual grammaticalization paths without losing their potential for addressing compositionality issues or crosslinguistic use. Note that in order to ensure cross-linguistic comparability, the context should be defined in semantic terms (as presented in [45]), or else in broad syntactic terms (part of speech distinctions), as in cases when the evolution depends on the lexical class of the host category. For example, the evolution of ‘give’ into benefactive applicative, on the one hand, and into dative marker, on the other hand, can be represented in the following way:  /_ V/ → ,

76

Walter Bisang, Andrej Malchukov, et al.

 /_ N/ → . Integrating context in this way can certainly make modelling of grammaticalization processes more predictive, but we have not implemented it consistently (at this stage of the project), also because splitting source constructions necessitates a far larger database in order to detect (areal) clustering. Finally, it should be noted that different modes of representation may be justified for different scenarios to different degrees. In particular, they depend on the extent to which a single nucleus can be straightforwardly identified for particular paths. In some simpler scenarios, this is indeed possible. Recall the development of spatial terms into locative adpositions, or the numeral ‘one’ into indefinite article discussed in § 2.1. And in some other scenarios, constructional details are obliterated in the course of convergent developments associated with more than one process of reanalysis. However, we also find more complex scenarios, where multiple elements are targeted in the production of some entirely new constructions with new meanings. These scenarios (recall the case of insubordination discussed in section 4.2.2 above) are more difficult to characterize in terms of grammaticalization parameters, in fact they are more often treated in the literature as cases of reanalysis rather than of grammaticalization. Recall the quotation above from Harris and Campbell (1995: 65), who observe that “many instances of reanalysis involve more than one of the aspects of the underlying structure at once”. Overall, it can be acknowledged that different grammaticalization scenarios show construction-sensitivity to a different degree. Some cases (which may be called endocentric grammaticalization scenarios) have a clear nucleus (called ‘construction marker’ in Himmelmann [2004]). For these scenarios, an item-based formulation of paths in terms of Heine (e.g.,  > ) is most applicable. Some other more complex scenarios as we find them in the case of nucleus mismatches may be called exocentric grammaticalization. This type of scenarios does not allow such formulations and will result in multiple paths (or rampant polygrammaticalization) as a side effect if no further qualifications are given (cf. the list of multipletarget concepts in section 2.1 for some examples). The distinction between what we call endocentric vs. exocentric grammaticalization scenarios roughly corresponds to the distinction between ‘ordinary grammaticalization’ vs. ‘construction-based grammaticalization’ in Maisak (this volume; see also example [41] above). Beyond endocentric and exocentric grammaticalization, we find the more complex instances of reanalysis generating new constructions like insubordination. The cline between endocentric grammaticalization and reanalysis is represented in Figure 15 below:

endocentric

exocentric grammaticalization

grammaticalization

(nucleus mismatches, nucleus indeterminacy)

Fig. 15: A cline from grammaticalization to reanalysis.

Reanalysis

Position paper: Universal and areal patterns in grammaticalization

77

We call the scale in Figure 15 a cline rather than tripartition, since different grammaticalization phenomena will show grammatical context sensitivity to a different degree, thus falling into a gray zone between endocentric vs. exocentric grammaticalization, or between exocentric grammaticalization and reanalysis. This is even true for some acknowledged paths of grammaticalization, which might show a “concerted” array of changes involving different constituents and thus cannot be easily reduced to combinations of individual changes (cf. renewal scenarios, like the wellknown “negative cycle” familiar from French; Hopper and Traugott [2003: 65–66]). The gradualness of the distinctions along the clines is also reflected in the cautious wording that the contributors use to characterize the nature of diachronic changes. Thus, Wiemer (this volume, § 5) characterizes certain developments in Slavic not as grammaticalization proper, but rather as “phenomena which either depend on grammaticalization or which are basically characterized by changes that are concomitant to grammaticalization, but can also be explained simply by reanalysis, analogical extension, functional expansion (or shift) via the conventionalization of implicatures, exaptation or any combination thereof”. Clearly more work needs to be done in order to identify the typology of diachronic phenomena on the margins of grammaticalization, and in particular in the “gray zone” between grammaticalization and reanalysis, which we call exocentric grammaticalization at the moment. For now, it can be concluded that the phenomena subsumed under exocentric grammaticalization should be approached from different perspectives, making use of tools developed in research on grammaticalization, on the one hand, and methods developed in Diachronic Construction Grammar, on the other hand (see contributions to Barðdal et al. [2015] for a representative sample of work in the latter tradition).

 Conclusions Our introductory chapter addressed the question of cross-linguistic variation in grammaticalization paths and grammaticalization scenarios seeking to uncover universal trends and areal signals in our data. Our results, both qualitative and quantitative, heavily rely on the contributions to the present volume, even though the research questions are mostly derived from previous work. For paths, we focused on the question of which paths are most common and which are rare or language specific. Since no path is clearly universal in the sense of deterministically present in all languages, the question here is rather which paths require the fewest enabling conditions, i.e., which paths can be instantiated under different structural conditions. As for the latter question, we have been able to identify more frequent and less frequent paths, as well as certain paths which show areal clustering (also visible on the NeighborNet plots in Figures 5A and 5B). We have also discussed the reasons behind language particular and rare or “exotic” grammaticalization paths.

78

Walter Bisang, Andrej Malchukov, et al.

In some cases, it was possible to assimilate a rare path to a familiar one by means of reductive analysis (cf. section 2.2 on granularity). In other cases, we were able to identify the motivations for seemingly peculiar cases (cf. section 2.2 on nucleusmismatches). In the context of grammaticalization scenarios, our most general result was the demonstration of different frequencies with which parameters are instantiated. To the extent that our dataset is representative, it does not confirm the stronger version of the meaning/form coevolution hypothesis (Parallel Reduction Hypothesis) but it is compatible with a weaker form of this hypothesis, predicting that functional changes precede (and presumably condition) formal changes (Meaning First Hypothesis). Furthermore, our results suggest that the distinctions between the Parallel Reduction Hypothesis and the Meaning First Hypothesis is too coarse-grained to capture the specifics of different parameters in interaction, and even the more finegrained proposal by Narrog and Heine (2018) may not be fine-grained enough. For now, the heatmaps (Figures 9 and 10) as well as the Network Graph in Figure 11 generalizing over the correlations displayed by the heatmaps are the most faithful representations of codependencies between individual parameters. Overall, the Network Graph in Figure 11 reveals stronger dependencies between a group of functionrelated parameters (Semantic Integrity, Paradigmaticity, Syntagmatic Variability), on the one hand, and some dependencies between formal parameters (Bondedness, Phonetic Reduction, Allomorphy), with Paradigmatic Variability and Decategorization depending on both. It further shows that Bondedness correlates with both function-related parameters, on the one hand, and with form-related parameters (Phonetic Reduction), on the other hand. Thus, the Network Graph in Figure 11, as well as the unidimensional Parameter Hierarchy in (24) represent typological constraints on covariation of parameters in grammaticalization processes, as they emerge from our data. As mentioned above, the patterns displayed in the Graph and the Hierarchy show both similarities and differences. While both capture certain patterns of co-dependencies between individual parameters, the Graph reveals more specific correlations, while the Hierarchy is more general in nature, as it captures frequencies of individual parameters, on individual paths, as well as, by extension, crosslinguistic frequencies. For the issue of areal variation in scenarios of grammaticalization, our methodology showed less clear-cut results, at least at the level of macro-areas. Better signals are obtained for specific micro-areas, which is true of both grammaticalization paths and grammaticalization scenarios. While variation is also detected across macro-areas, it is more readily interpretable if one zooms into smaller areas which share certain typological features (e.g., Southeast Asian languages). Weak signals for areal clusterings at the level of macro-areas may thus be due to limited data. On the other hand, given that similarities at the level of macro-areas might be due to prehistoric population dispersals (Nichols 1997; Bickel 2017), they might not be expected for the grammaticalization paths under discussion because these are chronologically fairly recent.

Position paper: Universal and areal patterns in grammaticalization

79

Where does this leave the question of grammaticalization theory as a coherent concept and as a natural class of phenomena? – a topic of heated discussions in literature. On the one hand, our results support the approach developed by leading grammaticalization theorists like Lehmann, Heine and Bybee, suggesting grammaticalization as a unitary phenomenon, as covariation between parameters can indeed be detected. On the other hand, dependencies between individual parameters are shown to be more complex and variegated, as also revealed by the Network Graph in Figure 11 and the two heatmaps (Figures 9 and 10). Most damaging for classical approaches is perhaps the fact that the meaning/form covariation is rarely observed in our data. As was shown, Phonetic Reduction and Bondedness rather lag behind functional and syntactic changes.32 There can be different reactions to this challenge. On the one hand, one may suggest to view grammaticalization from a canonical perspective (as advocated by Corbett’s canonical typology). As is clear from Corbett’s work (see, e.g., Brown, Chumakina, and Corbett 2013), the canon is different from a prototype in that it is not necessarily the most frequent instantiation of its type, rather it is most distinctive in relation to other phenomena. On this interpretation, Lehmann’s concept of “normal grammaticalization process” (mentioned in footnote 6) would correspond to a canon rather than to a prototype. Thus, we can also argue that obligatoriness belongs to canonical grammaticalization, since it can be used to distinguish grammaticalization from processes of lexicalization (cf. Diewald 2010), without making the claim that obligatorification is part of the most frequent grammaticalization phenomena. Another answer to the above challenge, suggested to us by Christian Lehmann (p.c.), is that there may be a gnoseological bias involved here. Cases where form is affected may be more difficult to identify as belonging to certain grammaticalization paths, so specialists may be reluctant to postulate a path in the first place. Last but not least, our study has shown that establishing co-dependencies between parameters in general, and co-evolution of form and meaning, in particular, crucially depend on the method employed. Recall the comparative discussion of the two heatmaps in § 3.4, which showed that when dependency is measured in terms of binary features (‘change’ vs. ‘no change’ on a path from source to target), one might miss a correlation if the source item already shows grammaticalization along some of the parameters. One of our examples, was encliticization of negative auxiliaries in EastTungusic languages, which taken at face value might be seen as problematic for

 The cases discussed above in terms of ‘grammaticalization from below’ (aka chunking or univerbation) provide a counterexample to this tendency, but such cases occur marginally in our data. Moreover, their status as grammaticalization phenomena is controversial for several reasons. On the one hand, this type of language change does not always lead to conventional grammaticalization outcomes (cf. Bybee [2011] on chunking). On the other hand, the outcome of chunking/univerbation may be either grammaticalization or lexicalization; for this reason, Lehmann considers ‘univerbation’ as a separate phenomenon (see Lehmann [forthcoming], for a detailed discussion of univerbation).

80

Walter Bisang, Andrej Malchukov, et al.

both the Parallel Reduction Hypothesis and for the Meaning First Hypothesis, unless one realizes that the meaning of the negative auxiliary is already fairly abstract (i.e., it shows reduced Semantic Integrity). Clearly, these and many other questions have to be addressed in further research. It should be reiterated that the MAGRAM project was explicitly conceived as a pilot project in this domain, which should be followed up and verified in subsequent work. This work would have to extend the database in order to confirm or disconfirm covariation between individual parameters and should be able to detect areal patterns where we were not able to do so because of limitations of data. Such work is likely to enrich a framework with other parameters, which we were not able to incorporate, first and foremost frequency (Bybee [2011]; see also Diessel and Hilpert [2016] for a general discussion of frequency effects in grammar and Sun and Saavedra [forthcoming], for a pilot corpus study of the role of frequency in crosslinguistic research). Note that the covariation between individual parameters observed above does not necessarily suggest a causal link. In fact, the causal chain might be mediated through frequency. Future studies should also be able to integrate socio-linguistic settings into their models of grammaticalization in order to identify factors promoting/inhibiting grammaticalization processes, making the whole approach more predictive.33 While all these questions remain tasks for future studies, we believe the MAGRAM project, and the current volume, in particular, has made another important step towards modelling grammaticalization processes and ultimately towards contributing to the formation of a predictive theory of grammaticalization. Clearly, achieving this ultimate goal is possible only through a concerted effort by the whole linguistic community.34

Acknowledgements Author contributions: Walter Bisang and Andrej Malchukov contributed to every stage of the reported project from its conception to data analysis as well as writing  As an example, of such sociolinguistic factors, consider a suggestion by Narrog, Rhee, and Whitman (2018), who attribute the presence of more extensive grammaticalization in Korean compared to Japanese to the fact that a conventionalized written variety was created earlier in Japanese than in Korean.  Apart from groundbreaking work cited in this paper we would like to point out recent publications relevant to the topic, including the revised version of Heine and Kuteva’s Lexicon (Kuteva et al. 2020), and the recent volume by Narrog and Heine (2018) dealing with typological variation in grammaticalization, which is similar in spirit to the present volume, but is not questionnairebased. Another very interesting project aiming into the same direction, especially as far as the database is concerned and the issues of grammaticalization scenarios, is the Berlin-based project on ‘clusters of change’ (see Norde and Beijering 2014), which has developed independently and in parallel to the Mainz project. Clearly, all these results must be integrated in future research on grammaticalization.

Position paper: Universal and areal patterns in grammaticalization

81

the present chapter. Iris Rieder, Linlin Sun and Marvin Martiny contributed to the data coding and data analysis, while Svenja Luell conducted the statistical analysis. All the authors contributed to the conceptual development of the project and they also provided comments on the first draft. The qualitative and quantitative results of the reported research rely heavily on the contributions of the language experts contributing to this volume, whose unfailing support is gratefully acknowledged. We are indebted to Arne Nagels for advice on statistical matters. We are very grateful to our contributors who provided last minute comments on the first version of our paper: Denis Creissels, Guillaume Jacques, Agnes Korn, Christian Lehmann, Timur Maisak, Marianne Mithun, Damaris Nübling and Björn Wiemer. Our special thanks go to Edith Moravcsik for valuable advice and to Christian Lehmann for his extensive and insightful critique. A further thank you for very helpful comments goes to Laura Becker and Thomas Stolz. Needless to say, all remaining shortcomings remain our own. Finally, we would like to thank the German Research Foundation (DFG), which supported this project under the number of BI 591/12–1. Walter Bisang would also like to thank the University of Zhejiang (China) for awarding him a Chair Professorship. Part of the paper was written during his stay there in Hangzhou.

Abbreviations Abbreviations follow the Leipzig glossing rules. Additional abbreviations include  – possessive/subject function,  – animate,  – absolutive function,  – conjoint,  – completive,  – change of state,  – sentence-ender,  – general,  – hesternal past,  – locative voice,  – medial,  – motion,  – non-realis,  – past perfect,  – pronominal clitic (enclitic pronoun),  – perfective,  – potential,  – realis,  – simultaneous,  – undergoer voice, v – verbal dative

References Aikhenvald, Alexandra Y. 2002. Language contact in Amazonia. Oxford: Oxford University Press. Ansaldo, Umberto & Lisa Lim. 2004. Phonetic absence as syntactic prominence. grammaticalization in isolating tonal languages. In Olga Fischer, Muriel Norde & Harry Perridon (eds.), Up and down the cline – the nature of grammaticalization, 345–362. Amsterdam & Philadelphia: John Benjamins. Arcodia, Giorgio Francesco. 2013. Grammaticalization with coevolution of form and meaning in East Asia? Evidence from Sinitic. Language Sciences 40. 148–167. Arcodia, Giorgio Francesco. 2015. More on the morphological typology of Sinitic. Bulletin of Chinese Linguistics 8. 5–26. Barðdal, Jóhanna, Nils Jörgensen, Gorm Larsen & Bente Martinussen 1997. Nordiska. Vå ra språ k fö rr och nu. [Scandinavian languages and their history]. Lund: Studentlitteratur. Barðdal, Jóhanna, Elena Smirnova, Lotte Sommerer & Spike Gildea. 2015. Diachronic Construction Grammar. Amsterdam: John Benjamins.

82

Walter Bisang, Andrej Malchukov, et al.

Bickel, Balthasar 2017. Areas and universals. In R. Hickey (ed.), The Cambridge handbook of areal linguistics, 40–54. Cambridge: Cambridge University Press. Bisang, Walter. 1992. Das Verb im Chinesischen, Hmong, Vietnamesischen, Thai und Khmer. Tübingen: Narr. Bisang, Walter. 1996. Areal typology and grammaticalization: processes of grammaticalization based on nouns and verbs in East and mainland South East Asian languages. Studies in Language 20(3). 519–597. Bisang, Walter. 2004. Grammaticalization without coevolution of form and meaning as an areal phenomenon in East and mainland Southeast Asia – the case of tense-aspect-mood (TAM). In Walter Bisang, Nikolaus Himmelmann & Björn Wiemer (eds.), What makes grammaticalization? A look from its components and its fringes, 109–138. Berlin & New York: Mouton de Gruyter. Bisang, Walter. 2006. South East Asia as a linguistic area. In Keith Brown (ed.), Encyclopedia of language and linguistics, vol. 11, 587–595. Oxford: Elsevier. Bisang, Walter. 2008. Grammaticalization and the areal factor: The perspective of East and mainland Southeast Asian languages. In María José López-Couso & Elena Seoane (eds.), Rethinking grammaticalization. New perspectives, 15–35. Amsterdam & Philadelphia: John Benjamins. Bisang, Walter. 2009. On the evolution of complexity – sometimes less is more in East and mainland Southeast Asia. In Geoffrey Sampson, David Gil & Peter Trudgill (eds.), Language complexity as an evolving variable, 34–49. Oxford: Oxford University Press. Bisang, Walter. 2011. Grammaticalization and typology. In Heiko Narrog & Bernd Heine (eds.), Handbook of grammaticalization, 105–117. Oxford: Oxford University Press. Bisang, Walter. 2013. Word class systems between flexibility and rigidity: an integrative approach. In Jan Rijkhoff & Eva van Lier (eds.), Flexible word classes: Typological studies of underspecified parts of speech, 275–303. Oxford: Oxford University Press. Bisang, Walter. 2014. On the strength of morphological paradigms – a historical account of radical pro-drop. In Martine Robbeets & Walter Bisang (eds.), Paradigm change in historical reconstruction: The Transeurasian languages and beyond, 23–60. Amsterdam and Philadelphia: Benjamins. Bisang, Walter. 2015a. Problems with primary vs. secondary grammaticalization: the case of East and mainland Southeast Asian languages. Language Sciences 47. 132–147. Bisang, Walter. 2015b. Hidden complexity – the neglected side of complexity and its consequences. Linguistics Vanguard 1(1). 177–187. Bisang, Walter. 2017a. Classification between grammar and culture – a cross-linguistic perspective. In Tanja Pommerening & Walter Bisang (eds.), Classification from Antiquity to Modern Times, 199–230. Berlin: de Gruyter. Bisang, Walter. 2017b. Grammaticalization. Oxford research encyclopedia, linguistics. Online publication. DOI: 10.1093/acrefore/9780199384655.013.103. Bisang, Walter. 2020. Grammaticalization in Chinese – a cross-linguistic perspective. In Janet Xing (ed.), A typological approach to grammaticalization and lexicalization: East meets West, 17–54. Berlin: de Gruyter. Bisang, Walter, Nikolaus Himmelmann & Björn Wiemer (eds.). 2004. What makes grammaticalization? A look from its components and its fringes. Berlin & New York: Mouton de Gruyter. Bisang, Walter & Andrej Malchukov (eds.). 2017. Unity and diversity in grammaticalization Scenarios. (Studies in Diversity Linguistics 16). Berlin: Language Science Press. Boye, Kasper & Peter Harder. 2012. A usage-based theory of grammatical status and grammaticalization. Language 88. 1–44. Brown, Dunstan, Marina Chumakina & Greville G. Corbett (eds). 2013. Canonical morphology and syntax. Oxford: Oxford University Press.

Position paper: Universal and areal patterns in grammaticalization

83

Bybee, Joan. 2011. Usage-based theory of grammatialization. In Heiko Narrog & Bernd Heine (eds.), The Oxford handbook of grammaticalization, 69–79. Oxford: Oxford University Press. Bybee, Joan, William Pagliuca & Revere D. Perkins. 1994. The evolution of grammar. Tense, aspect, and modality in the languages of the World. Chicago: University of Chicago Press. Bybee, Joan & Clay Beckner. 2015. Emergence at the cross-linguistic level. Attractor dynamics in language change. In Brian MacWhinney & William O’Grady (eds.), The handbook of language emergence, 183–200. Malden: Wiley Blackwell. Campbell, Lyle & Richard D. Janda. 2001. Introduction: Conceptions of grammaticalization and their problems. Language Sciences 23. 93–112. Comrie, Bernard. 1998. Rethinking the typology of relative clauses. Language Design 1. 59–86. Craig, Colette. 1991. Ways to go in Rama: a case study in polygrammaticalisation. In Elisabeth Closs Traugott & Bernd Heine (eds.), Approaches to grammaticalization, 455–492. Amsterdam & Philadelphia: John Benjamins. Croft, William A. 2001. Radical Construction Grammar. Syntactic theory in typological perspective. Oxford: Oxford University Press. Cuyckens, Hubert. 2018. Reconciling older and newer approaches to Grammaticalization. German Cognitive Linguistics Association 6. 183–196. Cysouw, Michael. 2014. Inducing semantic roles. In Silvia Luraghi & Heiko Narrog (eds.), Perspectives on semantic roles, 23–68. Amsterdam & Philadelphia: John Benjamins. Dahl, Östen. 1995. Areal tendencies in tense-aspect systems. In Pier Marco Bertinetto, Valentina Bianchi, Östen Dahl & Mario Squartini (eds.), Temporal reference, aspect and actionality, vol. 2, 11–28. Torino: Rosenberg & Sellier. Dahl, Östen. 2004. The growth and maintenance of linguistic complexity. Amsterdam & Philadelphia: John Benjamins. Dahl, Östen. 2018. Grammaticalization in the languages of Europe. In Heiko Narrog & Bernd Heine (eds.), Grammaticalization from a typological perspective, 79–97. Oxford: Oxford University Press. De La Fuente, Jose Andres Alonso. 2011. Tense, voice and aktionsart in Tungusic: Another case of “analysis to synthesis”? Wiesbaden: Harrassowitz. Diessel, Holger and Martin Hilpert. 2016. Frequency effects in grammar. In M. Aronoff (ed). Oxford research encyclopedia of linguistics. New York: Oxford University Press. Diewald, Gabriele. 2010. On some problem areas in grammaticalization theory. In Katerina Stathi, Elke Gehweiler & Ekkehard König (eds.), Grammaticalization: Current views and issues, 17–50. Amsterdam & Philadelphia: John Benjamins. Dryer, Matthew S. 1989. Article-noun order. Chicago Linguistic Society 25. 83–97. Dryer, Matthew S. 1992. The Greenbergian word order correlations. Language 68. 81–138. Dunn, Michael, Stephen C. Levinson, Eva Lindström, Ger Reesink & Angela Terrill. 2008. Structural phylogeny in historical linguistics: methodological explorations applied in Island Melanesia. Language 84(4). 710–759. Enfield, Nicholas James. 2003. Linguistic epidemiology. Semantics and grammar of language contact in Mainland Southeast Asia. London: Routledge Curzon. Epps, Patience. 2008. From “wood” to future tense: nominal origins of the future constructions in Hup. Studies in Language 32. 382–403. Evans, Nicholas. 1995. A grammar of Kayardild. Berlin: Mouton de Gruyter. Evans, Nicholas. 2007. Insubordination and its uses. In Irina Nikolaeva (ed.), Finiteness: Theoretical and empirical foundations, 366–431. Oxford: Oxford University Press. Fedden, Sebastian. 2011. A grammar of Mian (Mouton Grammar Library 55). Berlin: De Gruyter Mouton. Fischer, Olga. 2007. Morphosyntactic change: Functional and formal perspectives. Oxford: Oxford University Press.

84

Walter Bisang, Andrej Malchukov, et al.

Fortescue, Michael, Marianne Mithun & Nicholas Evans (eds.). 2017. The Oxford handbook of polysynthesis. Oxford: Oxford University Press. Gaeto, Livio. 2004. Exploring grammaticalization from below. In Walter Bisang, Nikolaus P. Himmelmann & Björn Wiemer (eds.), What makes grammaticalization? A look from its components and its fringes, 45–77. Berlin & New York: Mouton de Gruyter. Gisborne, Nikolas & Amanda Patten. 2011. Construction: Grammar and grammaticalization. In Heiko Narrog & Bernd Heine (eds.), The Oxford handbook of grammaticalization, 92–104. Oxford: Oxford University Press. Givón, Talmy. 2009. Multiple routes to clause union: the diachrony of complex verb phrases. In Talmy Givón & Masayoshi Shibatani (eds.). Syntactic Complexity. Diachrony, Acquisition, Neurocognition, Evolution, 81–118. Amsterdam: John Benjamins. Harris, Alice C. & Campbell, Lyle. 1995. Historical syntax in cross-linguistic perspective. Cambridge: Cambridge University Press. Haspelmath, Martin. 1997. From space to time: temporal adverbials in the world’s languages. Munich: Lincom Europa. Haspelmath, Martin. 1998. Does grammaticalization need reanalysis? Studies in Language 22. 315–351. Haspelmath, Martin. 2008. The European linguistic area. Standard Average European. In Martin Haspelmath, Ekkehard König, Wulf Oesterreicher & Wolfgang Raible (eds.), Language typology and language universals, vol. 2, 1492–1510. Berlin & New York: Mouton de Gruyter. Haspelmath, Martin & Robert Forkel. 2015. CLLD: Cross-Linguistic Linked Data. Leipzig: Max Planck Institute for Evolutionary Anthropology. http://clld.org . Heath, Jeffrey. 1998. Hermit crabs: Formal renewal of morphology by phonologically mediated affix substitution. Language 74(4). 728–759. Heine, Bernd & Mechthild Reh. 1984. Grammaticalization and reanalysis in African languages. Hamburg: Buske. Heine, Bernd, Ulrike Claudi & Friederike Hünnemeyer. 1991. Grammaticalization: A conceptual framework. Chicago: University of Chicago Press. Heine, Bernd & Tania Kuteva. 2002. World lexicon of grammaticalization. Cambridge: Cambridge University Press. Heine, Bernd & Tania Kuteva. 2005. Language contact and grammatical change. Cambridge: Cambridge University Press. Heine, Bernd & Tania Kuteva. 2006. The Changing Languages of Europe. Oxford: Oxford University Press. Heine, Bernd & Tania Kuteva, 2007. The genesis of grammar: A reconstruction. Oxford: Oxford University Press. Heine, Bernd, Tania Kuteva & Heiko Narrog. 2017. Back again to the future: How to account for directionality in grammatical change. In Walter Bisang & Andrej Malchukov (eds.), Unity and diversity in grammaticalization scenarios, 1–30. Berlin: Language Science Press. Helmbrecht, Johannes. 2017. On the grammaticalization of demonstratives in Hoocąk and other Siouan languages. In Walter Bisang & Andrej Malchukov (eds.), Unity and diversity in grammaticalization scenarios, 137–172. Berlin: Language Science Press. Hemphill, J. F. 2003. Interpreting the magnitudes of correlation coefficients. American Psychologist 58(1). 78–79. Hilpert, Martin. 2008. Germanic Future constructions: A Usage-based approach to language change. [Constructional Approaches to Language, 7]. Amsterdam & Philadelphia: John Benjamins. Himmelmann, Nikolaus P. 2004. Lexicalization and grammaticalization: Opposite or orthogonal? In Walter Bisang, Nikolaus P. Himmelmann & Björn Wiemer (eds.), What makes grammaticalization? − A look from its fringes and its components, 21–42. Berlin & New York: Mouton de Gruyter.

Position paper: Universal and areal patterns in grammaticalization

85

Himmelmann, Nikolaus P. 2005. Gram, construction, and class formation. In Clemens Knobloch & Burkhard Schaeder (eds.), Wortarten und Grammatikalisierung, 79–93. Berlin & New York: Mouton de Gruyter. Himmelmann, Nikolaus P. 2014. Asymmetries in the prosodic phrasing of function words: Another look at the suffixing preference. Language 90(4). 927–960. Hopper, Paul J. & Elizabeth C. Traugott. 2003. Grammaticalization. Cambridge: Cambridge University Press. Huson, Daniel & David Bryant. 2006. Application of phylogenetic networks in evolutionary studies. Molecular Biology and Evolution 23(2). 254–267. Jacques, Guillaume. 2017. The origin of comitative adverbs in Japhug. In Walter Bisang & Andrej Malchukov (eds.) Unity and diversity in grammaticalization scenarios (Studies in Diversity Linguistics 16), 31–44. Berlin: Language Science Press. Janhunen, Juha. 2000. Reconstructing Pre-Proto-Uralic typology: Spanning the millennia of linguistic evolution. In Anu Nurk, Triinu Palo & Tonu Seilenthal (eds.), Congressus Nonus Internationalis Fenno-Ugristarum, 7. –13. 8. 2000 Tartu, 59–76. Tartu, Estonia: Eesti Fennougristide Komitee. Janhunen, Juha. 2012. Mongolian (London Oriental and African Language Library 19). Amsterdam & Philadelphia: John Benjamins. Johanson, Lars. 2011. Grammaticalizaton in Turkic languages. In Heiko Narrog & Bernd Heine (eds.), The Oxford handbook of grammaticalization, 752–761. Oxford: Oxford University Press. Johanson, Lars & Éva Á. Csató. 2018. Grammaticalization in Turkic. In Heiko Narrog & Bernd Heine (eds.), Grammaticalization from a typological perspective, 146–166. Oxford: Oxford University Press. Krug, Manfred. 2000. Emerging English modals: A corpus-based study of grammaticalization. Berlin & New York: Mouton de Gruyter. Kuryłowicz, Jerzy. 1965. The evolution of grammatical categories. Diogenes 51. 55–71. Kuteva Tania. 2004. Auxiliation: An enquiry into the nature of grammaticalization. Oxford: Oxford University Press. Kuteva, Tania, Bernd Heine, Bo Hong, Haiping Long, Heiko Narrog & Seongha Rhee. 2020. World lexicon of grammaticalization (2nd revised edition). Cambridge: Cambridge University Press. Lehmann, Christian. 1995 [1982]. Thoughts on grammaticalization. A programmatic sketch. Munich: Lincom Europa. Lehmann, Christian. 2002a. New reflections on grammaticalization and lexicalization. In Ilse Wischer & Gabriele Diewald (eds.), New Reflections on Grammaticalization. (Typological Studies in Language 49), 1–18. Amsterdam & Philadelphia: John Benjamins. Lehmann, Christian. 2002b. Thoughts on grammaticalization (2nd revised edn). Erfurt: Seminar für Sprachwissenschaft der Universität (ASSidUE, 9). Lehmann, Christian. 2004. Theory and method in grammaticalization. Zeitschrift für germanistische Linguistik 32. 152–187. Lehmann, Christian. 2015. Valency classes in Yucatec Maya. In Andrej Malchukov & Bernard Comrie (eds.), Valency classes in the world’s languages, 1407–1461. Berlin & New York: Mouton de Gruyter. Lehmann, Christian. Forthcoming. Univerbation. Available at: http://www.christianlehmann.eu/ publ/lehmann_univerbation.pdf Longacre, Robert E. 1983. Switch Reference systems in two distinct linguistc areas: Wojokeso (Papua New Guinea) and Guanano (Northern South America). In John Haiman & Pamela Munro (eds.), Switch Reference and Universal Grammar, 185–207. Amsterdam & Philadelphia: John Benjamins. Maisak, Timur. 2016. Morphological fusion without syntactic fusion: the case of the “verificative” in Agul. Linguistics 54(4). 815–870.

86

Walter Bisang, Andrej Malchukov, et al.

Malchukov, Andrej. 2009. Rare and exotic cases. In Andrej Malchukov & Andrew Spencer (eds.) The Oxford handbook of case, 635–651. Oxford: Oxford University Press. Malchukov, Andrej. 2013. Verbalization and insubordination in Siberian languages. In Martine Robbeets & Hubert Cuyckens (eds.), Shared grammaticalization: With special focus on Transeurasian languages, 177–208. Amsterdam & Philadelphia: John Benjamins. Malchukov, Andrej & Bernard Comrie (eds.) 2015. Valency classes in the world’s languages. Berlin & New York: Mouton de Gruyter. Malchukov, Andrej & Patryk Czerwinski. Forthcoming a. Verbal categories in the Transeurasian languages. In Martine Robbeets (ed.), The Oxford guide to the Transeurasian languages. Oxford: Oxford University Press. Malchukov, Andrej & Patryk Czerwinski. Forthcoming b. Complex constructions in the Transeurasian languages. In Martine Robbeets (ed.), The Oxford guide to the Transeurasian languages. Oxford: Oxford University Press. McWhorter, John H. 2001. The world’s simplest grammars are creole grammars. Linguistic Typology 5, 125–166. McWhorter, John H. 2005. Defining Creole. Oxford: Oxford University Press. Meillet, Antoine. 1912. L’Évolution des formes grammaticales. Scientia (Rivista di scienzia) 12(26.6). 384–400. Narrog, Heiko & Toshio Ohori. 2011. Grammaticalization in Japanese. In Heiko Narrog & Bernd Heine (eds.), The Oxford handbook of grammaticalization, 775–85. Oxford: Oxford University Press. Narrog, Heiko & Bernd Heine (eds.). 2011. The Oxford handbook of grammaticalization. Oxford: Oxford University Press. Narrog, Heiko & Bernd Heine (eds.). 2018. Grammaticalization from a Typological Perspective. Oxford: Oxford University Press. Narrog, Heiko, Seongha Rhee & John Whitman. 2018. Grammaticalization in Korean and Japanese. In Heiko Narrog & Bernd Heine (eds.), Grammaticalization from a typological perspective, 166–188. Oxford: Oxford University Press. Newmeyer, Frederick J. 1998. Language form and language function. Cambridge, MA: MIT Press. Newmeyer, Frederick J. 2001. Deconstructing grammaticalization. Language Sciences 23. 187– 230. Nichols, Johanna. 1997. Modeling ancient population structures and movement in linguistics. Annual Review of Anthropology 26. 359–384. Nichols, Johanna. 1998. The Eurasian spread zone and the Indo-European dispersal. In Roger Blench & Matthew Spriggs (eds.), Archeology and language, 220–266. London: Routledge. Noël, Dirk. 2007. Diachronic construction grammar and grammaticalization theory. Functions of Language 14(2). 177–202. Norde, Muriel 2009. Degrammaticalization. Oxford: Oxford University Press. Norde, Muriel. 2012. Lehmann’s parameters revisited. In Kristin Davidse, Tine Breban, Lot Brems & Tanja Mortelmans (eds.), Grammaticalization and language change: New reflections, 73–110. Amsterdam: John Benjamins. Norde, Muriel & Karin Beijering. 2014. Facing interfaces: A clustering approach to grammaticalization and related changes. Folia Linguistica 48(2). 385–424. R Core Team. 2018. R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. https://www.R-project.org/. Reinöhl, Uta. 2017. Grammaticalization and the rise of configurationality in Indo-Aryan. Oxford: Oxford University Press. Reinöhl, Uta & Antje Casaretto. 2018. When grammaticalization does not occur: Prosody-syntax mismatches in Indo-Aryan. Diachronica 35(2). 238–276. Robbeets, Martine. 2009. Insubordination in Altaic. Voprosy Filologii: Serija Uralo-Altajskie Issledovanija 1. 61–80.

Position paper: Universal and areal patterns in grammaticalization

87

Robbeets, Martine. 2017. The development of finiteness in the Transeurasian languages. Linguistics 55(3). 2–35. Schiering, René. 2006. Cliticization and the evolution of morphology: A cross-linguistic study on phonology and grammaticalization. Constance, Germany: University of Constance dissertation. Schlegel, August Wilhelm. 1818. Observations sur la langue et la littérature provençales. Stolz, Thomas. 2001. To be with X is to have X: comitatives, instrumentals, locative, and predicative possession. Linguistics 39, 321–350. Sun, Linlin. 2015. Flexibility in the parts-of-speech system of Classical Chinese. Mainz, Germany: University of Mainz dissertation. Sun, Linlin & David Correia Saavedra. Forthcoming. Measuring grammatical status in Chinese through quantitative corpus analysis. Corpora 15(3). Tabor, Whitney & Elizabeth C. Traugott. 1998. Structural scope expansion and grammaticalization. In Anna Giacalone Ramat & Paul J. Hopper (eds.), The limits of grammaticalization (Typological Studies in Language 37), 229–272. Amsterdam & Philadelphia: John Benjamins. Traugott, Elizabeth C. 2011. Grammaticalization and mechanisms of change. In Heiko Narrog & Bernd Heine (eds.), The Oxford handbook of grammaticalization, 19–30. Oxford: Oxford University Press. Traugott, Elizabeth C. 2015. Toward a coherent account of grammatical constructionalization. In Jóhanna Barðdal, Elena Smirnova, Lotte Sommerer & Spike Gildea (eds), Diachronic Construction Grammar, 51–81. Amsterdam: John Benjamins. Traugott, Elizabeth C. & Richard B. Dasher. 2002. Regularity in semantic change. Cambridge: Cambridge University Press. Traugott, Elizabeth C. & Graeme Trousdale. 2013. Constructionalization and constructional changes. Oxford: Oxford University Press. Wei, Taiyun & Viliam Simko. 2017. R package “corrplot”: Visualization of a Correlation Matrix (Version 0.84). https://github.com/taiyun/corrplot. Wichmann, Anne. 2011. Grammaticalization and prosody. In Heiko Narrog & Bernd Heine (eds.), The Oxford handbook of grammaticalization, 331–341. Oxford: Oxford University Press. Wichmann, Søren. 2015. Statistical observations on implicational (verb) hierarchies. In Andrej Malchukov & Bernard Comrie (eds.), Valency classes in the world’s languages, 155–183. Berlin & New York: Mouton de Gruyter. Wickham, Hadley. 2016. ggplot2: Elegant graphics for data analysis. New York: Springer.

Walter Bisang, Andrej Malchukov, Iris Rieder, and Linlin Sun

2 Measuring Grammaticalization: A questionnaire  Grammaticalization Parameters . Parameters: an overview Our questionnaire measures grammaticalization in terms of eight parameters. The first six parameters correspond to Lehmann’s (1995) parameters as summarized in Table 1:

Tab. 1: Parameters for measuring autonomy (Lehmann 1995). Paradigmatic

Syntagmatic

Weight

. Semantic integrity . Phonetic reduction

Structural scope

Cohesion

. Paradigmaticity

. Bondedness

Variability

. Paradigmatic variability

. Syntagmatic variability

The parameters as they are shown in Table 1 and as they are used in our questionnaire deviate from Lehmann’s (1995) parameters in the following way: – We split Paradigmatic Weight into the two logically independent parameters of Semantic Integrity (reduction of semantic weight, desemanticization) and Phonetic Reduction (loss of phonetic integrity, phonetic attrition). – We do not use structural scope because it proved to be theoretically and empirically most challenging (cf. Tabor and Traugott 1998; Lehmann 2004; Diewald 2010; Norde 2012). We add the following two parameters, which are frequently discussed in the literature on grammaticalization: – Parameter 7: Decategorization (cf. Hopper and Traugott 2003) – Parameter 8: Allomorphy (thus, we single out allomorphy from paradigmaticity, parameter 3) Additional remarks on our methodology: 1. The above eight parameters are logically independent. Even if all of them instantiate loss of autonomy in Lehmann’s (1995) framework, there does not seem to https://doi.org/10.1515/9783110563146-002

90

Walter Bisang, Andrej Malchukov, Iris Rieder, and Linlin Sun

be covariation in the sense that the change of one parameter automatically entails the change of all other parameters. In fact, this project expects interesting cross-linguistic variation. Thus, the extent to which there are correlations and the extent to which these correlations are subject to cross-linguistic variation will be one of the results of our database. 2. There are four values for each parameter, starting from 1 for “lowest value” to 4 for “highest value”. The details will be explained separately for each parameter in section 1.2). 3. In principle, these values can be assigned to a linguistic sign in an absolute and in a relative way: – Absolute assignment: Here, the value refers to the value of that sign in its target function. – Relative assignment: Here, we look at whether the value of the sign has changed from source to target. This distinction is relevant for finding out if a given target has changed its value in the process of change from source to target but it is irrelevant for the definition of the parameters. While the definition of parameters might suggest different perspectives (6 and 7 defined relatively, other parameters in absolute terms), actually, the binary values (‘+/-‘) represent the binary perspective while ‘level’ values (1, 2, 3, 4) represent the absolute perspective, as explained below. The values which we need for our statistics consist of two parts, a value (1, 2, 3 or 4) plus ‘+’ or a ‘–’ sign. (i) The assignment of the value is absolute. We look at the properties of the target and we assign it one of the possible values as defined for each parameter. If a linguistic sign is an agglutinative affix, it will get the value 3 for parameter 4 ‘bondedness’. (ii) The assignment of the ‘+’ / ‘–’ sign depends on whether there was a change of value between the source and the target. Thus, the assignment of the ‘+’ / ‘–’ sign is based on a relative perspective: – If there is no change from source to target, the ‘–’ sign is written in front of the relevant value of the target: e.g., –3 for parameter 4 ‘bondedness’, if the source is an agglutinative affix and the target remains an agglutinative affix. – If there is a change in value, a ‘+’ sign will be added to the value of the target, e.g., +4. This means that the parameter value for the target is 4 and that the value for the source was lower than 4. Notice that we do not indicate a concrete value for the source concept in our statistics. We are only interested in whether there is a change from source to target (+) or not (–). 4. If markers develop different values in different slots of a paradigm, the marker with the highest value will be selected. This can be illustrated by an example from Beja (Cushitic): Table 2 shows the perfective () forms of ‘say’ and the suffixes for the imperfective derived from it:

Measuring Grammaticalization: A questionnaire

91

Tab. 2: From the paper of Vanhove (this volume, Table 2). ‘say’ 

Imperfective marker (2)

1

a-ni

-ani

2.

ti-ni-ja

-tnija

2.

ti-ni:

-tini:

3.

i-ni

-i:ni

3.

ti-ni

-tini

1

ni-di

-nej/-naj

2

ti-di:-na

-te:n(a)

3

e:-n(a)

-e:n(a)

A look at parameter 2 (Phonetic Reduction) shows that there is no change from 1. > 1. (value 1), while we have “loss of phonetic substance with effects on syllable structure” (value 3) in the cases of 2., 1 and 2. Since the latter case has a higher grammaticalization value, this value is chosen +2 (the markers in question are syllabic and not subsyllabic). 5. In many cases, grammaticalization does not affect a single source item but a more complex construction consisting of more than one item. In such cases, our measurement will be focused only on the nucleus of that construction, which we define as the most “lexical” component of the source concept (cf. the notion of by a ‘construction marker’ in Himmelmann [2005: 80]). Thus, the measurement of the English future marker be going to will be focused on the (lexical) nucleus go (rather than –ing or to). Of course, what is assessed is the target be going to but for assessing the change from source to concept (the + or – values), it is necessary to integrate the properties of the source as well. 6. The distinction between tokens and types of sources and targets is of no relevance for this project, since the values (e.g., +3) remain the same, irrespective of whether we evaluate a token or a type.

. The individual parameters Parameter 1: Semantic Integrity A grammatical marker is more general to the extent that it is semantically compatible with more lexical host items (Bybee [1985] on semantic generality). 1 The linguistic sign has a lexical meaning. 2 The linguistic sign has an abstract meaning, which is however referential/denotational rather than relational (e.g., ‘people’, existential verb, pronouns).

92 3 4

Walter Bisang, Andrej Malchukov, Iris Rieder, and Linlin Sun

The linguistic sign has an abstract non-denotational or relational meaning (e.g., prepositions, auxiliaries, numeral classifiers). The linguistic sign only has a syntactic function.

The whole continuum from 1 to 4 can be exemplified by the common development from body part > locative case through a number of intermediate stages (Lehmann 1995; Heine and Kuteva 2002): e.g., ‘head’ (value 1) > ‘top’ (2) > on (3) > locative case (4). The distinction between ‘referential’ and ‘relational’ (cf. values 2 and 3) should be understood in the conventional sense that lexical categories have denotations, while grammatical categories do not. Lehmann (2002: 139) refers to this distinction also as autosemantic vs. synsemantic. He notes that any full lexical item can signify by itself a certain concept (object, process, property) and independently refer to a certain class of such concepts, but when grammaticalized, it loses this ability and depends on another item for expressing certain distinctions. “A number or gender markers does not signify a number or gender concepts as such, but only insofar as these are features of other concepts” (Lehmann 2002: 139).1 The difference between values 3 and 4 corresponds to the distinction between “wide” vs. “narrow” definitions of grammatical (inflectional) meaning. The wide perspective (value 3) covers semantically meaningful inflectional categories (like tense in verbs; cf. “inherent inflectional morphemes” in terms of Anderson [1985]2). Narrow inflectional meaning (value 4) is characterized by the reduction to purely syntactic functions (case on nouns; agreement on verbs, adjectives, etc.). Parameter 2: Phonetic reduction 1 The linguistic sign consists of 2 or more syllables. 2 The linguistic sign is (i) a monosyllabic word or (ii) a full syllable with no word status. 3 The linguistic sign is a subsyllabic morpheme. 4 The linguistic sign is (reduced to) a suprasegmental feature or is lost. Parameter 3: Paradigmaticity (paradigmatic cohesion) This parameter is concerned with the size of the paradigm and its degree of homogeneity.  If the distinction between the values 2 and 3 is not clear for individual expressions, a somewhat simplified version of the tests designed by Boye and Harder (2012) can be used to distinguish between lexical vs grammatical categories (cf. addressability and focalizability). Thus, only lexical (‘referential’, ‘autosemantic’) but not grammatical (‘synsemantic’) items can be questioned or focused: cf. demonstratives vs. the definite article in English: (a) Which book? – This. Vs. *The; (b) It is this I like vs. *It is the I like.  Anderson (1985) distinguishes the following three types of inflectional forms: relational inflectional morphemes, agreement morphemes, inherent inflectional morphemes.

Measuring Grammaticalization: A questionnaire

1 2 3 4

93

The linguistic sign belongs to a major (open) word class. The linguistic sign is an element of a minor (closed) word class. The linguistic sign expresses a grammatical category but is not fully paradigmatic. The linguistic sign is a member of a small and homogeneous paradigm (it occurs in the same morphological slot as the other members of the same grammatical domain, its phonological shape in terms of number of phonemes or syllable structure is similar to at least most of the other members of the paradigm).

Examples of value 1 are elements that belong to a major word class (N, V, A), while elements with value 2 belong to a minor word class (adpositions, auxiliaries, but also pronouns). The linguistic expressions with value 3 express a grammatical category but are not fully integrated formally and/or semantically as, e.g., analytic tense forms in languages with synthetic tenses (cf. English Has done, German werde gehen [will go]). Similarly, aspect forms in Russian showing semantic and formal idiosyncrasies are not fully paradigmatic. The same holds for gender marking on nouns, which does not form a paradigm for individual lexical items, i.e., a specific noun does not occur in different gender forms in most cases (exceptions are cases like Spanish muchach-a ‘girl’ vs. muchach-o ‘boy’. Differentiation of markers with values 2 vs. 3 is not always straightforward, but can be decided for individual cases on the basis of frequency of individual markers, functional coherence and the number of grammaticalized markers (high frequency, semantic coherence and few markers involved speak for 3, that is for an emerging paradigm). Finally, value 4 will be reserved for canonical inflectional paradigms (tense on verbs, case on nouns), where individual lexemes have at least two forms in a paradigm. On this view, gender has value 4 only on adjectives (which have agreement), but not on nouns. Parameter 4: Bondedness (syntagmatic cohesion) This parameter measures the degree of cohesion/fusion between the host and the linguistic sign that undergoes grammaticalization. 1 The linguistic sign is a free morpheme or is the lexical root of a word. 2 The linguistic sign is a clitic (its use is not limited to a single word class). 3 The linguistic sign is an agglutinative affix (affixed to individual words which are members of a single word class). 4 The linguistic sign is part of a porte-manteau morpheme or is a suprasegmental (e.g., tonal marker) or a process morpheme (Ablaut, …), or a zero morpheme. Parameter 5: Paradigmatic variability Degree of obligatoriness: 1 No obligatoriness: The use of a linguistic sign is not imposed by the system, which is true of all lexical items, but also of grammatical items at early stages of grammaticalization.

94 2 3 4

Walter Bisang, Andrej Malchukov, Iris Rieder, and Linlin Sun

The obligatoriness of a linguistic sign is restricted to a small set of clearly defined constructions in certain contexts (cf. examples3). The linguistic sign is obligatory in most contexts (e.g., an article is obligatory for indefinites except when used as nominal predicates). The linguistic sign is generally obligatory (i.e., it always cooccurs with the relevant host) and the speaker has to select one value out of a set of markers expressing the relevant category (e.g., tense marking system with three values: PRS, PST, FUT).

Clarification on obligatoriness based on Lehmann’s (1995) view: By this we mean the freedom of the language user with regard to the paradigm as a whole. The paradigm represents a certain grammatical category, and its members, the subcategories (or values) of that category. There may then be a certain freedom in either specifying the category by using one of its subcategories or leaving the whole category unspecified. To the extent that the latter option becomes constrained and finally impossible, the category becomes obligatory. (Lehmann 1995: 124)

Generally, the opposite values (not obligatory: 1 vs. obligatory: 4) are least controversial. Intermediate values relate to cases in which a category is obligatory in a small number of instances (value: 2), or in a large number of instances (value: 3). A good example are the different degrees of grammaticalization in plural marking: Not grammaticalized (value: 1), grammaticalized only on pronouns and kin terms (value: 2), grammaticalized/obligatory for all nouns except for non-specific ones (value: 3), and finally grammaticalized for all (count) nouns (value: 4). For the values 3 and 4, we define individual obligatoriness domains as for example: – Tense: Occurrence in an independently utterable declarative clause. – Aspect: Like tense. – Evidentials: Like tense.

 (i) Accusative in some languages with differential object marking is only found on pronouns, but not on (common) nouns. (ii) The selection of passive voice due to syntactic reasons as in the case of conjunction reduction (Equi NP deletion). Here the verb occurs in two forms and the speaker is forced into selecting either the active voice with the coreference of agent and intransitive S as in (1) or the passive form with the coreference of patient and intransitive S as in (2): (1) Johni sees the dog and øi runs away; (2) The dogi was seen by John and øi run away. Thus, English passives get the value 2. In contrast, symmetric voice systems as we find them in Austronesian get the value 4 because the selection of a specific voice marker out of a set of voice markers is compulsory in a finite assertive clause. (iii) The obligatory expression of the subject in the subordinate clause of Chinese control constructions if the matrix-clause subject differs from the subordinate subject. E.g., Lĭ Líni yào tāk măi shū [Li Lin want s/he buy book] ‘Li Lini wants him/herk to buy a book’ vs. Lĭ Líni yào øk măi shū [Li Lin want buy book] ‘Li Lini wants øi to buy a book’.

Measuring Grammaticalization: A questionnaire

– – – –

95

Person: Like tense. Number: Occurrence with all count nouns, or like tense in the case of verb agreement. Definiteness/indefiniteness: Occurrence with all count nouns. Gender: Occurrence with all nouns (±including proper names) plus agreement.

A remark on the independence of the parameters with regard to paradigmaticity (parameter 3) and obligatoriness: A grammatical category can be obligatory but not (fully) paradigmatic. This can be seen from the case of gender marking, which tends to be maximally 3 (on parameter 3) if marked on the noun even though it is obligatory (according to parameter 5). There also exists the opposite case in which a category is paradigmatic but not obligatory. For example, different slots in polysynthetic languages might be seen as paradigmatic (accommodating different aspectual, valency and other categories which are mutually exclusive for a certain position class) even though not all of them are obligatory. In fact, the minimal verb form in such languages might include very few slots which are obligatory (often person, number and some subcategories of TAM). Thus, paradigmaticity is clearly related to obligatoriness, but there is no one-to-one relation between the two. At a more general level, paradigmaticity relates to individual markers (e.g., perfect tense), while obligatoriness relates to a category as a whole.

Parameter 6: Syntagmatic variability The syntagmatic variability of a sign is the ease with which it can be shifted around in its context. In the case of a grammaticalized sign, this concerns mainly its positional mutability with respect to those constituents with which it enters into construction. Syntagmatic variability decreases with increasing grammaticalization. (Lehmann 1995: 140) 1 Word order is not constrained, i.e., it is as free as the lexical items of a language to which the source concept belongs. 2 The position is more constrained, but is syntactically transparent in the sense that it corresponds to the normal syntactic position of the source concept. 3 The linguistic sign is assigned to a position that is no longer transparent (undergoes ‘positional adjustment’ in terms of Lehmann [1995: 169] or ‘permutation’ in terms of Heine [Heine and Reh 1984: 132]). 4 The linguistic sign becomes bound (is assigned a morphologically fixed position in a morphological template). The distinction between values 2 and 3 depends on whether the grammatical marker retains its position or undergoes restructuring (in terms of position or type [nonbound vs. bound, overt vs. zero]). A good example of restructuring is the grammaticalization of participles (deinen Vorschlägen entsprechend [your propositions according]) to prepositions (entsprechend deinen Vorschlägen [according your propo-

96

Walter Bisang, Andrej Malchukov, Iris Rieder, and Linlin Sun

sitions]) (cf. Lehmann 1995: 142). In this case, entsprechend in its function as a preposition gets the value +3. Another example of restructuring are auxiliaries in Romance languages. While the preferred word order for the verb in Latin was clause final (epistulam scriptam habeo [letter written have]) or clause initial (habeo epistulam scriptam [have letter written]), it is now in a fixed position immediately preceding the lexical verb, which was a rather marked position in Latin: Ital. Ho scritto una lettera (= habeo scriptam epistulam) (cf. Lehmann 1995: 141). For that reason, the Italian auxiliary gets again the value +3. If the source is already a bound morpheme as in many cases of secondary grammaticalization (i.e., a “grammatical marker becoming more grammatical”, along the aforementioned parameters), the value of the target is -4.

Parameter 7: Decategorization 1 No change of categorial properties of the source word class (in particular, inflectional categories of nouns and verbs). 2 Partial absence of certain categorial properties (for example, German kann [3.::can] does not have non-finite forms, but retains tense and person distinctions). 3 Absence of most categorial properties (for example, English modal verbs like can retain tense distinctions, but lose person). 4 Total decategorization.

Parameter 8: Allomorphy 1 No allomorphy. 2 The linguistic sign shows moderate (phonologically conditioned) allomorphy. 3 There is morphologically conditioned allomorphy. 4 There is lexically conditioned allomorphy. In Bybee, Pagliuca, and Perkins’s approach (1994: 110–113), allomorphy contributes to the loss of autonomy of a sign, and phonologically conditioned allomorphy is regarded as instantiating less dependency than morphologically or lexically conditioned allomorphy. We also adopt a similar approach here albeit in a simplified way.

 Source concepts to be evaluated Before going into the details about the features for measuring grammaticalization, indicate which of the following 30 source concepts (a selection from Heine and Kuteva [2002]) undergo grammaticalization in the language you analyze (Yes vs. No). Please, also note the target function(s).

Measuring Grammaticalization: A questionnaire

Tab. 3: 30 source concepts (Heine and Kuteva 2002). No

Source Concept

Target function(s)

.



)  )  )  )  ()

.

 (body part)

)  )  )  )  )  )  ()

.



) - )  )  ) 

.



)  )  ) 

.



)  )  )  ) 

.



)  )  )  )  )  ) 

.



)  )  ) 

Yes

Target functions Other

97

98

Walter Bisang, Andrej Malchukov, Iris Rieder, and Linlin Sun

Tab. 3 (continued) No

Source Concept

Target function(s)

Yes

Target functions Other

)  )  )  ,  )  )  .



)  )  )  )  ) -

.



)  ) 

.



)  )  )  ) 4 ) 

.



)   )  ) 

.



)  ) -- )  )  )  )  ) -5

 Nonsubordinating clause-linker: S1, (and) then S2.  That is, predicative possessive (marker of possessive ‘have’-constructions; Heine and Kuteva [2002]).

Measuring Grammaticalization: A questionnaire

Tab. 3 (continued) No

Source Concept

Target function(s)

)  )  .



)  )  )  )  ) 

.



)  ) -- )  )  )   )  )  )  

.

 (body part)

)  )  )  ) -

.

 (body part)

)  ) - )  )  ) 

.



)  )  ) - ) 

.



)  ) 

Yes

Target functions Other

99

100

Walter Bisang, Andrej Malchukov, Iris Rieder, and Linlin Sun

Tab. 3 (continued) No

Source Concept

Target function(s)

)  )  )  )  .



)  )  )   ) 

.



)  )  )  ) 

.



)  )  )   )  )  -

.



)  )  )   )  )  )  )  )  ) 

.



)  )  )  ) 

Yes

Target functions Other

Measuring Grammaticalization: A questionnaire

101

Tab. 3 (continued) No

Source Concept

Target function(s)

Yes

Target functions Other

)  )  )  )  .



)  ) 

.



)  )  ) 

.



)  )  ) 

.



)  )  )  )  )  )  ) -

.



)  )   ) -6

.



)  )  ) 

.



)  ) 

 That is, attributive possessive, for example, expressed by genitive or an adposition (Heine and Kuteva 2002).

102

Walter Bisang, Andrej Malchukov, Iris Rieder, and Linlin Sun

We ask the contributors to check whether any of grammaticalization paths listed above (in our 30-concept list in Table 3) is found in the language of your expertise. If there is a path of grammaticalization for a given source concept, please provide an example of grammaticalized structures, as well as evaluate the grammaticalization path in accordance with the parameters (and their values), as specified above. In addition, we ask our contributors to check the tables documenting for grammaticalization paths extracted from individual papers for accuracy of value assignment (especially, cases where the author is explicitly asked to provide feedback but also elsewhere).

Form for analyzing the grammaticalization value of individual markers/ linguistic signs Language: SOURCE concept: TARGET concept:

Parameter

















Value

Explanations and examples 1. Semantic integrity: 2. Phonetic reduction: 3. Paradigmaticity (paradigmatic cohesion) 4. Bondedness (syntagmatic cohesion) 5. Paradigmatic variability 6. Syntagmatic variability 7. Decategorization 8. Allomorphy

References Anderson, Stephen R. 1985. Inflectional morphology. In Timothy Shopen (ed.), Language typology and syntactic description, Vol. III: Grammatical categories and the lexicon, 150–201. Cambridge: Cambridge University Press. Boye, Kasper & Peter Harder. 2012. A usage-based theory of grammatical status and grammaticalization. Language 88. 1–44. Bybee, Joan L. 1985. Morphology: A study of the relation between meaning and form (Typological Studies in Language 9). Amsterdam & Philadelphia: John Benjamins.

Measuring Grammaticalization: A questionnaire

103

Bybee, Joan L., Revere D. Perkins & William Pagliuca. 1994. The evolution of grammar: tense, aspect, and modality in the languages of the World. Chicago IL: University of Chicago Press. Diewald, Gabriele. 2010. On some problem areas in grammaticalization theory. In Katerina Stathi, Elke Gehweiler & Ekkehard König (eds.), Grammaticalization: Current views and issues, 17–50. Amsterdam & Philadelphia: John Benjamins. Heine, Bernd & Mechtild Reh. 1984. Grammaticalization and reanalysis in African languages. Hamburg: Buske Verlag. Heine, Bernd & Tania Kuteva. 2002. World lexicon of grammaticalization. Cambridge: Cambridge University Press. Himmelmann, Nikolaus P. 2005. Gram, construction, and class formation. In Clemens Knobloch & Burkhard Schaeder (eds.), Wortarten und Grammatikalisierung, 79–93. Berlin & New York: Mouton de Gruyter. Hopper, Paul J. & Elizabeth Closs Traugott. 2003. Grammaticalization (2nd edn). Cambridge: Cambridge University Press. Lehmann, Christian. 1995 [1982]. Thoughts on grammaticalization. A programmatic sketch. Munich: Lincom Europa. Lehmann, Christian. 2002, Thoughts on grammaticalization (2nd revised edn). Erfurt: Seminar für Sprachwissenschaft der Universität (ASSidUE, 9). Lehmann, Christian. 2004. Theory and method in grammaticalization. Zeitschrift für germanistische Linguistik 32. 152–187. Norde, Muriel. 2012. Lehmann’s parameters revisited. In Kristin Davidse, Tine Breban, Lot Brems & Tanja Mortelmans (eds.), Grammaticalization and language change: New reflections, 73–110. Amsterdam: John Benjamins. Tabor, Whitney & Elizabeth C. Traugott. 1998. Structural scope expansion and grammaticalization. In Anna Giacalone Ramat & Paul J. Hopper (eds.), The limits of grammaticalization (Typological Studies in Language 37), 229–272. Amsterdam & Philadelphia: John Benjamins.

Damaris Nübling and Luise Kempf

3 Grammaticalization in the Germanic languages  Introduction to the Germanic languages . The Germanic language family The Germanic languages belong to the best-investigated language families, synchronically as well as diachronically. This holds at least for the major languages: English, German, Dutch, Swedish, Danish, and Norwegian. Some of them look back on more than one thousand years of attested history: The first written records date from the 7th and 8th centuries for English and German, respectively. However, there are centuries where written records are scant. Thanks to the results of the comparative historical method, Proto-Germanic (PG), spoken until around 500 BC, may be reconstructed as a common source (Ramat 1981; Krahe and Meid 1969; Bammesberger 1986, 1990; Ringe 2006). PG first split into Northwest and East Germanic, later Northwest Germanic separated into West and North Germanic (Figure 1). East Germanic with its most famous representative Gothic died out in the 5th or 6th century, but a very early record of this language, the Gothic Wulfila Bible from the 4th century survives. This is the earliest Germanic record at all but for a number of runic inscriptions. Unlike the Romance languages with Latin as their well-attested common source, there is only indirect knowledge of the Germanic proto-language. Today, around 12 modern Germanic standard languages can be distinguished (Figure 1). Proto-Germanic

Northwest Germanic

East Germanic (†) (Gothic etc., 4th cent.) North Germanic

West Germanic

South Germanic

Anglo-Frisian

East Scandinavian

West Scandinavian

German Yiddish Lux. Dutch Afrik. Frisian Engl. Danish Swedish Norwegian Faroese Icelandic 103

3

0.4

20

5–6

0.4

300

5

9

4.3

0.05

0.3 million speakers

Fig. 1: Family tree of the Germanic languages (around 450 million speakers) based on Henriksen and van der Auwera (2002) and extended by official numbers for the smaller languages. https://doi.org/10.1515/9783110563146-003

106

Damaris Nübling and Luise Kempf

English has by far the largest number of speakers, followed by German, which is mainly spoken in Germany, Switzerland, and Austria. Norwegian comprises two closely related languages: Bokmål being highly influenced by Danish and belonging to East Scandinavian and the more conservative Nynorsk belonging to the West Scandinavian branch. The following passages are limited to those Germanic languages surviving to our day.

. Typological characteristics Proto-Indo-European (PIE) is assumed to have been a highly inflecting synthetic language with a large number of grammatical categories. PG preserved a part of these inflectional affixes and is therefore classified as belonging to the inflecting languages. However, it underwent intense language contact. The contact language is not yet identified, but a number of hypotheses are debated vividly at present. Due to this language contact, most of the PIE inflectional categories were lost. Although the PG verbal and nominal systems preserved a considerable number of inflectional affixes and allomorphs, only few grammatical categories coded by them survived. In noun inflection, the categories of number (singular, dual, plural) and case (six cases) were preserved. As for verbs, personal endings (1st‒3rd ps. in each singular, dual, and plural), two tenses (present and preterite < PIE perfect) and mood (indicative, two subjunctives, imperative) survived. There were no synthetic perfect and pluperfect tenses and no future. The inherited strong verb system with ablaut (i.e., vowel change, cf. Engl. drink – drank – drunk) was expanded by the regular formation of weak verbs, which contain a stable root vowel and form their preterite with a so-called dental suffix, cf. Engl. love – loved, Germ. lieben – liebte. This suffix is the result of an early grammaticalization process, i.e., the cliticization of the preterite of the reduplicating PG verb *dōn ‘do’, which in Gothic (4th century) is still visible in the plural of weak verbs, cf. salbō-dēdun ‘(they) anointed’ (see Szczepaniak 2011: 115); this clitic subsequently developed into an obligatory inflectional marker, Engl. -(e)d, Germ. -t-, Swed. -de/-te (depending on the preceding sound). Today, weak verbs are productive in every Germanic language, whereas strong verbs (with ablaut) are not productive anymore. This, however, does not mean that they are dying out as is often predicted: Apart from Afrikaans, all Germanic languages preserve between about 100 (Swedish) and several hundred (Icelandic) strong verbs with high token frequencies and covering the most elementary meanings. Today, the Germanic languages range from highly inflecting languages such as Icelandic, Faroese, and German to deflecting ones such as English and Afrikaans, both of which experienced profound language contact. Figure 2 arranges the Germanic languages along a rough scale of morphological complexity, based on their nominal and verbal systems. In addition, the number of nominal genders is indicated, since it has an important effect on nominal complexity (see Section 2.1).

107

Grammaticalization in the Germanic languages

highly inflecting conservative Icelandic Faroese

German Luxemb.

deflecting innovative Nynorsk Swedish

3 genders

Bokmål Danish

Dutch Frisian

2 genders

English

Afrikaans

no gender

Fig. 2: The Germanic languages ranged according to the degree of inflection and deflection.

Regarding the direction of grammatical determination or modification, PIE was a rather clear postdetermining language (with suffixes), whereas the Germanic languages have developed a lot of predetermining grammatical morphemes such as (obligatory) personal pronouns (subject pronouns), prepositions and (in the West Germanic languages) definite articles. Furthermore, adjectives precede the noun they modify. As the Germanic languages have inherited many PIE endings (and sometimes even strengthened or extended them), they exhibit a mixture of postand predetermining grammatical morphemes. Word formation affixes, which have developed mostly in PG and later, can be found both, in pre- and postdetermining position.

. North versus West Germanic languages There are three basic features distinguishing North from West Germanic languages; two of these are directly related to grammaticalization and will be addressed in more detail below: 1) In the West Germanic languages, the definite article precedes the noun, whereas in the North Germanic languages, the article originally followed the noun and was later cliticized leading to suffixes (normally spelled without the hyphen; see [1]): () West-Germanic Dutch: het huis German: das Haus :.. house 2)

North-Germanic Swedish: hus-et Icelandic: hús-ið house-..

The North Germanic languages developed a synthetic passive deriving from the Old Norse postverbal reflexive pronoun sik, which was contracted and assimilated to -st in Icelandic and Faroese and further reduced to -s in Continental Scandinavian (2):

108

Damaris Nübling and Luise Kempf

() Icel. hún heyr-ði-st she hear-- ‘she was heard’



Swed. hon hör-de-s she hear--

The third feature is phonological and concerns Old Norse u-umlaut, rounding short [a] in front of [u] to [ɔ] (which later changed to [œ]), cf. Engl. ale, bear versus Swed. öl ‘beer’, björn ‘bear’.

 Grammaticalization of nominal categories . Declension class and gender Declension class: Most Germanic languages have preserved some of the PG declension classes. They only become overtly recognizable in the form of case (mostly genitive) and number (plural) allomorphs. English lost declension class and gender altogether (the irregular plurals teeth, geese, mice, lice, women, men, etc. are the last remnants of the former declension classes). In other languages such as Dutch, noun classes have become predictable by a grammaticalized link between the number of syllables and the kind of plural the respective word will take: Monosyllabic words take the syllabic plural ending -en, di- and polysyllabic words take asyllabic -s, thus mostly yielding the ideal trochaic plural form (output-oriented rule). The systems of Luxembourgish, German, and Icelandic are considerably more complicated. They contain a wealth of declension classes, which often even alter the root vowel by umlaut. The class membership cannot easily be determined from the word itself, but rather has to be learned by rote. In German, there are about 10 declension classes, in Icelandic some dozens. Some languages established a close connection between class and gender (e.g., Icelandic and German), whereas others have separated them (e.g., Dutch and Frisian). Dammel, Kürschner, and Nübling (2010) provide a survey of the nominal inflection of ten Germanic languages and show, for instance, that the number of declension classes roughly correlates with the number of genders. From a grammaticalization perspective, it is more important that in some languages the declension class system was reorganized according to animacy, individuation, countability, or other semantic features (Köpcke 1995; Kürschner 2008b; Nübling 2008; Kürschner and Nübling 2011). In German, there is the masculine weak declension class, which was comprised of a diverse array of nouns in Middle High German; later, non-humans left this class, leading to the present situation in which the class contains and productively incorporates only nouns of trochaic structure for male humans. Higher animates such as Fink ‘finch’ and Bär ‘bear’ are

Grammaticalization in the Germanic languages

109

Tab. 1: Nominal and pronominal gender (based on Audring 2010). Language

nominal gender

pronominal gender

Icelandic Faroese German

m

f

n

m

f

n

Luxemb. Yiddish Nynorsk Bokmål Swedish Danish

c

n

Dutch Frisian English



Afrikaans m = masculine, f = feminine, n = neuter, c = common gender

leaving this class, while apes are still welcome (Affe ‘ape’, Schimpanse ‘chimpanzee; see Wurzel [1986]; Köpcke [1995, 2000a, 2000b, 2002] and, including German dialects, Nübling [2008]). These processes can be seen as cases of reanalysis and thus resemanticization. If we assume that in former times declension classes represented semantic classes before they desemantized, this current case of resemanticization constitutes a case of degrammaticalization. An exhaustive diachronic comparison of the nominal classes in German, Dutch, Swedish, and Danish is provided by Kürschner (2008a). In German, a real classification system with classifiers preceding proper names has recently emerged. Here, the combination of +/– (former) definite article in combination with three genders has led to a six-class system which provides information about the ontological class the named object belongs to. Thus, a feminine name with an overt classifier such as die Bismarck may refer to a ship, an airplane, a company, or simply a female named Bismarck, whereas the neuter das Bismarck denotes a hotel, a restaurant, or a beer brand. Neuter names without an overt classifier denote towns, countries, and continents (Ø Bismarck (n.) is the capital of North Dakota). Masculine names often refer to mountains, cars, and males (see Nübling 2015a, 2020). Similar systems seem to exist in other Germanic languages as well; however, they have not been described yet (except for Swedish, see Fraurud [2000]). Gender: Except for English and Afrikaans, all Germanic languages have preserved two or even all three of the PG nominal genders. In pronouns, the PIE threegender system is fully preserved throughout (see Table 1).

110

Damaris Nübling and Luise Kempf

Degree

of

human >

animal >

individuation Personal pronoun

bounded object/abstract

feminine masculine

masculine

>

specific

>

mass

unspecific mass/abstract

neuter

Fig. 3: The redistribution of pronominal gender mapped onto the Individuation Hierarchy (adapted from Audring 2009: 127).

In languages with nominal gender, it is inherent to each noun.1 In languages without nominal gender, pronominal gender is animacy-driven, cf. Engl. it for inanimates and she, he for animates in combination with sex (similar in Afrikaans). This goes back to a reanalysis of obsolete forms. In languages with two nominal genders, the transition to a semantically based pronominal gender can be observed well: For contemporary Dutch, Audring (2009, 2010) points out that the feminine and masculine pronouns sij and hij refer to humans, whereas hij also tends to refer to countable things (including animals) and concepts; the neuter form het specializes in unspecific mass and abstract nouns (see Figure 3). Thus, pronominal gender detaches from nominal gender and is loaded with semantic information. Here, countability and the degree of individuation are crucial. This also holds for Norwegian (Bokmål) with its so-called pancake-sentences (Enger 2004), meaning that the predicate adjective (target) tends not to be in agreement with the subject (controller) in number and gender. The adjective is often marked as singular neuter if the concept of the pluralized noun can be conceived as a (non-countable) mass noun, i.e., grammatical agreement is overruled and replaced by semantic agreement (3): (3)

Bokmål Pannekake-r er god-t pancake-  good-. ‘Pancakes are good’

This is a case of reanalysis. A similar development can be observed in Swedish. Nominal gender can be considered the last stage in a lengthy grammaticalization process assumed to have started with classifiers providing semantic information about the denoted object. This is not true for gender anymore (complete desemanticization). Nominal gender, which is usually expressed on different targets (articles, adjectives, pronouns), can be considered form without meaning. In Ger-

 The exceptions concerning proper names were mentioned above (referential gender). In German dialects and in Luxembourgish, a pragmatic gender evolved with female personal names indicating the relationship between speaker und female referent (see Nübling 2015b, 2017).

Grammaticalization in the Germanic languages

111

man (and other Germanic languages), there are only few semantic gender assignment rules: Fruits, for example, are almost always feminine (die Banane, Mango, Ananas), and alcoholic drinks are mostly masculine (der Wein, Likör, Schnaps). In most Germanic languages, nouns for males are masculine and those for females feminine. However, these rules or rather tendencies don't really help memorizing gender, which is arbitrary in most cases: Gender is one of the most difficult features, especially for L2-learners. Its main function, at least in German, can be found in syntax: Gender plays the primary role in the famous NP framing constructions (cf. Section 4.1). As already mentioned, gender was re-functionalized and degrammaticalized in German by developing into a classifier system for proper names (die Bismarck → a female person or a ship, das Bismarck → a hotel or a restaurant, der Bismarck → a male person or a car). In combination with the (obligatory) presence or the (obligatory) absence of the definite article, German names are or rather (as this process is ongoing) tend to be integrated in a system of six onymic classes.

. Number Today, number can take one of two values, singular and plural. In PG, there was also a dual. Number is marked on nouns, pronouns, articles, and verbs, in highly inflecting languages also on adjectives (in Scandinavian on predicative adjectives as well). The number category was inherited from PIE. Later, there was a period of extensive formal reorganization. After the PG period, the amalgamated case/ number-suffixes were separated in most Germanic languages (except Icelandic and Faroese), causing the more relevant number suffix to precede the case suffix.2 The latter was later severely reduced, also due to initial stress. The plural suffix could trigger umlaut (fronting of the stem vowel) if it contained i or j, leading to multiple stem alternations, which in some languages (German, Luxembourgish, Yiddish) even became highly productive. In these languages, umlaut was morphologized, i.e., functionalized and grammaticalized (morpho-phonological stage). In many

 The relevance concept used here is taken from Bybee (1985, 1994). She defines it as follows: “A meaning element is relevant to another meaning element if the semantic content of the first directly affects or modifies the semantic content of the second” (Bybee 1985: 13). The degree of relevance determines the order of the morphemes: “If two meaning elements are, by their content, highly relevant to one another, then it is predicted that they may have lexical or inflectional expression, but if they are irrelevant to one another, then their combination will be restricted to syntactic expression” (Bybee 1985: 13). Thus, the number category is more relevant for the noun, which usually denotes objects, than case; therefore, number affixes are coded closer to the noun (even triggering umlaut of the root vowel) than case affixes. The same holds for the verb: aspect, tense and mood are more relevant categories than number or person; therefore, their expression often affects the lexeme.

112

Damaris Nübling and Luise Kempf

nouns, umlaut even is the only plural marker (Germ. Vater – Väter ‘father – fathers’). Here, non-linearity and the degree of fusion increased, which Dahl (2011: 158‒159) terms maturation. Other languages such as Dutch and English eliminated these assimilation products (Nübling 2013). All in all, the German NP is characterized by cooperative inflection: Every member contributes directly or indirectly to the expression of four nominal categories. This can be shown by the definite plural article die, which is homophonous with the feminine singular article. This syncretism explains why all feminine nouns have an obligatory overt plural marker: die Schüssel-n ‘the bowls’ (Schüssel is feminine). In contrast, masculine and neuter nouns show zero plurals (depending on their length and declension class): die Schlüssel-Ø ‘the keys’ (Schlüssel is masculine).

. Possession In most Germanic languages, the possessive relation is marked on the possessor, not on the possessum. Apart from possessive pronouns, the genitive case is the most important possessor marker: English my mother’s dog, the hotel owner’s car, Swedish min mors hund, hotellägarens bil. Both constructions may be replaced by of: the dog of my mother, the car of the restaurant owner (Swedish av-constructions occur less frequently and rather in the substandard). The more animate the possessor, the more common the prenominal genitive. In some languages, only the prepositional construction is possible, e.g., Luxemburgish and Faroese (for the complete replacement of the genitive by prepositional constructions see Section 2.5). Other languages still require or allow the genitive for personal names as possessors only, e.g., Dutch: Peters hond. Otherwise, a periphrasis occurs: de hond van mijn moeder, de auto van de hotelbesitter. German uses 1) prenominal genitives (personal names: Peters Hund) and 2) postnominal genitives (the rest: der Hund meiner Mutter / das Auto des Hotelbesitzers ‘my mother’s dog/the hotel keeper’s car’). 3) There are also popular paraphrases with von: der Hund von meiner Mutter (lit.) ‘he dog of my mother’. 4) For animate possessors, spoken German provides another common periphrasis, the so-called possessive dative: meiner [_] Mutter ihr [her] Hund. Inanimate possessors are excluded: *dem Haus seine Fenster → die Fenster von dem Haus ‘the windows of the house’. Here, a new distinction has been grammaticalized which constitutes an innovation. A degrammaticalization is taking place in languages such as English, Swedish, and Danish in cases where the so-called group genitive occurs (Norde 1997, 2006, 2009; Herslund 2001). Here, the genitive marker seems to have developed from an inflectional suffix ([N]-s) into a phrasal clitic with a syntactically conditioned distribution ([NP]=s). In Old Swedish, the inflectional genitive ending depended on the gender and declension class of the noun and showed concordial marking (“wordmarking genitive” according to Norde [2006] from which the following corpus-based data are taken):

Grammaticalization in the Germanic languages

(4)

113

Old Swedish en-s rik-s man-s hws a-.. rich-.. man-.. house ‘a rich man’s house’

In the course of time, -s also attached to feminine and plural nouns (which was not possible) originally, i.e., this system gradually disappeared and developed into an intermediary stage of a so-called “phrase-marking genitive” (Norde 2006: 206): (5)

Middle Swedish kom iak heem til fadhir min-s hws (14th ct.) came I home to father-Ø my- house ‘I came home to my father’s house’

In the 16th century the first group genitives with clitic -s are attested: (6)

Swedish [konung-en i Danmarck]s krigzfolck [king- in Denmark]= forces ‘the king of Denmark’s forces’

Today, the clitic attaches to the last word of the genitive NP irrespective of its word class membership and thus has become a possessive marker with enlarged scope (Nübling 1992: 112‒118; Norde 2006: 205), see (7): (7)

Swedish (Teleman et al. 1999, 3: 131) med [familj-en ovanpå]s ungar with [family- from above]= children ‘with the children of the family living above’

We find a similar tendency in English (8) and other Germanic languages: (8)

English

[that man upstairs]’s daughter [that man we met]’s daughter [the person I talked to]’s theories

Usually, clitics develop into affixes. In the cases above, the direction is inverted. For some languages such as English, it was objected that the possessive ’s was the successor of a former reduced possessive pronoun his/-s or the genitive -s adopted at least its position; in the latter case Lehmann (2004: 176) describes it as an “analogically-oriented degrammaticalization”. For Swedish, however, such a possessive pro-

114

Damaris Nübling and Luise Kempf

noun is not attested (for a critical discussion and further references cf. Janda [2001]; Lehmann [2004]; Börjars and Vincent [2011]). A new system of alienable and inalienable possessive markers has emerged in Fering-Öömrang, a North Frisian dialect. Here, the definite article split into two paradigms, a full (“D-article”) and a reduced one, which lost its initial d- and begins with a- (“A-article”). Specific uses of the A-article mark (in-)alienability. Here, only two A-articles exist, a (the former masculine and feminine article) and at (the former neuter article). Today, a marks inalienability, at alienability which can be shown with body parts (here: blees ‘bladder’): hat hee’t me a blees means ‘she has problems with her bladder’, whereas at blees refers to a pig’s bladder which in former times was used as a toy (Ebert 1998). This development can be analyzed as a case of exaptation.

. Determiners PG did not have articles. The discourse status of referents had to be deduced from the pragmatic context, syntax, and, to some extent, the adjectival inflection (hidden complexity). All Germanic languages developed overt definite and indefinite articles, giving rise to a common feature as well as a new grammatical category (innovation). Himmelmann (2001: 838) points out that compared to all languages of the world, only few languages develop definite as well as indefinite articles: It is common to think of definite and indefinite articles as a ‘natural pair’, i.e., as occurring together in one morphosyntactic paradigm. Crosslinguistically, however, this is the exception rather than the rule […]. There are many languages with definite articles lacking indefinite articles […].

During the last two millennia, many western European languages developed articles. Some linguists assume language contact as the main factor and presume that Greek, the Romance and the Germanic languages constitute the center of this development which spread to different degrees to other languages (see Heine and Kuteva [2006] with further references). Without historical data, it is hard to decide whether or – if this was the case – to which extent a parallel development can be motivated by language contact.

.. The indefinite article The Germanic languages did not develop indefinite plural articles. The most common source of the indefinite article is the numeral ‘one’: Engl. I saw a dog – I saw Ø dogs. Icelandic is the only Germanic language which did not even develop an indefinite singular article (9):

Grammaticalization in the Germanic languages

(9)

115

Icelandic ég sá hund-inn – ég sá hund-Ø I saw dog-... – I saw dog-... ‘I saw the dog’ – ‘I saw a dog’

The indefinite article grammaticalized in the Middle Ages. Its first occurrences in German can be found in late Old High German (OHG), e.g., samo so in einero uesti ‘as in a fortress’ (Notker, 10th century; for a historical survey see Szczepaniak [2016]). In every Germanic language, the indefinite article (if present) occurs in front of the noun: Germ. ein Haus – Engl. a house – Swed. ett hus. With regard to the grammaticalization parameter of bondedness the indefinite article is less grammaticalized than the definite one because it is not agglutinated to the noun. Nevertheless, it is always unstressed (whereas the numeral may be stressed) and often reduced. This is not exclusive to English, where it is rather obvious – compare one with a(n) – but also in other languages where no orthographic difference exists: Compare German ein: da läuft nur 'ein ['aɪn] Hund ‘there is only one dog (and not more) walking’ vs. da läuft nur ein [(ə)n] 'Hund ‘there is just a dog (and not a pig) walking’. The article can be (and mostly is) reduced to a mere consonant; if the vowel is articulated at all, it is weakened to schwa which is a typical co-evolution of meaning and form. In spoken German, the indefinite article usually cliticizes to the preceding word (here: nur=n), only sentence-initially to the following word: Ein [ṇ]= Hund ist ein nettes Haustier ‘Dogs make nice pets’ (Dedenbach 1987). Indefinite articles indicate that the referential set is not in the universe of discourse. After an introduction together with an indefinite article, the object can be taken up again accompanied by a definite article (or by the demonstrative): Yesterday, I saw a dog in my garden. The (That) dog belongs to my neighbor. Furthermore, the (less grammaticalized) specific and referential use has to be distinguishable from the (more grammaticalized) non-specific and non-referential, generic use (see Figure 4): German ich habe einen Hund gekauft ‘I bought a dog’ designates a specific dog, whereas in the following sentence, the same article is used generically (and can be replaced by the definite article or the plural): Ein Hund gefällt mir besser als eine Katze ‘I like dogs more than cats’ [sic]. The predicative use of the indefinite article represents the same high degree of grammaticalization: er ist ein treuer Hund ‘he is a faithful dog’. In Swedish (and Danish), indefinite articles are used less frenumeral ‘one’

indefinite article

‘exactly one object’

(‘belonging to a class of objects’) specific

non-specific

referential

non-referential (generic, predicative use)

Fig. 4: Grammaticalization cline from the numeral ‘one’ to the indefinite article (adapted from Himmelmann 2001; Szczepaniak 2011, 2016).

116

Damaris Nübling and Luise Kempf

quently for predicative generic reference: Swed. Hon har (en) hund – Germ. Sie hat einen Hund – Engl. She has a dog (Skrzypek 2012), i.e., the indefinite article reached different grammaticalization stages in the Germanic languages. Unfortunately, there are no cross-linguistic studies between these languages. All in all, indefinite articles still include singular objects. However, the number information is not as crucial as it is for the numeral ‘one’.

.. The definite article The definite article emerged from a demonstrative (hinn in Old Norse, the(r) in WestGermanic; here, only the m.sg.nom. is mentioned). Both demonstratives underwent considerable semantic bleaching and formal reduction, and new demonstratives developed (see also Lehmann 1995b, Hilpert 2011). Definite articles occur in every Germanic language.

... The definite article in the North-Germanic languages As the morphosyntactic behavior of the article is rather different in North- and WestGermanic, we will first concentrate on the development in Scandinavian. The grammaticalization of the definite article started in the late Middle Ages (14th century). ON/Icel. hinn ‘that, the other’ was/is a distal demonstrative. In the past, it often followed the noun, especially if an adjective followed, which explains its subsequent cliticization and suffixation: ON maðr hinn gamli lit. “man the old” – ‘the old man’ > maðrinn gamli > maðrinn (gamli) (see Barðdal, Jörgensen, and Larsen [1997: 302], who claim that the grammaticalization of the demonstrative into a definite article started in constructions of adjectival attribution). During the grammaticalization into a definite marker, it lost its stress and eventually attached to the noun where it reduced initial h- and often the first vowel i- as well. This rather complex development ended up in two different systems, a) Icelandic (with Faroese) behaving more conservative and b) Swedish (with Norwegian and Danish), see Figure 5.

Proto-Germanic *hest-az horse-..

hin-az that-..

> > >

Old Norse hest-r=in-n horse-=..

> > >

a) Icelandic hest-ur=in-n horse-=..

> > >

b) Swedish häst=en horse=..

Fig. 5: The suffixation of the definite article in Scandinavian languages. N.B.: In ON, Icelandic, and Swedish orthography, no hyphens are used; in this paper, the old inflectional endings are marked by “-”, the definite suffix by “=”. The glossings are simplified.

Grammaticalization in the Germanic languages

117

Tab. 2: The (in)definite declension of Icelandic hestur (m.) and Swedish häst (c.) ‘horse’. indefinite

definite

case

singular

plural

singular

plural

Icelandic

nom. gen. dat. acc.

hest-ur hest-s hest-i hest

hest-ar hest-a hest-um hest-a

hest-ur=inn hest-s=ins hest-i=num hest=inn

hest-ar=nir hest-a=nna hest-u=num hest-a=na

Swedish

nom.

häst

häst-ar

häst=en

häst-ar=na

genitive

häst-s

häst-ar-s

häst=en-s

häst-ar=na-s

“-”: number/case suffix; “=”: definite suffix

Most ON and Icelandic nouns mark case, number, and gender by their inherited suffixes. The same holds for the former demonstrative, which developed into a postponed article and eventually fused with the noun. This led to the emergence of double (redundant) inflection, which is fully preserved in Icelandic and Faroese (see Table 2 for a full paradigm). The Continental Scandinavian languages eliminated the nominal case endings and thus produced the following sequence of nominal categories following the relevance principle according to Bybee (1994: 2559): (-)=..-. Thus, the definite marker directly follows the noun if it is a singular (häst=en). If it is a plural, number is expressed twice. The definite suffix includes information about number and gender. If the noun is in the genitive, the superstable marker -s is added last. In these languages, only a binary case system remains: genitive and non-genitive. If the noun is accompanied by an attribute, Norwegian and Swedish use two different definite articles framing the NP (double determination): a younger socalled adjective-article and the old definiteness suffix, which results in NPs such as Swed. den stora hunden ‘the big dog’ (Old Norse did not exhibit preponed free darticles as it is the case with Icelandic until today). (10) Swedish Jag såg hund-en – Jag såg den stor-a hund-en I saw dog-.. – I saw ... big- dog-.. ‘I saw the dog’ – ‘I saw the big dog’ In Danish, the most progressive Scandinavian language, the adjective-article precludes the suffixed marker: hund-en ‘the dog’, but den store hund ‘the big dog’.

118

Damaris Nübling and Luise Kempf

Here, contrary to Swedish, the suffixation of the definite article is not realized consistently. Norwegian represents an intermediate state. The younger adjective-article derives from the German d- article and is a loan from Middle Low German, which was widely spoken in Continental Scandinavia during the Hanse period, especially in Denmark. According to Dahl (2004), double determination (“over-determination”) is the result of two competing grammaticalization areas: the older suffixation in North-East Scandinavia and the “West-Germanic” model of preponed determiners in the south. The area in between combined these two models resulting in double determination. Danish practices a division of labor by using the “West-Germanic” type if modifiers occur (den store hund) and the “North-Germanic” type if the NP is not modified (hunden). Evidence for this account is provided by Scandinavian dialects (for further details, see Dahl [2004]).

... The definite article in the West-Germanic languages The West-Germanic languages grammaticalized a preponed article as can be demonstrated by German der, which still functions as a demonstrative if stressed (dér) and as a definite article if unstressed (der). Furthermore, a new demonstrative, dieser, arose when der started grammaticalizing (definiteness cycle). When der developed into an article, it lost its deictic component and its dependence on context, i.e., it is not reliant on pragmatics anymore (the notion of pragmatics is taken from Szczepaniak [2011]). In German, this process already started in OHG. It follows the animacy hierarchy and also depends on referentiality which is shown in a corpus-based study undertaken by Flick (2017). The other West-Germanic languages have been explored to a lesser extent which explains why we focus on German. As Figure 6 shows, the demonstrative can be used 1) to introduce new objects and 2) to refer anaphorically to objects previously mentioned, i.e., familiar objects or facts. Even if the object was mentioned some weeks ago, it may be 3) remembered later, e.g., wirst du diesen Hund nun kaufen? ‘will you buy this dog now?’. This pragmatic definiteness 1)–3) also holds for the full (unstressed) article: wirst du den Hund nun kaufen? Semantic definiteness can, however, only be expressed by the definite article and relies on general knowledge (4) (Himmelmann 2001). In when I came home, the dog was already asleep – Germ. als ich nach Hause kam, schlief der Hund schon, the addressee must be familiar with the fact that the speaker is the owner of a single dog; if not, the reference to this particular dog will fail. In 5), associative-anaphoric use, the referent is neither unique nor previously mentioned, it is rather taken for granted because it belongs to the general scenario: my dog ate but did not like the rice – Germ. mein Hund fraß, aber mochte nicht den Reis. Rice is a natural part of a meal. 6) A further increase in grammaticalization is the article in front of a name. Proper names are inherently definite, so the article is functionally redundant and therefore expletive (der Rhein, die Schweiz, in spoken German die Susanne, der Peter). As it is still connected to a referential entity, it is separated from the next section where the article exclusively fulfills non-referential, secondary

Function

pragmatic semantic definiteness definiteness (does not depend on (depends on context) context)

1) situational 2) anaphoric

introducing

stressed ++ ++

+ +

3) anamnestic (recognitional)

activating

+

+





+

++



+

++



+

+++

7) generic (non-referential)



+

+++



+

+++

5) associativeanaphoric

marking definiteness

6) expletive (in front of names)

8) noun marker/ syntactic use (if name/noun is modified)

marking noun class / framing NPs

Full def. article

Enclitic def. article unstressed ─ ─

obligatory after certain prepositions: im, ans, vom, zum, zur …

4) abstractsituational

Demonstrative

119

grammaticalization

Use

other requirements

Grammaticalization in the Germanic languages

Fig. 6: From demonstrative to definite article in German: Extensions (adapted, based on Szczepaniak 2011). Symbols: “++” means ‘frequently used as’, “+” ‘can be used as’, “–” ‘cannot be used’; the grey shading reflects increasing grammaticalization.

functions such as 7) marking nouns or 8) framing NPs. The generic article (7) is non-referential and relates to a whole class: Germ. Der Mensch stammt vom Affen ab ‘Man descended from apes’. As the English translation shows, this article use is not possible in English. The same holds for 8): der arme Peter ‘poor Peter’.3 In Standard German, personal names usually do not take articles unless there is an adjective occupying the middle field. In this case, the article functions as an obligatory left brace and plays a purely syntactic role. In German dialects (Bavarian), the article occurs even more frequently, as Eroms (1989: 106) states: Die hervorstechendste Auffälligkeit im Dialekt ist nämlich die Tatsache, dass ein Substantiv so gut wie immer artikelbegleitet ist. […] Es kann davon ausgegangen werden, dass der Artikel im Dialekt die Grundaufgabe hat, die nominale Qualität seines Substantivs zu signalisieren. The most striking peculiarity in the dialect [Bavarian] is, in particular, the fact that a noun is almost always accompanied by an article. […] It can be assumed that the article in the [Bavarian] dialect has the main purpose of signaling the nominal quality of its noun.

This generalized article use is not possible in English. There is, however, a bigger difference between both languages: The German article often attaches to the preced-

 Otherwise, the English generic article is used in a similar way as in German.

120

Damaris Nübling and Luise Kempf

ing preposition and achieves a high degree of fusion as a clitic. This only holds for dative and accusative masculine and neuter forms: in dem > im, in das > ins, von dem > vom, vor das > vors, über den > übern, even more (including feminine forms) in spoken German: mit dem > [mɪm], auf dem > [aʊfm], auf der > [aʊfɐ]. This process is historically documented by Christiansen (2012, 2016) who proves that it already started in the Middle Ages predominantly with articles in the dative masculine and neuter. Today, these (simple and special) clitics fill the rightmost column in Figure 6 and occupy the most grammaticalized stages 4) to 8). They clearly exceed the full article in frequency, build small paradigms, and can even be obligatory, provided they follow specific prepositions (such as bei, in, an, zu, von, vor ‘by, in, at, to, of, before’): Thus, in the prepositional phrase im Schwarzwald ‘in the Black Forst’ (i)m is the only correct form; *in dem Schwarzwald is not acceptable. There are many contexts where the full article must be replaced by a clitic (Nübling 1998, 2005). If there is no preposition, the full article has to be used: in Touristen tun dem Schwarzwald gut ‘tourists benefit the Black Forest’, the article cannot cliticize and fuse with tun. Other Germanic languages do not exhibit such highly fused articles, except Luxembourgish, which is also the only Germanic standard language with obligatory proclitic articles in front of personal names (d’Claudine, de Pierre).

. Case The earliest records of the Germanic languages show five cases: nominative, genitive, dative, accusative, and instrumental, the latter of which later merged with the dative. While Icelandic and German have preserved the four cases best, other languages have reduced them and now use either syntactic means of expression or prepositions. Faroese abolished the synthetic genitive (except for some relics) and replaced it with the preposition hjá + dative. A similar development occurred in Luxembourgish (vun), Dutch (van), and, to a lesser degree, in German (von) and English (of). In English, the synthetic dative was replaced by the preposition to (except in front of a pronoun), cf. Germ. ich schreibe meinem (dat.) Vater einen (acc.) Brief vs. Engl. I write a letter to my father. These are cases of renewal. Furthermore, many verbs and adjectives govern different cases. In many languages, the genitive in particular was abolished and replaced by a different case or by prepositions. Some German examples: sich erinnern + gen. > sich erinnern an (+ dat.) ‘to remember sth./so.’, sich schämen + gen. > sich schämen für (+ acc.) ‘to be ashamed of sth.’, sich bewusst sein + gen. > sich bewusst sein über (+ dat.) ‘to be aware of sth.’. These are so-called empty or neutral prepositions as they are completely de-semanticized.

Grammaticalization in the Germanic languages

121

 Grammaticalization of verbal categories . Passive voice in the Germanic languages It is assumed that PG had no passive construction. The modern Germanic languages have developed analytic passive constructions, Scandinavian even a synthetic one (see Section 1.3). Here, we skip the stative passive, which is usually formed by a finite  + past particle (PP). We focus on the dynamic passive, which requires a transitive verb and converts the accusative object of the active into the subject (nominative) of the passive (the former subject/agent can be omitted or added in a prepositional phrase), see the German example in (11): (11) Sie stahl (dem Mann) den Schlüssel She stole (the. man) the. key ‘she stole the key (from the man)’ → der Schlüssel wurde (dem Mann) (von ihr) gestohlen → the. key was (the. man) (by her) stolen → ‘the key was stolen (from the man) (by her)’ English is the only language lacking different auxiliaries for these two plain passives. Thus, the key was stolen is ambiguous. When the passives emerged, the construction with  was the first and only one. Later, dynamic passives were expressed by different means. Icelandic can still use vera ‘be’ for the dynamic passive: barnið er klætt (af mér) ‘the child is (being) dressed (by me)’.

Tab. 3: (Dynamic) Passives in the Germanic languages. Language

analytic passive

English

be + PP

Frisian

wurde + PP

Dutch

worden + PP

German

werden + PP

Luxembourgish

synthetic passive

ginn + PP

Danish

blive + PP

-s

Swedish

bli + PP

-s

Faroese

verða + PP, blíva + PP

-st

Icelandic

vera + PP

-st

122

Damaris Nübling and Luise Kempf

Table 3 shows the principal ways of forming dynamic passives. The West Germanic languages use a verb going back to PG *verþ-a- ‘turn’ > ‘become’. Faroese and Icelandic are the only Scandinavian languages (except Nynorsk) still using this old verb, albeit rather rarely (the paradigm of bli in spoken Swedish contains the suppletive preterite form vart instead of blev). More important is the synthetic -st ending originating from the postponed ON reflexive sik: heyrast ‘to be heard’, fæðast ‘to be born’, byggjast ‘to be built’. On the other hand, -st is far from being a monofunctional passive marker. It is also a marker for the medio-passive (anticausative), which historically was an important step towards the passive (Haspelmath 1987). Here, the verbal event is related to the subject: Icel. ég klæðist ‘I dress myself/I am dressed by myself ’. The marker -st can also express reciprocity: heilsast ‘to say hello to each other’, bítast ‘to bite each other’, and sometimes imperfective aspect. Quite often, verbs in -st are completely lexicalized (anda ‘breathe’ – andast ‘die’) or even lack a counterpart without -st (fyrnast ‘get old’ – *fyrna, öðlast ‘gain’ – *öðla). Sometimes, plain verbs are synonymous with their reflexive counterparts (biðja = biðjast ‘ask’, leita = leitast ‘look for’). Birkmann (1997) and Enger (2002) argue for a grammaticalization from sik > clitic > derivational affix (> inflectional affix) with derivation as a possible final stage. Nowadays, there are relevance-driven re-orderings in spoken Icelandic (Birkmann 1997: 88): (12) Við köll-um-st > við köllu-st-um We call-1.- > we call--1. → ‘we are called’ The highly relevant voice category moves to the verbal stem, whereas the agreement category person and number shifts to the right periphery of the word form (Schmuck 2013: 283‒293). Danish, Swedish, and Bokmål exhibit two frequently used passives: 1) bli, a loan from Middle Low German blīven ‘remain’, which was grammaticalized and reduced to the copula bli ‘become’ (replacing former varda) only in Scandinavian (Braunmüller 2015). It later became a passive auxiliary, gradually replacing 2) the older -s passive. Currently, both passives are in use with some division of labor (Swedish): -s passives are more frequently used in written Swedish and describe actions where the agent is not important, unknown, or which can be deduced from the context: middag serveras kl. 7 ‘dinner is served at seven’, tavlan målades i slutet av 1600-talet ‘the picture was painted at the end of the 17th century’; det skrivs mycket ‘a lot is written’ (Holmes and Hinchliffe 1994; Teleman et al. 1999). The blipassive occurs in spoken Swedish and is used for singular actions and events with an agent; it also appears in the past tense (preferably with adverbs): han blev påkörd av en bil igår och ena benet blev brutet ‘he was hit by a car yesterday, and one of his legs was broken’; Göran blev sparkad av Inger ‘Göran was kicked by Inger’.

Grammaticalization in the Germanic languages

123

While the Swedish -s passive occurs in every tense, the Danish and Norwegian -s passives are clearly more restricted (e.g., they occur neither in the perfect and pluperfect tense nor in the preterite of strong verbs), less common, and thus less grammaticalized. As already mentioned for Swedish, the use of Danish und Norwegian -s und bli passives also differs comparing the spoken register (bli is more frequent) and the written register (-s is more frequent). According to Laanemets (2009), written Swedish uses -s passives in 99 %, spoken Swedish in 84 % (and in 16 % bli) in contrast to spoken Danish and Norwegian (Bokmål) with 64 % and 62 % -s passives (the rest is formed with bli) (see also Schmuck 2013: 278‒281). The most exceptional grammaticalization took place in Luxembourgish, where ginn, originally and still today 1) a full verb meaning ‘give’, developed into 2) the inchoative copula ‘become’ and then 3) into the passive auxiliary: Lux. 1) ech ginn der e Buch ‘I give you a book’ → 2) ech gi krank ‘I get sick’ → 3) ech gi gesinn ‘I am seen’ (divergence). Copulas are often the predecessors of passive auxiliaries, cf. German werden, Swed./Dan. bli/blive and Lux. ginn. Here, participles used as predicative adjectives (step 2) must have been reanalyzed as verbal participles (step 3). As an example, the German ambiguous adjective/past particle enttäuscht ‘disappointed’ can be understood as a pure adjective (ich wurde enttäuscht ‘I became disappointed’ – copula), at the same time, however, it may be re-analyzed as a dynamic passive: ich wurde enttäuscht < jemand enttäuschte mich ‘I was disappointed < somebody disappointed me’. Step 2) → 3) can be explained in these terms. Step 1) → 2) is more complicated since ‘give’ is not a typical source for copulas (Heine and Kuteva 2004). Lux. ginn is a trivalent verb, which had to be dramatically reduced (loss of the indirect argument, semantic bleaching, formal reduction) in order to become an inchoative copula (Nübling 2006). Most Germanic languages developed another passive, the so-called recipient passive (or addressee passive). Here, the indirect object (recipient) of the active sentence is promoted to the subject (albeit not the agent) of the passive sentence (Lehmann 1991). The passive auxiliary derives from a ‘receive’-verb and usually needs a direct object (ditransitive): Germ. kriegen/bekommen, Lux. kréien, Dutch krijgen, Engl. get, Swed./Dan. få. In (13), Germ. Mann ‘man’ is the recipient of the action: (13) Sie stahl dem Mann den Schlüssel she stole the. man the. key ‘she stole the key from the man’ → der Mann bekam den Schlüssel (von ihr) gestohlen → the. man got the. key (by her) stolen → ‘the man had the key stolen (by her)’ Most languages exhibit quite grammaticalized periphrases, which can be deduced from the fact that the ‘receive’-verb can be combined with privative verbs such as ‘steal, take away, remove’, i.e., the subject can take the semantic role of the malefac-

124

Damaris Nübling and Luise Kempf

tive instead of the benefactive. Furthermore, the direct object may sometimes be omitted (das Kind kriegt eine Geschichte vorgelesen ‘the child is read a story’), i.e., an extension to intransitive dative-verbs can be observed (sie bekommen applaudiert ‘they are applauded’). This applies all the more for Luxembourgish. In this case, even inanimate subjects (recipients) can be used, which is not possible in German yet. All in all, the recipient passives are increasing in frequency, especially in spoken language, and their proceeding grammaticalization can be observed well (Szczepaniak 2011: 152‒158; Wiemer 2011; Lenz 2011, 2013).

. Aspect .. Perfective and imperfective Each of the Germanic languages has developed a periphrastic resultative perfect, which in the beginning was in close connection to the present (present perfect) and was formed by  + PP or  + PP. Some languages preserved both auxiliaries (with different functions), e.g., German, Dutch, Luxembourgish, Danish, Icelandic, while others generalized ‘have’ (Swedish, English). The perfect started out as a present with a possessive meaning with respect to the ‘have’-construction, see the OHG example in (14) taken from Tatian (163, 10‒12): (14) phígboum habeta sum giflanzotan in sinemo fig tree-... had somebody planted.... in his uuingarten vineyard ‘Somebody had a fig tree (which was) planted in his vineyard’ Sentence (14) originally meant that somebody owned a fig tree which had been planted (by someone else) in his vineyard. The past participle giflanzotan syntactically belonged to the fig tree (it agreed with it in case, gender and number) and characterized it. Only later, a reanalysis took place where the action of planting was associated with the subject: ‘I myself planted the fig tree’, giving rise to the verbal construction. Subsequently, verbal agreement was reduced and a direct object was not obligatory anymore. A similar process has to be assumed for ‘be’, which often was combined with verbs of motion verbs. Icel. hann er kominn (lit.) ‘he is come-’ means that he has just arrived (and still is here). In the beginning, the resultative function of the perfect was dominant. Later, the functional domain expanded to a perfective where the action before the result became increasingly important. In these cases, it finally touched upon the function of the old inherited and synthetic preterite (which for its part developed from the PIE perfect) as the imperfective aspect focuses on the past action. In most lan-

Grammaticalization in the Germanic languages

ASPECT

125

TENSE

HAVE

(possessive) BE

(copula)

resultative Icelandic

perfective Swedish English

past

Dutch

German Luxembourgish

Fig. 7: Grammaticalization cline from aspect to tense.

Fig. 8: Functional expansion of the perfect in German, Dutch, Swedish and English (based on Dammel, Nowak, and Schmuck 2010: 348).

guages, a primarily aspectual opposition emerged between the (resultative or perfective) perfect and the (imperfective) preterite. This, to some extent, still holds for Icelandic, which furthermore makes a distinction between ‘be’ and ‘have’-perfects occurring with the same verbs: hann hefur komið ‘he has come’ means that he came over (maybe several times) in the past but is not present now. In Swedish and English, the original aspectual distinction is well preserved. Perfect and past tense form an opposition; definite past time adverbials such as yesterday, Swed. igår preclude the use of the perfect (see Figures 7 and 8). Engl. (and Swedish) I have lived in Berlin for 10 years means that I am still in Berlin and may stay for some more years. It cannot be translated into a German (or Dutch) perfect, because the sentence ich habe 10 Jahre in Berlin gelebt refers to a completed event (now I am living elsewhere). Thus, Dutch and especially German and Luxembourgish went further and de-aspectualized their perfect, causing it to compete with the preterite, which is already disappearing in consequence. Examples can be found in Figure 8 (for comparative data on frequencies of perfect in corpora see Section 3.3.1).

126

Damaris Nübling and Luise Kempf

All in all, German and Luxembourgish temporalized their aspect leading to the extinction of the preterite, at least in Luxembourgish and in South German dialects (Dammel, Nowak, and Schmuck 2010; Schmuck 2013; Gillmann 2011, 2016). Section 3.3.1 will continue with this topic.

.. Progressives Tab. 4: Progressive constructions in Germanic languages (adapted from Ebert 2000). prepositional construction (V = infinitive)

postural verb construction/ pseudo-coordination

English

be V-ing

Frisian

oan't V wêze

sit te V [infinite] (stean, lizze, hingje, rinne)

Dutch

aan het V zijn

zit (O) te V [infinite] (staan, liggen, hangen, lopen)

German

am V sein (beim V sein)

Danish

være ved at V

Swedish

Icelandic

other constructions

sidder og V [finite] (stå, ligge) sitter och V [finite] (stå, ligga)

hålla på att V (inf.)

vera að V

Most Germanic languages have developed periphrastic progressive constructions exhibiting different degrees of grammaticalization. The English ing-construction (she is reading) is the most grammaticalized, the German am Xen sein (sie ist am Lesen) presumably the least grammaticalized, as it is far from being obligatory and often restricted to intransitive verbs in spoken German (Lehmann 1991). These are prepositional constructions (am being a contraction of an ‘at’ + dem ‘def. article’). The Cologne dialect is known for frequent and rather complex progressives; in contrast to standard German, it allows for verbal dependents (das Buch ‘the book’) between the auxiliary and the full verb complex: sie ist das Buch am Lesen ‘she is reading the book’ (cf. Andersson 1989; Lehmann 1991; Bhatt and Schmidt 1993; Flick and Kuhmichel 2013; Flick 2016). The same holds for Dutch, using its aan het V zijnconstructions even more frequently than the Cologne variety. Prepositional constructions occur in many languages. Swedish favors the third type of progressive constructions, so-called pseudo-coordinations [V1 C V2]: These are subordinative constructions combining a postural (local), semantically weak verb (mostly ‘sit, stand, lie’) with the conjunction ‘and’ and a full finite verb reflecting the categories

Grammaticalization in the Germanic languages

127

of the auxiliary, cf. Swed. hon sitterfinite.pres. och läserfinite.pres. lit. ‘she sits and reads’ → ‘she is reading’ (Kvist Darnell 2008; Hesse 2009, 2011). ‘Sit’ is the most frequently used auxiliary. The person is typically really sitting during his/her activity (more than 80 % of the subjects are animate). ‘He is cooking’ is combined with stå ‘stand’, i.e., the postural verbs still carry some of their original meaning (persistence); stå and especially ligga also take inanimate subjects. In Dutch and Frisian, a further auxiliary, ‘run’, is in use, indicating that the person is walking around while doing something: Fris. Rindert rint in ferske te sjongen ‘Rindert walks singing a song’ (Tiersma 1999: 116). Furthermore, there are different degrees of obligatoriness. Here as well, English and Swedish use their specific constructions rather frequently. Finally, the combinability with tense and voice, with objects, and with specific types of full verbs differs cross-linguistically. A broad cross-linguistic comparison has yet to be conducted. (Norwegian behaves similarly to Swedish and Danish, Afrikaans similarly to Dutch.) Inchoative constructions occur as well. They are the most usual in Scandinavian, using ‘take’ as their auxiliary, which is often formally reduced, e.g., Swed. jag ska ta och handla ‘I am starting to shop’ (see Hesse 2009: 123‒132).

. Tense In Section 1.2, the grammaticalization of weak verbs in PG has already been mentioned. These verbs form their preterite with a dental suffix (Engl. love – loved) going back to an originally periphrastic construction with PG *dōn ‘do’. Today, several Germanic languages started again a grammaticalization cycle with highly obligatory ‘do’-constructions, above all English. This topic cannot be elaborated on in any more detail here. We will first have a short look at the temporalization of former aspect forms (3.3.1). As the Germanic languages developed rather differing future constructions, these will be treated in more detail (3.3.2).

.. The temporalization of perfect to a general past As mentioned in Section 3.2.1, German and Luxembourgish lost the aspectual distinction between the former perfective (perfect) and imperfective (preterite). Since Early New High German, the perfect forms have been increasing in frequency, while the older preterites are dramatically decreasing. Today, many preterites can be replaced by a perfect, albeit not vice versa since it is the perfect that extended its functional domain and therefore is expanding by replacing the preterite (Dentler 1997, 1998). This has even led to the complete loss of preterites in South Germany (Fischer 2015, 2018). The success of the formally more complex perfect constructions can also be explained by their ability to support the framing constructions. Framing

128

Damaris Nübling and Luise Kempf

constructions separate the components of grammatical constructions such as article and noun (das [große …] Auto ‘the [big ..] car’) or auxiliary and full verb (sie hat [den Wettbewerb dreimal] gewonnen ‘she has [the contest three times] won’ – ‘she won the contest three times’) and constitute a rather important typological feature of German syntax. This explains many specific morphological developments which did not occur in other Germanic languages (see section 4.1 and Ronneberger-Sibold 1991, 1993, 1997; Abraham and Conradie 2001). A bipartite perfect construction allows for framing, a single preterite does not. According to Solms (1984), the decrease of preterites started in late MHG; around 1500, the perfect occurrences surpassed the preterites. In Standard German, preterites often still co-exist and can mostly be replaced with perfect forms. However, the loss of the preterite is assumed to be continuing. Dammel, Nowak, and Schmuck (2010) analyzed written interviews given by native-speaking football players and trainers in Germany, the Netherlands, Sweden and the UK. Related to the occurrences of full verbs, German has a perfectto-preterite-ratio of 21 % : 79 %, Dutch of 46 % : 54 %, Swedish of 57 % : 43 %, and English of 68 % : 32 %. If auxiliary verbs are included, the perfect and preterite forms are almost equally frequent in German (52 % : 48 %). In the other Germanic languages, the frequency of the perfect is lower than that of the preterite: 40 % : 60 % in Dutch, 30 % : 70 % in Swedish, and 28 % : 72 % in English. The whole replacement process in German is due to the diachronic temporalization of the perfective aspect (Schmuck 2013: 213‒266). Other languages, especially Luxembourgish, Yiddish, Afrikaans, and Dutch, show similar tendencies (Dammel, Nowak, and Schmuck 2010; Abraham and Conradie 2001). As a recent phenomenon, German has started using so-called double perfect constructions: (15) German das hab(e) ich schon gesagt gehabt that have I already say. have. ‘I have already said that.’ The reason has not been explained yet. It is debated whether the construction is replacing pluperfect forms by exchanging the preterite auxiliary within with its perfect form (das hatte ich schon gesagt > das habe ich schon gesagt gehabt) or whether a new aspect category is arising (Buchwald-Wargenau 2012).

.. The development of future constructions In WALS, Dahl and Velupillai (2013) make the following statement: It is relatively rare for a language to totally lack any grammatical means for marking the future. Most languages have at least one or more weakly grammaticalized devices for doing so.

Grammaticalization in the Germanic languages

129

PG lacked inflectional future marking. Its daughter languages, however, established quite grammaticalized periphrastic constructions with a wide range of auxiliaries. English was the only language to develop a kind of morphologized future auxiliary as will is often cliticized to ’ll. Since some linguists only count inflectional futures as real futures, Dahl (2000: 325‒326) states: “[…] there is a ‘futureless’ area in Northern Europe which includes all Finno-Ugrian and Germanic languages except English”. Future is defined as “prediction […] of the speaker that the situation in the proposition, which refers to an event taking place after the moment of speech, will hold” (Bybee, Pagliuca, and Perkins 1994: 244). General indicators of a highly grammaticalized future are the following: First, the future construction should be obligatory and not replaceable by the present. This leads to high token frequencies of the auxiliaries. Thus, the English future is far more mandatory than the German future. Di Meola (2013) analyzed German texts and found that only 28.7 % of the clauses with future reference are realized with the future construction (even less in spoken German with 8.2 %). The rest appears in the present tense (similar data hold for Dutch). Second, the auxiliary should be as free as possible from modal or motional meaning; as a consequence, it also occurs with inanimate subjects. This criterion is rather well fulfilled by Icelandic munu and by English will/’ll and be going to, made visible by the fact that the subjects of both expressions can be inanimate: There will be a storm / There is going to be a storm. The difference is that the will-construction indicates an assumption, while the other is usually used if the storm’s black clouds are already on the horizon, i.e., its arrival is the immediate future. Third, the auxiliary can be combined with a wide range of verbs, even the verb expressing the source concept of the auxiliary itself. Thus, it is acceptable to say she is going to go to the cinema in English, whereas in Dutch, gaan cannot be followed by gaan: *ik gaa naar de bioscoop gaan. All in all, in Germanic there are two important grammaticalization paths: 1)

From modality to future: obligation volition

>

intention

(ability) modality

>

(remote) future (prediction)

generalised (persistence)

Obligation (shall – I shall not pass) and volition (will – I will not smoke) are the most important sources. By and by, they lose their modality and move towards intention, which still requires an animate subject (I shall get to London); thus, these two stages are agent-oriented, while the next stage, the future, also allows for inanimate subjects (it will rain). Usually, a remote future emerges (we will/we’ll all die).

130 2)

Damaris Nübling and Luise Kempf

From movement to future: physical motion

>

inchoative

GO

( de-andative)

COME

(de-venitive)

>

[intention]

>

immediate future

>

distant future

Another grammaticalization path starts with a physical movement (go, come) implying a goal situated in the future. During grammaticalization, the direction and spatial meaning is lost. According to Heine (1993: 97), pure intention is the next step. As de-venitive ‘come’-constructions often lack agent-oriented plans, Hilpert (2008) questions this stage. Thus, a direct path from motion to an immediate and afterwards distant future seems to be more plausible. Futures with motion verbs usually denote imminent events. These two paths are universally the most frequent. Bybee and Pagliuca (1987) describe a third one, which is used less. It starts with an inchoative copula, which is extended to a future verb. This path occurred in German. Table 5 provides an overview of the sources of future auxiliaries. As already mentioned, the English future marking is highly obligatory. In this sense, Dahl (2000: 325) states: “English normally has obligatory [future] marking […]. In fact, English turns out to be relatively isolated in the Germanic area in this respect”. The futurate present is not common and only used for expressing planned events: The train leaves at four p.m. Engl. shall is less grammaticalized than will; most advanced is the most reduced and completely de-modalised clitic ’ll, which confirms the co-evolution of meaning and form. The de-andative going to-future also developed a shorter gonna-expression. When both future constructions take the

Tab. 5: Sources of future auxiliaries in the Germanic languages. ) modals

English

) motion verbs

volition

obligation



will / ’ll

shall

be going to

Frisian

sille

Dutch

zullen

) copula 

gaan

German

(gehen)

Swiss G. dialects

gō go

werden chō cho

Yiddish

veln &

Danish

ville

skulle

Norwegian

ville

skulle

komme til å

skola

komma (att)

Swedish Icelandic

munu



& vern

fara að

Grammaticalization in the Germanic languages

131

same lexical material (she will have a baby – she is going to have a baby), the willfuture indicates a more remote future or an assumption (some day she will probably have a baby), while the going to-future points to an immediate future that will definitely occur (she is already pregnant, thus the baby is on the way). Frisian is the only Germanic language to use a former verb of obligation exclusively, sille, as a future auxiliary. Dutch uses zullen and gaan. Zullen is completely unrestricted and marks the general distant future including uncertain events. Dutch gaan marks the proximate future, sometimes even inchoative, and it marks certain events. Unlike Engl. going to, the following full verbs are restricted in Dutch; for instance, gaan cannot be combined with hebben, zijn and gaan ‘have, be, go’. As a further difference, mainly intransitive, non-agentive imperfective verbs (regenen, zitten, gebeuren ‘rain, sit, happen’) follow, most typically weather events: het gaat regenen: ‘it starts raining/it will soon be raining’. Rather often, the subject referent is inanimate (het ‘it’), i.e., intentionality usually is not included. English going to, however, takes mostly transitive, perfective and telic verbs (marry, say, put, die) with a high degree of agentivity and dynamism. Human subjects are common, i.e., intention is an important feature. These findings are based on collexeme analyses, which are part of Construction Grammar. Here, the strength of association between lexical items and grammatical constructions can be measured. Collexeme analyses are strictly corpus-driven and token-based; in this context, they intend to find out which full verbs and subjects (co-)occur most frequently with a given auxiliary (Hilpert 2008). Interestingly, the Swedish de-venitive construction komma (att) behaves like Dutch gaan. First of all, komma (att) is currently in the process of losing the infinitival marker att as an effect of the increased token frequency of this highly grammaticalized auxiliary (co-evolution). Originally, the construction started with komma till att + inf. (as it still holds for Norwegian) and was shortened to komma att [ɔ]. Modern spoken Swedish uses komma + inf. The de-venitive future is (originally) not speaker-oriented (often with inanimate subjects) and used for unplanned, even inevitable future events, typically weather phenomena: det kommer (att) regna ‘it is going to rain’. Thus, it constitutes a proximate future. Typical collexemes are nondynamic, non-agentive, and mostly atelic verbs, including s-passives (finnas, bli, glömma, kosta, hända, skratta ‘exist, become, forget, cost, happen, laugh’): Hon kommer att bli en rik kvinna ‘she will be a rich woman’. Hilpert (2008: 53) states: “The verb komma itself deictically focuses on the end point of a movement. The construction thus evokes the end point of an action, at which the agent’s intentions are less important than at the beginning of that action” (see also Hilpert 2006). However, some intentional verbs such as säga, göra, skicka, köpa ‘say, do, send, buy’ have recently been emerging as further possibilities and typically use the 1st ps.sg. for their subjects. By far the most frequent future auxiliary is ska (< modal skall) for intention-based actions (Dahl 1992). De-venitive verbs are also attested for Swiss German dialects, which normally use the present tense for future events. However, they show some incipient gram-

132

Damaris Nübling and Luise Kempf

maticalizations of gō ‘go’ and chō ‘come’, often with syntactic doubling (the second part being an uninflected particle and sometimes reduced to schwa): gō go and chō cho (Lötscher 1993; Glaser and Frey 2011). The gō construction takes animate subjects, mostly the first person, and the speaker (or agent) usually moves his/her body away, i.e., the original meaning is still present: ich gang ga der Onkel bsueche ‘I am going to visit my uncle’, si goot goge ychauffe ‘she is going to shop’ (syntactic triplication). The chō construction behaves differently. First persons as subjects are not very common, and if used they are followed by verbs denoting body states, e.g., si chunt z schwitze ‘she is getting sweaty’ (Bickel 1992). Most frequently, the subjects are inanimate. The deixis is inverted, i.e., somebody/something moves towards the speaker: s chunt cho rägne ‘it is going (“coming”) to rain’, er chunnt cho luege ‘he is coming to look’. All in all, Swiss German chō (cho) corresponds to Swedish komma (att) and Dutch gaan in that intention is not emphasized. Danish uses two modals as future auxiliaries: ville expresses the more neutral and common future, while skulle still carries some modal meaning. In the first person, it often implies a promise: jeg skal hjælpe dig ‘I will/shall help you’. Thus, ville is more grammaticalized than skulle, and both verbs also occur as modals. In contrast, Icelandic munu is completely de-modalized, i.e., it is a pure future verb. For inchoative and proximate futures, fara að ‘go to’ is used: hún fer að horfa á sjónvarpið ‘she is going to watch TV’. German chose a special grammaticalization path. Although Middle and Early New High German already had quite grammaticalized modals with possible future reference (sollen, wollen), the inchoative copula werden was rather quickly extended to a future auxiliary from the 14th century onward and finally replaced the modals (Bogner 1989; Heine 1995; Diewald and Wischer 2013). First, a present participle was added, which later lost its final -d and was reanalyzed as an infinitive: sie wird/ward sprechend(e) ‘she starts/started speaking’ (inchoative) > sie wird/ward sprechen ‘she will speak’ (future)/*’she started speaking’. When the stage of the future meaning was reached, the past tense of this construction died out (Wiesinger 2001). Afterwards, sollen and wollen were again confined to their modal function (which happened, according to the corpus-based study of Bogner [1989], in the 16th century). The highly successful auxiliary werden had the advantage that it was loaded with neither modal nor motional meaning. It just meant ‘become’ and was already poly-grammaticalized, because it was involved in a wide range of periphrastic constructions (werden is also used as copula, passive auxiliary, epistemic modal verb, and subjunctive marker; [see Diewald and Wischer 2013]). Nevertheless, the werden-future is mostly used in written texts (28.7 %). Most German dialects manage without future auxiliaries. Until now, the futurate present has a rate of 92 % in spoken German and 71 % in written texts. The claim made by Leiss (1985) that the German werden-future is a loan construction from Czech budu + inf. has been rejected by Harm (2001) and Diewald and Habermann (2005). Counter-arguments are the low status of Czech, which generally did not serve as donor language for German;

Grammaticalization in the Germanic languages

133

furthermore, werden futures hardly occur in spoken language and dialects, and their emergence in written German was not in the German-Czech contact zone. Diewald and Habermann (2005) rather claim that Latin was the contact language, i.e., translations in the 14th and 15th centuries from Latin to German created the need for pure future constructions. This fact favored werden and not the semi-grammaticalized modals sollen and wollen. In spoken German, a slightly grammaticalized andative future construction (motion-cum-purpose construction) occurs, which still requires motion of the body: sie geht singen (lit.) ‘she goes sing’ means that she has to move her body to another place in order to sing; this motion can also include the use of a vehicle (a similar development is reported for Norw. gå og + full finite verb, [cf. Hesse 2011: 91‒103]). Yiddish chose another extraordinary path: According to Weissberg (1988) and Abraham (1989) it combined forms of veln ‘will’ and wern ‘become’ to form a new suppletive future auxiliary paradigm (the former vil-forms are printed in bold): ikh vel (< vil) loyfn ‘I’ll walk’, du vest (< verst), er vet (< vert), mir veln (< veln), ier vet (< vert), zey veln (loyfn). It makes sense – and represents a case of persistence – that the first persons should use ‘will’, while ‘become’ fills most of the rest of the paradigm. The lowering of i>e in the vil>vel-forms can be explained by paradigmatization as it makes the new paradigm more homogeneous.

. Mood: The subjunctive All Germanic languages originally inherited synthetic indicative, optative, subjunctive, and imperative forms. With regard to the subjunctive, all languages except Icelandic, Faroese, and German gave up the old synthetic subjunctive II forms (Table 6). German, however, has also developed an analytic construction with würde

Tab. 6: The subjunctive: synthetic and analytic forms. Language

Synthetic

analytic (auxiliary, rd ps.sg.)

English

would [wʊd] + inf.

Frisian

soe [suə/su:] + inf.

Dutch

zou [zɑu] + inf.

German

with most frequent verbs

würde (täte) + inf.

Luxembourgish

géif (déit) + inf.

Danish

skulle + inf.

Swedish

skulle + inf.

Faroese

all verbs

Icelandic

all verbs

134

Damaris Nübling and Luise Kempf

(the subjunctive of werden) as its auxiliary (Smirnova 2006). Here, token frequency is the most decisive factor: auxiliaries, modals, and other highly frequent verbs preserve the old forms (layering). The other Germanic languages have already lost their synthetic forms and also use the subjunctive forms of their future auxiliaries. In some cases, these forms have been dramatically reduced (Engl. would lacks [l], the Frisian and Dutch forms have lost their dental suffix). Luxembourgish géif is the subjunctive of ginn ‘become’ as copula and passive auxiliary, the competitor déit of doën ‘do’. The same holds for German dialects, which often use tät(e) instead of würde.

 Grammaticalization of complex constructions When looking at the grammaticalization of complex constructions in the Germanic languages, two factors have to be taken into consideration, one being the basic word order, the other the impact of literary language. Both play ambivalent roles in grammaticalization. Word order in Germanic languages can on the one hand prevent grammaticalization processes, e.g., the amalgamation of main verb and auxiliary, but on the other hand it can be the target of grammaticalization itself. Written language with its need for clarity (overtness) has brought about, e.g., highly paradigmatic conjunction systems, but can also prevent advanced grammaticalization processes where, e.g., relativizers would lose their inflection or be omitted altogether.

. Word order In terms of word order, the Germanic languages are characterized by their famous verb-second feature (V2), which requires the finite verb to appear in second position in declarative main clauses (see Swedish examples in [16]) and, depending on the language, other clause types as well.4 This feature occurs in all modern Germanic languages except one: English has generalized SV order into an obligatory rule for declarative sentences (Faarlund 2008), thus arriving at V3 structures when a nonsubject constituent is fronted (see the translation of [16b]).

 Harbert (2007: 399) points out that this feature is a “hallmark of the GMC languages, though it is not exclusive to them; […] Old French was a (strict) V2 language […] as are a number of other Romance varieties, medieval and modern, and it may perhaps be interpreted as an areal phenomenon.”

Grammaticalization in the Germanic languages

135

(16) Swedish a. Jag dricker kaffe på morgonen. ‘I drink coffee in the morning.’ b. På morgonen dricker jag kaffe. ‘In the morning, I drink coffee.’ Another highly characteristic feature of Germanic languages is marking polar questions by means of fronting the finite verb. Typologically, it is quite uncommon to use word order alteration for marking polar questions.5 All Germanic languages exhibit this feature and in most Germanic languages, the verb position generally functions as a marker of sentence moods/sentence types (declarative, interrogative, imperative, optative, exclamatory). In the diachronic development of finite verb positions in Germanic, grammaticalization processes as well as independent reanalysis processes seem to have been at work. The following paragraphs will describe the central developments and then discuss which aspects of grammaticalization can be observed. PG, like its predecessor Proto-Indo-European, is generally considered an SOV language (cf. Pittner 1995: 211; Harris and Campbell 1995: 214). Verb-final declarative sentences are attested in the early Germanic languages (Runic Germanic, Ancient Nordic, Gothic, Old English [OE], Old High German [OHG], and Middle Dutch, cf. Harbert [2007: 398]; Faarlund [2008: 1708]). The verb-final pattern was dominant in PG and in Gothic (Eythórsson 1995: 20‒24; Kiparsky 1995; Harbert 2007: 353‒354, 360), yet V2 already occurred as an alternative construction. Originally, only auxiliaries and other unaccented words were placed in the second position, more precisely: cliticized to the first independent element in the sentence (Wackernagel’s law, Hopper 1975: 18‒19, 45). It is assumed that this prosodic principle was reanalyzed as a syntactic condition and consequently it was extended to all finite verbs (cf. Faarlund 1990: 59f; Harris and Campbell 1995: 233). If these assumptions are correct, the early Germanic languages can be considered in a state of transition. This situation, in addition to the residual inflectional richness, may account for the considerable word order variation displayed by Gothic, Ancient Nordic, Old English, and Old High German (Hopper 1975: 22; Mitchell 1985: I, 405 f.; Pintzuk 1993; Eythórsson 1995: 2‒25, 104, 294; Kiparsky 1995; Harbert 2007: 361, 405, 407). In addition to verb-final and proper V2-constructions, there are clauses where pronouns appear between the topic and the finite verb, both in Old English and in Old High German (Lippert 1974; Admoni 1990; Dittmer 1992; Dittmer and Dittmer 1998; Greule 2000; Harbert 2007: 409; Axel 2009; Szczepaniak 2013: 743 f.). Yet another option for declarative sentences was to leave the prefield vacant, resulting in V1-

 In Dryer’s (2013) sample, only 13 of 955 languages display the feature, including seven Germanic languages (Norwegian/Bokmål, Swedish, Danish, English, Frisian, Dutch, and German), but also Spanish and Czech as well as two Austronesian and two South American languages.

136

Damaris Nübling and Luise Kempf

order. These constructions served specific pragmatic purposes: Since the prefield is a universal topic position, they were used in existential sentences. They were moreover used to introduce new discourse referents or situations, at the beginnings of new sections, at turning points in the story − thus often together with telic motion verbs − (Hinterhölzl, Petrova, and Solf 2005; Petrova and Solf 2008), or when emphasizing the entire content of the sentence (Lenerz 1985: 119; Pittner 1995: 223). (17) OHG (Tatian 6, 1, Lenerz 1985: 103) Uuarun tho hirta in thero lantskeffi uuahhante Were there herdsmen in the country watching ‘There were herdsmen on watch in the country.’ (18) OE (Faarlund 2008: 1708) Wæs Hæsten þa þær cumen mid his herge was Hæsten then there come with his army ‘Hæsten had then come there with his army’ From discourse-pragmatic conditioning as shown in the examples above, V1-order later grammaticalized to indicate specific sentence moods. More precisely, V1 became a part of complex constructions that would, in modern Germanic languages, signal polar interrogatives (Are you with me?), imperatives (Stay away from Jim!), conditional sentences (Were I to choose between … German Hätte ich Geld … ‘If I had any money …’), and optative sentences (German Hätte ich doch bloß …! lit. “Had I  only …”, ‘If only I had …’). In these constructions, V1 signals sentence mood cooperatively with other means, viz. intonation/punctuation, inflectional mood (Section 3.4), and modal particles (Section 5) (cf. Szczepaniak [2013, 2015] on German). As an underlying cognitive scenario for this development of V1, Lenerz (1984: 153) suggests that the empty topic position had characterized the respective sentences as entirely rhematic, i.e., not making a statement on any given theme. It was thus possible to pragmatically infer that the whole proposition was questioned (→ interrogative) or set into irrealis modality (→ imperatives, conditionals, optatives). Once these connections had tightened, the realis use of V1-senctences by and large declined, confining the pattern to irrealis modalities. In declarative sentences, the prefield had to be filled, giving rise to expletive constructions (There were …, German Es ritten drei Reiter … lit. “It rode three horsemen …”, ‘Three horsemen rode …’). The specialization does not seem to have affected spoken language to the same extent: In German, realis V1 still occurs in jokes, vibrant narratives, and exclamatives (cf. Lehmann 1991, § 5.2; Auer 1991; Önnerfors 1993, 1997; d’Avis 2013). This could be accounted for by the additional means that spoken language has at its disposal, such as intonation, facial expression and contextuality, which can

Grammaticalization in the Germanic languages

137

make up for a lack of semiotic clarity in word order. The realis V1-pattern is also retained in modern Icelandic and in Yiddish (Vikner 1995; Hall 1967: 32, 149‒152), possibly having to do with the preserved inflectional mood in the former (section 3.4), while in the latter, it seems to be influenced by orality − Hall (1967) and Weissberg (1988: 155) quote examples from jokes and folk songs. (19) Yiddish (Hall 1967: 149) A yid is amol gekumen kayn Pariz. Hot im Zayner a bakanter gefirt A jew is once come to Paris. Has him His a acquaintance led oyf dem yidišn beys-oylem. on the Jewish cemetery. ‘A Jew once came to Paris. One of his acquaintances took him to the Jewish cemetery.’ Modern English has grammaticalized word order not only to mark sentence types, but also to express syntactic relations − which compensates for the poverty of nominal inflection (cf. section 2.1). Subjects are marked by preceding the finite verb (strict SV). This development, however, heavily limits information structuring possibilities (see Los and Van Kemenade [2012] for more detail). This situation has promoted the evolution of a number of compensation strategies, besides expletive there also, e.g., topicalization and cleft sentences. Also, the rise of specific passive constructions can be understood in this context, e.g., prepositional passives (This bed has been slept in) or recipient passives (see Section 3.1). Yet another aspect of grammaticalizing word order lies in marking dependent clauses. German, Dutch, and Afrikaans have retained the ancient verb-final order (junk, Lass 1990) in dependent clauses (cf. Harris and Campbell 1995: 215), thus taking the connection between verb position and sentence mood a step further: While in main clauses, verb first or second position helps to encode sentence mood (conditional, declarative, etc.), in dependent clauses, verb final order signals the lack of independent sentence mood (cf. Szczepaniak 2015: 119‒121). The functionalization of the three verb placement options (V1, V2, V-final) can be considered an instance of paradigmatization. In a similar vein, obligatorification (i.e., a limited choice of verb position pattern) and fixation of the once relatively free (i.e., pragmatically customizable) word order can be mentioned. There is still quite some variation in how to construct, e.g., an optative or a conditional sentence, but word order for an unmarked declarative, polar interrogative, or imperative sentence is fairly fixed. Dependent verb final order in German (even though already at 74 % in Old High German) has increased under the influence of written language in the times following the invention of letterpress printing (Härd 1981: 116‒119; Ebert 1986; Takada 1994; Polenz 1991: 200 f.; Szczepaniak 2015: 116 f.). It served as an important signal for the reader to decode the ever-increasing hypotactic sentence structure, which

138

Damaris Nübling and Luise Kempf

peaked in the 17th century (Admoni 1967: 166 f., Admoni 1985: 1540; Szczepaniak 2015: 111). In this context, it is interesting to note that in Afrikaans dependent verb final is fading in spoken language, while still being prevalent in writing (Ponelis [1993: 340, 345], pointing out the influence of English). The final position of the finite verb in dependent clauses results in a bracketing construction composed of the initial complementizer (opening bracket) and the verb (closing bracket). No element can be positioned outside this bracket in modern German dependent clauses.6 In addition to the dependent clause bracket, German has also developed bracketing constructions for main clauses. They are formed by either a periphrastic verb construction (e.g., perfect, cf. section 3, Sie hat ihn gestern nicht gesehen, lit. “She has him yesterday not seen”), a particle verb (Sie nimmt diese Tabletten häufig ein/mit, lit. “She takes these pills often in/with”, ‘She often takes these pills/along’), or a predicative copula construction (Sie ist seit drei Jahren Lehrerin, lit. “She is since three years teacher.F”, ‘She’s been a teacher for three years’). Main clause brackets have evolved later than dependent clause brackets and are less rigid, but their use has intensified in literary language (Schildt [1976: 271] detects a rise from 68.1 % around 1500 to 81.4 % around 1700; in the other cases, no bracket was formed, or some constituents were exbraciated). A third type of bracketing occurs in the NP framing constructions (den schönen Hund ‘[the pretty dog]A’, cf. section 2.1, example (20) below, and Szczepaniak [2011: 107]). All three types of bracketing constructions (main clause, dependent clause, and NP bracket) can be conceived of as structuring devices which guide the attention of the reader by packaging associated information (cf. Thurmair’s [1991] article on the famous notion of ‘waiting for the verb’). The coherence among the two bracket members, in each of the cases, is strengthened by the fact that usually neither component is unambiguous by itself. This is exemplified by the NP in (20). The glosses show the possible meanings of each component. Neither component marks ‘accusativ singular masculine’ unambiguously, but the combination does, since this is the only option all components share. (20) New High German d-en -.. -. ‘[the pretty dog]A’

schön-en dog(.)-. pretty-.. dog(.)-. pretty-.

Hund-Ø dog(.)-. dog(.)-.

 At least not in unmarked written German; cases of exbraciated adjuncts seem acceptable for expressing afterthoughts in spoken German, e.g., weil er ihn gestern gesehen hat im Park lit. “because he him yesterday seen has in the park”, ‘because he saw him yesterday in the park’.

Grammaticalization in the Germanic languages

139

This situation can be compared to the cooperation of different means marking sentence mood: We are looking at a semiotic interdependency, or a genuinely constructional meaning, where the whole signifies more than the sum of the parts (hence Ronneberger-Sibold’s [1980: 56‒64, 1991, 1993] suggestion of a combining type in morphological typology, which goes back to Werner [1979]). The rise of the bracketing constructions, undoubtedly an instance of emerging grammar, is difficult to capture in terms of grammaticalization. In his article on Word order change by grammaticalization, Lehmann (1992) views word order change mainly as a consequence of the grammaticalization of other constructions. Adopting this view, the rise of bracketing constructions can be analyzed as an epiphenomenon of the grammaticalization of other constructions such as periphrastic verb constructions (e.g., past perfect, passive, future) or the definite and indefinite articles: As those elements have become more grammaticalized, their syntagmatic variability has been reduced. This analysis, however, leaves the question unanswered why pairs of associated elements such as auxiliary and full verb have come to be positioned discontinously. If the grammaticalization of periphrastic verb forms were the driving force of word order fixation, then one might have expected associated elements to have gradually been placed closer to one another (as it occurred, for instance, in the grammaticalization of the subordinating conjunction obwohl, see De Groodt [2002] and section 4.4 below). Different positional options would have been possible and in fact can be observed, for instance, in Yiddish (see the examples in (21) below). In German, bracket constructions prevent further grammaticalization of the periphrastic verb forms: Due to their placement in the opening and closing bracket, respectively, auxiliary and full verb are not regularly adjacent in independent clauses, thus no cliticization can occur. Another unanswered question with the “epiphenomenon analysis” would be why many different framing constructions (see Ronneberger-Sibold 1994) have coevolved in German. Their collective/successive rise does not appear to be a coincidence, but rather seems to be connected to a common purpose − likely that of structuring information and guiding the recipient’s attention. The opening and closing elements of the brackets mark the beginnings and endings of informational units (cf. Ronneberger-Sibold 1994: 115). An alternative analysis would be to view the evolution of bracketing constructions as a grammaticalization phenomenon in its own right, where an initially pragmatic tendency to indicate the boundaries of informational units has become a grammatical(ized) word order rule. Yiddish split from German before the main clause bracket was fully developed and never grammaticalized it. The main verb can take various positions, but forming a full frame is disfavored (Weissberg 1988: 152) or even ungrammatical in ditransitive constructions. Among the prevalent types, “VO order has become increasingly frequent through time” (Harbert 2007: 360; see also Birnbaum 1979; den Besten and Moed-van Walraven 1986: 125f; Santorini 1993: 234).

140

Damaris Nübling and Luise Kempf

(21) Yiddish a. Ix hob Moišn gezen / I have Moiše seen / b. Ix hob gezen Moišn I have seen Moiše c. Er hot gegebn Moišn dos bux He has given Moiše the book d. Er hot Moišn gegebn dos bux ??Er hot Moišn dos bux gegebn The lack of bracketing constructions, or the alternation OV − VO, as well as the possibility of V1 realis sentences point to a rather free syntactic system. This can be seen in connection with the proximity to spoken language: As Fleischer (2004: 217, 2010: 164f) points out, modern Eastern Yiddish started standardizing relatively late (in the late 18th century) and it is based on spoken varieties.

. Complement clauses Complement clauses in Germanic languages instantiate the most grammaticalized type of clause combining in Hopper and Traugott’s (2003: 177‒178) cline, shown here in Figure 9. They are usually morphosyntactically marked as dependent, and they are embedded because they function as arguments of their matrix clauses, e.g., the subject (That the Titanic sank was unexpected).

parataxis

>

hypotaxis

>

subordination

−dependent

+dependent

+dependent

−embedded

−embedded

+embedded

Fig. 9: Cline of clause combining (Hopper and Traugott 2003: 178).

To be more precise, they conform best to the right-most “cluster point” of the cline, but the match is not perfect: Not all instances are marked as dependent (the complement clause in (22) has main clause syntax), and also the equivalence to a subject complement may not be absolute, as inversions are not always possible (see (23)): (22) German Er sagt, er weiß nichts davon. He says he knows nothing thereof.

Grammaticalization in the Germanic languages

141

(23) English *Did that the Titanic sank surprise you? (cf. Hopper and Traugott 2003: 194, referencing Koster 1987) The degree of grammaticalization of complement clauses can be measured by a number of criteria (see Lehmann 1988: 217), one of them being the morphology of the predicate. English, offering a four- or even five-way contrast, displays an unusually rich system; the options are listed below (cf. Noonan 2007: 53‒55, 61, 145, 147). They are arranged by increasing desententialization or decategorialization, i.e., loss of verbal inflectional properties. ‒ indicative (that- or if-clause, e.g., I doubt if Martha knows) ‒ “a rather moribund subjunctive” (I insisted that Rea live here, Noonan [2007: 145, 61]) ‒ infinitive (to or bare, e.g., They saw the car pull out) ‒ participial clause (Sam saw Alison defeating Alexander) ‒ gerundial or verbal noun clause (Alison’s defeating Alexander is significant) The examples also show varying degrees of what Lehmann (1988: 204 ff) terms “interlacing”, i.e., the combined clauses can share participants, tenses, or moods. For instance, in They saw the car pull out, the car formally serves as the object of the matrix, but logically it is the subject of the complement clause, as can be seen in They saw that the car pulled out (cf. Hopper and Traugott 2003: 178). The sharing of a participant, in this example, is syntactically realized via raising the complement clause subject the car to a matrix clause object (cf. Noonan 2007: 79‒83). Germanic languages have grammaticalized subject-to-object raising to varying degrees. While in English, a few matrix predicates obligatorily require raising (want, let, make), in German raising is more restricted, although it is obligatory with lassen ‘let’ (→ bare infinitive in complement clause) and possible with some perception verbs (→ bare infinitive) as well as with scheinen ‘seem’ (→ zu-infinitive in complement clause) (cf. Hawkins 1986: 75‒85; Noonan 2007: 82; Horie 2008: 989). (24) English *I want that he steals the chicken. I want him to steal the chicken. I saw him steal the chicken.

German Ich will, dass er das Huhn stiehlt. *Ich will ihn das Huhn zu stehlen. Ich sah ihn das Huhn stehlen.

A third aspect illustrated by the examples is the overtness of linking: It ranges from if (a complementizer signaling irrealis modality), through that and to (universal subordinators of finite/infinite clauses) to asyndetic linkage. That and to have developed via canonical grammaticalization paths. English to and German zu derive from the PG allative preposition *tō. In Old English and Old High German, it combined with a nominalized verb form inflected for dative, e.g., to wyrcanne ‘to perform’. The allative directional meaning of to/zi had extended to a meaning of purpose:

142

Damaris Nübling and Luise Kempf

(25) OHG (OtfridV, 12, 27, cited in Haspelmath 1989: 289) er ward zi manne, bi si zi irsterbanne he became to man with them to die ‘[Christ] became a man in order to die with them.’ The construction then grammaticalized, the nominalization becoming an infinitive (losing its dative inflection, decategorialization) and to/zi becoming an infinitive marker and a complementizer when used with a complement-taking matrix verb (desemanticization). For more detail and cross-linguistic comparison see Haspelmath (1989); Hopper and Traugott (2003: 188‒190); Heine and Kuteva (2004: 37f), and Noonan (2007: 58, 69, offering numerous references). As a finite clause complementizer, Germanic languages have grammaticalized the neuter nominative/accusative form of the demonstrative pronoun, i.e., the PIE form *tod. This form is the origin of Gothic þat-ei (here combined with the particle ei ‘in that case, then, thereby, etc.’), West Germanic that/dat/dass, and North Germanic að/at/att (Harbert 2007: 415‒417). Traditionally, a reanalysis of a sentence structure of the following type has been assumed (e.g., by Behaghel 1928: 766): “She said that: there is no money” > “She said that there is no money” (Heine and Kuteva 2004: 106). This view has been questioned and refined e.g., by Lühr (2004) and by Axel-Tober (2012, 2013), who, like Hopper and Traugott (2003: 190‒194) and Szczepaniak (2011: 167‒169) emphasize the importance of correlative structures, i.e., constructions with both an anticipatory and a resumptive that, cf. (26). (26) OE (ChronA (Plummer) 755.23, gloss by Hopper and Traugott 2003: 191) cyninges Þegnas þe Ða on morgenne gehierdun þæt þæs when/then in morning heard:  : king’s Thanes who Him beæftan wærun þæt se cyning ofslægen wæs, Þa ridon hie Him behind were  the king Slain was then rode they þider thither ‘When in the morning the king’s thanes who had been left behind heard that he had been killed, then they rode up there.’ − or − ‘Then in the morning the king’s thanes heard this (these thanes had been left behind earlier) that the king had been slain. Then they rode up there.’ Axel-Tober (2012: 91‒126, 2013: 259‒261) suggests an analysis that can be roughly summed up as the grammaticalization path demonstrative pronoun > relative pronoun > relative particle > complementizer. Modern English complementizer-that supports both, factive and irrealis interpretations of the complement clause, depending on the matrix predicate (cf. [27] vs. [28]). The complementizer if, by contrast, does not support a factive interpretation of the complement clause with the matrix predicate knows, but instead leads to an

Grammaticalization in the Germanic languages

143

irrealis reading with both matrix predicates, neutral to know as well as to doubt, “which expresses a negative propositional attitude” (Noonan 2007: 115). (27) Alf knows that Zeke came. (28) I doubt that Zeke came. (29) Alf knows if Zeke came. (30) I doubt if Zeke came. The complementizer if, according to Noonan (2007: 57) derives from the conjunction if. This path conforms with Hopper and Traugott’s (2003) cline of clause combining, the conjunction and the complementizer marking hypotaxis and subordination, respectively. As a conjunction, if heads conditional clauses. As a complementizer, it retains traces of this semantics, marking its clause as non-actual or irrealis (Noonan 2007: 115; cf. the example above, where it cancels a factive reading). Due to this persistence, complementizer-if can be assessed as being less grammaticalized than complementizer-that. The German complementizer ob (cognate and functionally similar to if) may not have developed along the path conjunction > complementizer, but from the polar question particle oba (see section 4.4). The Yiddish complementizer vos, as illustrated in the example below, displays an interesting evolution: Fleischer (2004: 233) suggests the grammaticalization path interrogative pronoun > relative pronoun > relative particle > complementizer. Note that the last three steps parallel the scenario suggested by Axel-Tober (2012) for German dass. (30) Yiddish (Fleischer 2004: 239) s’ iz efsher an aveyre, vos mir lakhn azoy It is perhaps a sin  we laugh so ‘It is perhaps a sin that we laugh that way’

. Relative clauses Relative clauses (RCs) in Germanic are predominantly of the embedded, externally headed, postnominal type, which we will focus on (for an overview on this and other types see Nikolaeva [2006]; for a more comprehensive account see Lehmann [1984, section III.]). RCs can be linked to their matrix clauses by a relative pronoun (inflected, e.g., who, whom, whose), by a relative particle (uninflected, e.g., that), or asyndetically. Prototypical relative pronouns (as in [31]) carry out three functions at once (Lehmann 1984: 246‒252; Zifonun 2003: 80; Pittner 2009: 747; Zifonun 2017: 1751): In addition to a) marking subordination, they also express b) the syntactic function

144

Damaris Nübling and Luise Kempf

Tab. 7: Central types of RCs in Germanic languages and varieties. pronominalization

gap formation

relative pronoun

German



relative particle

with resumptive pronoun: Engl. the man that I saw Zurich Germ. de Bueb, wo mer em Swedish: en organisation som es Velo versproche händ lit. vi kontrollerar ‘an organization “the boy, where we him a bicycle that we control’ (Teleman et al. promised have”, ‘… to whom we : ) had promised a bicycle’ (Salzmann : ) spoken Engl. it’s something that I keep returning to it (Miller )

Asyndeton

with resumptive pronoun: spoken Engl. some eggs you’re not sure about their age (Miller )

der Mann, den ich sah ‘the man whom I saw’

Engl.

the man I saw

within the RC (e.g., direct object as in [31]) and c) the connection (i.e., referential identity) to the head noun (e.g., by gender, number, or animacy agreement). (31) German Der Mann, den ich sah :.. man(..) ... I saw ‘the man whom I saw’ The three functions do not need to be combined within one form; b) and c) may also be realized by a resumptive pronoun as in it’s something that I keep returning to it (spoken English, Miller [2006: 509]). None of the three functions needs to be expressed explicitely (consider the man I saw). The central types of RCs in Germanic languages and varieties are presented in Table 7, a modified version of Zifonun’s (2007: 207; 2017: 1767) overview. The rows are organized by different linkage types; the first two linkage types are explicitely subordinating, the last is not (cf. Zifonun 2017: 1740). The columns distinguish between pronominalization and gap formation: in the first case, an anaphoric pronoun (relative or resumptive) expresses the syntactic function within the RC as well as the referential identity with the head noun; in the case of gap formation, these functions remain implicit (note that this is not possible with relative pronouns). Resumptive pronouns are a feature typical of substandard or dialectal varieties (of Germanic and other European languages, [cf. Zifonun 2007: 207]), but they also occur e.g., in Romanian, where they combine with a relative. More typically, they

Grammaticalization in the Germanic languages

145

occur in combination with relative particles. In general, resumptive pronouns make for fairly analytic constructions, suiting well the needs of spoken language in that they are easier to plan and parse than more synthetic constructions. Asyndetic relative clauses, too, seem to be associated with spoken varieties, examples will be discussed below. Relative pronouns, in turn, are a famous typological quirk of Standard Average European languages (Comrie 1998; Haspelmath 2008: 1494 f.). Only 12 out of 166 languages in the WALS sample use relative pronouns in relativization on subjects, 10 of them being European standard languages (Comrie and Kuteva 2013). The association between relative pronouns and (written) standard languages appears quite coherent from a functional viewpoint. With their high informational density, relative pronouns offer both the explicitness and the brevity favored in writing. Their double nature can, however, give rise to syntactic conflicts: Due to their role as subordinators, relative pronouns have to precede the RC. This can lead to pied-piping – arguably (Miller 2006: 509) a construction difficult to process (cf. some eggs about whose age you’re not sure as opposed to the spoken version some eggs you’re not sure about their age). Most of the Germanic languages apply various types of relativizing techniques, layering in their systems long-standing as well as more recently grammaticalized relative constructions. All of the early Germanic languages make use of relative particles: Gothic ei, Old Norse es/er, OE þe, OHG the (Hopper 1975, cf. also Kiparsky 1995: 150). Relative pronouns are attested in the oldest East and West Germanic varieties, i.e., in Gothic, OE, OHG, and Old Saxon (Harbert 2007: 436). All of these idioms also apply combinations of a relative pronoun and a relative particle, see Harbert (2007: 434‒436). As for asyndetic relative clauses, Hopper (1975: 30), Lehmann (1984: 378 f., 1995: 1207 f.) and Zifonun (2003: 73) assume that they had already existed in PG, while Harbert (2007: 436) remains doubtful, pointing out that they are not found in Gothic or OE. Early Germanic relative pronouns derive from demonstratives (cf. section 2.4.2 on demonstrative > definite article; → polygrammaticalization). Though crosslinguistically not uncommon (Heine and Kuteva 2004: 114 f.), this path sets Germanic apart from the surrounding European languages (Celtic, Italic, and Slavic, see Harbert 2007: 436). Relative pronouns based on demonstratives (“D-relatives”) are, according to Harbert, the basic type in Gothic and OE, and they also play a substantial role in OHG (cf. Schrodt 2004: 176 f.). In the course of its history, German vastly expanded the use of D-relatives, at the expense of relative particles and asyndetic RCs, which were both still common in OHG (see examples) but given up by the end of the Early New High German period. (32) OHG (Otfrid 1.26.9, cited in Schrodt 2004: 175) in doufe the unsih reinot ther ginadigo got in baptism  us purifies the gracious God ‘in baptism, in which God the gracious purifies us’

146

Damaris Nübling and Luise Kempf

(33) OHG (Otfrid 1.17.74, cited in Schrodt 2004: 174) in droume sie in zelitun then weg Ø sie faran scoltun in dream they them told the way Ø they travel should ‘they [the angels] revealed to them [the Magi] in their dreams the route that they should take’ This development, however, holds only for Standard German, while contemporary German dialects display rather different systems (Fleischer 2004, 2005). As opposed to the Standard system, they make much less use of relative pronouns but favor relative particles instead, sometimes in combination with pronoun retention (cf. the Zurich German example in Table 7). This distribution of RC types among Standard and dialects conforms to the general tendencies of spoken and written language sketched above. Since the Standard German system is not paralleled by any of the German dialects, Fleischer (2005: 184‒185) assesses it as unnatural and likely to have been shaped by the process of standardization, which was influenced by other European standard languages. Another interesting result of Fleischer’s (2004, 2005) surveys is that the investigated relativization systems (12 contemporary German dialects + Yiddish) conform throughout to Keenan and Comrie’s (1977: 66) Noun Phrase Accessibility Hierarchy, reproduced below, although only the categories up to (and including) oblique arguments were investigated.  >  >  >  >  >  subject > direct object > indirect object > oblique argument > genitive/possessor > object of comparison The hierarchy refers to the syntactic function within the RC, i.e., the position relativized into (for instance, the subject position in the boy whoS lived). It states that the syntactic functions are accessible to relativization in descending order. When lower positions on the hierarchy are relativized, they typically require more explicit markers than higher positions. For instance, the Alemannic dialects of Oberrotweil and Basel use the relative particle wo in all positions, but they support it by pronoun retention and other means in the oblique position and, in the Basel system, also in the indirect object position (see Table 8). (34) Basel dialect (Fleischer 2004: 225, citing Binz 1888: 61) ha Dä Ma, woni im s Mässer gä the man _I him the knife given have ‘the man to whom I have given the knife’ Alemannic wo, derived from the adverb wo ‘where’, belongs to a more recent layer of relativizers; it gradually replaced its predecessor da ‘there’, beginning in Early

Grammaticalization in the Germanic languages

147

Tab. 8: Types of relative clauses in Oberrotweil and Basel Alemannic and Yiddish (cited from Fleischer 2004: 227, 2010: 164; P = preposition, pr. = resumptive personal pronoun). SU

DO

IO

OBL

Oberrotweil

wo

wo

Wo

wo… + P + pr. wo… + da + P

Basel

wo

wo

wo + pr.

wo… + P + pr. wo… + da + P

Yiddish

vos (+ pr.) velkh

vos (… + pr.) velkh ver

vos … + pr. velkh ver

vos… + P + pr. P + velkh ver

New High German (cf. Zifonun 2003: 75). It is used in all other dialects of the sample as well (not in Yiddish and the Lubica variety), but predominantly in the right-hand positions of the hierarchy. This brings Fleischer (2004: 233 f.) to the conclusion that its grammaticalization started out with obliques and spread from there. The opposite seems to hold for the was-type, instantiated by Yiddish vos, which presumably spread from higher positions to lower ones. Yiddish is among those languages in which “W-relatives have completely displaced D-relatives” (Harbert [2007: 438], also naming English and Afrikaans). Starting out as an interrogative pronoun, was ‘what’ developed into a relative pronoun and in Yiddish, losing its inflection, went on to become a relative particle. Yiddish velkh (< ‘which’) and ver (< ‘who’) retain their inflection. The former belongs to the written language (Lockwood 1995: 126; Harbert 2007: 438) and is presumably borrowed from German, where it is “a creation of chancery style, most likely in imitation of French lequel or Latin qui” (Lockwood 1968: 248, cited in Fleischer 2004: 232); for the latter, Heine and Kuteva (2006: 214) suggest English influence. For the W-relatives in English, Dutch, and Afrikaans, different kinds of Romance influence (French, Latin, Portuguese) are discussed (see Harbert [2007: 441‒443, 447] for references). The development of W-relatives in the West Germanic languages thus represents an “instance of similarity by convergence, rather than by inheritance” (Harbert 2007: 437). The Yiddish relativization system reflects the status of a standard language on the one hand (cf. relative pronouns velkh and ver) with a solid and rather recent foundation in spoken varieties on the other hand (cf. relative particle vos and pronoun retention; the type combining both seems to have been influenced by spoken Slavic varieties, [cf. Fleischer 2010: 164]). Also, the German dialects display typical spoken features (relative particles, resumptive pronouns, little usage of relative pronouns). However, they almost entirely lack asyndetic relative clauses. This can be linked to the dependent clause bracket having grammaticalized into an absolute rule (cf. section 4.1; Pittner 1996: 121) so that dependent clauses in contemporary German are obligatorily bracketed between an overt subordinator and the finite verb

148

Damaris Nübling and Luise Kempf

at the end. Interestingly, the only dialect allowing asyndetic relative clauses (the dialect of Schleswig, [Fleischer 2004: 226]) is also less strict about verb-final order in dependent clauses. The North Germanic languages have taken a rather different route than West Germanic. Old Norse es/er was supplemented and in most varieties replaced early on by sem and persists only in written Icelandic. Sem originally was a comparative particle. Lehmann (1995a: 1210) illustrates how the relevant reanalysis can have taken place with the following examples: (35) Old Icelandic rauðr sem blōð ‘red as blood’ (36) Old Icelandic svā vitran mann sem Þū ert ‘such wise man as you are’ (37) Old Icelandic Þat var et sama sem hann hafn̡e [sic] honum greitt ‘That was the same [price] as/that he had him payed’ These examples of comparative constructions show increasing degrees of similarity to relative constructions. When the object of comparison is a noun and the comparison amounts to an identity as in the last two examples, there is no more structural difference to relative clauses (Lehmann 1995a: 1210). The relative particle sem (or its variants like Swedish/Danish som) remains the dominant relativizer in the North Germanic languages to the present day, even when other relativizers are available (Koefoed 1958; Harbert 2007: 447). W-relatives play only a marginal role in the North Germanic languages; they are used much less or are restricted to formal registers (Teleman et al. 1999: 493; Harbert 2007: 448).

. Adverbial clauses Also, in terms of adverbial clauses (e.g., temporal or causal adverbials), spoken vs. written modality seems to be an influential factor. Kortmann (2001: 842) observes that the Western and Central European languages, “especially those representing Standard Average European (SAE), are particularly rich in ACs [adverbial conjunctions], both as regards number and semantic diversity”. As a major reason for this, he suggests that “complex sentences are primarily a phenomenon of planned language use, notably of written language”. The systems of SAE languages comprise 60 ACs on average. Considerably smaller inventories are found e.g., in Faroese, which developed its written standard relatively late, and in Fering (a North Frisian dialect) (Kortmann 1998: 525).

Grammaticalization in the Germanic languages

149

The evolution of rich conjunctional systems typically involves functional specialization on the level of individual conjunctions and paradigmatization on the level of the conjunctional system as a whole. For English, Rissanen (2011) points out the relevance of borrowing from French, Latin, and Greek, which enriched the inventories of Middle and Early Modern English (e.g., because, in case). Szczepaniak (2011: 165‒172, 2015) discusses the evolution of the conjunctional system in German, including the enormous growth of the inventory since OHG as well as the development of individual conjunctions such as weil ‘because’, see below. Ágel (2012), in a diachronic survey on junctional profiles, contrasts language of distance vs. language of immediacy. In his sample, distance texts tend towards stronger syntactic integration (particularly subordination, see Figure 9 above)7 and make more use of semantically distinct subordinators (weil ‘because’, indem ‘by [X-ing]’), while immediacy texts more often apply semantically underspecified or paratactic coordination, where the nature of the relation has to be inferred. Moreover, he observes that diachronically (17th‒19th century) immediacy texts assimilate to distance texts in terms of increasing integrative linkage and use of distinct subordinators. Clause linkage in German has thus undergone a drift from hidden to overt realization of coherence relations. The paths leading to individual ACs are manifold. Just as in the case of relativizers, interrogatives constitute one of the source domains. In all eight Germanic languages in Kortmann’s (1998: 520f) sample, ACs of both ‘simultaneity overlap’ and ‘place’ are derived from interrogatives, e.g., Engl. when, where, German wo ‘where’, wie ‘when’ < interrogative ‘how?’. Furthermore, Germanic languages figure very prominently in the AC type formed by a preposition that takes a demonstrative or definite article as its complement, e.g., Engl. in that ‘as, becauce’, Fering äfter dat ‘after’, German nachdem, lit. “after-:::”, ‘after’; Icelandic eftir þvi sem, lit. “after : ”, ‘(just) as’; Yiddish nokh dem vos, lit. “after :: ”, ‘after’ (Kortmann 1998: 492 f.). There are also typical paths of change within the conjunctional systems. The famous pathway from temporal to logical relations is represented, e.g., by English while, German während, both conveying ‘simultaneity overlap’ as well as an ‘adversative/concessive’ relation, and also by a number of ACs that have developed additional causal meanings, e.g., Engl. since, German nachdem, or shifted to causal meaning altogether, e.g., Swedish (high register/archaic) enär (< ‘when, while’), German weil. The latter derives from Middle High German (al) die wīle sō/unde/daz, lit. “(all) the while that”, ‘as long as, while’. The syntagma was reanalyzed as a complex subjunction and eventually lost its complementizer. Early New High German die weil was rather ambiguous; depending on the context, it could mean ‘since’, ‘while’, ‘as long as’, ‘after’, ‘because’, ‘while, whereas’ (Szczepaniak 2011: 171). With the growth of the inventory, the ACs on average narrowed their meanings,

 See also Raible’s (1992) study, which explores junction types in Romance and other languages.

150

Damaris Nübling and Luise Kempf

(die) weil specializing on causal meaning. The transition from temporal overlap to causation can be traced well in the ambiguous Early New High German example in (38). (38) Early NHG (Hans Neidhart, 1486, cited from Bonner Frühneuhochdeutschkorpus) Aber sie wolt in nit ein lassen die weil der ritter bei ir was But she wanted him not in let the while the knight by her was ‘But she did not want to let him in while/because the knight was with her’ In recent German, weil has developed the additional function of a discourse marker: Its function of introducing the cause of a proposition has expanded to introducing the cause of the preceding utterance, and, arguably, further on to introducing a new aspect of what has been said before, and finally to generally marking the continuation of the utterance (turn keeping signal). The example in (39) is taken from a study by Auer and Günthner (2005: 340), who also discuss whether the development can be counted as an instance of grammaticalization. (39) spoken German und den profs wars eigentlich im grund gnommen au and to.the professors was=it technically in=the ground taken also scheißegal weil phh … ja … also des geht denen halt au shit:equal because  … yes … well this goes them  also am arsch vorbei by=the butt past ‘and the professors, they basically didn’t care either, because, well, they don’t really give a damn’ Another famous path − from conditional to concessive meaning, cf. Kortmann (1997: 199 f.) − is reflected in the history of German ob. Like Engl. if, German ob has a somewhat doubtful etymology (cf. Lühr 2007), though both are assumed to be related to OHG iba ‘condition, stipulation, doubt’ and Old Norse if(i), ef(i) ‘doubt, hesitation’ (OED online s. v. if). The OHG predecessor variants of ob occurred in four uses: as a question particle, introducing independent polar interrogatives, as a complementizer, heading indirect interrogatives, and as a conjunction, heading conditional as well as concessive clauses. Szczepaniak (2013: 756‒758) sketches a bifurcated grammaticalization path, the question particle resulting a) in a complementizer heading indirect interrogatives and b) in a conditional conjunction (cf. the OHG example in [40]) that spread to concessive uses as well. (40) OHG (Hildebrandlied, 830/40, cited from Schlosser 2004: 68) ibu du mi enan sages, ik mi de odre uuet if you me one name.2S I me the others know ‘if you name one [of your kin], I’ll know the others’

Grammaticalization in the Germanic languages

151

On the way to present-day German, the conditional use was lost, conceivably due to the general specialization and paradigmatization of the German conjunctional system, in which wenn (cognate to when) took over the conditional function (‘if ’). As for the concessive use, ob has undergone univerbation processes with a number of different particles (wohl ‘well’, gleich ‘even’, schon ‘already’, zwar ‘although’, [cf. De Groodt 2002]), yielding four synonymous markers of concession. Here, a selection process is taking place, in which obzwar has been discarded and obwohl/obschon crystallize as the dominant types in Germany and Switzerland, respectively. Obwohl not only survives, but has also, like weil, taken on the additional function of a discourse marker. As such, it can be used to introduce utterances that contradict what has been said before (see Günthner 2000, 2003).

 Other patterns of grammaticalization Most of the Germanic languages, especially German, are well-known for modal particles, i.e., mostly unstressed short words with a fixed syntactic position which are typical for spoken language. In German some dozens of these small particles are available, which are syntactically restricted to the middle field and evolved out of adverbs, adjectives, and other particles (Autenrieth 2002; Diewald 1999). They express the speaker’s attitude to the (truth of the) proposition or expectations about the hearer. In sie wollen […] arbeiten ‘they want […] to work’, the bracketed position can be filled by halt (sie wollen halt arbeiten), which can be paraphrased as ‘this is an undeniable fact’, by eben ‘just’, ja ‘as you know’, doch ‘as you should know’, wohl ‘supposedly’. There are also some stressed modal particles such as dóch ‘I didn’t expect this’ and schón ‘I admit’. In exclamatory sentences – die haben [vielleicht/aber/ja] gearbeitet! ‘these […] worked!’ –, vielleicht means ‘to an unexpectedly high degree’, aber ‘I didn’t expect that’ and ja ‘I am surprised’ (Harbert 2007: 32‒36).

 Conclusion In this final section, phenomena and variation among Germanic languages and varieties will be briefly recapitulated, with respect to the written/spoken modality, the nature of grammaticalization, in cross-linguistic comparison, and with respect to language contact. In quite a number of the phenomena mentioned above, an evolution from hidden to overt complexity can be observed, which can be viewed in connection with the rise of written language and the increasing number of translations mostly from Latin to German. An example would be the grammaticalization of different future

152

Damaris Nübling and Luise Kempf

constructions as well as the rise of rich and highly distinct adverbial conjunction systems in English and German (as opposed to smaller inventories in Faroese and Fering). Also, the preference for relative pronouns as opposed to relative particles or asyndetic RCs in Standard German points towards this direction. The grammaticalization of word order as a co-indicator of sentence mood is rather ambiguous with respect to the notion of hidden vs. overt complexity. On the one hand, word order has, by and large, become more rigid since PG; on the other hand, overt markers such as question and declarative particles have been lost (cf. Szczepaniak 2013: 748 f.). The rise of (in)definite articles is another piece of evidence for the development from hidden to overt complexity, which however has to be regarded a development independent of written language. In the Germanic languages, co-evolution of meaning and form occurs very often, e.g., in the development of the (in)definite articles (< demonstratives < numeral ‘one’), which, especially in German and Lux., even cliticize with their preceding prepositions. The same holds for the semantically and phonologically reduced future auxiliaries (< modals < movement verbs), whereas the famous framing constructions may impede further coalescence with the full verb; in many Germanic languages, auxiliaries and full verbs are separated by a middle field. Only English ’ll < will attaches to the preceding word (mostly pronouns). The same conflict applies to passive and present perfect constructions. Even ACs confirm the co-evolution principle by their erosion ((al) die wīle daz > weil ‘because’) and univerbation (ob (…) wohl > obwohl ‘although’). Bisang (2011: 114) determines a correlation between erosion and grammaticalization only in stress-based language types, which perfectly applies to the Germanic languages as they all belong (although to different degrees) to the stress-based type (Germanic initial stress). Many of the grammaticalization paths discussed above are rather unusual from a typological, European, or even Germanic perspective. A number of phenomena constitute major Standard Average European features, but are rare in the European peripheries and/or in other languages of the world. This concerns the grammaticalization of a definite as well as an indefinite article (found in less than 8 % of the languages of Dryer’s [1989: 85] sample, [cf. Haspelmath 2008: 1494]), relative pronouns (“relative clauses formed using the relative pronoun strategy are quite exceptional outside Europe, except as a recent result of the influence of European languages” Comrie [1998: 61]), ‘have’-perfect (“almost exclusively found in Europe”, Haspelmath [2008: 1495], referencing Dahl [1995, 1996]). The pronounced tendency to grammaticalize adverbial conjunctions from prepositions plus definite articles/ demonstratives or the syntactic rule of V2-order is characteristic for the Germanic branch even among the European languages. Within the Germanic languages, some individual grammaticalization paths are unusual, such as the development of ‘give’ into a passive auxiliary in Luxembourgish (which is also typologically rare, cf. 3.1). Yiddish and Icelandic both stand out relatively often from the other Germanic languages, e.g., in that they have both retained the realis V1-pattern or in that their

Grammaticalization in the Germanic languages

153

inventories of adverbial conjunctions are, on average, unusually analytic (Ice. eftir þvi að ‘after’, Yiddish nokh dem vos ‘after’, [cf. Kortmann 1998: 525 f.]). Furthermore, Yiddish has an unusually free word order; Icelandic stands out as a highly inflecting language. Moreover, it is the only Germanic language not to have grammaticalized an indefinite article and the only one to have retained the patronymic by-name system. These findings are interesting given that both languages are not located in the Germanic languages’ heartland. This, in the case of Icelandic, implies isolation and thus conservation of archaic features and in the case of Yiddish a stronger influence of Slavic languages (in addition to Hebrew). A contact-induced feature that has spread very widely across Germanic languages is the grammaticalization of W-relatives out of interrogative pronouns or adverbs. This path is likely to have transferred from various Romance languages to West Germanic languages at different times, cf. section 4.3. Since W-relatives are marginal and of formal registers in the North Germanic languages, they are possibly calques from West Germanic (cf. Swedish vilken ‘which’, German welchen ‘which.A’). Within the Germanic languages, there is some sort of unidirectional influence from the West to the North Germanic languages (due to the Hanse contact in the Middle Ages): Here, the preponed definite d-article and the periphrastic blipassive have to be mentioned (grammatical replication from Low German as a model language to Scandinavian as replica language [Heine and Kuteva 2011: 292]). However, not every similarity between neighboring languages can be interpreted as contact-induced: Thus, the proposed influence of a Czech loan construction containing budu + infinitive on the grammaticalization of the German werden-future has been rejected.8

Acknowledgements We would like to thank Christian Lehmann, Christer Lindqvist and Andrej Malchukov for their useful comments on a previous version of this paper. Also, we are grateful to Anke Lensch and Mehmet Aydın for proofreading. Any remaining errors and shortcomings are of course ours.

References Abraham, Werner. 1989. Futur-Typologie in den germanischen Sprachen. In Abraham Werner & Theo Janssen (eds.), Tempus – Aspekt – Modus. Die lexikalischen und grammatischen Formen im Deutschen, 345‒389. Tübingen: Niemeyer.

 Heine and Kuteva (2011: 297) mention a de-allative future for Luxembourgish (which hardly ever occurs) and interpret it as a French loan construction. Considering the other Germanic languages with many de-allative futures, such an explanation is superfluous.

154

Damaris Nübling and Luise Kempf

Abraham, Werner & Jac Conradie. 2001. Präteritumschwund und Diskursgrammatik. Amsterdam & Philadelphia: Benjamins. Admoni, Wladimir G. 1967. Der Umfang und die Gestaltungsmittel des Satzes in der deutschen Literatursprache bis zum Ende des 18. Jhs. Beiträge zur Geschichte der deutschen Sprache 89. 144‒199. Admoni, Wladimir G. 1985. Syntax des Neuhochdeutschen seit dem 17. Jahrhundert. In Werner Besch, Anne Betten, Oskar Reichmann & Stefan Sonderegger (eds.), Sprachgeschichte. Ein Handbuch zur Geschichte der deutschen Sprache und ihrer Erforschung Volume 2, 1538‒ 1556. Berlin & New York: de Gruyter. Admoni, Wladimir. 1990. Historische Syntax des Deutschen. Tübingen: Niemeyer. Ágel, Vilmos. 2012. Junktionsprofile aus Nähe und Distanz. In Jochen Bär & Marcus Müller (eds.), Geschichte der Sprache ‒ Sprache der Geschichte. Probleme und Perspektiven der historischen Sprachwissenschaft des Deutschen. Oskar Reichmann zum 75. Geburtstag, 181– 206. Berlin & Boston: Akad.-Verl. Andersson, Sven. 1989. On the generalization of progressive constructions. “Ich bin das Buch am Lesen” – status and usage of three varieties of German. In Lars-Gunnar Larsson (ed.), Proceedings of the Second Scandinavian Symposium on Aspectology, 95–106. Uppsala: Almqvist & Wiksell. Audring, Jenny. 2009. Reinventing pronoun gender. Utrecht: LOT. Audring, Jenny. 2010. Deflexion und pronominales Genus. In Antje Dammel, Damaris Nübling & Sebastian Kürschner (eds.), Kontrastive germanistische Linguistik Vol. 2, 693‒717. Hildesheim: Olms. Auer, Peter. 1991. Zur Verbspitzenstellung im Gesprochenen Deutsch. Deutsche Sprache 3. 193‒ 222. Auer, Peter & Susanne Günthner. 2005. Die Entstehung von Diskursmarkern im Deutschen – ein Fall von Grammatikalisierung? In Torsten Leuschner, Tanja Mortelmans & Sarah Groodt (eds.), Grammatikalisierung im Deutschen, 335–362. Berlin & New York: de Gruyter. Autenrieth, Tanja. 2002. Heterosemie und Grammatikalisierung bei Modalpartikeln. Eine synchrone und diachrone Studie anhand von eben, halt, e(cher)t, einfach, schlicht und glatt. Tübingen: Niemeyer. Axel, Katrin. 2009. The verb-second property in Old High German. Different ways of filling the prefield. In R. Hinterhölzl & S. Petrova (eds.), Information structure and language change. New approaches to word order variation in Germanic, 17–44. Berlin: de Gruyter. Axel-Tober, Katrin. 2012. (Nicht-)kanonische Nebensätze im Deutschen. Synchrone und diachrone Aspekte. Berlin & Boston: de Gruyter. Axel-Tober, Katrin. 2013. Unselbstständiger dass- und ob-VL-Satz. In Jörg Meibauer, Markus Steinbach & Hans Altmann (eds.), Satztypen des Deutschen, 247‒265. Berlin & New York: de Gruyter. Bammesberger, Alfred. 1986. Der Aufbau des germanischen Verbalsystems. Heidelberg: Winter. Bammesberger, Alfred. 1990. Der Morphologie des urgermanischen Nomens. Heidelberg: Winter. Barðdal, Jóhanna, Nils Jörgensen & Gorm Larsen. 1997. Nordiska. Våra språk förr och nu [Nordic. Our languages then and now]. Lund: Studentlitteratur. Behaghel, Otto. 1928. Deutsche Syntax Volume 3. Heidelberg: Winter. Bhatt, Crista & Claudia Maria Schmidt. 1993. Die am + Infinitiv-Konstruktion im Kölnischen und im umgangssprachlichen Standarddeutschen als Aspekt-Phrase. In Werner Abraham & Josef Bayer (eds.), Dialektsyntax (Linguistische Berichte), 71–98. Bickel, Balthasar. 1992. Future from the fringes: the marking of future time reference in Züritüütsch. In Östen Dahl (ed.), Future Time Reference in European Languages, 72‒85. Stockholm: Univ. Stockholm, Dept. of Linguistics.

Grammaticalization in the Germanic languages

155

Birkmann, Thomas. 1997. Das neuisländische Mediopassiv: Flexion oder Wortbildung? In Thomas Birkmann, Heinz Klingenberg, Damaris Nübling & Elke Ronneberger-Sibold (eds.), Vergleichende germanische Philologie und Skandinavistik, 81‒89. Tübingen: Niemeyer. Birnbaum, Solomon. 1979. Yiddish. A survey and a grammar. Toronto: Univ. of Toronto Press. Bisang, Walter. 2011. Grammaticalization and typology. In Heiko Narrog & Bernd Heine (eds.), The handbook of grammaticalization, 105‒117. Oxford: Oxford Univ. Press. Bogner, Istvan. 1989. Zur Entwicklung der periphrastischen Futurformen im Frühneuhochdeutschen. Zeitschrift für deutsche Philologie 108. 56‒85. Börjars, Kersti & Nigel Vincent. 2011. Grammaticalization and directionality. In Heiko Narrog & Bernd Heine (eds.), The handbook of grammaticalization, 163‒176. Oxford: Oxford Univ. Press. Braunmüller, Kurt. 2015. Zum Passiv im Nordgermanischen. Kungl. Humanistiska VetenskapsSamfundet i Uppsala, Årsbok 2015. 5‒27. Buchwald-Wargenau, Isabel. 2012. Die doppelten Perfektbildungen im Deutschen. Berlin & Boston: de Gruyter. Bybee, Joan. 1985. Morphology. Amsterdam: Benjamins. Bybee, Joan. 1994. Morphological universals and change. In R. E. Asher (ed.), The encyclopedia of language and linguistics Vol. 5, 2557‒2562. Oxford: Pergamon Press. Bybee, Joan & William Pagliuca. 1987. The evolution of future meaning. In Anna Giacalone Ramat, Onofrio Carruba & Giuliano Bernini (eds.), Papers from the 7th Intern. Conference on Historical Linguistics, 109‒122. Amsterdam: Benjamins. Bybee, Joan, William Pagliuca & Revere Perkins. 1994. The evolution of grammar. tense, aspect, and modality in the languages of the World. Chicago: Univ. of Chicago Press. Christiansen, Mads. 2012. Die Präposition-Artikel-Enklise im Mittelhochdeutschen und Frühneuhochdeutschen. Beiträge zur Geschichte der deutschen Sprache und Literatur 134(1). 1‒24. Christiansen, Mads. 2016. Von der Phonologie in die Morphologie. Diachrone Studien zur Präposition-Artikel-Enklise im Deutschen. Hildesheim: Olms. Comrie, Bernard. 1998. Rethinking the typology of relative clauses. Language Design 1. 59‒86. Comrie, Bernard & Tania Kuteva. 2013. Relativization on subjects. In Matthew S. Dryer & Martin Haspelmath (eds.), The World atlas of language structures online. Leipzig: Max Planck Institute for Evolutionary Anthropology. (Available online at http://wals.info/chapter/122, Accessed on 2016–05–17.) D’Avis, Franz. 2013. Exklamativsatz. In Jörg Meibauer, Markus Steinbach & Hans Altmann (eds.), Satztypen des Deutschen, 171‒201. Berlin & New York: de Gruyter. Dahl, Östen. 1992. The marking of future time reference in Continental Scandinavian. In Ö. Dahl, Caspar de Groot & Hannu Tommola (eds.), Future time reference in European languages (Eurotyp Working Papers VI/2), 60‒71. Stockholm: Univ. Stockholm. Dahl, Östen. 1995. Areal tendencies in tenseaspect systems. In Pier Marco Bertinetto (ed.), Temporal reference, aspect and actionality Volume 2: Typological perspectives, 11‒28. Turin: Rosenberg & Sellier. Dahl, Östen. 1996. Das Tempussystem des Deutschen im typologischen Vergleich. In Ewald Lang & Zifonun, Gisela (eds.), Deutsch – typologisch, 359‒368. Berlin: de Gruyter. Dahl, Östen. 2000. The grammar of future time reference in European languages. In Östen Dahl (ed.), Tense and aspect in the languages of Europe, 309‒328. Berlin & New York: de Gruyter. Dahl, Östen. 2004. Definite articles in Scandinavian: Competing grammaticalization processes in standard and non-standard varieties. In Bernd Kortmann (ed.), Dialectology meets typology. Dialect grammar from a cross-linguistic perspective, 147‒180. Berlin & New York: de Gruyter Mouton. Dahl, Östen. 2011. Grammaticalization and linguistic complexity. H. Narrog & B. Heine (eds.), The handbook of grammaticalization, 153‒162. Oxford: Oxford Univ. Press.

156

Damaris Nübling and Luise Kempf

Dahl, Östen & Viveka Velupillai. 2013. The future tense. In Matthew S. Dryer & Martin Haspelmath (eds.), The World atlas of language structures online. Leipzig: Max Planck Institute for Evolutionary Anthropology. (Available online at http://wals.info/chapter/67, Accessed on 2016–03–01.) Dammel, Antje, Sebastian Kürschner & Damaris Nübling. 2010. Pluralallomorphie in zehn germanischen Sprachen. Konvergenzen und Divergenzen in Ausdrucksverfahren und Konditionierung. In Antje Dammel, Damaris Nübling & Sebastian Kürschner (eds.), Kontrastive Germanistische Linguistik, Vol. 2, 473‒522. Hildesheim: Olms. Dammel, Antje, Jessica Nowak & Mirjam Schmuck. 2010. Strong verb levelling in four Germanic languages. A category frequency approach. Journal of Germanic Linguistics 22(4). 341‒365. De Groodt, Sarah. 2002. Reanalysis and the five problems of language change: a case study on the rise of concessive subordinating conjunctions with ob- in Early Modern German. Sprachtypologie und Universalienforschung (STUF) 55(3). 277–288. Dedenbach, Beate. 1987. Reduktions- und Verschmelzungsformen im Deutschen. Frankfurt: Lang. Den Besten, Hans & Corretje Moed-van Walraven. 1986. The syntax of verbs in Yiddish. In Hubert Haider & Martin Prinzhorn (eds.), Verb second phenomena in Germanic languages, 111–135. Dordrecht & Riverton: Foris Publications. Dentler, Sigrid. 1997. Zur Perfekterneuerung im Mittelhochdeutschen. Die Erweiterung des zeitreferentiellen Funktionsbereichs von Perfektfügungen. Göteborg: Acta universitatis Gothoburgensis. Dentler, Sigrid. 1998. Gab es den Präteritumschwund? In John Ole Askedal (ed.), Historische germanische und deutsche Syntax, 133‒147. Frankfurt: Lang. Di Meola, Claudio. 2013. Die Versprachlichung von Zukünftigkeit durch Präsens und Futur I. Eine ebenenübergreifende Untersuchung samt kontrastivem Ausblick auf das Italienische. Tübingen: Stauffenburg. Diewald, Gabriele. 1999. Die Entwicklung der Modalpartikel aber: ein typischer Grammatikalisierungsweg der Modalpartikeln. In H. Spillmann & I. Warnke (eds.), Internationale Tendenzen der Syntaktik, Semantik und Pragmatik, 83‒91. Frankfurt: Lang. Diewald, Gabriele & Mechthild Habermann. 2005. Die Entwicklung von werden + Infinitiv als Futurgrammem. Ein Beispiel für das Zusammenwirken von Grammatikalisierung, Sprachkontakt und soziokulturellen Faktoren. In Torsten Leuschner, Tanja Mortelmans & Sarah Groodt (eds.), Grammatikalisierung im Deutschen, 229‒250. Berlin & New York: de Gruyter. Diewald, Gabriele & Ilse Wischer. 2013. Markers of futurity in Old High German and Old English: A Comparative corpus-based study. In Gabriele Diewald, Leena Kahlas-Tarkka & Ilse Wischer (eds.), Comparative studies in Early Germanic languages. Amsterdam/Philadelphia, 195‒216. Dittmer, A. & E. Dittmer. 1998. Studien zur Wortstellung. Satzgliedstellung in der althochdeutschen Tatianübersetzung. Göttingen: Vandenhoeck & Ruprecht. Dittmer, E. 1992. Die Wortstellung im AHD Tatian. Althochdeutsch. In Y. Desportes (ed.), Syntax und Semantik. Akten des Lyoner Kolloquiums zur Syntax und Semantik des Althochdeutschen. 1.–3. März 1990, 247–258. Lyon: Centre d’Etudes Linguistiques Jacques Goudet. Dryer, Matthew S. 1989. Article-noun order. Chicago Linguistic Society 25. 83‒97. Dryer, Matthew S. 2013. Polar questions. In Matthew S. Dryer & Martin Haspelmath (eds.), The World atlas of language structures online. Leipzig: Max Planck Institute for Evolutionary Anthropology. (Available online at http://wals.info/chapter/116, Accessed on 2016–04–30.) Ebert, Karen. 1998. Genussynkretismus im Nordseeraum: die Resistenz des Fering. In Winfried Boeder & Johannes Bechert (eds.), Sprache in Raum und Zeit, Bd. 2: Beiträge zur empirischen Sprachwissenschaft, 269‒281. Tübingen: Narr. Ebert, Karen. 2000. Progressive markers in Germanic languages. In Östen Dahl (ed.), Tense and aspect in the languages of Europe, 605‒653. Berlin & New York: de Gruyter.

Grammaticalization in the Germanic languages

157

Ebert, Robert Peter. 1986. Historische Syntax des Deutschen II: 1300–1750. Bern: Lang. Enger, Hans-Olav. 2002. The story of Scandinavian -s(t) retold: Grammaticalising a clitic to a derivational affix. Folia Linguistica Historica 23(1‒2). 79‒105. Enger, Hans-Olav. 2004. Scandinavian pancake sentences as semantic agreement. Nordic Journal of Linguistics 27(1). 5‒34. Eroms, Hans-Werner. 1989. Regionalsprachliche Artikelparadigmen und die grammatikalische Behandlung der Artikel im Deutschen. In Hans-Werner Eroms (ed.), Probleme regionaler Sprachen, 103‒123. Hamburg: Buske. Eythórsson, Thórhallur. 1995. Verb position and verb movement in Early Germanic. Cornell University PhD dissertation. Faarlund, Jan Terje. 1990. Syntactic change. Toward a theory of historical syntax (Trends in linguistics: Studies and monographs, 50). Berlin & New York: de Gruyter Mouton. Faarlund, Jan Terje. 2008. From Ancient Germanic to modern Germanic languages. In Martin Haspelmath, Ekkehard König, Wulf Oesterreicher & Wolfgang Raible (eds.), Language Typology and Language Universals. Volume 2, 1706‒1719. Berlin & New York: de Gruyter. Fischer, Hanna. 2015. Präteritumschwund in den Dialekten Hessens. Eine Neuvermessung der Präteritalgrenze(n). In Michael Elmentaler, Markus Hundt & Jürgen Schmidt (eds.), Deutsche Dialekte. Konzepte, Probleme, Handlungsfelder, 107‒133. Stuttgart: Franz Steiner. Fischer, Hanna. 2018. Präteritumschwund im Deutschen. Dokumentation und Erklärung eines Verdrängungsprozesses. (Studia Linguistica, 132). Berlin: de Gruyter. Fleischer, Jürg. 2004. A typology of relative clauses in German dialects. In Bernd Kortmann (ed.), Dialectology meets typology. Dialect grammar from a cross-linguistic perspective (Trends in Linguistics Studies and Monographs 153), 211‒243. Berlin & New York: de Gruyter. Fleischer, Jürg. 2005. Relativsätze in den Dialekten des Deutschen: Vergleich und Typologie. In Helen Christen (ed.), Dialektologie an der Jahrtausendwende (Linguistik online 24, 3), 171– 186. Fleischer, Jürg. 2010. Relativsätze im Deutschen und Jiddischen. In Antje Dammel, Damaris Nübling & Sebastian Kürschner (eds.), Kontrastive Germanistische Linguistik, Vol. 1, 145‒169. Hildesheim. Flick, Johanna. 2016. Der am-Progressiv und parallele am V-en sein-Strukturen im Spannungsfeld zwischen grammatischen und lexikalischen Konstruktionen: Kompositionalität, Variabilität und Netzwerkbildung. Beiträge zur Geschichte der deutschen Sprache und Literatur 138(2). 163‒196. Flick, Johanna. 2017. Die Entwicklung des Definitartikels im Althochdeutschen. Eine kognitivlinguistische Korpusuntersuchung. Hamburg: University of Hamburg dissertation. Flick, Johanna & Katrin Kuhmichel. 2013. Der am-Progressiv in Dialekt und Standard. In Petra Vogel (ed.), Sprachwandel im Neuhochdeutschen, 52‒76. Berlin & Boston: de Gruyter. Fraurud, Kari. 2000. Proper names and gender in Swedish. In Barbara Unterbeck, Matti Rissanen, Terttu Nevalainen & Mirja Saari (eds.), Gender in grammar and cognition, 167‒219. Berlin & New York: de Gruyter. Gillmann, Melitta. 2011. Die Grammatikalisierung des sein-Perfekts – eine korpuslinguistische Untersuchung zur Hilfsverbselektion der Bewegungsverben im Deutschen. Beiträge zur Geschichte der deutschen Sprache und Literatur 133(2). 203‒234. Gillmann, Melitta. 2016. Die Perfektkonstruktionen haben + V-PP und sein + V-PP aus gebrauchsbasierter Perspektive. Eine Korpusuntersuchung im Althochdeutschen, Altsächsischen und Neuhochdeutschen. Berlin & New York: de Gruyter. Glaser, Elvira & Natascha Frey. 2011. Empirische Studien zur Verbverdoppelung in schweizerdeutschen Dialekten. Linguistik online 45(1). Greule, Albrecht. 2000. Syntax des Althochdeutschen. In Werner Besch, Anne Betten, Oskar Reichmann, Stefan Sonderegger (eds.), Sprachgeschichte. Ein Handbuch zur Geschichte der deutschen Sprache und ihrer Erforschung, Volume 2, 1207–1213. Berlin: de Gruyter.

158

Damaris Nübling and Luise Kempf

Günthner, Susanne. 2000. From concessive connector to discourse marker: The use of obwohl in everyday German Interaction. In Elizabeth Couper-Kuhlen & Bernd Kortmann (eds.), Cause − condition − concession − contrast. Cognitive and discourse perspectives, 439‒468. Berlin: de Gruyter. Günthner, Susanne. 2003. Lexical-grammatical variation and development: The use of conjunctions as discourse markers in everyday spoken German. In Regine Eckkart, Klaus von Heusinger & Christoph Schwarze (eds.), Words in time. Diachronic semantics from different points of view, 375‒403. Berlin & New York: de Gruyter. Hall, Richard Michael Ryan. 1967. Yiddish syntax: Phrase Structure Rules and optional singulary transformations of the modern language. New York: Ann Arbor PhD dissertation. Harbert, Wayne. 2007. The Germanic Languages. Cambridge: Cambridge University Press. Härd, John Evert. 1981. Studien zur Struktur mehrgliedriger deutscher Nebensatzprädikate. Diachronie und Synchronie. Göteborg: Acta Universitatis Gothoburgensis. Harm, Volker. 2001. Zur Herausbildung der deutschen Futurumschreibung mit werden + Infinitiv. Zeitschrift für Dialektologie und Linguistik 68. 288‒307. Harris, Alice C. & Lyle Campbell. 1995. Historical syntax in cross-linguistic perspective. Cambridge: Cambridge Univ. Press. Haspelmath, Martin. 1987. Transitivity alternations of the anticausative type. Cologne: Institut für Sprachwissenschaft der Universität zu Köln. Haspelmath, Martin. 1989. From purposive to infinitive. A universal path of grammaticalization. Folia Linguistica Historica 10(1‒2). 287‒310. Haspelmath, Martin. 2008. The European linguistic area. Standard Average European. In Martin Haspelmath, Ekkehard König, Wulf Oesterreicher & Wolfgang Raible (eds.), Language typology and language universals, Volume 2, 1492‒1510. Berlin & New York: de Gruyter. Hawkins, John A. 1986. A comparative typology of English and German. London: Univ. of Texas Press. Heine, Bernd. 1993. Auxiliaries. Cognitive forces and grammaticalization. New York & Oxford: Oxford Univ. Press. Heine, Bernd. 1995. On the German werden future. In Werner Abraham, Talmy Givon & Sandra Thompson (eds.), Discourse grammar and typology, 119‒137. Amsterdam & Philadelphia: Benjamins. Heine, Bernd & Tania Kuteva. 2004. World lexicon of grammaticalization. Cambridge: Cambridge University Press. Heine, Bernd & Tania Kuteva. 2006. The changing languages of Europe. Oxford: Oxford University Press. Heine, Bernd & Tania Kuteva 2011. The areal dimension of grammaticalization. In H. Narrog & B. Heine (eds.), The handbook of grammaticalization, 291‒301. Oxford: Oxford University Press. Henriksen, Carol & Johan van der Auwera. 2002. The Germanic languages. In E. König & J. van der Auwera (eds.), The Germanic languages, 1‒18. London & New York: Routledge. Herslund, Michael. 2001. The Danish -s genitive: From affix to clitic. Acta Linguistica Hafniensia: International Journal of Linguistics. 33(1). 7‒18. Hesse, Andrea. 2009. Zur Grammatikalisierung der Pseudokoordination im Norwegischen und in den anderen skandinavischen Sprachen. Tübingen & Basel: Francke. Hesse, Andrea. 2011. Zur Entwicklung der aspektuellen Bedeutung bei der skandinavischen Pseudokoordination. NOWELE 60–61. 147–169. Hilpert, Martin. 2006. A synchronic perspective on the grammaticalization of Swedish Future constructions. Nordic Journal of Linguistics 29(2). 151‒173. Hilpert, Martin. 2008. Germanic Future constructions. A usage-based approach to language change. Amsterdam & Philadelphia: Benjamins. Hilpert, Martin. 2011. Grammaticalization in Germanic languages. In H. Narrog & B. Heine (eds.), The handbook of grammaticalization, 708‒718. Oxford: Oxford University Press.

Grammaticalization in the Germanic languages

159

Himmelmann, Nikolaus. 2001. Articles. In Martin Haspelmath, Ekkehard König, Wulf Oesterreicher & Wolfgang Raible (eds.), Sprachtypologie und sprachliche Universalien (HSK 20.1), 831–841. Berlin & New York: de Gruyter. Hinterhölzl, Roland, Svetlana Petrova & Michael Solf. 2005. Diskurspragmatische Faktoren für Topikalität und Verbstellung in der ahd. Tatianübersetzung (9. Jh.). In Shinichiro Ishihara, Michaela Schmitz & Anne Schwarz (eds.), Interdisciplinary studies on information structure, Volume 3, 245‒257. Potsdam: Universitätsverlag Potsdam. Holmes, Philip & Jan Hinchliffe. 1994. Swedish. A comprehensive grammar. London & New York: Routledge. Hopper, Paul J. 1975. The syntax of the simple sentence in Proto-Germanic. The Hague & Paris: Mouton. Hopper, Paul. & Elizabeth Traugott. 2003. Grammaticalization. Cambridge: Cambridge Univ. Press. Horie, Kaoru. 2008. Complement clauses. In Martin Haspelmath, Ekkehard König, Wulf Oesterreicher, Wolfgang Raible (eds.), Language typology and language universals, Volume 2, 979‒993. Berlin & New York: de Gruyter. Janda, Richard D. 2001. Beyond ‘pathways’ und ‘unidirectionality’: on the discontinuity of transmission and the counterability of grammaticalization. Language Sciences 23(2‒3). 265‒ 340. Keenan, Edward L. & Bernard Comrie. 1977. Noun Phrase Accessibility and Universal Grammar. Linguistic Inquiry 8(1). 63‒99. Kiparsky, Paul. 1995. Indo-European origins of Germanic syntax. In Adrian Battye & Ian Roberts (eds.), Clause structure and language change, 140‒169. New York & Oxford: Oxford University Press. Koefoed, Hans Anton. 1958. Danish. London: Hodder and Stoughton. Köpcke, Klaus-Michael. 1995. Die Klassifikation der schwachen Maskulina in der deutschen Gegenwartssprache. Zeitschrift für Sprachwissenschaft 14(2). 159‒180. Köpcke, Klaus-Michael. 2000a. Chaos und Ordnung ‒ Zur semantischen Remotivierung einer Deklinationsklasse im Übergang vom Mhd. zum Nhd. In Andreas Bittner, Dagmar Bittner & Klaus-Michael Köpcke (eds.), Angemessene Strukturen: Systemorganisation in Phonologie, Morphologie und Syntax, 107‒122. Hildesheim: Olms. Köpcke, Klaus-Michael. 2000b. Starkes, Schwaches und Gemischtes in der Substantivflexion des Deutschen. Was weiß der Sprecher über die Deklinationsparadigmen? In Rolf Thieroff, Matthias Tamrat, Nanna Fuhrhop & Oliver Teuber (eds.), Deutsche Grammatik in Theorie und Praxis, 155‒170. Tübingen: Niemeyer. Köpcke, Klaus-Michael. 2002. Wie entwickeln sich die Deklinationsklassen im Deutschen? In Peter Wiesinger (ed.), Zeitenwende ‒ die Germanistik auf dem Weg vom 20. ins 21. Jahrhundert, Vol. 2: Entwicklungstendenzen der deutschen Gegenwartssprache., 101‒108. Bern: Lang. Kortmann, Bernd. 1997. Adverbial subordination. A typology and history of adverbial subordinators based on European languages. (Empirical Approaches to Language Typology 18). Berlin & New York: de Gruyter. Kortmann, Bernd. 1998. Adverbial subordinators in the languages of Europe. In J. van der Auwera (ed.), Adverbial constructions in the languages of Europe, 457‒561. Berlin & New York: de Gruyter Mouton. Kortmann, Bernd. 2001. Adverbial conjunctions. In Martin Haspelmath, Ekkehard König, Wulf Oesterreicher, Wolfgang Raible (eds.), Language typology and language universals, Volume 1, 842‒854. Berlin & New York: de Gruyter. Koster, Jan. 1987. Why subject sentences don’t exist. In S. Jay Kayser (ed.), Recent transformational studies in European languages, 53‒64. Cambridge & Mass.: MIT Press. Krahe, Hans & Wolfgang Meid. 1969. Germanische Sprachwissenschaft. Berlin: de Gruyter.

160

Damaris Nübling and Luise Kempf

Kürschner, Sebastian. 2008a. Deklinationsklassen-Wandel: Eine diachron-kontrastive Studie zur Entwicklung der Pluralallomorphie im Deutschen, Niederländischen, Schwedischen und Dänischen. Berlin & New York: de Gruyter. Kürschner, Sebastian. 2008b. Semantische Konditionierung in der Pluralallomorphie deutscher Dialekte. In Franz Patocka & Guido Seiler (eds.), Dialektale Morphologie, dialektale Syntax, 141‒156. Wien: Praesens Verl. Kürschner, Sebastian & Damaris Nübling 2011. The interaction of gender and declension in Germanic languages. Folia Linguistica 45(2). 355‒388. Kvist Darnell, Ulrika. 2008. Pseudosamordningar i svenska: särskilt sådana med verben sitta, ligga och stå [Pseudo coordinations in Swedish: especially those with the verbs sitta, ligga, stå]. Stockholm: Inst. för lingvistik, Stockholm Univ. Laanemets, Anu. 2009. The passive voice in written and spoken Scandinavian. In Mark Fryd (ed.), The passive in Germanic languages, 144‒166. Groningen: Univ. of Groningen Press. Lass, Roger. 1990. How to do things with junk: exaptation in language evolution. Journal of Linguistics 26. 79‒102. Lehmann, Christian. 1984. Der Relativsatz (Language Universals Series 3). Tübingen: Narr. Lehmann, Christian. 1988. Towards a typology of clause linkage. In John Haiman & Sandra Thompson (eds.), Clause combining in grammar and discourse, 181‒225. Amsterdam & Philadelphia: Benjamins. Lehmann, Christian. 1991. Grammaticalization and related changes in contemporary German. In Elizabeth Traugott & Bernd Heine (eds.), Approaches to grammaticalization: Volume II. Types of grammatical markers, 493‒535. Philadelphia. Lehmann, Christian. 1992. Word order change by grammaticalization. In Marinel Gerritsen & Dieter Stein (eds.), Internal and external factors in syntactic change, 395‒416. The Hague & Berlin: de Gruter Mouton. Lehmann, Christian. 1995a. Relativsätze. In Joachim Jacobs et al. (eds.), Syntax. Ein internationales Handbuch zeitgenössischer Forschung, Volume 2, 1199‒1216. Berlin & New York: de Gruyter. Lehmann, Christian. 1995b. Thoughts on grammaticalization. München: LINCOM Europa. Lehmann, Christian. 2004. Theory and method in grammaticalization. Zeitschrift für germanistische Linguistik 32(2). 152‒187. Leiss, Elisabeth. 1985. Zur Entstehung des neuhochdeutschen analytischen Futurs. Sprachwissenschaft 10. 250‒273. Lenerz, Jürgen. 1984. Syntaktischer Wandel und Grammatiktheorie. Eine Untersuchung an Beispielen aus der Sprachgeschichte des Deutschen. Tübingen: Niemeyer. Lenerz, Jürgen. 1985. Diachronic syntax: Verb position and COMP in German. In Jindřich Toman (ed.), Studies in German grammar, 103‒132. Dordrecht: Foris Publ. Lenz, Alexandra. 2011. Zum kréien-Passiv uns seinen “Konkurrenten” im schriftlichen und mündlichen Luxemburgischen. In Peter Gilles & Melanie Wagner (eds.), Linguistische und soziolinguistische Bausteine der Luxemburgistik. Frankfurt a. M.: Lang. Lenz, Alexandra. 2013. Vom ‘kriegen’ und ‘bekommen’. Kognitiv-semantische, variationslinguistische und sprachgeschichtliche Perspektiven. Berlin & Boston: de Gruyter. Lippert, J. 1974. Beiträge zur Technik und Syntax althochdeutscher Übersetzungen unter besonderer Berücksichtigung der Isidorgruppe und des althochdeutschen Tatian. München: Fink. Lockwood, William Burley. 1968. Historical German syntax. Oxford: Clarendon Press. Lockwood, William Burley. 1995. Lehrbuch der modernen jiddischen Sprache. Hamburg: Hamburg. Los, Bettelou & Ans van Kemenade. 2012. New perspectives, theories and methods: Information structure and syntax in the history of English. In Alex Bergs & Laurel Brinton (eds.), Historical Linguistics of English (HSK 34.2), 1475–1490. Berlin & Boston: de Gruyter Mouton.

Grammaticalization in the Germanic languages

161

Lötscher, Andreas. 1993. Zur Genese der Verbverdopplung bei gaa, choo, laa und aafaa (“gehen”, “kommen”, “lassen” und “anfangen”) im Schweizerdeutschen. In Werner Abraham & Josef Bayer (eds.), Dialektsyntax, 180–200. Opladen: Westdt. Verl. Lühr, Rosemarie. 2004. Der Nebensatz in den Westgermania. In Thorwald Poschenrieder (ed.), Die Indogermanistik und ihre Anrainer. Dritte Tagung der Vergleichenden Sprachwissenschaftler der Neuen Länder, 161‒179. Innsbruck: Inst. für Sprachwiss. Lühr, Rosemarie. 2007. Konnektoren im älteren Deutsch. In H.-U. Schmid (ed.), Beiträge zur synchronen und diachronen Sprachwissenschaft, 39–51. Leipzig: Verl. der Sächs. Akad. der Wiss. Miller, J. 2006. Relative Clauses in Spoken Discourse. In K. Brown (ed.), Encyclopedia of language & linguistics, Volume10, 508‒551. Amsterdam & Heidelberg: Elsevier. Mitchell, Bruce. 1985. Old English syntax (2 volumes). Oxford: Clarendon Press. Nikolaeva, I. 2006. Relative clauses. In K. Brown (ed.), Encyclopedia of language & linguistics, Volume 10, 501–508. Amsterdam & Heidelberg: Elsevier. Noonan, Michael. 2007. Complementation. In Timothy Shopen (ed.), Language typology and syntactic description. Volume 2: Complex constructions, 52‒150. Cambridge: Cambridge University Press. Norde, Muriel. 1997. The history of the genitive in Swedish. A case study in degrammaticalization. Amsterdam: Vakgroep Skandinavische taal- en letterkunde. Norde, Muriel. 2006. Demarcating degrammaticalization: the Swedish s-genitive revisited. Nordic Journal of Linguistics 29(2). 201–238. Norde, Muriel. 2009. Degrammaticalization. Oxford: Oxford University Press. Nübling, Damaris. 1992. Klitika im Deutschen. Tübingen: Narr. Nübling, Damaris. 1998. Wann werden die deutschen Präpositionen flektieren? Grammatisierungswege zur Flexion. In Ray Fabri, Albert Ortmann & Teresa Parodi (eds.), Models of inflection, 266‒289. Tübingen: Niemeyer. Nübling, Damaris. 2005. Von in die über in’n und ins bis im: Die Klitisierung von Präposition und Artikel als “Grammatikalisierungsbaustelle”. In Torsten Leuschner, Tanja Mortelmans & Sarah De Groodt (eds.), Grammatikalisierung im Deutschen, 105‒131. Berlin & New York: de Gruyter. Nübling, Damaris. 2006. Auf Umwegen zum Passivauxiliar − Die Grammatikalisierungspfade von geben, werden, kommen und bleiben im Luxemburgischen, Deutschen und Schwedischen. In C. Moulin & D. Nübling (eds.), Perspektiven einer linguistischen Luxemburgistik, 171‒202. Heidelberg: Winter. Nübling, Damaris. 2008. Was tun mit Flexionsklassen? Deklinationsklassen und ihr Wandel im Deutschen und seinen Dialekten. ZDL 75(3). 282‒330. Nübling, Damaris. 2013. Zwischen Konservierung, Eliminierung und Funktionalisierung: Der Umlaut in den germanischen Sprachen. In Jürg Fleischer & Horst Simon (eds.), Sprachwandelvergleich. Comparing diachronies, 15‒42. Berlin & Boston: de Gruyter. Nübling, Damaris. 2015a. Die Bismarck – der Arena – das Adler. Vom Drei-Genus- zum SechsKlassen-System bei Eigennamen im Deutschen: Degrammatikalisierung und Exaptation. Zeitschrift für Germanistische Linguistik 43(2). 306‒344. Nübling, Damaris. 2015b. Between feminine and neuter, between semantic and pragmatic gender assignment: Hybrid names in German dialects and in Luxembourgish. In Jürg Fleischer, Elisabeth Rieken & Paul Widmer (eds.), Agreement from a diachronic perspective, 235‒265. Berlin & Boston: de Gruyter Mouton. Nübling, Damaris. 2017. Funktionen neutraler Genuszuweisung bei Personennamen und Personenbezeichnungen im germanischen Vergleich. In Johannes Helmbrecht, Damaris Nübling & Barbara Schlücker (eds.), Namengrammatik. Linguistische Berichte, Special Volume 23, 173‒211 Hamburg: Buske.

162

Damaris Nübling and Luise Kempf

Nübling, Damaris. 2020. Die Bismarck – der Arena – das Adler. The emergence of a classifier system for proper names in German. In Renata Szczepaniak & Johanna Flick (eds.), Walking on the Grammaticalization Path of the Definite Article. Functional Main and Side Roads, 228– 249. Amsterdam: John Benjamins. OED Online. Oxford English Dictionary. Oxford University Press, March 2016. Web. URL: http:// www.oed.com/, Accessed on 2016–03–01. Önnerfors, Olaf. 1993. Über narrative Verb-erst-Deklarativsätze im Deutschen. Sprache und Pragmatik 31. 1–52. Önnerfors, Olaf. 1997. Verb-erst-Deklarativsätze. Grammatik und Pragmatik. Stockholm: Almqvist & Wiksell. Petrova, Svetlana & Michael Solf. 2008. Rhetorical relations and verb placement in Early Germanic: A Cross Linguistic Study. In C. Fabricius-Hansen & W. Ramm (eds.) ‚Subordination‘ versus ‚coordination‘ in sentence and text: A Cross-linguistic perspective, 329–351. Amsterdam: Benjamins. Pintzuk, Susan. 1993. Verb seconding in Old English: Verb Movement to Infl. The Linguistic Review 10. 5–35. Pittner, Karin. 1995. The case of German relatives. The linguistic review 12(3). 197‒231. Pittner, Karin. 1996. Attraktion, Tilgung und Verbposition: zur diachronen und dialektalen Variantion beim Relativpronomen im Deutschen. In Ellen Brandner & Gisella Ferraresi (eds.), Language change and Generative Grammar (Sonderheft Linguistische Berichte 1995/96), 120‒153. Opladen: Westdt. Verl. Pittner, Karin. 2009. Relativum. In Ludger Hoffmann (ed.), Handbuch der deutschen Wortarten, 727‒757. Berlin & New York: de Gruyter. Polenz, Peter von. 1991. Deutsche Sprachgeschichte vom Spätmittelalter bis zur Gegenwart. Berlin & New York: de Gruyter. Ponelis, Fritz. 1993. The development of Afrikaans. Frankfurt a. M.: Lang. Raible, Wolfgang. 1992. Junktion. Eine Dimension der Sprache und ihre Realisierungsformen zwischen Aggregation und Integration. Heidelberg: Winter. Ramat, Paolo. 1981. Einführung in das Germanische. Tübingen: Niemeyer. Ringe, Donald. 2006. From Proto-Indo-European to Proto-Germanic. Oxford: Oxford University Press. Rissanen, Matti. 2011. On the long history of English adverbial subordinators. In Anneli MeurmanSolin & Ursula Lenker (eds.), Studies in variation, contacts and change in English. Volume 8: Connectives in synchrony and diachrony in European languages. URL: http://www.helsinki.fi/ varieng/series/volumes/08/rissanen/, Accessed on 2016–03–01. Ronneberger-Sibold, Elke. 1980. Sprachverwendung – Sprachsystem. Ökonomie und Wandel. Tübingen: Niemeyer. Ronneberger-Sibold, Elke. 1991. Funktionale Betrachtungen zu Diskontinuität und Klammerbildung im Deutschen. In Norbert Boretzky, Werner Enninger, Benedikt Jeßing & Thomas Stolz (eds.), Sprachwandel und seine Prinzipien, 206–236. Bochum: Brockmeyer. Ronneberger-Sibold, Elke. 1993. ‘Typological conservatism’ and framing constructions in German morphosyntax. In J. van Marle (ed.), Historical linguistics, 295–314. Amsterdam & Philadelphia: Benjamins. Ronneberger-Sibold, Elke. 1997. Typology and the diachronic evolution of German morphosyntax. In Jacek Fisiak (ed.), Linguistic reconstruction and typology, 315‒335. Berlin & New York: de Gruyter. Salzmann, M. 2006. Resumptive pronouns and matching effects in Zurich German relative clauses as distributed deletion. Leiden Working Papers in Linguistics 3(1). 17–50. Santorini, Beatrice. 1993. Jiddisch als gemischte OV/VO Sprache. In Werner Abraham & Josef Bayer (eds.), Dialektsyntax, 230–244. Opladen: Westdt. Verl.

Grammaticalization in the Germanic languages

163

Schlosser, Horst Dieter (ed.). 2004. Althochdeutsche Literatur. Berlin: Erich Schmidt. Schmuck, Mirjam. 2013. Relevanzgesteuerter verbalmorphologischer Umbau. Eine kontrastive Untersuchung zum Deutschen, Niederländischen und Schwedischen. Hildesheim: Olms. Schrodt, Richard. 2004. Althochdeutsche Grammatik Volume 2: Syntax. Tübingen: Niemeyer. Skrzypek, Dominika. 2012. Grammaticalization of (in)definiteness in Swedish. Poznań: Wydawnictwo Naukowe UAM. Smirnova, Elena. 2006. Die Entwicklung der Konstruktion würde + Infinitiv im Deutschen: eine funktional-semantische Analyse unter besonderer Berücksichtigung sprachhistorischer Aspekte. Berlin & New York: de Gruyter. Szczepaniak, Renata. 2011. Grammatikalisierung im Deutschen. Tübingen: Narr. Szczepaniak, Renata. 2013. Satztyp und Sprachwandel. In Jörg Meibauer, Markus Steinbach & Hans Altmann (eds.), Satztypen des Deutschen, 738–763. Berlin & New York: de Gruyter. Szczepaniak, Renata. 2015. Syntaktische Einheitenbildung ‒ typologisch und diachron betrachtet. In Christa Dürscheid & Jan Georg Schneider (eds.), Handbuch Satz, Äußerung, Schema, 104– 124. Berlin & Boston: de Gruyter. Szczepaniak, Renata. 2016. Vom Zahlwort eins zum Indefinitartikel ein(e). Rekonstruktion des Grammatikalisierungsverlaufs im Alt- und Mittelhochdeutschen. In Andreas Bittner & KlausMichael Köpcke (eds.), Regularität und Irregularität in Phonologie und Morphologie. Diachron, kontrastiv, typologisch, 247‒261. Berlin & Boston: de Gruyter. Takada, Hiroyuki. 1994. Zur Wortstellung des mehrgliedrigen Verbalkomplexes im Nebensatz im 17. Jahrhundert. Mit einer Beantwortung der Frage, wie und warum die Wortstellung von Grimmelshausens “Simplicissimus” geändert wurde. Zeitschrift für Germanistische Linguistik 22. 190–219. Teleman, Ulf, Staffan Hellberg, Erik Andersson & Lisa Christensen. 1999. Svenska Akademiens Grammatik. Stockholm: Norstedts Ordbok. Thurmair, Maria. 1991. Warten auf das Verb. Die Gedächtnisrelevanz der Verbklammer im Deutschen. Jahrbuch Deutsch als Fremdsprache 17. 174–202. Tiersma, Pieter Meijes. 1999. Frisian reference grammar. Dordrecht: Foris Publ. Vikner, Sten. 1995. Verb movement and expletive subjects in the Germanic languages. New York & Oxford: Oxford University Press. Weissberg, Josef. 1988. Jiddisch. Eine Einführung. Bern/Frankfurt: Lang. Werner, Otmar. 1979. Kongruenz wird zu Diskontinuität im Deutschen. In Bela Brogyanyi (ed.), Studies in diachronic, synchronic, and typological linguistics. Festschrift for Oswald Szemerényi on the occasion of his 65th birthday (Current Issues in Linguistic Theory, 11), 959‒988. Amsterdam & Philadelphia: Benjamins. Wiemer, Björn. 2011. The grammaticalization of passives. In H. Narrog & B. Heine (eds), The handbook of grammaticalization, 535‒546. Oxford: Oxford University Press. Wiesinger, Peter. 2001. Zum frühneuhochdeutschen Ausdruck der Aktionsart im Präteritum beim steirischen Dichtermönch Andreas Kurzmann um 1400. In Sheila Watts, Jonathan West & Hans-Joachim Solms (eds.), Zur Verbalmorphologie germanischer Sprachen, 175‒188. Tübingen: Niemeyer. Wurzel, Wolfgang. 1986. Die wiederholte Klassifikation von Substantiven. Zur Entstehung von Deklinationsklassen. ZPSK 39. 76‒96. Zifonun, Gisela. 2003. Sprachtypologische Fragestellungen in der gegenwartsbezogenen und der historischen Grammatik des Deutschen, am Beispiel des Relativsatzes. In Anja LobensteinReichmann & Oskar Reichmann (eds.), Neue historische Grammatiken. Zum Stand der Grammatikschreibung historischer Sprachstufen des Deutschen und anderer Sprachen, 59‒ 85. Tübingen: Niemeyer. Zifonun, Gisela. 2007. Relativsyntagmen im Deutschen und in europäischen Vergleichssprachen: funktionale Domäne und ausgewählte Varianzparameter. Deutsche Sprache 35(3). 190‒212.

164

Damaris Nübling and Luise Kempf

Zifonun, Gisela. 2017. Relativsyntagmen. In Lutz Gunkel, Adriano Murelli, Susan Schlotthauer, Bernd Wiese & Gisela Zifonun (eds.), Grammatik des Deutschen im europäischen Vergleich: das Nominal. Teilband 2, Nominalflexion, Nominale Syntagmen, 1736‒1806. Berlin & Boston: de Gruyter.

Michela Cennamo

4 Mechanisms and paths of grammaticalization and reanalysis in Romance  The Romance languages: A structural and typological overview . Latin and Romance The Romance languages have evolved from a common source, Latin, a variety of the Italic branch of Indo-European, originally spoken in a west-central area of Italy, Latium (Lazio), probably dating back to 1000 BC (Harris 1988: 1, among others), and subsequently spreading, during the Republican age and especially with the rise and expansion of the Roman Empire, beyond Italy, to northern and eastern Europe as well as to western north Africa, with a wealth of documentation (both literary and sub-literary) roughly from 200 BC, and only few texts before then (Clackson 2016: 3 and references therein). With the end of the Western Empire, by the end of the fifth century AD (officially in 476, date of the deposition of the last Emperor, Romulus Augustulus), the changed socio-cultural conditions, leading to the loss of the awareness of belonging to a ‘unitary ‘Roman’ world’ (Varvaro 1991, among others), determined a profound restructuring in the rules and relations governing linguistic variation in the Latin-speaking world, resulting in the seeming ‘fragmentation’ of the original linguistic unity and the emergence of Romance vernaculars. The rigid unitary Latin norm, that until that time had kept in check the local, centrifugal forces, collapsed and the latter often emerged as the new norm(s) in various, different areas of the Romània, so that “diastratic variation changes and crystallizes into diatopic variation” (Varvaro 1991: 49), with Latin continuing to be used as the language of culture and of the clergy during late Antiquity and the early Middle Ages, whilst the ‘local norms’, reflecting the spoken language(s), were used more and more in everyday communication. The diglossic situation whereby Latin, the high prestige language, alternated with the low prestige, local spoken varieties (‘vernaculars’) (Kabatek [2016: 627]; Wright [1991, 2016] for a different view) – the latter emerging in written texts by the late eighth (e.g., the Indovinello Veronese in Italy) and ninth (e.g., the Strasbourg Oaths in northern France) centuries, although sporadically until the beginning of the twelfth century (Frank-Job and Selig 2016: 24) – gradually evolved into a new, triglossic picture from the late twelfth century onwards (until the Renaissance). Then the writing of chronicles, juridical and administrative acts in the local vernaculars (attested initially in southern France, subsequently in the Iberian Peninsula, northern France and Italy) leads to a clear divergence between https://doi.org/10.1515/9783110563146-004

166

Michela Cennamo

Fig. 1: The Romance language areas of the world. From Bernd Kortman & Johan van der Auwera (eds.), The languages and linguistics of Europe: A comprehensive guide, 70–71. Berlin: de Gruyter.

Fig. 2: The Romance language areas in Europe. From Bernd Kortman & Johan van der Auwera (eds.), The languages and linguistics of Europe: A comprehensive guide, 70–71. Berlin: de Gruyter.

the high prestige Latin, and written Romance and spoken Romance (and other) varieties, depending on the area, until the establishment of the Romance vernaculars as the dominant languages of official documents, with Latin gradually retreating for this purpose by the sixteenth century (Kabatek 2016: 627–628, among others, and references therein). Over time, standardization processes and socio-political reasons led to a number of officially recognized national standard languages – nowadays including French,

Mechanisms and paths of grammaticalization and reanalysis in Romance

167

Spanish, Portuguese, spoken also in the Americas and in Africa (see Figure 1), as well as less widely spoken languages such as Italian, Romanian, Catalan, – and a number of languages and varieties (at times only conventionally labelled as such but languages in their own right, like the Italo-Romance dialects; Maiden and Parry [1997: 2–3]; Vincent [2014: 2–5]), with no such official status, such as Romance Creoles (Papiamento, Haitian Creole, Chavacano creole), Judaeo-Spanish, Aragonese, Asturian, Galician, Gascon, Francoprovençal, Occitan, Aromanian, MeglenoRomanian, Istro-Romanian, Sardinian, Italo-Romance, Raeto-Romance, just to mention a few (Harris [1988: 1–25]; Pountain [2016: 634] and the overview and structural descriptions in Ledgeway and Maiden [2016]). As recently pointed out by Bossong (2016: 68), the list of Romance languages potentially varies depending on the criteria adopted (see also Kabatek and Pusch 2011: 73–74). The areal distribution of the languages is shown in Figure 1 and Figure 2 (from Kabatek and Pusch 2011: 70–71), illustrating the areas where the Romance languages are spoken as L1 as well as those where they are either the official/national languages or lingua francas.

. Morphosyntactic typology, structural characteristics and classifications Latin was a canonical dependent-marking language, with head-marking patterns on the verb (signalled by verb agreement, indexing person and number of the subject, generally unexpressed if pronominal, so-called null anaphora1 [Nichols 1986: 77, 104; Ledgeway 2012: 286]) and a flat, non-configurational syntax (Vincent 1997a, 1998) in the nominal, verbal, adjectival and adverbial domains, unlike at the sentence and prepositional levels, where there is evidence of ‘incipient’ configurationality (Ledgeway 2012: 77–80). Thus, the syntactic dependency relations between constituents is ‘lexocentrically encoded’ (Bresnan 2001: 109–112; Ledgeway 2012: 71, note 84, 72–77), by means of case-marking and agreement (puella rosas donat girl... rose... donate..3 ‘the girl donates roses’). In contrast, the Romance languages are dependent-marking languages with a well-developed system of head-marked coding, in different grammatical domains (e.g., the nominal, verbal and clause/sentence level) and a configurational organization of their grammar (Vincent 1997b, 1998; Ledgeway 2011b, 2012: ch. 6). Thus, syntactic relations between constituents are expressed by means of hierarchically ordered strings of elements with fixed positions and limited order flexibility (It. i miei studenti amano la musica lit. ‘the my students enjoy music’ [*studenti miei i] [*students mine the] ‘my students enjoy music’).

 Latin, however, also allows object pro-drop (Vincent 1997b: 151; Ledgeway 2012: 71–75) and generally the omission of arguments (subject, direct object, indirect object, prepositional object etc.) if pragmatically salient and recoverable (I thank Adam Ledgeway for raising this point).

168

Michela Cennamo

The typological changes taking place in the transition from Latin to Romance involve indeed the rise of configurationality and the emergence of head-marked coding systems (Vincent [1997b, 1998, 2003] and the more recent perceptive and thorough analysis for all grammatical domains by Ledgeway [2011b, 2012, 2016b, 2017b]), together with head-marked patterns of active-inactive/stative syntax (depending on the structure) (Cennamo 2009) in several grammatical domains (La Fauci 1994). These innovations result in several of the grammaticalization phenomena observable in Romance and discussed in the present chapter, such as (i) the rise of articles, with a ‘dedicated position for (in)definiteness marking’ (Ledgeway 2011c: 721–723, 2012: 80–118, 2016b: 764–767), (ii) the emergence of clitics (Vincent 1997b, 1998), (iii) the proliferation of auxiliaries, (iv) the replacement of non-finite complementation (e.g., accusative + infinitive) with a rich system of complementizers introducing the different (subordinate) clause types (so-called functional heads in the generative framework), (v) the fixing of SV0 as the sentence unmarked word order , after an initial V medial position (SVO/OVS) in Late Latin (Herman 2000: 86) leading to V2 syntax in Medieval Romance (but in old Sardinian, most typically V1, Lombardi [2007] and recent nuanced discussion in Wolfe [2015]), a characteristic still retained in some Romansh and Ladin varieties (Salvi 2016: 1009–1010) and subsequently lost (see full discussion in Ledgeway 2012: 59–80, 2016b: 771; Salvi 2016; Wolfe 2018 and references therein). The typological and structural changes characterizing the Romance languages in respect to Latin (on which see Vincent [2016b] for a recent overview) point to a northern-southern differentiation within the Romània, based on morphosyntactic characteristics (Zamboni [1998: 130–131] and recent discussion in Bentley [2020]; Drinka [2020]), albeit with phonological reflexes as well (Zamboni 2000: 87–88), differing from previously proposed classifications, such as the western-eastern one, based on phonological properties (von Wartburg 1950; Bossong 2016: 71) and the centre-periphery distinction, based on lexical distribution (Bartoli [1929, 1933] and discussion in Ledgeway [2012: 314]). More specifically, northern Romance (comprising northern and southern Gaul, northern Italian dialects, Ræto-Romance) and southern Romance (including Ibero-Romance, central/southern Italian dialects, Sardinian and Daco-Romance), are differentiated by a number of opposing structural characteristics, namely (i) differential marking of the S/A2 argument (through subject clitics/preverbal order) and rise of the partitive vs differential marking of highly individuated Os (animate/human/definite/referential), (ii) development of obligatory subject clitic pronouns vs null subjects; (iii) past participle agreement vs lack of past participle agreement; (iv) preference for the present perfect in aoristic function

 S, A/O are syntactico-semantic primitives, referring to the sole participant of an intransitive predicate and to the Agent/Patient participants of a transitive predicate. In their canonical realizations they coincide with the grammatical categories of subject and object (Dixon [1979, 1994: 6–8], also Mithun and Chafe [1999] and more recent discussion in Haspelmath [2011]).

Mechanisms and paths of grammaticalization and reanalysis in Romance

169

vs use of the simple past and lack of the present perfect; (v) presence of a split intransitivity contrast marked through auxiliary selection (‘have’ vs ‘be’) vs generalization of one auxiliary (either ‘have’/’hold’ or ‘be’) (Zamboni 1998: 130–131, 2000: 102–105; Ledgeway 2012: 314). The present discussion takes a broader perspective on grammaticalization, which is viewed as involving different types and degrees of ‘tightness’ affecting linguistic expressions in all grammatical domains (Haspelmath [1998a: 318]; Wiemer and Bisang [2004: 4]; Carlier, De Mulder, and Lamiroy [2012: 293], among others, and discussion in Lehmann [2015: 11–17]). Thus, not only changes from “more to less lexical or from less to more grammatical” (Börjars and Vincent 2011: 163) will be considered, but also changes concerning the higher ‘internal’ dependency among constituents (Haspelmath 2004: 10), such as the fixing of word order (Lehmann [1992]; Ledgeway [2011c: 727–728] and Sitaridou [2012: 596–598] for Romance) and the degree of syntactic cohesion between the matrix causative verb and its dependent infinitive with causatives (Soares da Silva 2012; Vincent 2016a). The study also addresses processes of reanalysis, with which grammaticalization interacts (Hopper and Traugott [2003: 59]; Traugott [2011: 21], with a discussion of different aspects and definitions of the relationship between grammaticalization and reanalysis and related references).

 Grammaticalization of nominal categories . Declension class(es) and gender Most Romance languages display three types of endings for the singular of nouns and adjectives, namely -a, for feminine gender, -o/-u for masculine gender, -e for either masculine or feminine (e.g., Sp, chica ‘girl’ [F], chico ‘boy’ [M], clase ‘class’ [F]), continuing the Latin declensions: first ( puellam ‘girl’/bonam ‘good’), second ( puerum ‘boy’/bonum ‘good’) and third ( canem ‘dog’/celerem ‘swift’). In languages/varieties where the distinction between final unstressed –e and –o was phonologically neutralized (e.g., French, Occitan, Romansh), the three-way opposition has become a two-way distinction, between the first declension, retaining the final vowel and associated with feminine gender, and a non-first declension, lacking the final vowel, and associated with either gender, while class membership is quite arbitrary (Maiden 2011: 159–163, 2016b: 508). Genders, the relationship between nouns and their modifiers (Maiden 1995: 106), or, more precisely, “classes of nouns reflected in the behaviour of associated words, being fundamentally a matter of morphosyntactic agreement” (Martin Maiden, p.c.), are marked on articles, (attributive) adjectives, (strong/clitic, third person) pronouns, including possessives and determiners, participles, first and second person plural pronouns incorporated into an indefinite adjective/pronoun (e.g., Rom. noantri/noantre [we.other../.] ‘we’; voantri/voantre [you.other../.]

170

Michela Cennamo

Tab. 1: Reanalysis of Latin gender classes. Latin

>

Romance

neuter plural (-a)

>

feminine singular

- ‘leaf’

It. foglia, Pt. folha, Cat. fulla, Ro. foaie ‘leaf’

fruit names - (.)/- (.) ‘apple/apples’

It. mel-a (.) ‘apple’; mel-e (.) ‘apples’

tree names feminine singular > (morphologically masculine) - ‘pear tree’ ()

>

masculine per-o ‘pear tree’ per-i ‘pear trees’

‘you’) and marginally, in some southern Italo-Romance varieties, adjectival adverbs (Ledgeway 2011a: 37; Loporcaro 2016b: 924), intertwining with number (Maiden 2016c: 700, and § 2.2). Most typically, Romance languages show a two-gender system (but see discussion further below and § 2.2), feminine vs non-feminine, the latter conventionally labelled masculine (reflecting the CL formal identity between the masculine and neuter in the various declensions for some cases and further subsequent phonetic and morphological changes in Romance [Maiden 1995: 108]). Although there is a semantic core for gender assignment, reflecting sex congruence, especially for human beings (It. ragazzo/ragazza ‘boy/girl’, proper names (It. Claudio/Claudia) and domestic animals (Pt. cavalo/egua ‘horse/mare’), there is a high degree of arbitrariness. Thus, for instance, the word ‘fox’ is feminine in Italian (la volpe), masculine in French (le renard), while displaying two words of different gender in Spanish, respectively masculine and feminine reflecting sex (e.g., zorro ‘fox’/ zorra ‘vixen’) and Romanian, where the masculine form vulpoi(ul) is derived from the feminine vulpe(a) (Loporcaro 2016b: 926). In some lexical domains, as with tree and fruit names, the apparent arbitrariness in gender assignment results from the reanalysis of previous Latin gender classes. Originally neuter Latin fruit names (malum/mala ‘apple/apples’) became (mostly) feminine, owing to the reanalysis of the neuter plural ending –a as feminine singular (It. mela ‘apple’), prompted by formal identity with the inflectional ending of first declension nouns, e.g., puella(m) ‘girl’ (Maiden 1995: 108–109, 2011: 172, 2016c: 111–112 and references therein). On the other hand, tree names, that in Latin belonged to the second declension and were assigned to the feminine gender (e.g., pirus ‘the pear tree’), are reanalysed as masculine, owing to their morphologically masculine inflection (Maiden 2011: 172; Loporcaro 2016b: 926, 2018: 55). In Romanian, however, some fruit names are neuter (e.g., măr(ul)/mere(le) ‘(the) apple / (the) apples’), although most are feminine (Loporcaro 2016b: 926), whereas in some central-southern Italian dialects (e.g., Agnone [Molise] fruit names are neuter and contrast with masculine fruit trees (Loporcaro and Paciaroni 2011: 410–411; Loporcaro 2016b: 927).

Mechanisms and paths of grammaticalization and reanalysis in Romance

171

Tab. 2: Two-/three-/four-gender systems and reanalysis of Latin gender classes in some domains. Feminine

Masculine

Two-gender system:

It. ragazz-a/e It. ragazz-o/i girl-./. boy-./. ‘girl/s’ ‘boy/s’

Three-gender system: neuter = alternating gender Ro., It. (small class, e.g., body parts), Neap. (neuter with mass nouns) (Loporcaro & Paciaroni : )

Ro. băutur-a drink (.) – the.. ‘the drink’/ băuturi-le drink (.) – the.. ‘the drinks’

Four-gender system: SID (e.g., Aviglianese, Molfettese) (Loporcaro & Paciaroni : )

Molf. la vo ǝʃǝ ‘the voice’ rǝ vvauʃǝ ‘the voices’

Ro. student-ul student (.) – the.. ‘the student’/ studenţ-i students (.) – the.. ‘the students’

Neuter

Alternating

Ro. vin-ul wine (.) – the.. ‘the wine’/ vinuri-le wine (.) – the.. It. bracci-o (.) bracci-a arm – ./ – . ‘arm/arms’

Molf. u ffi errǝ (da st ǝrá) () ‘the iron’ lǝ ffi errǝ () ‘the irons’

Molf. rǝ ffi errǝ (material) (mass noun) ‘the iron’

Molf. u vit ǝr ǝ ‘the glass’ () (singular: masculine article) rǝ vvεtǝrǝ ‘the glasses’ () (plural: feminine article)

In Romanian and marginally in Italo-Romance, there is also evidence for threegender systems (e.g., Neapolitan, where neuter /o/ and feminine plural /e/ prompt phonosyntactic doubling, Ledgeway [2009: 150–154]; Loporcaro [2016b: 932]; Maiden [2016d] for a perceptive, different view) and even four-gender systems in centralsouthern Italo-Romance. The former continues the Latin (three-gender) system (masculine, feminine, neuter), which was eliminated and reanalysed at a different pace in different Romance areas (Loporcaro 2016b: 930–931). This state of affairs is instantiated by the so-called genus alternans (Loporcaro and Paciaroni 2011; Loporcaro 2018: 225–230), a phenomenon whose (potentially triggering) morphological conditions are inherited from IE (Maiden [2016d], whereby, for instance, some nouns display feminine agreement in the plural (-e in Romanian, -a in Italian) and masculine agreement in the singular (e.g., Ro. brațele ‘(the) arms’, brațul ‘(the) arm’; It. (le) braccia ‘the arms’, (il) braccio ‘the arm’, see Section 2.2). Four-gender systems are attested in old Neapolitan (with modern Neapolitan exhibiting instead a threegender system, masculine, feminine and neuter with mass nouns [Ledgeway 2009: 150–164; Loporcaro 2016b: 932]), Salento and northern Calabria, where the new alternating system involves feminine agreement in the singular and masculine in the plural (Loporcaro 2016b: 932–933).

172

Michela Cennamo

. Number In the Romance languages number is expressed inflectionally on nouns, pronouns, adjectives, determiners and verbs. Three types of plural endings (mainly continuing the Latin tripartite system from accusative forms in -s, -i and -a [neuter plural]) can be found:3 (i) the sigmatic plural in the west (Iberian peninsula, western Occitan, written French, Romansh, Friulan, Ladin and Sardinian), (ii) the vocalic plural in the east (Italo- Romance, Dalmatian, Daco-Romance), (iii) the invariant form in the north (spoken French, Francoprovençal, northern Occitan) (Maiden 2016c). They are realized, respectively, as -s and -i/-e/-a, depending on the inflectional class and gender, with the invariant type characterized by identity of the singular and plural forms. In the vocalic type (in which there occurs cumulative marking of number and gender) the -i ending is the most frequent form, marking masculine plural and feminine plurals for nouns whose singular form ends in -e (It. voc-e/-i ‘voice’). The distribution of the three types of endings varies according to the language/variety and their individual phonological rules, the sigmatic system probably underlying historically the expression of plural number in all Romance languages, with the feminine plural -e derived from the path * > *-ai ̯ > -e (Maiden 2016c: 698 and references therein) and the masculine - being both a reflex of third declension - and of second declension masculine nominative plural - (Maiden 1995: 102; 2016c: 699–700). The Latin neuter plural - instead is reanalysed as a feminine plural marker for masculine singular nouns deriving from continuants of Latin (inanimate) second declension nouns in - (e.g., It. muro ‘wall’ vs mura ‘walls’) in some areas (Daco-Romance, southern Italo-Romance, with traces also in standard Italian and Dalmatian (Vegliote), owing to its identity with the singular feminine ending -() (e.g., filia(m) ‘daughter’). This led to the so-called genus alternans (Igartua 2006; Loporcaro and Paciaroni 2011), a typologically unusual pattern (attested in several Australian languages) (Loporcaro and Paciaroni 2011: 414), whereby a gender opposition is reanalysed as a number distinction (cf. § 2.1). In some varieties (e.g., Italian, Portuguese, French), neuter -a was also reanalysed as a feminine singular marker (It. foglia, Pt. folha, Cat. fulla ‘leaf ’) (Maiden 2016c: 700, 2016d) and even as a masculine plural ending in some regions (e.g., Corsica) – a much later and localized phenomenon (Martin Maiden, p.c.) – spreading to animate nouns as well (Cors. avvukatu ‘lawyer’ – avvukata ‘lawyers’ (Maiden 2016c: 701). An unusual morphological structure is displayed by the plural -ora ( > uri in Romanian ), the continuant of the plural of (neuter third declension) Latin nouns of the type  ~  ‘body/bodies’; this form, unlike the other Romance plural desinences (monomorphemic and monosegmental), is disyllabic, involving the morphological reanalysis of -, a positional variant of root final -, reinterpreted as part of an inflectional ending consisting of two formatives, - + -, thus showing a ‘double, syntagmatic,

 These, however, are typological tendencies rather than absolutes (Martin Maiden, p.c.).

Mechanisms and paths of grammaticalization and reanalysis in Romance

173

morphological marking of the plural’ and remaining confined to inanimates and exhibiting genus alternans (Maiden 2016c: 702–703, 2016d). Reanalysis is also attested for nouns denoting non-separable pairs/sets (e.g., scissors, trousers) that are idiosyncratically either singular or plural in some Romance languages (cf. Fr. un pantalon ‘[a pair of] trousers’ vs Sp. pantalón-/pantalones-) (Maiden 2016c: 706 and further discussion therein).

. Possession Possession is instantiated by various constructions in Romance, falling within the nominal (attributive) and verbal (predicative) domains. In the former the possessorpossessee relation is encoded by means of prepositional phrases headed by the prepositions de/di ‘of ’ and by the oblique case in Romanian (§ 2.4.2.2). The latter domain involves the continuants of the Latin location and equation schemas – instantiated by the ‘be’ construction, with the possessor encoded as an oblique argument and the possessee in the nominative, as in the Romanian external possession pattern (Ro. Ioana îmi este cumnată Ioana [I. be..3 sister-in-law] lit. ‘Ioana to-me is sister-in-law’ ‘Ioana is my sister-in-law’, [Niculescu 2013: 186]) – and the action schema, realized by the ‘have’ pattern, with the possessor in the nominative and the possessee in the accusative, the latter structure used for both alienable and inalienable possession (It. Marco ha una bella intelligenza [Mark have..3 a fine mind] ‘Mark has a fine mind’). The distribution of these two verbal (possessive) constructions varies: Italian, Spanish and northern Italian dialects only employ the ‘have’ pattern, whilst French, Romanian and upper southern Italian dialects allow both (although with definiteness restrictions on the nominal expressing the possessee in French) (la maison est à moi/Jean/*la femme [the. house. be..3 to me/John/the woman], Bentley and Ciconte [2016: 851]). Portuguese has grammaticalized two different verbs for the action and accompaniment scheme, respectively the verbs ter ‘hold’ and estar ‘stay’ plus a comitative phrase introduced by the preposition com ‘with’, the last pattern employed for alienable possession (BPt. estou com vinte reais no bolso [stay.1 with twenty reais in pocket] ‘I’ve got twenty reais in my pocket’, Whitlam [2011: 297]), a pattern also available in colloquial regional Southern Italian registers (e.g., Neap. sto con dieci euro [stay..1 with ten euros] ‘I have ten euros’, Ro. El este cu două mașini he [stay..3 with two cars] ‘He has two cars’ [example from Martin Maiden, p.c.]). The same holds generally for IberoRomance and upper southern-Italo-Romance (Bentley and Ciconte 2016: 852), with Neapolitan exhibiting the grammaticalization of tenere ‘hold’ as a durative and result copular-like, linking element, used for permanent/inherent situations, e.g., tenere pacienza [hold. patience] ‘to be patient’ (lit. ‘to have patience’), differing from avé ‘have’, occurring for contingent situations, e.g., agge pacienza! [have. .2 patience] ‘be patient’ (Ledgeway 2009: 645–647; Bentley and Ciconte 2016: 852).

174

Michela Cennamo

. Determiners Among determiners (demonstratives, quantifiers, possessives and determiner-like adjectives, e.g., It. altro ‘other’), two major Romance innovations involving grammaticalization are the rise of articles (§ 2.4.1), a category that was lacking in Latin, and the change of demonstratives and possessives into determiner-like elements (2.4.2), all of them testifying the emergence of a Determiner Phrase in Romance, with the fixing of a dedicated prenominal determiner position, in the passage from Latin to Romance (Vincent 1997b, 1998; Ledgeway 2011c, 2012: 81–119, 2016b: 764).

.. Articles Indefinite and definite articles result from the grammaticalization of the Latin numeral for ‘one’, /- and of the distal demonstrative  ‘that/intensifier’  ‘-self, very’, respectively. The former is a Romance development (Cat./It./Sp. un/una, Fr. un/une, Pt. um/uma, Ro. un/o), attested by the thirteenth-fourteenth century, displaying in some varieties a full and a reduced form (Cal. unu/una vs nu/ na − na figliul-a ‘a daughter’ (Ledgeway 2012: 89–109, 2016b: 764, 2017b). During the transitional stage in which it is used only for specific and new referents, probably a reflex of its original numeral function, it cannot be used for generic, unique and abstract referents, for which bare nominals occur (OTsc., Nov.4: donami cavallo da cavalcare [give.me horse to ride] ‘give me a horse to ride’ [generic], from Ledgeway [2016b: 764]). In modern Romance languages, the indefinite article is employed with both generic and specific referents. In some varieties (e.g., Ibero-Romance and Occitan), there is also the plural form of the indefinite article (Log. fau unas cocas ‘I make some cakes’, Ledgeway [2016b: 765]), although it should be regarded an indefinite quantifier rather than a plural indefinite article. Definite articles evolve from the weakened form of the distal demonstrative  ‘that’ (denoting spatial distance between speaker and addressee), attested by the 6th–8th century AD (Herman 2000: 84 f.; Adams 2013: 506–527; Ledgeway 2016b: 765; 2017b, among others), which gradually loses two of its meaning components, namely spatial distance and individuation of an entity or a set of entities (Maiden 1995: 119–120). Instead, the property of shared cognition between interlocutors becomes prominent and ultimately the only relevant characteristic left, with continuants of Latin  (originally both anaphoric and cataphoric) coming to mark anaphoricity and definiteness (Cat./ Sp. el/la, Fr./Occ. le/la, It. il/la, Pt./SItR, o/a, Ro. -(u)l/-a). In some Catalan varieties and Sardinian a reflex of the Latin intensifier  (used only in anaphoric function) is found instead (Bal./Costa Brava Cat. es/sa and Sardinian su/sa, Ledgeway [2016a: 765] and references therein). Balearic Catalan also displays the grammaticalization of the descendant of Latin  + a reflex of Latin / ‘master/mistress’ (> en/na/n’), used with proper names, e.g., en Nicolau (Dols)/na Mercè Rodore-

Mechanisms and paths of grammaticalization and reanalysis in Romance

175

Tab. 3: Continuants of the Latin numeral ‘one’ and distal demonstrative and intensifier. Indefinite

Definite

Definite

Definite

Definite

< Lat. / 

< Lat. 

< Lat. 

< Lat.  + /

Alternation reflex of - (deictic) and -based paradigms (article salat, +deictic)

It., Cat., Fr., Sp., Pt., Ro.

Cat., Sp., Fr./Occ., It., Pt./SItR., Ro.

Srd., Bal./Costa Brava Cat.

Cat.

Cat., Bal.Cat.

da, n’Eiximenis (Wheeler, Yates, and Dols 1999: 67–68; Ledgeway 2012: 99–100; 2017b: 847). It also shows the alternation of - and -based/derived paradigms, the latter known as article salat (Badia i Magarit 1995: 444–446; Ledgeway 2012: 89–96), characteristic of the spoken language and popular literature (Wheeler, Yates, and Dols 1999: 45). This retains deictic force, showing anaphoric and cataphoric reference, and employed for definite, specific, [±given] referents, unlike illebased forms, that occur with definite, generic and non-topical referents (tu éts sa dona que jo vaig veure? ‘Are you the lady that I saw’ vs el Papa ‘the Pope’, la terra ‘the earth’, examples and discussion from Ledgeway [2016b: 766–767]).

.. Demonstratives and possessives Alongside the lexicalization of a dedicated position for (in)definiteness marking, manifested by the emergence of articles, other determiner-like elements such as demonstratives (§ 2.4.2.1) and possessives (§ 2.4.2.2), the latter having both an adjectival and pronominal status in Latin, gradually acquire a determiner function, occurring in a fixed position within the Nominal group (Ledgeway [2016b: 767]; Ledgeway and Smith [2016] for a full discussion).

... Demonstratives Thus, demonstratives – that in Romance only preserve the spatial-related dimension of the original Latin paradigm, losing their discourse-related function (realized, respectively, by , ,  vs , , ) (Giusti [2016: 550]; Maiden [1995: 115–117] for Italian) – gradually lose their adjectival status, grammaticalizing into determiner status, although to a different extent within Romance. In Catalan, Occitan, Romanian and Spanish, demonstratives retain their adjectival status, since two paradigms occur: one where the demonstrative is in complementary distribution

176

Michela Cennamo

Tab. 4: Grammaticalization of demonstratives as determiner-like elements. Demonstratives > Determiner-like function Prenominal position in complementary distribution with article: determiner-like function Postnominal position: canonical adjectival function Prenominal position only: determiner function only

Cat., Occ., Ro., Sp. Cat., Occ., Ro., Sp. It., Pt.

with the article (Occ. aqueste (*lo) brave ome [this (*the) good man] ‘this good man’) and one where it preserves its adjectival status, occurring in its canonical potnominal position, with the prenominal determiner position filled by the article (Occ. lo brave ome aqueste lit. ‘the good man this’, examples and discussion from Ledgeway [2016b: 767]); see also the Romanian adjectival article cel, from the grammaticalization of ‘endophoric distal demonstrative’ acel / acela (< Lat. / + ), through Balkan influence (Nicolae 2013b: 308–318). In Italian and French, however, the demonstrative has fully grammaticalized into a determiner, since only the paradigm with its prenominal, determiner position is found (It. questo buon uomo vs *il questo buon uomo ‘this good man’). Latin demonstratives  ‘that’ and/or  ‘-self, the very’ (depending on the language/variety), also grammaticalize as third person pronouns in Romance, losing their adjectival, i.e., modifier function, while retaining their pronominal value, strengthening their lexical status and autonomy (Cappellaro 2016: 724 and references therein). More specifically, their semantic differentiation whereby ille has both a distal and an anaphoric function, whereas  conveys an emphatic/contrastive meaning (Vincent 1997b: 154) is lost, with  losing its anaphoric-contrastive meaning, and their continuants alternating in the same contexts, as witnessed in early Italo-Romance (e.g., oFl., egli udì … [he listen...3] ‘He listened to …’ vs … esso … ritenne he [keep...3] ‘He … kept in his mind’ [Decameron, Day 1, nov. 6.6–7; Branca 1976, LIZ 4.0] (Cappellaro 2016: 726) and merging into one suppletive paradigm in some languages/varieties, as in (formal) standard Italian for third person pronouns in subject function, e.g., egli ‘he’ ()/ella-essa ‘she’ (), essi ‘they’ ()/esse ‘they’ () (Cappellaro 2016: 725–726 and references therein).

... Possessives The grammaticalization of possessives involves the gradual change from adjectival to determiner function, with the loss of their pronominal function, and an intermediate possible stage involving a dual adjectival-determiner paradigm, formally distinct in Spanish and French (Fr. mon [clitic]/mien [tonic]; Sp. mí [clitic]/mío [tonic] vs It. mio, Ro. meu ‘my’ for the masculine 1 possessive, Lyons [1986]; Ledgeway [2016b: 767]), where they have a quirky adjectival status, with a narrow(er) distribution (Van Peteghem 2012). French and Brazilian Portuguese (Thomas 1969: 80; Teys-

Mechanisms and paths of grammaticalization and reanalysis in Romance

177

Tab. 5: Distribution and grammaticalization path of possessives. Grammaticalization cline: Possessives

(Dual Adjectivaldeterminer paradigms ) >

Adjectival >

Determiner

Adjectival paradigm: tonic, postnominal (Sp., Ro.); tonic, pre/postnominal (Occ., Cat.)

Adjectival paradigm only: Ast., It., EPt.

Fr., BPt. Clitic paradigm only (pre/postnominal)

Determiner paradigm (singular persons): clitic, prenominal (except in Ro.)

(Determiner-like function of adjectival possessive with singular, unmodified kinship terms: Ast., It., EPt.)

Postnominal possessive; Possessive article (Ro.) Initial stage (loss of pronominal function)

Intermediate stage (probably characteristic of early stages of Romance)

Further stage

More advanced stage

sier 1984: 105) instantiate a more advanced stage in the grammaticalization process, since possessives have lost their adjectival function,4 having only a clitic determiner paradigm (pre/postnominal in several Romance languages [BPt. nosso vizinho ‘our neighbour’], but postnominal in Romanian, Ledgeway [2016b: 767]). Italian, Asturian and European Portuguese lie at the bottom of the cline, in that possessives are mainly adjectival, therefore co-occurring with a determiner (It. il mio amico ‘my friend’ vs *mio amico5), although displaying also a clitic determiner position (It. mio padre ‘my father’ vs *il mio padre), generally confined to singular, unmodified kinship terms (*miei padri ‘my fathers’, Ledgeway [2016b: 767]). Spanish appears to be intermediate on the cline, with a prenominal determiner possessive paradigm and a postnominal adjectival one, whilst Romanian exhibits a different pattern, with postnominal possessives (carte-a mea ‘book-the mine’ ‘my book’ and the ‘possessive or genitival article’ al (>  ‘to, at’ >  ‘of ’ +  ‘that’), a ‘prepositional determiner’ (Ledgeway 2012: 116), displaying number and gender agreement with the possessee (noile cărţi ale profesorului new.. book.. al.. professor... ‘The professor’s new books’ [Stan 2013: 265–267] and references therein).

 In French, for instance, the strong possessive form does not have an adjectival distribution (Van Peteghem 2012: 631).  The pattern, however, is possible in some constructions, e.g., with equative clauses, in predicative function: Marco è mio amico Mark be..3 my friend (lit.) ‘Mark is my friend’ (I thank Martin Maiden for raising the issue).

178

Michela Cennamo

. Case With the exception of Romanian (where a nominative/accusative vs genitive/dative distinction is retained with feminine singular nouns and adjectives, Dragomirescu and Nicolae [2016: 913] and references therein), the Romance languages do not have morphological case for nouns. They only retain minimal case distinctions in the pronominal system (and with determiners in some languages), most typically opposing nominative/accusative in tonic first and second person singular pronouns (It. io ‘I’, tu ‘you’/me ‘I’, te ‘you’, Sp. yo ‘I’, tú ‘you’/mí ‘I’, ti ‘you’, with dative in accusative function), as well as first and second person plural in Romanian. Catalan keeps the distinction for the first person only, while there is a wide dialectal variation in Italo-Romance, where one also finds a three-way distinction (e.g., Avigliano, Basilicata), which also occurs in other Romance varieties, distinguishing nominative, dative and accusative after prepositions (Ro. eu ‘I’, tu ‘you’/mie ‘to me’, ție ‘to you’/mine ‘me.’, tine ‘you.’). Third person singular pronouns in Romanian display the nominative/accusative vs genitive/dative distinction with nouns, also attested in old Italian (Dragomirescu and Nicolae [2016: 915] for discussion and references). In modern Italian, case distinctions are difficult to detect in the third person, the same forms being used as nominative and oblique in the singular (lui, esso ‘he’; lei, essa ‘she’), while other forms (egli he’, ella ‘she’) occur only as nominative (Salvi and Vanelli 2004: 192). Within the pronominal system, several varieties show the grammaticalization of the Latin comitative forms  ( ‘I.’ +  ‘with’ ‘with me’),  ( ‘you.’ +  ‘with’ ‘with you’),  ( ‘he.’ +  ‘with’ ‘with him(self)’ [± anaphoric]),  ( ‘we.’ +  ‘with’ ‘with us’),  ( ‘you.’ +  ‘with’ ‘with you’). The original Latin sequence of preposition ( ‘with’) enclitic to the pronoun, lost its original meaning (i.e., it semantically bleached) and, coalescing with it, was reanalysed as an allomorph of the pronoun, so that a new preposition (e.g., Neap. co ‘with’) is now prefixed to the complex pronoun in comitative function (Neap. co mmico, Sp./Pt. conmigo/comigo, ‘with me’, southern Lazio dialects kot'tiku ‘with you’, ko n'nosco ‘with us’), the third person forms losing the reflexive interpretation (e.g., Tsc. stetti seco stay..1 .with ‘I stayed with him’, Maiden [1995: 170–171; 194: note 66]). Grammaticalization of the possessive pronoun (possessive enclitics) appears in several varieties, restricted to kinship terms (cf. southern Italo-Romance sorema ‘sister-mine’, Ro, maică-sa ‘mother-his/her’). The Romanian forms show a more advanced grammaticalization stage, with the (strong) enclitic possessive having an affixal status and as such being followed by case inflexional endings, e.g., sor(ă)-mea [-] ‘sister-my’, sor (ă)-mii [-], sor(ă)-meo [] (Nicolae 2013a: 341 f.; Dragomirescu and Nicolae 2016: 915). Indeed, the enclitic possessive pronoun in these patterns appears to be an “agreement inflection on the noun for the person features of its argument” (Giusti 2016: 544).

Mechanisms and paths of grammaticalization and reanalysis in Romance

179

A minimal case-system is also found with object clitics – a category that did not exist in Latin and a major innovation in Romance syntax, derived from Latin weak pronouns (Salvi 2011: 326–327 and references therein) – displaying accusative and dative marking, for direct and indirect object functions, respectively, with different ordering according to the language/variety, clustering into one form in Italian (1a): (1)

a. (It.) glielo ho dato he..it.. have.1 give.... b. (Fr.) Je le lui ai donné I it. he. have.1 give.... ‘I gave it to him’

Some Romance accusative/dative clitic pronouns result from the grammaticalization of locative adverbs, still functioning as such, e.g, It. (accusative/dative) 1 ci ‘(to) us/ourselves’ < Lat.  ‘hence’ (Maiden 1995: 167), Fr. y ‘there, to it’, Occ. i ‘there/him (dat.)’ (< Lat.  ‘there’ or  ‘here’, Oliviéri and Sauzet [2016: 331]), It. (accusative/dative 2.) vi ‘(to) you/yourselves’ <  ‘there’, Pescarini [2016: 742– 743]), alongside their grammaticalized uses, involving also their proform function in existential/presentative sentences (northern Italo-Romance gh(e) < j <  ‘here’/ // ‘there, therein, in that place’, Benincà [2007], Log./Nuo. bi, It. vi < ibi, Cat. hi < , Badia i Margarit [1951: 266], Fr. y, Prv. i < , , southern and northern Italo-Romance nd(i), ne <  ‘thence’ Blasco Ferrer [2003: 61]; Bentley [2013]; Bentley and Ciconte [2016: 855]). The continuants of Latin  ‘thence’ have also acquired a partitive/genitive function in Italian (Salvi and Vanelli 2004: 201), French and Catalan (Fr./Cat. en, It. ne, Dragomirescu and Nicolae [2016: 921]), replacing an indefinite or quantificational O (2): (2)

a. (It.) Vuoi del pane? Sì Ne voglio want..2 some bread yes of-it want..1 ‘Would you like some bread? Yes, please, I would like some’ b. (It.) ne vidi cinque (sc. ragazzi) of-them see..1 five (boys) ‘I saw five of them’

Remnants of ablative forms occur in the (western) Romance descendants from the grammaticalization of ablative nouns such as Latin  ‘(with a) mind’ <  ‘mind’ (unattested in Romanian), which has turned into a derivational affix, com-

180

Michela Cennamo

bining with adjectives (Maiden [1995: 93–94] for Italian; Detges [2015] for a panRomance perspective; also Bauer [2003], Norde [2009: 41–47] for a critical assessment of the diachrony of the change). The change involves the gradual bleaching of the original meaning of the word ‘mind’, coming to denote ‘manner’, through metonymic inferencing, along the path ‘mental state of the participant in the event’ > ‘way in which the event is perceived’ > ‘manner in which the event takes place’ (Norde 2009: 43–44; Detges 2015) and shows different paths and degree of grammaticalization (see Norde [2009: 41–46] and references therein for the theoretical implications of the divergent developments of this suffix in Romance). Traces of the ablative also occur in prepositions, continuants of Latin nouns and present participles in Italian (e.g., senza ‘without’ < Lat.  ‘in the absence’; durante ‘during’ < Lat.  ‘(while) lasting’ <  ‘harden, endure, last’; nonostante ‘despite’ <   ‘not withstanding’ < Lat.  ‘stand before/against; be opposite’; ora ‘now’ <  ‘at (the) hour’, Maiden [1995: 98]).

.. Adpositions With the demise of the case system, several functions originally conveyed by case in Latin come to be realized by means of prepositions. These, although continuing to occur in their original function, also develop grammaticalized uses involving different degrees and types of semantic bleaching and reanalysis. This is the case with the continuants of the Latin prepositions  ‘motion towards’,  ‘(down) from’ (on whose diachrony see also Vincent’s [2017: 293–296] recent, insightful discussion),  ‘through/by’,  ‘with’. The grammaticalization of the Latin preposition  ‘of, (down) from’ as a marker of objective genitive in Romance (Cat. les flors de aquesta noia ‘the flowers of this girl’) is a slow and gradual process, detectable in early medieval Latin texts and attested in early Romance (e.g., OFr. and ORo.), where, however, one also finds the descendant of Latin  ‘to(wards)’, spreading from dative to genitival function, occurring in old French (la chambre a la pucele lit. ‘the chamber to the maiden’, ‘the maiden’s chamber’), and still vital. For instance, it is very productive in the adnominal possession construction in contemporary colloquial French (la tête/la femme à Jean lit. ‘the head/woman to John’, ‘John’s head/ woman’), Istro-Romanian (a le mil’ere ‘to the woman’, ‘of the woman’), Aromanian and old Romanian (urdirea a lumiei [making. to world..] ‘the making of the world’, Dragomirescu and Nicolae [2016: 915]), and contemporary Romanian, which displays gender and number agreement with the possessor for this invariable genitival marker (al, a, ai, ale > Lat.  + suffixal definite article), used for alienable possession (caietul de matematică al fetei [notebook.. of mathematics al. girl..] ‘the girl’s mathematics exercise book’ Dragomirescu and Nicolae [2016: 918]). The grammaticalization path of the continuants of Latin  (di in Italian) also involves the development of a partitive and pseudo-partitive function (the latter

Mechanisms and paths of grammaticalization and reanalysis in Romance

181

being non-referential), respectively with a definite article (Sp. dos de los estudiantes ‘two of the students’, Pt. dois dos estudiantes < de ‘of ’ + los/os ‘the..’ [with fusion of de with the article], and a bare noun, It. un bicchiere di vino ‘a glass of wine’). In Romanian, the distinction has been grammaticalized through the use of two distinct prepositions: (i) the partitive construction, involving de+the genitival marker al (un prieten de-al Mariei, lit. ‘a friend of Maria’), alongside din (< de’of ’ + în/dintre < de ‘of ’ + între ‘between’ doi din/dintre studenți ‘two of the students’), (ii) the pseudo-partitive pattern, involving the use of de and cu ‘with’ (un pahar cu vin lit. ‘a glass with wine’ ‘a glass of wine’ Dragomirescu and Nicolae [2016: 918]). In French and Italian de/di gradually come to mark an ‘indefinite quantity’, ultimately merging with the definite article, fully grammaticalizing into a marker of indefiniteness, i.e., into a determiner function (the so-called partitive article, lacking in Romanian, Portuguese and Spanish, where, however, it is well attested until the fifteenth century, Carlier and Lamiroy [2014]; Luraghi and Kittilä [2014: 23– 24, 51–52]). French instantiates the highest degree of grammaticalization, the partitive construction being obligatory (Pierre mange du pain [Peter eat..3  bread] ‘Peter eats bread’, Carlier [2007]), unlike Italian, where the pattern is optional, its distribution reflecting regional and register variation (Luraghi 2017) (preferisco delle mele/mele prefer.1 of.the apples/apples ‘I prefer some apples/apples’, Carlier and Lamiroy [2014]). As for the dative, apart from clitic pronouns and the Romanian inflectional dative illustrated above, the Romance languages employ the continuants of Latin  ‘to(wards)’, with the exception of Romanian la ‘to’ (< Lat.  ‘there + ), the ‘reinforced variant’ of the preposition a in old Romanian, spreading from adverbial phrases to mark indirect objects, mainly in contexts where it could have a beneficiary function/interpretation (Nedelcu 2013: 454–455), e.g., Ro. la un fiĉór to a boy ‘to a boy’ (non-standard, dialectal form, Martin Maiden, p.c.), Marche and Romagnol ma ‘to(wards)’ < Lat.    ‘in middle to’, Portuguese para ‘for, towards’ < Lat.  ‘through’ +  ‘to, toward’, widely used in contemporary colloquial Brazilian Portuguese to mark indirect objects (Whitlam 2011: 59; 181–182; also Dragomirescu and Nicolae 2016: 919), Neapolitan vicino a ‘close to’ (Ledgeway 2009: 849 f.). Thus, the continuants of Latin  in Romance are grammaticalized as dative markers, with other prepositions being pressed into this function as well in some languages/varieties. The preposition a ‘to(wards)’ is also reanalysed as a marker of non-canonical Os, i.e., animate, human, highly individuated, topical arguments (A), differentiating A from O, once case marking no longer differentiates the core arguments of the clause, after the collapse of the case system, with the use of the accusative in A function, and the merging of the nominative with oblique cases owing to the loss of vowel quantity and loss of final consonants (Maiden 2011: 155f; Sornicola 2011: 9). The phenomenon – known as prepositional accusative/D(ifferential) O(bject) M(arking) – is well attested in the Romance languages (with Romanian grammaticalizing

182

Michela Cennamo

however the preposition pe ‘on’ (< Lat.  ‘through’) into this function). More specifically, it is found in Ibero-Romance (e.g., Spanish) – although restricted to pronouns (strong personal pronouns and reciprocals) and banned with personal direct objects in Catalan (Escandell-Vidal 2009: 837 f.) – central (Loporcaro and Paciaroni 2011: 273 f.), southern Italian dialects, where it also occurs with inanimate Os (e.g., Aliano, Basilicata, Manzini and Savoia [2005: II: 508, 515]; Ledgeway [2016a: 268, 2018] and further references therein), marginally also in some northern Italian varieties (e.g., Trieste, Genova, Rohlfs [1969: § 633]; Berretta [1989]) with traces also in some early northern varieties such as Ligurian (Parry 2003), or Romanian (Ledgeway [2011b: 436]; Mardale [2008, 2015]; Onea and Mardale [2018] for a diachronic study). In some Latin American varieties (e.g., Puertorican and Argentinian Spanish) specificity and definiteness are the main parameters involved, overriding animacy (cosecharon al maíz [harvest..3 a. maize] ‘they harvested the maize’, Dragomirescu and Nicolae [2016: 921] and references therein). This pattern can be regarded as the last stage in the grammaticalization process of the preposition a, one where the preposition is semantically bleached and acquires an affix-like status, since it is reanalysed as a case-marker (cf. also Bossong [1998] for an overview in the European languages). Alongside parameters related to the inherent properties of O, also pragmatic factors such as topicality, have been pointed out, especially for the initial stage of the grammaticalization process (Iemmolo 2010; Dalrymple and Nikolaeva 2011), instantiated in spoken regional varieties of French, where the differential marking of O is optional and pragmatically determined, occurring for contrast/emphasis, most typically associated with right/left dislocation (Iemmolo 2011: 257), although also reflecting the lexico-aspectual properties of verbs and the inherent properties of the O argument (Fagard and Mardale [2014], also for a corpusbased investigation and further references therein). Adpositions may also result from the grammaticalization of lexical items, like French chez ‘at (the house of), in (the works of )’, deriving from Latin () ‘house’, involving the categorial reanalysis of a noun as a (locative) preposition, resulting from (i) a lexical change, the loss of the (OFr.) allomorph chiese, (ii) an irregular phonological development (() > *cas > chies > chiez), (iii) a categorial shift (N > P) and (iv) the desemanticization of the word ‘home’ that acquires a ‘generalized and abstract location meaning (Longobardi [2001: 276–277]; Fagard and Mardale [2012: 311]; also discussion in Fuß and Trips [2004: 18–19]). In Catalan, the grammaticalization of the continuant of Lat.  ‘house’ as a preposition is still under way, with the word ca (< casa de ‘house of ’), (and its variants cal(s) contraction with the masculine definite articles, e.g., cal dentist ‘the dentist’s’ Wheeler, Yates, and Dols [1999: 44–45]), retaining its categorial and semantic features, cooccuring with a directional preposition when used for directed motion and only allowing human complements (e.g., He anat a cal metge [have...1 go.. to home.the doctor] ‘I went to the doctor’s’), unlike French chez (Aujoud’hui on parlera de la tuberculose chez les animaux ‘Today we will talk about tuberculosis in animals’ (Lamiroy and Pineda 2018: 308).

Mechanisms and paths of grammaticalization and reanalysis in Romance

183

 Grammaticalization of verbal categories . Voice/valency The Romance languages have a rich and complex system of voice and valency changing operations and alternations, pivoted on the reflexive morpheme  and on a number of verbal periphrases, in passive(-like), light verb and impersonal(like) functions, resulting from the reanalysis of the continuants of the Latin reflexive morpheme  (./.)/ (dat.) (§ 3.1.1), the grammaticalization of lexical verbs denoting (a) activity (e.g.,  ‘do’,  ‘entrust > command’,  ‘let’), (b) state ( ‘remain’,  ‘stand’,  ‘sit’), (c) change of state, ( ‘become’), (d) change of location ( ‘come’, ,  ‘go, walk’), as well as (e) modals (* ‘want’ < ,  ‘must’ as voice markers [§ 3.1.2] and of nominals/personal pronouns [±clitic] as impersonal/indefinite markers and dummy pivot holders [§ 3.1.3]).

.. Reflexive constructions Alongside its reflexive-middle function (cf. Fr. le prisonnier s’est tué, [the prisoner  is kill..] ‘The prisoner killed himself ’, Pt. Carolina se arrependeu [Carolina  repent..3] ‘Carolina repented’), in Spanish (Martín Zorraquino 1979: 27 n. 15; Mendikoetxea 1999a), insular (Madeira and Porto Santo) and central-southern continental European Portuguese (Martins 2009: 196), Romanian (Pană Dindelegan 2013a: 110) and some southern Italo-Romance varieties (Cennamo 1998: 83),  and its variants – the accusative form in Romanian (e.g., mi sa urât de singurătate [..1 ..have.3 get-fed-up. . of loneliness] ‘I got fed up with loneliness’ Pană Dindelegan [2013a: 110]) and the first person plural reflexive pronoun ci ‘us’ and its variants in some central-southern Italian dialects (Cennamo 1997a: 158) – have been reanalysed as a marker of the lack of control of the subject with both divalent (Sp. se me ha olvidado la cartera [ I. have..3 forgot.... the bag] ‘I forgot my bag’) and monovalent (Sp. se murió vs *se murió asesinado [ die..3  died killed] ‘he died’) verbs. This pattern reflects the late Latin grammaticalization of the dative reflexive  as an unaccusative marker (Cennamo 1999), in so-called pleonastic uses, occurring mainly with verbs of telic change of state/location, states (most typically predicates denoting location and relation) and a Patient/Theme subject (Cennamo 1999, 2000: 45), a usage well attested in several Romance languages/varieties also with indefinite change verbs (cf. Pt. já se aconteceu [already  happen..3] ‘it has already happened’, Ro. a se întămpla [ happen.] ‘happen’ Dragomirescu and Nicolae [2017: 409]; Cennamo [2016: 971]). It is also found in Istrian dialects with monovalent/divalent verbs (e.g., me se dorme/magna [I.  sleep..3/eat.3] ‘I

184

Michela Cennamo

sleep/eat’), apparently to denote either necessity or wishing (Cennamo 1998: 83 and references therein), where, however, the structure might also reflect Slavic influence (Fici Giusti 1994: 169–74). By contrast, in all Romance languages /( in some southern Italian dialects) has been reanalysed as a marker of the passive voice (It. la casa si vende [the house  sell..3] ‘the house is (being) sold/is on sale’; Fr. ce poisson se mange cru [this fish  eat...3 raw] ‘This fish is eaten raw’), except some Grisons varieties (Manzini and Savoia 2005: 57), albeit occurring with different degrees of productivity and constraints (cf. Table 6) (Cennamo 2016 and references therein). The reinterpretation of  as an impersonal/indefinite marker with monovalent verbs (e.g., It. si parte all’alba [ leave..3 at dawn] ‘one/we (will) leave at dawn’) (cf. the occurrence of ci in the impersonal of reflexives in Italian, as in ci si pentì [one/we  repent..3] ‘one; we repented’), on the other hand, is not attested everywhere, and not in the same range of constructions, reflecting differences in referential scope and its degree of grammaticalization (Cennamo 2014). For instance, it is absent in contemporary French and in several Italian dialects (Cennamo 2016: 973). More specifically, the varieties where  does not have an existential inclusive interpretation, show a more limited use of this morpheme as a marker of impersonality, preferring other strategies: the indefinite (weak) subject pronoun ‘man’ (e.g., Fr. on, Cat. (un) hom, Abr. nome), the collective NP ‘the people’ (It. la gente) and the indefinite third person plural, e.g., Neapolitan (Cennamo 1997a, 2014, 2016: 973, 879–880; cf. also § 3.1.3). In Florentine, other Tuscan varieties (e.g., Sienese, Viareggino) and in some Umbrian (Città di Castello, Amelia) and Marchigiano (Arsoli) dialects,  has been further grammaticalized, displaying affixal status, with the sequence si+3 active replacing the traditional 1 synthetic verb form (e.g., si va [ go..3 = andiamo go.1] ‘we go’, Cennamo [1997a: 158, 1998: 82, 2014: 82–83, 2016: 974]; Manzini and Savoia [2005: ch. 4]). In French, passive se only occurs in the sequence S(ubject) se V (beaucoup de livres se vendent dans cette ville [a lot of books  sell..3 in this city] ‘many books are sold in this city’). If S conveys new information, an impersonal pattern is employed, with the subject filler il ‘he’, due to the non-null subject status of the language and the verb in the non-agreeing 3 form (il se vend beaucoup de livres dans cette ville). This structure also obtains in most northern Italian dialects, where the pragmatic given/new distinction is grammaticalized, and the S se V order is ungrammatical if S conveys new information, the pattern thus becoming impersonal, optionally occurring with a non-agreeing subject clitic in some varieties (Cennamo 1997a; 2014; 2016: 972; Manzini and Savoia 2005, vol. II: 17–19). European Portuguese displays an unusual pattern, where impersonal se is doubled by the indefinite pronominal expression a gente (lit. ‘the people’) which has been grammaticalized as a first person plural pronoun, also in Brazilian Portuguese (Martellotta and Cezario [2011: 729]; Duarte, Kato, and Barbosa [2003] for an overview), showing variable word order and agreement patterns (third singular, first

Mechanisms and paths of grammaticalization and reanalysis in Romance

185

Tab. 6: Passive and impersonal reflexives. reflexive/ middle

anticausative

Pan-Romance Pan-Romance

Passive

impersonal (indefinite)

impersonal (Ø argument/affix)

+ It, Sp., EPt.; Fr.; – BPt, some NIDs, some Grisons varieties

+ It, NIDs, EPt/BPt, Sp., Ro. – Fr., SIDs

+ It., Sp., Ro.; (some) SIDs; affix: Fl., some CIDs

plural, third plural) as well as inclusive/exclusive interpretation of se, depending on the nature of the doubling subject (Martins 2005, 2009), e.g., Não sabem o que a gente se passámos aí [not know.3 the what we = went.through.1 there] ‘You do not know what we have been through’, a pattern also marginally attested in Brazilian Portuguese (Cyrino, p.c.). S is also widely used as a marker of anticausativization (continuing a Latin pattern), signalling a suppressed causer, most typically associated with telic predicates (achievements and accomplishments) lexicalizing a final state/endpoint (It. rompere/Fr. se briser ‘break’, EPt. rasgarse ‘tear’). Variation in the presence, optionality and lack of  reflects (i) the aspectual specification of verbs – whereby for instance degree achievements occur in the active intransitive form in Italian, European Portuguese and Spanish (EPt. aumentar, It. aumentare, Sp. aumentar) [Duarte 2003; Legendre and Smolensky 2009; Kailuweit 2011; Cennamo 2016: 971 and references therein] – (ii) the conceptualization of a situation as an event or a result state, and (iii) thematic notions such as internal/external causation (respectively [–] and [+]), depending on the language/variety. These parameters interact with varying diachronic trends, such as the the loss of  and the use of the active intransitive as the sole anticausativization strategy in Brazilian Portuguese, resulting from the demise of  from the reflexive/middle and passive domains (Cyrino 2007, 2013; Cennamo 2016: 971 and references therein). The variety of uses and functions of the morpheme  in Romance and the grammaticalization and reanalysis paths involved can be neatly summarised through the semantic map approach, along the lines proposed by Haspelmath (2003), adapted in Table 6.

.. Passive and impersonal periphrases The grammaticalization of lexical verbs as voice markers in contemporary Romance includes the continuants of Latin  ‘come’, generally used for dynamic passives (e.g., It. La città venne rasa al suolo (dal nemico) [the town be..3 de-

186

Michela Cennamo

stroy.... by the enemy] ‘The town was destroyed (by the enemy)’, Green [1982: 118 f. 133, n. 25]; Giacalone Ramat [2000: 9]; Giacalone Ramat and Sansò [2014]) and with a weak deontic function (i.e., necessity) in Romanian, where it is found in some southern varieties (south-eastern area of Buzău and southern Oltenia), as well as in Istro-Romanian – where it is viewed as resulting from direct contact with Italian (Dragomirescu and Nicolae 2014) – and only occurring with some aspectual classes of verbs, accomplishments and, marginally, achievements (e.g., celălalt bec vine slăbit the.other [light-bulb come..3 loosen....] ‘the other light bulb gets loosened’ Dragomirescu [2013: 169]; Dragomirescu and Nicolae [2014: 72]). In Ræto-Romance ‘come’ is the canonical passive auxiliary for dynamic passives (Haiman 1988: 364), alternating with ‘be’ in compound tenses in the Engadine dialects (e.g., ε sun ny/stat klam-a lit. ‘I am come ./be. . called’), with aspectual differences in Latin, where ‘be’ occurs with a resultative stative meaning, and ‘come’ when the action is in progress (Haiman and Benincà 1992: 108). In some Italian varieties (e.g., the Pugliese dialect of Volturino), passive ‘come’ also occurs in compound tenses, conveying an aspectual difference, signalling a recent past, as opposed to the ‘be’ auxiliary, that denotes remote past (e.g., è mmenute cundannete [...3 come... sentence..] ‘He has just been sentenced’ vs è state cundannete ‘He has been sentenced’, Cennamo [1997a: 149, 2016: 975–976]). ‘Go’, the continuant of Latin ,  ‘go, walk’, is well-attested as a passive auxiliary in Spanish (although only in non-compound tenses, Green [1982: 114, 123]; Yllera [1999: 3432]), Italian (il palazzo è andato distrutto [the building be...3 go.... destroy....] ‘The building has been destroyed.’) and the dialects (Cennamo 1997a: 151; Sansò and Giacalone Ramat 2016), also with a deontic value (la casa va abbattuta [the house goes demolish....] ‘the house must be demolished’). Unlike the auxiliary ‘come’, which can occur with all aspectual classes and in all persons, in Italian passive ‘go’ is confined to a few accomplishments (mainly verbs of loss/destruction) and to the third person (Giacalone Ramat 2000: 129), and may occur in all tenses. In its deontic passive function, on the other hand, Italian ‘go’ occurs with all aspectual verb classes, although only in some tenses (present, future, past imperfect, and conditional, as in [10c], and also Giacalone Ramat [2000: 133]; Sansò and Giacalone Ramat [2016]; Cennamo [2016: 976], for further discussion). The grammaticalization of the continuant of Latin  ‘stand’ as a passive auxiliary, in the pattern  + past participle, is found not only for compound tenses of the passive (and overt expression of the agent in Spanish with activity verbs (la asociación está dirigida por un grupo de amigos lit. ‘the association is managed by a group of friends’ ‘the association is run by a group of friends’, Yllera [1999: 3419 f.]), while being ungrammatical with stative psych-verbs (*está temido/ amado/odiado ‘he is feared/loved/hated’ [Yllera 1999: 3430]), but also to mark resultative stative passives, as witnessed in Spanish, Catalan, Portuguese, and some southern Italian varieties (Pountain 1982; Ledgeway 2000: 230). Other passive-like

Mechanisms and paths of grammaticalization and reanalysis in Romance

187

periphrases involve the grammaticalization of the verbs ‘remain’ (It. rimanere/restare, Sp. quedar; Mendikoetxea [1999b: 1625; Giacalone Ramat 2000]), light verb uses of ‘become/come’ (Pt. ficar ‘be(come), remain’ < *figikare ‘fix’; Parkinson [1988: 162]; Whitlam [2011: 127]) (cinco pessoas ficaram feridas [five people became injure....] ‘five people were injured’), It. venire ‘come’ (il pavimento viene /è venuto pulito con quel detersivo (*da Marco) [the floor comes is come. ... cleaned.... with that cleaning.product by Marco] ‘the floor is/was cleaned with that cleaning product (by Marco)’, Rosen [1991]; La Fauci [2000]; Cennamo [2007]). The grammaticalization chain for these verbs involves the initial equivalence to the copula  ‘be’ (already attested in CL for ‘come’ and ‘go’), i.e., copula expansion (Dik [1987]; Cennamo [2005] for Late Latin) and a subsequent change in the aspectual class of the participial complement in the construction, from accomplishments/achievements (and ambiguity between anticausative/passive reading) to activities (and ensuing passive interpretation only) (e.g., caro cocta fit [meat.. cook..... become..3] vs fiant gubernatas [become...3 govern....] ‘That they be governed …’, Michaelis [1998]; Cennamo [2006, 2016: 969–970]). Also the verb ‘see’ (< Lat.  ‘see, perceive’) occurs in passive-like auxiliary/light verb constructions, in conjunction with the reflexive morpheme , followed by the past participle of a transitive and ditransitive verb, and subjectivization of the IO/Beneficiary in the latter case (It. Marco si vide offerta la direzione del reparto dal direttore generale [Mark  see..3 offer.... the direction of the ward by the general manager] ‘Mark was offered (lit. saw himself offered) the direction of the ward by the general manager’). The verb ‘see’ is desemanticised, the structure conveys the (pragmatic) nuances of adverseness and unexpectedness and instantiatiates subjectification in its extension to inanimate subjects (Lehmann, Pinto de Lima, and Soares 2010; Giacalone Ramat 2017). The pattern is found in all the standard Romance languages, albeit with varying degrees of grammaticalization, lower in Catalan (Bartra Kaufmann 2002) and Romanian (apparently occurring in the latter mainly in the written language, Giacalone Ramat [2017: 173]), higher in French where it is a recent and controversial development, occurring with animate, human subjects and with inanimate ones with abstract meaning only (Bat-Zeev Shyldkrot 1981; Giacalone Ramat 2017: 170–171). The grammaticalization of the pattern is most advanced in Spanish – where it is attested also with inanimate, concrete subjects (Sp. los discos … no se verán afectados por la liquidación de Galerías [the disks and the videos not  see..3 affect.... by the liquidation of Galleries] ‘Disks … will not be affected by the liquidation of Galleries’ Yllera [1999: 3431]) – and Portuguese, where the construction – well in use as early as the 14th c. (Lehmann, Pinto de Lima, and Soares 2010) and initially confined to animate, human subjects, since the 20th c. – is also found, albeit rarely, with inanimate subjects (Pt. “esta enorme prosperitade viu-se comprometida pela decadência do império ro-

188

Michela Cennamo

mano” (20th c. Encyclopedia, CPMD) [this enormous prosperity see..3- endanger.... by-the decadence of the Roman Empire] ‘This enormous prosperity found itseld endangered by the decadence of the Roman Empire’; Lehmann, Pinto de Lima, and Soares [2010] and discussion of the gradual spread of the construction from the high to the low end of the Individuation hierarchy). In Italian, the structure, already attested in 14th c. Florentine texts, only occurs with animate, most typically human subjects (see Giacalone Ramat 2017: 170–173). Passive-like ‘see’ alternates between the past participle of the lexical verb and the infinitive in Italian, French, Portuguese, the patterns conveying an aspectual difference – perfective vs imperfective event, respectively – (It. il giovane si vide assalito/assalire dalla folla [the young-man  see..3 assault..../ by the crowd] ‘The young man found himself/was assaulted by the crowd’; Giacalone Ramat [2017]; Lehmann, Pinto de Lima, and Soares [2010] for Portuguese). Romanian and several southern Italian dialects/regional varieties of Italian also display the grammaticalization of the modal auxiliaries ‘must’ and ‘want’ as (noncanonical) passive markers, respectively Romanian a trebui ‘be necessary’, in the default third person singular form followed either by a past participle (trebuie căutată altă explicaţie [must..3 searched.. other.. explanation.] ‘we have to look for another explanation’) and a form conventionally labelled as supine (trebuie reacţionat cu calm [must..3 react. with calm] ‘we must react calmly’, Maiden [2013: 511–518]) in a deontic function only (Neamţu 1986: 155 f.), the agreeing 3 pattern being a later development (Dragomirescu 2013: 198). A similar type of grammaticalization is found in several southern Italo-Romance dialects, regional varieties of Italian, and Sardinian with the modal auxiliary ‘want’ with deontic function (Sal. lu pisce ulia mangiatu stammane [the fish want..3 eaten this.morning] ‘the fish should have been eaten this morning’, Cennamo [1997a: 151]; Ledgeway [2000: 236–81, 2009: 669–71] and references therein). Nuorese Sardinian and Sicilian also exhibit an unusual grammaticalized pattern, a passive-like resultative construction consisting of the sequence ‘be’ + ‘without’ + past participle, with subject agreement and lack of an overt Agent (cf. Nuo. sa petha est kene mandicata (**dae sos pitzinnos) lit. ‘this. meat. be..3 without eat. ... by the children (= the meat has not been eaten (by the children)’; Jones [1993: 125]; Leone [1995: 46]). The typologically unusual grammaticalization of the verb of possession ‘have’ as a passive marker, followed by the past participle of the lexical verb, occurring both in simple and compound tenses, is found in some southern Italian dialects from Puglia (e.g., Altamuran, Minervino Murge) and Lucania (e.g., Irsina, Tolve), alongside ‘be’ and ‘come’ passives. The passive and the active forms of ‘have’ may differ (e.g., Irsina 3 γávenə (passive) vs. annə, Loporcaro [2012b: 179]) and may also be in free alternation with the ‘be’ pattern for dynamic and resultative passives, as in Altamuran, where they are restricted to a human O, and A is optionally expressed (e.g., dʒω'wann e 'stεt vist/a a'wωot vist [John be..3 be..

Mechanisms and paths of grammaticalization and reanalysis in Romance

189

see../have..3 have.. see.] ‘John has been seen’), occurring also with inanimate subjects in the present, where ‘be’ is banned (l'arve jεv abːrʊšɛi ̯t [the tree have...3 burn..] ‘The tree is being burnt’, Loporcaro [1988: 254–257, 2012b: 179–180]). ‘Have’ as a passive auxiliary is the last step in the grammaticalization of lexical ‘have’ as a passive marker, with an optional intermediate structure, from which passive ‘have’ might have developed, instantiated by resultative, passive-like ‘have’ patterns formed on trivalent verbs with subjectivization of the IO (e.g., Neap.-It. Marco ha avuto regalato/a una bicicletta [Mark have...3 have.... give..../. a bicycle..] lit. ‘Marco has had given a bicycle’). This pattern is characterized by a lower degree of grammaticalization, in that ‘have’ is not fully decategorialized and semantically bleached. It is found in several southern Italian dialects (especially Calabria and Sicily [Ledgeway 2000: 238]), either lacking passive ‘have’, e.g., Neapolitan, or displaying it, e.g., Altamuran (Loporcaro 1988: 29]6–299; 2012b: 178–180; Cennamo 1997a: 150; Ledgeway 2000: 238 f.). The past participle typically agrees with O, but it may also agree with the recipient/beneficiary subject (Ledgeway 2000: 29–31; 238–239). French, Ræto-Romance, and several northern Italian dialects (i.e., non-null subject languages/varieties) have grammaticalized the default third person singular with active and passive (‘be’/‘come’ + past participle) morphology, in patterns defocusing the S/A argument (optionally expressed as a by-phrase in the case of passive), displaying lack of subject and an optional/obligatory dummy pivot holder as well as an optional reflexive marker in Italian and Romanian in some constructions referring to natural events (It. (si) fa buio [ makes dark] ‘it gets dark’). These constructions result from the grammaticalization of a pronoun and/or demonstrative (e.g., Fr. il, the unstressed, masculine third person singular subject pronoun) (Grevisse 1980: §§ 1043–1047), in impersonal actives (Fr. il me faut ces livres [ I. is.necessary these books] ‘I need these books’), and passives (Fr. il será parlé de vous (par tout le monde) [ be..3 speak.. of you.2 (by all the world)] ‘You will be spoken of (by everyone)’, Kayne [1975: 247]; also Culbertson and Legendre [2014]). They show varying definiteness effects and agreement options in relation to the postverbal nominal of a transitive verb, differing also in their distribution with intransitive verbs (see further discussion and references in Cennamo [2016: 977–978]). In non-standard varieties of null subject languages/varieties such as Ibero-Romance (e.g., European Portuguese, Galician, Dominican Spanish and Balearic Catalan) and some Italian dialects (e.g., Neapolitan/Campanian) seemingly expletive pronouns/demonstratives have grammaticalized into discourse markers instead (e.g., EuP esto será tarde ‘that will be late’, Hinzelin [2009]; Ledgeway [2009: 294; 82 f.; 2011b: 287, n. 30] and references therein). The default third person singular also exhibits a grammaticalized use for maximal agent defocusing, to denote the taking place of an event when no participant is involved, as with meteorological verbs and lexico-semantic impersonals (Creissels

190

Michela Cennamo

2007: 17), monovalent/divalent verbs which do not allow a canonical subject, do not inflect for person and often take a sentential complement (cf. Fr. manquer ‘lack’, falloir ‘be necessary’, Ro. trebui + subjunctive ‘must’; trebuie să citești [must.3 . read..2] ‘You must read’, BrPt. precisar ‘need, must’, It. bisognare ‘need’ (deontic modal), Occ. caler ‘be necessary’ [20b], Sp. caber ‘fit, be possible’; Cornillie et al. [2009: 118]; Whitlam [2011: 175]; Pană Dindelegan [2013a: 108 f.]).

.. Impersonal/indefinite markers Other (colloquial) impersonal devices include the grammaticalization of reflexes of Latin  ‘one’ (e.g., Romansh ins ‘one’, similar in its interpretation to French on < Lat.  ‘human being, person’) and German man (Anderson 2016: 182–183), ‘people’ (Cat. la gent, Abr. la gende), and the second person singular pronoun, widely used also in formal contexts in Romanian with a generic (exclusive/inclusive) interpretation (Pană Dindelegan 2013a: 109). In Brazilian Portuguese, the second person singular (você ‘you’ < vossa mercê ‘your mercy’ + third person singular verb) is the most common indefinite strategy (Duarte, Kato, and Barbosa 2003), whilst in European Portuguese se is most frequently employed in this function, doubled by the first person plural pronoun a gente in dialectal varieties (Martins 2009; Cennamo 2016: 972). Another widespread impersonal/indefinite construction is the reflex of Latin  ‘human being, person’6 + third singular verb form, occurring in French (Grevisse 1980: §§ 1287–1291), Catalan (Badia i Margarit 1995; Wheeler, Yates, and Dols 1999: 518 f.), some central (e.g., Marchigiano) and upper southern Italo-Romance varieties (e.g., Abruzzese; Giammarco 1968: 1344; Hastings 1994: 16–31; D’Alessandro and Alexiadou 2003; Giacalone Ramat and Sansò 2011; D’Alessandro 2014), and Sardinian (D’Alessandro and Alexiadou 2003: 188), already attested in old Italian (Giacalone Ramat and Sansò 2007a), old Catalan (De Borja Moll 1952: 283), old Spanish (Barrett Brown 1931), old Portuguese (Said Ali 1964: 116), and old Provençal (Weerenbeck 1943). This pattern displays different degrees of grammaticalization, ranging from a pronominal function (with a generic (3a) and/or existential/arbitrary interpretation (3b), a first person plural reading, as in French (3c)), to a verbal plural affix (the -um/om first person plural endings in some NIDs (Lombard-Modern Bergamasco) (Giacalone Ramat and Sansò 2007a) and nome <   and its variants homme/lome/dome (3d) in modern Abruzzese (Hastings 1994: 17; Cennamo 2016: 979):

 The original meaning of the Latin word  is retained in Romanian om ‘human being’: ea este un om fericit [she be..3 a person happy] ‘she is a happy person’. Indeed, the generic reference of this word in Latin might have eased its grammaticalization (example and point raised by Martin Maiden, p.c.).

Mechanisms and paths of grammaticalization and reanalysis in Romance

(3)

191

a. (generic; Abr.) Nome va a Marte ma nen z’ ambare a cambà. one go..3 to Mars but not = learn..3 to live. ‘People can go to Mars but they haven’t learnt how to live yet’ b. (existential/arbitrary; Fr.) On a déjà répondu one have..3 already answer.. ‘That’s already been answered’ c. (first person plural; Fr.) On part en voyage. one leave..3 on trip ‘We are leaving for a trip’ d. (affix; Abr.) Allore le tre ggiuvunette homme penzètte then the three girls one think..3 ‘Then the three girls thought’

The Romance data are in line with the diachronic grammaticalization paths generally assumed in the literature: (a) lexical DP > impersonal generic pronoun > impersonal arbitrary pronoun > referential pronoun (Van Gelderen 1997; Welton-Lair 1999; Egerland 2003, 2010); (b) species-generic > human non-referential indefinite > (i) human referential indefinite, optionally (ii) first person singular/plural (Giacalone Ramat and Sansò 2007b: 106). The data, however, also point to two unusual paths of development, suggesting a non-linear path: (i) Abruzzese nome does not acquire referential status before turning into a plural affix on the verb, but was previously always either arbitrary or generic (D’Alessandro 2014: 12); (ii) there also occurs a reverse pattern of change, a degrammaticalization whereby, alongside an impersonal pronoun becoming a verbal affix as with nome, a third person plural auxiliary form anne ‘they have’ (largely attested in the eastern Abruzzese dialects of Lanciano and Pescara, probably under the influence of Italian) is changing into a non-referential, arbitrary pronoun (Anne’ve fitte [anne had done] ‘someone had done’), after an intermediate stage as a plural marker (Marje e Pasquale z’anne magne le sagne [Maria and Pasquale =anne eat.3 the lasagne] ‘Mary and Pasquale eat lasagna.’ (Hastings 1994: 28, n. 12; D’Alessandro 2014: 14; Cennamo 2016: 979– 980).

. Tense and aspect Three main innovations involving grammaticalization can be detected in the Romance tense-aspectual system, generally preserving the Latin tripartite temporal

192

Michela Cennamo

distinction present/future/past and a clear-cut distinction between imperfective/ perfective aspects in the past domain: (i) new inflectional (synthetic) futures, unknown to Latin, (ii) the rise of verbal periphrases, with original lexical verbs as tense-aspect-modality markers (i.e., auxiliaries) and light verbs (i.e., event modulators conveying differences in control, aspect, benefaction (Hopper and Traugott [2003: 113]; Cennamo [2019] for Italo-Romance), (iii) the emergence of compound and double compound forms/tenses (both in the active and the passive voice) and shift of the aorist/preterite function of the (active) simple past onto the corresponding compound form(s) (so-called ‘aoristic drift’) (Bertinetto and Squartini 2016: 939 among others and references therein).

.. Temporal and aspectual periphrases A widely employed pattern is the grammaticalization of the verb ‘go’, that in west Romance occurs as a future auxiliary, in line with a well-known cross-linguistic tendency, owing to its ‘transitional’ meaning of ‘motion towards a goal’, extended from physical space to time (Bybee, Perkins, and Pagliuca 1994: 268). In French, Portuguese and Spanish, the verb is followed by the bare infinitive (Fr. je vais chanter [go.1 sing.] ‘I am going to sing’).7 By contrast, in Catalan, Italian (where this construction only occurs in non-standard registers/specific genres) and the dialects, ‘go’ as a future marker is headed by the preposition a ‘to’ (coll.It. Domani non so proprio cosa gli vado a dire [tomorrow not know...3 really what he. go...1 to tell.] ‘I really don’t know what I’ll tell him tomorrow’ Amenta and Strudsholm [2002]; Lamiroy and Lahousse [2018]). In the Occitan varieties of Gascon, Guardia Piemontese (Calabria, southern Italy, Jacobs and Kunert [2014]) and Catalan (Jacobs 2011), the verb ‘go’ occurs in the socalled ‘go’-past, a pattern where the present tense of this verb occurs in auxiliary function (with a specialized paradigm, different from the one displayed by the lexical verb in some persons (e.g., Cat. vaig − lexic. ‘I go’/và(re)ig − aux.) (Juge 2006; Ledgeway 2011b: 424), followed by the bare infinitive (el seu discurs va causar un gran impacte en l’auditori [the his talk go.3 cause. a great effect on the.audience] ‘his talk produced a great effect on the audience’) (Detges 2004: 211), alongside the structure with the preposition a ‘to’ heading the infinitive, which has instead a future reading (Joan va a cantar [John go.3g to sing.] ‘John is going to sing’). Ambiguity can arise between the future and past interpretations of the pattern in the spoken language when the auxiliary form ends in /a/, owing to the imperceptible difference between the structure with and without the preposition a, e.g., Joan va a

 Interestingly, in French, the construction go+infinitive is currently grammaticalizing into a present-habitual marker (il va me téléphoner tros fois par jour ‘he calls me three times a day’, Bres and Labeau [2012] in Bertinetto and Squartini [2016: 951]).

Mechanisms and paths of grammaticalization and reanalysis in Romance

193

cantar ‘John is going to sing’ vs Joan va cantar ‘John sang’ (Jacobs 2011: 230, n. 5). The grammaticalization of this verb as a past tense marker in Catalan involves an initial stage where go+infinitive acquires an inchoative meaning, subsequently becoming a narrative past marker, characteristic of the spoken language, and finally, a past tense marker (Detges [2004: 214]; Jacobs [2011: 238–240] for further details and references). The foregrounding function in sequential narratives is viewed as relevant at an intermediate stage in the grammaticalization of Catalan anar + infinitive, from inchoativity to aoristic temporality (Bertinetto and Squartini 2016: 942 and references therein). The verbs ‘come’ in French and Occitan dialects and ‘complete’ in IberoRomance (cf. Sp./Pt. acabar ‘finish, end’), have been grammaticalized as markers of temporal distance, signalling recent past (Pt. Um novo príncipe acaba de chegar ‘A new prince has just arrived’, Fr. Je viens/venais d’arriver ‘I have/had just come (at this very moment’), so-called retrospective aspect, Nicolle [2012: 376–379]; Ramat and Ricca [2016: 55]; Bertinetto and Squartini [2016: 943]). ‘Come’ is found as a future auxiliary in Romansh, in conjunction with the infinitive headed by the preposition a ‘to’ (e.g., Srs., jeu vegnel a lavar [I come.1 to wash.] ‘I shall wash’, Ledgeway [2012: 123]). In transitional dialects (e.g., Putèr, Surmeira, Vallader), there occurs a ‘double’ future pattern, consisting of the auxiliary ‘come’ + the synthetic future suffix -ar/-ir (for the 1) (cf. Putèr, e ɲaro at deklarer [I come..1 to.you explain.] ‘I will explain it to you’) (Haiman and Benincà 1992: 86–88). The verb ‘want’ has also grammaticalized as a future auxiliary, as witnessed by Friulan and Romanian, which employ, respectively, a form of the verbs volê and a vrea ‘want’, followed by the bare infinitive of the lexical verb (cf. Frl. voj parti [want...1] ‘I will leave’, Rom. voi pleca [want...1 leave.] ‘I will leave’). Romanian has no synthetic future and exploits in its stead different periphrases with different degrees of grammaticalization, ranging from a regional (phonologically reduced inflected) variant (e.g., oi.1) of the standard form (e.g., voi, 1) (so-called voi/oi type) (e.g., oi pleca want.1 leave.), to a colloquial fossilized invariant marker o (or in the third person plural), probably derived from the 3 impersonal form va ‘wants’ (Zafiu 2013a: 40), plus the subjunctive complementizer să and the subjunctive of the lexical verb, e.g., o să plec [want să leave..1] ‘I will leave’ (Zafiu 2013a: 38–39; Ledgeway 2016b: 768; Maiden 2016a: 109–110), a pattern characteristic of the Balkan languages, where the future consists of a particle plus a subjunctive form (Zafiu 2013a: 38–40). As a future periphrasis there also occurs the verb ‘have’, in the originally deontic structure a avea + subjunctive ‘have to, must’ (am să plec ‘I have to go, I must go’), the pattern being not fully grammaticalized, since it partially preserves its original modal meaning, and is not used in the 1 and 2 with future meaning, but only with deontic meaning (Zafiu 2013a: 39 and discussion in § 3.2.3). ‘Have’ + the preposition de ‘of ’ as a future marker is also found in Ibero-Romance, Sardinian and Italian dialects (e.g., Friulan [Haiman

194

Michela Cennamo

and Benincà 1992: 93]), Sicilian, Calabrian (cf. Catanzaro ha de nivicara [it has of snow.] ‘it will snow’), whilst the modal verb ‘must’ is employed in Sardinian with a morphophonologically specialized/reduced paradigm (vs. lexical form devet; cf. det próere [must.3 rain.] ‘it will rain’) (Ledgeway 2011b: 422, 2012: 123–124, 2017b: 849; Bertinetto and Squartini 2016: 951). Most typically, however, temporal and aspectual properties/features are strongly intertwined, as in perfective and resultative periphrases, stemming from the extended use of the copula ‘be’ as well as the grammaticalization of stative location verbs such as ‘stay’, ‘remain’, the verbs of possession ‘have’ and ‘hold’, and change of state/location such as ‘become’, ‘come’, ‘go’, occurring in auxiliary and light verb uses, often displaying different types and degrees of grammaticalization (Ledgeway 2012: 118–124; 2016b: 767, among others). Thus, several Romance languages (e.g., Italian, several dialects, French) employ both ‘be’ and ‘have’ as perfective auxiliaries, the latter resulting from the grammaticalization of an original resultative pattern, consisting of the sequence ‘have’ +  (Vincent 1982; Cennamo 2008, among others), and still functioning as such in some languages, for instance in Italian, where the difference between the two constructions is conveyed by word order and agreement, the past participle occurring after the object and agreeing with it in the resultative pattern, as in (4a), but before the object and in the default, non-agreeing masculine singular form in the perfect, as in (4b): (4)

a. Ho la tavola apparecchiata have.1 the table laid-out.... ‘I have the table laid out’ b. Ho apparecchiato la tavola have.1 lay-out.... the table ‘I laid out the table’

In some languages/varieties (e.g., Portuguese, Piedmontese, Neapolitan, Sicilian and Lombard dialects in Switzerland), instead, the two readings are associated with different participial forms, the adjectival one for resultative patterns (Neap. è muorto [be..3 die....] ‘He is dead’) and the verbal participle for the perfective structure, correlating with a different auxiliary and agreement pattern, as in Sicilian, e.g., ha muruto [have..3 die....] ‘He died’ (Ledgeway 2000: 229–233; Loporcaro, Pescia, and Ramos 2004; Bertinetto and Squartini 2016: 944). Portuguese, on the other hand, has grammaticalized the verb ‘keep’ (ter ‘hold’), employed both in perfective and resultative structures, unlike Spanish, which has generalized the verb ‘have’ in perfective use. The verb tener, on the other hand, is confined in Spanish to the resultative construction, as in sothern Italian dialects, e.g., Neapolitan, although displaying incipient grammaticalization of the sequence tener + past participle as a perfective construction, following the same path as ‘have’ + past participle (Day and Zahler 2014). The durative, ‘continuative’ reading

Mechanisms and paths of grammaticalization and reanalysis in Romance

195

(so-called inclusivity) of the construction (cf. Pt. tenho estudado imenso desde que decidi fazer o exame [have..1 study.. enormous since that decide.1sg. do. the examination] ‘I have been studying a lot since I decided to take the examination’, Bertinetto and Squartini [2016: 945]) might be viewed either as a possible stage in the grammaticalization path of Romance perfects (Harris 1982) or as as a characteristic property of Portuguese (Squartini and Bertinetto [2000: 419 f.]; Bertinetto and Squartini [2016: 945] for analogous structures in Galician, Judaeo-Spanish and Piedmontese and relevant references). In some northern Italian (Piedmontese, Ligurian, Lombard and Veneto) dialects the construction ‘keep’ + past participle occurs in continuous – iterative function (Ricca 1998; Vincent 2014: 14–15), with early attestations from sixteenth and seventh century texts (Ricca 1998) (cf. Cicagna [Liguria]) u 'teŋi:a se'ro: a 'porte [he keep..3 close. ... the door] ‘He kept closing the door’, lit. ‘he kept closed the door’, Ricca [1998: 350]), an interpretation developing from the original meaning of the verb tenere ‘keep, hold’ and still available in this variety, whereby the above example can also mean ‘He kept the door closed’ (Ricca 1998: 150) (see also Piedm., Castellinaldo, Cuneo; u ten dić [he keep...3 say..] ‘He keeps saying’, Toppino [1926], in Vincent [2014: 15]). In some languages/varieties the past meaning of the synthetic (past) form has been taken over by the analytic pattern, i.e., the compound form (Harris 1982), in so-called ‘aoristic drift’, the preterital function of the perfect (Squartini and Bertinetto 2000: 422–226), a contact-induced case of grammaticalization (Giacalone Ramat 2008), radiating from twelfth century Parisian French (Harris [1982: 58–59]; Drinka [2003, 2017: 255–259, 261–262] for a full and recent discussion). The change is complete in most spoken varieties of French (Smith [2016: 314–315] for the areal distribution of the preterite and present perfect), Raeto-Romance, Romanian, Northern Italian dialects, whilst being in progress in some Ibero-Romance varieties (e.g., central Spain, Peru, Bolivia). Argentinian Spanish displays the opposite tendency, the current relevance function of the analytic perfect being replaced by the preterite (e.g., todavía no terminaste el examen [yet not finish..2 the exam] ‘You have not finished the exam yet’, Tuten, Pato, and Schwarzwald [2016: 406]). Italian (e.g., Tuscan) varieties, instantiate an earlier stage in the grammaticalization chain, the aoristic function of the analytic perfect being pragmatically and semantically constrained (Bertinetto and Squartini [2016: 944–945]; Bossong [2016: 70] for a recent discussion, also Schwenter and Cacoullos [2008] on Ibero-Romance).

.. Progressive constructions Several progressive periphrases occur in Romance, differing in their distribution and realization (they are lacking, for instance in Romanian, probably owing to Slavic influence, with on-going situations expressed lexically (e.g., with adverbs such as tocmai ‘just, precisely’ [Martin Maiden, p.c.] rather than periphrastically) and

196

Michela Cennamo

predicate constraints, stemming from the grammaticalization of states (It. stare, BPt., Gal., Sp., Cat., estar + gerund, EPt estar a ‘lit. stay at’ + infinitive, Gal. ser ‘be’ + infinitive,) and change of location verbs (Pt. ir ‘go’/ vir, Sp. ir/venir, It. andare/ venire ‘come’, Sp., Pt. andar ‘walk’, Cat. anar ‘go’ (Fr. aller) + gerund, EPt. andar a ‘walk to’ + infinitive), alongside a number of other patterns (Fr. être en train de ‘be in the course of ’/être après ‘be after’/less frequently être a ‘lit. ‘be at’/+infinitive, It. essere dietro a ‘be behind’+infinitive [Squartini 1998: 32–33; Bertinetto and Squartini 2016: 947–948] and further references therein).

... State-periphrases The grammaticalization of state-periphrases, so-called ‘imperfective drift’ (Bertinetto 2000), involves an intermediate, desemantization stage where the (state) predicate loses its lexical meaning but retains its durative aspectual value, being also compatible with perfective tenses (as in 14th c. Italian stetti sognando [be..1 dream.] ‘I was dreamimg’, Bertinetto and Squartini [2016: 949]), unlike in the last stage of the process, where the verb is a mere marker of imperfectivity and predicate constraints no longer hold, as in Salentino Italian (e.g., lu sta pperdu/ pperdi … [it stand lose..1/lose..2] ‘I am going to/you are going to lose it’, Ledgeway [2016c: 1028; 2016d: 165]), where stay is an invariant marker cooccurring with the inflected verb (Ledgeway [2016c: 1027]; also Andriani [2017: ch. 5] on the Barese dialect in central Apulia). Generally the universal tendency whereby stative verbs are excluded from the progressive pattern does not always obtain in Romance (cf. Pt. João está sabendo a resposta [John stay..3 know. the answer] ‘John knows the answer’, with contingent states, and Sp. están siendo tontos [stay..3 be. silly.] ‘they are being silly’, where the pattern is also used for permanent states, unlike in standard Italian *sto sapendo/conoscendo la risposta [stay..1 know. the answer], lit. ‘I am (stay) knowing the answer’, *stiamo essendo stupidi [stay..1 be. silly..] ‘We are being silly’, Bertinetto and Squartini [2016: 947–948]). In French, the progressive periphrasis hardly occurs with achievements; in Italian, on the other hand, the pattern, although instantiating the last step of a usage starting with activities and accomplishments and only later spreading to achievements (Squartini 1998: 85–86), only the state-construction is possible in the progressive interpretation with achievements (stava morendo/stava a morì [stay..3 die. stay..3 to die.] ‘he was dying’ with the innovative infinitive in some southern Italian dialects, e.g., Molise). The imperfect tense, instead, would convey a different, ‘narrative imperfect’ meaning in this context (e.g., il ragazzo moriva [the boy die..3] ‘the boy died’, Bertinetto and Squartini [2016: 948] for further details and examples). In addition, in colloquial varieties of American Spanish (Squartini 1998) and some southern Italian dialects, the state-pattern can convey a future time reference (cf. Sic.It. sto venendo subito [stay..1 come. soon] ‘I’m coming soon’. Bertinetto and Squartini [2016: 949]).

Mechanisms and paths of grammaticalization and reanalysis in Romance

197

... Motion progressive periphrases The verbs ‘go’, ‘come’ (and ‘walk’ in Ibero-Romance) also occur in progressive constructions, in patterns displaying varying degrees of grammaticalization – instantiating the process of ‘specialization’, the ‘narrowing of choices’ characterizing ‘the emergence of new constructions’ (Hopper 1991: 75; Hopper and Traugott 2003: 116) – reflected in different predicate constraints (Squartini 1998: 2, ch. 5). The ‘come’ and ‘go’ variants are well attested in Italian, the former being used more rarely and stylistically marked (e.g., la situazione viene chiarendosi/va peggiorando sempre più [the situation come...3 clarify./go...3 worsen. always more] ‘The situation is getting clearer and clearer/is getting worse and worse’, Squartini [1998: 209] for their diatopic and diastratic distribution). Catalan and French instead only display ‘go’, whilst Spanish and Portuguese show two andative verbs, ir ‘go’ and andar ‘walk’ (cf. Sp. Andas hablando y tramando … [walk...2 talk. and plot.] ‘You are talking and plotting …’ (Squartini 1998: 263), also occurring with the infinitive (headed by the preposition a ‘to’), a pattern preferred with estar ‘stay’ (see § 3.2.2.1) and andar in European Portuguese, as part and parcel of the change leading to the replacement of the gerundial construction with the infinitive (Squartini 1998: 285–287). In Portuguese, Galician, and Spanish, ‘come’, which partially retains its deictic orientation meaning, is less commonly used than ‘go’ (Squartini 1998: 290). In Italian, it is favoured in iterative contexts alongside ‘go’ (lo vengo/vado ripetendo da mesi [it. come/go...1 repeat. from months] ‘I have been repeating it for months’, lit. ‘I come/go repeating’) and most typically occurs with accomplishments (e.g., degree achievements like aumentare ‘grow’) or iteratively coerced achievements (e.g., raccogliere ‘collect’) (Bertinetto and Squartini 2016: 950) and is incompatible with non-oriented or non-iterated activities and with states (Squartini 1998: 291, 294). In Spanish, where no actional restrictions occur, owing to the temporal reinterpretation of the deictic meaning/orientation of the verb, this construction is often used for ‘durative situations temporally oriented with respect to a Reference time’ (e.g., Me vienen doliendo las muelas desde que salí de Madrid [I. come...3 ache. the teeth since that leave..1 from Madrid] ‘My teeth have been aching since I left Madrid’, Gómez Torrego [1988: 167], in Squartini [1998: 294]), a pattern resembling analogous structures in some Oceanic languages, where a lexeme related to the verb ‘come’ has grammaticalized as a ‘continuative marker’ (Squartini 1998: 290, 294). The ‘go’-pattern in Italian instantiates a more advanced stage in the grammaticalization process than the ‘come’-periphrasis, since it also occurs with activities. In Spanish (Yllera 1999) and Portuguese, there occur two different auxiliaries for the ‘go’ domain: ir ‘go’ for telic, directed motion (occuring with telic predicates), and andar ‘walk’, referring to non-directed, atelic motion (occurring with non-telic ones), and also allowed with states in Spanish (Squartini 1998: 300). In Portuguese, however, andar hardly undergoes actional constraints (e.g., it also occurs with telic predicates), thus showing a higher degree of grammaticalization (Bertinetto and Squartini 2016: 950).

198

Michela Cennamo

.. Inflectional and analytic future forms For future time reference, alongside the future use of the present, Romance languages, with the exception of Sardinian, Romanian, Dalmatian (Ledgeway 2012: 134; 2016b: 768; Bertinetto and Squartini 2016: 951), developed new inflectional forms, resulting from the grammaticalization of the infinitive of the lexical verb in conjunction with the present tense of the auxiliary ‘have’ (from the gradual desemanticization of an original verb of possession), whose coalescence leads to current future forms, e.g., dar(e) ha(be)s [give. have..2] > daras (Fredegar’s Chonicle, 7th c. AD);   > Fr. chantera, It. canterà, Sp. cantará ‘He will sing’, Ledgeway [2011c: 718]; Ledgeway [2016b: 768]; Bertinetto and Squartini [2016: 951] and references therein). Another future strategy involves the use of an auxiliary derived from lexical verbs denoting motion (‘come’, ‘go’), possession (‘have’), volition (‘want’), obligation (‘must’), followed by the infinitive (either bare or headed by a preposition, depending on the language/variety), the subjunctive in Romanian, depending on the pattern, as illustrated in § 3.2.1. In some languages, the inflectional and analytic strategies may alternate, with the former ultimately being replaced by the latter, as in Brazilian Portuguese, where the inflectional form has been lost in the spoken informal varieties, supplanted by (the present of) ‘go’ + infinitive, e.g., a festa vai començar quando … [the party go..3 begin. when] ‘the party will start when …’ (Whitlam 2011: 442; Bertinetto and Squartini 2016: 952). In present or past tense contexts the inflectional future may also convey a conjectural modality meaning (It. sarà Marco/starà dormendo [be..3 Mark/ stay..3 sleep.] ‘it must be Mark/he must be asleep’), owing to the nonactual/factual nature of a future event, an interpretation that is also conveyed by analytic structures, as in French, e.g., Il va encore avoir oublié de donner à manger au chien [he go..3 still have. forget.. of give. to eat. to-the dog] ‘He must have forgotten to feed the dog again’ (Larreya 2005: 339; Bertinetto and Squartini 2016: 952). The epistemic modality reading indeed might have been probably part of the grammaticalization chain already at the beginning of the process (Bertinetto and Squartini 2016: 952) rather than being a secondary development from temporality (Fleischman 1982). A development mirroring the coalescence of the infinitive with the present form of the verb ‘have’ involves the future-in-the-past or conditional, where the infinitive of the lexical verb coalesces with the past and imperfect of ‘have’ (respectively in northern/central Italo-Romance and elsewhere), e.g., It. canterei < cantare habui ‘I would sing’; Sp. cantare habebam > cantaría (Bertinetto and Squartini 2016: 952 and references therein). The synthetic future-in-the-past emerged before the (inflectional) future, whilst the irrealis, conditional meaning of the construction developed from the future-in-the-past function, in line with the cross-linguistic tendency

Mechanisms and paths of grammaticalization and reanalysis in Romance

199

whereby future forms subsequently develop irrealis modal functions (Coleman 1971: 217; Fleischman 1982: 64; Ledgeway 2012: 134–140; 2016a: 769 for further details and references).

. Modality Modality in Romance is most typically instantiated by grammatical mood (§ 3.3.1), namely the indicative and the subjunctive,8 as well as by a set of modal verbs with different degrees of grammaticalization, denoting ability, necessity, obligation, permission, possibility, volition (§ 3.3.2). Invariable preverbal particles and complementizers mark grammatical mood in some languages (e.g., Romanian că vs să < Lat. si ‘if ’) and ca vs cu, mu and their variants in southern Italian dialects (Ledgeway 2011b: 438; Ledgeway and Lombardi 2014: 31; Ledgeway 2016c: 1018; Quer 2016: 956–957 and further references therein), the former most typically associated with realis, assertive propositions, the latter with irrealis/non-assertive ones and different points/shades in between, depending on the language, where the use of either mood is not semantically motivated (cf. Ledgeway 2016d: 1018–1019).

.. Mood: indicative vs subjunctive The Romance languages display different degrees of grammaticalization of the subjunctive as a marker of subordination, partly confirming proposals concerning the grammaticalization path of the subjunctive, whereby the spread of the subjunctive from the main to complement clauses gradually results in a weakening of its semantic function and its ensuing reanalysis as a marker of subordination, leading ultimately to its loss (Bybee, Perkins, and Pagliuca 1994: 214). More specifically, a cline has been proposed, with Spanish being more conservative than Italian (here the subjunctive has disappeared in regional varieties and most southern dialects) and French being most innovative, since the subjunctive does not appear to be semantically determined, as in Spanish and Italian, but is largely determined by the governor, either lexical or grammatical (Lamiroy and De Mulder [2011]; Carlier, De Mulder, and Lamiroy [2012]; Poplack et al. [2017] for a critical overview). Among the three types of subjunctives occurring in subordinate/complement clauses, the subjunctive selected by verbs of volition, command and necessity (as well as adjectives and nouns conveying the same meanings), so-called intensional subjunctive (Quer 2016: 957) is most stable across Romance: it never alternates with the indicative, unlike the polarity subjunctive (also referred to as ‘dubitative’ or

 The discussion will not consider the imperative and the conditional, classified both as a separate mood and as a tense (Quer 2016: 954 and references therein).

200

Michela Cennamo

‘potential’ subjunctive), licensed by negation in the matrix clause or a yes/no question, that may alternate with the indicative in Catalan (no s’imaginen que marxis/ marxen [not  imagine..3 that leave..2/leave..2] ‘They can’t imagine you leaving /that you are leaving ’), unlike in Italian (non immaginano che sia lui/*? è lui not imagine..3 that be..3 he/* be...3 he, Quer [2016: 957]); and the thematic or factive subjunctive (Cat. m’ha agradat que ha fet poca calor aquest estiu [I. have...3 please.. that make..3 little heat this summer] ‘I liked it that it wasn’t too hot this summer’), licensed by the semantics of the matrix predicate, most typically psych verbs (except for Romanian and the extreme southern Italian dialects) (Quer [2016: 958–959] for a thorough discussion of the various uses of the subjunctive in Romance). Thus, variation in the use of the subjunctive vs the indicative in Romance shows the gradual gaining ground – although to a different extent, according to the language/variety – of the indicative over the subjunctive, that has disappeared from most domains in French and some southern Italian varieties, completely ousted by the indicative in the extreme southern Italian dialects (Ledgeway and Lombardi 2014: 28–31; Ledgeway 2016c: 1015–1019, and references therein). Although no longer morphologically realized, in southern Italian dialects (even those that have formally neutralized the realis/irrealis distinction, since the subjunctive has been lost and one and the same complementizer occurs in both modal functions), the modal distinction has come to be marked through verb movement, i.e., at the syntactic level, through the different position of the verb and the complementizer (Ledgeway and Lombardi 2014: 39). Word order, thus, conveys the distinction originally marked by the indicative/subjunctive opposition (Ledgeway 2012: 170 f.; Ledgeway and Lombardi 2014: 31 f.; 39), as shown in (5), with a striking similarity to Romanian (6), where, however, the complementizer că is more freely interpolated (6a), unlike the complementizer să, as shown in (6b) (Maiden 2016a: 119; p.c.; Nicolae 2019: 74–79, 79, note 5). Indeed, Romanian may be regarded as being at an intermediate stage along the grammaticalization chain, ‘the subjunctive morphology being ‘too weak’ to independently license the different modal interpretation/distinction’ and therefore the need arising to make the realis/irrealis distinction overt through verb movement (leading to the loss of this modal distinction) (Ledgeway and Lombardi 2014: 46; also Quer 2016: 956): (5)

a. Realis: (Dicianu ca) Lello sempe fatica (Cos.) say...3 that Lello always work...3 ‘They say that Lello always work.’ (Ledgeway and Lombardi 2014: 37) b. Irrealis: (Vuonnu) ca Lello fatica sempe want...3 that Lello work...3 always ‘They want Lello to work continuously’

Mechanisms and paths of grammaticalization and reanalysis in Romance

(6)

201

a. Realis: (Spun că) mereu munceşte (munceşte mereu) say...3 that always work...3 ‘The say that he always works’ b. Irrealis: Vor să muncească mereu want...3 that work..3 always ‘They want him to work continuously/always’

.. Modal verbs The Romance languages have a well-developed system of modal verbs expressing dynamic (volition, ability, need/necessity) deontic (obligation, permission) and epistemic (knowledge, belief, possibility, probability) modality, resulting from the continuation and expansion in function of the Latin modals ‘can’, ‘must’, ‘want’ and the grammaticalization of other verbs/constructions (e.g., the verbs of possession ‘have’, ‘keep’/‘hold’, and ‘be’, ‘need’ + infinitive/subordinator, ‘go’ +  in [passive] deontic function, see § 3.3.2.1).

... Dynamic, deontic and epistemic modals The Romance continuants of Latin  ‘can, be able’ (It. potere, Fr. pouvoir, Sp./ Cat./Pt. poder, Rom. a putea), /* ‘want’,  ‘wish’ (Cat. voler, It. volere, Fr. vouloir, Pt./Sp. querer),  ‘must’ (It. dovere, Sp. deber, Fr. devoir, Cat. deure, Pt. dever, but Rom. a trebui being instead of Slavic origin [Cornillie et al. 2009]), instantiate different shades and degrees of grammaticalization in the dynamic – deontic – epistemic modality conceptual space. They realize the core of the category, due to their high degree of polysemy, most typically occurring with more than one modal function (e.g., dynamic (ability), deontic (permission) and epistemic (likelihood/possibility) for Ro. putea ‘can, be able’ (Zafiu 2013b: 576). The usage and morphosyntactic behaviour of modals in Romance confirm general trends pointed out in the literature, such as the fact that the epistemic function is a later and optional development on the grammaticalization cline (dynamic > deontic > epistemic > evidential), with dynamic and deontic meanings being diachronically more basic (Traugott and Dasher 2002: 109–111) and that in their epistemic use modals show a higher degree of grammaticalization (e.g., a ‘higher degree of coalescence and a narrower structural scope’) than dynamic and deontic modals (Hansen and de Haan 2009: 545), a characteristic which they share with Germanic modals (Mortelmans, Boye, and van der Auwera 2009). Thus, for instance, the impersonal use of Romanian a trebui ‘must’ – etymologically meaning ‘need’, displays defective morphology, i.e., the impersonal form, only

202

Michela Cennamo

occurring in its epistemic function, whilst alternating the personal and impersonal forms in its dynamic and deontic usage (Zafiu 2013b: 578–581) – and Brazilian Portuguese ter que/ter de ‘have to’, dever ‘must’, poder ‘can’, are fully grammaticalized only in their epistemic uses. Elliptical constructions (e.g., the omission of the infinitival complement which is possible with animate subjects only) are ungrammatical in Brazilian Portuguese (eles não querem realizar outro concurso no próximo ano mas devem [they not want..3 organize. another test next year but haveto..3] ‘they don’t want to organize another test next year but they have to’ [*are likely to do it]), unlike in their dynamic/deontic uses (Dall’Aglio Hattnher and Hengeveld 2016: 7). In addition, not all modals develop an epistemic reading. For instance, ‘want’ in Italian can occur with a dynamic and deontic (passive) function only, followed by the past participle of the lexical verb in the latter usage, characteristic of southern Italo-Romance, e.g., il muro vuole aggiustato [the wall want..3 repair. ...] ‘The wall must be repaired’ (reg. Sic. It., Alfieri [1992: 848]; Ledgeway [2000: 245]). In French, Italian and Spanish, alongside deontic and epistemic modality, ‘must’ also conveys reportive evidentiality (7a), developing from its deontic reading and often coexisting with the futural meaning, as shown in (7c) from French, although in some uses ‘must’ + infinitive only functions as a future marker, its obligation interpretation being completely bleached, e.g., même si vous devez ensuite me mépriser ‘even if you’ll despise me for it later’ (Squartini 2004: 877; also Fleischmann 1982: 145): (7)

a. Selon les indications données par Max, nous according-to the indications give... by Max we devons être déjà là must...1 be. already there ‘According to the instructions given by Max, we should already be there’ (Squartini 2003: 878) b. Je dois dîner avec Joseph la semaine prochaine I must...1g dine. with Joseph the week next ‘Im going to have dinner with Joseph next week’ (Squartini 2004: 877)

Romance modals clearly follow the grammaticalization cline also in showing the dynamic-deontic modality path, as for Italian bisogna + ‘that’/ ‘(it) needs’, originally a transitive verb of necessity in old Florentine (mi bisognan fiorini dugento d’oro [I. need..3 florins two-hundred of-gold] ‘I need two hundred gold florins’, [Decameron VIII, 1, p. 507]), gradually developing a deontic modal function, resulting in the loss of morphosyntactic transitivity, the verb becoming impersonal and governing a complement clause introduced by either the complementizer ‘that’ + subjunctive or the infinitive (bisogna che vada/andare [need...3 that

Mechanisms and paths of grammaticalization and reanalysis in Romance

203

go...1/go.] ‘I must go’, Benincà and Poletto [1997: 109]). A similar deontic passive function of the verb ‘need’ also obtains in Nuorese Sardinian, the verb kérrere being followed by the past participle (cussa dzente keret tímita [that people.. need..3 fear....] ‘Those people are to be feared’, [Jones 1993: 125]). Three different parameters converge in determining the grammaticalization of verbs of dynamic modality (e.g., ability/necessity) into markers of deontic and, optionally, epistemic and evidential modality, depending on the language/variety: (i) the desemanticization of the lexical verb, (ii) context determined (pragmatic) inferences, (iii) decategorialization (Cornillie et al. 2009: 109 and references therein). The same dimensions are at work for the grammaticalization of constructions involving the verbs ‘be’ (It. essere da, Rom. fi de, Fr. être à), the Italian motion verb andare ‘go’ + (passive) past participle (cf. § 3.1.2), the verbs bisogna ‘need’, occorre ‘must’, conviene ‘ought to’, the French verb falloir ‘miss/lack’, Sp. caber (impersonal) ‘be possible’, the verbs of possession ‘have’/‘hold’/‘keep’ + a/de (Sp., haber de/ que, tener que, Fr. avoir à, Rom. avea de + supine, Pt. ter de/que), in deontic/epistemic patterns, with the prepositions à ‘to’, de ‘of ’ evoking an objective to be attained (Cornillie et al. 2009: 111), depending on the language and on the verb/construction. Among the constructions denoting possession developing a deontic modal usage, Spanish ‘have’ appears to be the most advanced on the grammaticalization cline. Unlike in French and Italian, for instance, the verb no longer denotes possession but is a perfective auxiliary and has developed not only a deontic reading, but also a future (dize que los moros todos han de yr a parayso [tell...3 that the maur.. all.. have...3 to go. to paradise..] ‘One says that all Muslims will go to paradise’) (Cornillie et al. 2009: 114) and epistemic modal interpretation (creo que han de ser verdaderas [think...1 that have...3 to be.. real..] ‘I think they must be real’), the latter already attested in old Spanish (13th–14th c. texts), the verb also occurring in the impersonal form hay que, although in deontic meaning only (no hay que temer a los perros [not have..3 that fear.. to the dogs] ‘one should not be afraid of dogs’) (Cornillie et al. 2009: 114–115), a fact which contradicts the general pattern found in several European and other Romance languages, whereby the impersonal form usually conveys epistemic modality (Cornillie et al. 2009: 112; Haansen and De Haan 2009: 522). Even within a language/variety, therefore, one and the same pattern may show different degrees of grammaticalization, e.g., of semantic bleaching and decategorialization (Cornillie et al. 2009: 115). Other verbs, on the other hand, appear to be at the edge of the category, only occurring with one modal meaning, like ‘know’ (cf. It. sapere ‘know, can, be able to’ [intrinsic/acquired ability], e.g., sa comprendere le persone [know...3 understand.inf the people] ‘He understands people’, ‘He understands people’ vs. sa nuotare [know...3 swim.] ‘He can swim’; BPt. saber ‘to know how to’ [acquired ability only], Brazil sabe fazer grandes eventos … [Brazil know...3

204

Michela Cennamo

do.. big events] ‘Brazil is capable of organizing big events …’; see Dall’Aglio Hattner and Hengeveld [2016: 10–11]; Hansen and de Haan [2009: 513] for a general discussion).

. Agreement Verb agreement is fairly consistent in Romance. The finite verb generally agrees with the A/S/O subject, depending on the active/passive nature of the construction, indexing person, number and, marginally, gender, as in a central Italian dialect, Ripano (§ 3.4.1), whilst the active past participle in compound tenses shows variable agreement with the O/SO argument, exceptionally also the A/SA and R argument, the Indirect Object, as in some central Italian dialects (D’Alessandro and Pescarini 2016; Loporcaro 2016a: 805; and also § 3.4.2 below). Nevertheless, there are interesting diverging patterns reflecting different degrees of grammaticalization and features of verbal agreement controllers (see Ledgeway 2012: 300–305; Loporcaro 2016a for a recent overview; Bentley 2018). By contrast, passive participles never display variation in their agreement possibilities, always agreeing with their head, the O/SO argument (Loporcaro 2016a: 806– 812 and further references therein).

.. Subject agreement Subject agreement in Romance (Bentley 2018) is generally fully grammaticalized, i.e., unaffected by the pragmatic and semantic characteristics of the controller,9 when the (A/S/O) subject is preverbal, most typically indexing person and number.10 In presentative and existential clauses, the postverbal nominal (i.e., the noncanonical pivot) is not always subjectivized, and finite verb agreement exhibits varying degrees of grammaticalization, instantiated by the presence, optionality and lack of agreement, and the obligatory/optional presence of a locative/non-locative verb initial proform/expletive subject clitic pronoun, depending on the language/variety (e.g., Fr. il [8a–b], Mil. g(he) [8d, f, h])11 (cf. Kaiser and Remberger

 In some Tuscan and Ligurian varieties, however, discourse-pragmatic factors determine the pre/ postverbal position of the subject and related agreement/non-agreement patterns obtain (Adam Ledgeway, pc.; see also Cennamo 1997a: 152–155, 161).  In the Marchigiano dialect of Ripatransone, however, finite verb agreement indexes (synchretically) gender and number, whilst person distinctions are neutralized in the singular (Ledgeway 2012: 300–301).  Among the main existential proforms of modern Romance there occur gh(e) < j < ///  (Benincà 2007), bi (Log./Nuo., It. vi < , Wagner [1960–1964]; Blasco Ferrer [2003: 61]), It. ci <  //   (Maiden 1995: 167), Cat. hi <  (Badia i Margarit 1951: 266), Fr. Y, Prv. I < , ,

Mechanisms and paths of grammaticalization and reanalysis in Romance

205

[2009] for an overview; also Cennamo [2016: 977–978] and references therein). The grammaticalization of a locative adverb into a non-referential existential proform, however, is not complete yet in Romance (Ciconte 2015: 218; Bentley and Ciconte 2016: 856). With existential patterns, optionality of agreement – a subsequent stage on the grammaticalization path gradually leading, in some languages/varieties, to the loss of agreement with the postverbal nominal, both in existential and presentative constructions, as shown in Tables 7 and 8 below – reflects different D(efiniteness) E(ffect)s and especially the Specificity scale, ranging from 1st, 2nd persons to nonspecific NPs, according to the following implicational scale: 1st/2nd person pronoun > 3rd person pronoun > definites > definites+some indefinites > indefinites (Bentley 2013: 684). In Venetian dialects, copula agreement is found only with 1st and 2nd person pivots, whereas in Catalan and some Tuscan dialects only pronominal pivots agree with the existential copula. In Spanish, Galician, and European Portuguese, on the other hand, copula agreement is confined/restricted to specific pivots (Bentley 2013; Bentley, Ciconte, and Cruschina 2015; Bentley and Ciconte 2016: 858). Consistent agreement of the pivot with the copula is found instead in most Friulan dialects (Haiman and Benincà 1992: 185), Italian (where, however, spoken registers display a tendency towards non-agreement, e.g., c’era dei contadini [ be..3 some farmers] ‘there were farmers’, Koch [2003: 158]), central and southern ItaloRomance dialects, Corsican, Campidanese and some Nuorese Sardinian dialects, as well as Romanian (sunt eu [be..1] I ‘there is me’) (Bentley and Ciconte [2016: 857]). In contrast, lack of agreement characterizes French, Ladin, some Romansh dialects, some far southern Italo-Romance varieties (e.g., in Puglia and Calabria) as well as Brazilian Portuguese, where ter ‘have’ < Lat. tenere ‘to hold’ is found as an existential copula (tem muitos caroços nessa fruta [have.3 many seeds in.that fruit] ‘There are many seeds in that fruit’, Bentley and Ciconte [2016: 857]). In Spanish, non-canonical agreement of the postverbal NP with existential haber ‘have’ occurs mainly in past tenses, it is found also in written genres (… habían también oficiales ingleses … [have..3 also officers English] ‘… there were also English officers …’), reflecting also geographical and sociolinguistic factors (Meulleman 2012: 421, note 6 and references therein). French il y a ‘there is’, has become a fixed formula, an introducer of thetic propositions, occurring in a fixed position at the front of the sentence. Unlike Spanish hay ‘there is’ > ha < have..3 + the enclitic morpheme y < OSp. ý ‘there’ (Sánchez Lancis 2001: 107–109) and Italian c’è ‘there is’, French il y a shows lack of agreement and a high degree of attrition. It is phonologically reduced to a monosyllabic form ya in the spoken language (albeit also attested in newspapers), thereby displaying a higher degree of grammaticalization than Spanish hay and Italian c’è in relation to this parameter, for which the

(Blasco Ferrer 2003: 61), en (Arag.) <  (Blasco Ferrer 2003: 61), Log., Nuo., Cpd. (n)che, (n)ci (Wagner 1960–1964, 1: 624), southern and northern Italo-Romance nd(i), ne <  (Maiden 1995: 167).

206

Michela Cennamo

Tab. 7: Grammaticalization cline of agreement with existential patterns. Consistent agreement [ + ] > – Grammaticalized

Optionality of agreement [ ± ] > Parameters: DE − Specificity scale

Lack of agreement [ – ] + Grammaticalized

Friul., It., CIDs/SIDs, Cors., Cpd., Nuo., Ro.

Sp. (some registers/areas), Ven. (+ with st/nd), Cat./ EPt. (+  with pronominal Pivots)

Fr., NIDs, Lad. Romsh., Pugl., Cal., BPt.

Italian existential construction is the least grammaticalized among these three languages (Meulleman 2012: 422; 445; 447). Discourse-pragmatic factors interact with thematic (e.g., agentivity/affectedness) and aspectual notions (e.g., the presence/absence of a state component in the verb logical structure) in determining the presence, lack and optionality of agreement with the postverbal nominal with presentatives, organized along the unergative/unaccusative continuum in presentative structures with intransitive verbs (under the gradient view of the divide, as proposed by Sorace [2000, 2004, 2011]. In French the pattern notably occurs with unaccusatives (8a), and is only marginally available with unergatives (8b) and unacceptable with transitives (Legendre 1990; Creissels 2010). In northern Italian dialects the construction is possible with both unaccusatives and unergatives, with different agreement patterns (Cennamo 1997a: 153–155; Parry 2000; Bentley 2018). Agreement obtains with activity verbs (8c), but is instead optional with states (8d–e), achievements (f–g) and accomplishments (8h–i), depending on the presence/lack of an etymologically locative clitic in sentence initial position, that triggers lack of verbal agreement with the postverbal nominal, as shown in (8f, h) (examples and analysis from Bentley [2018], and further references therein; see Cennamo and Sorace [2007: 95–96] for agreement variability in Paduan; Parry [2000] for Piedmontese; Parry [2010, 2013a]). (8)

a. il est arrivé trois personnes  be..3 arrive.... three people ‘Three people arrived’ b. Il y nage la côte d’un chou … (sc. soup)  there swim..3 the stem of-a cabbage ‘There swims in it (sc. the soup) the stem of a cabbage’ (Legendre 1990: 87, note 7) c. an ciamà i tò gent/tanti have..3 call.. the your people/many.. malà (Mil., Lombardy) (activity) patients ‘Your parents/Many patients called’

Mechanisms and paths of grammaticalization and reanalysis in Romance

207

d. ier gh’ è astà mal i me yesterday  be..3 be.. unwell the my fiö/tanti fiö (state) children/many children. e. ier in astà mal i me yesterday be..3 be.. unwell the my fiö/tanti fiö children/many child. ‘Yesterday my children/many children were unwell’ f. gh’ è rivà i tò surei/di pac (achievement)  be.3 arrive.. the your sister.. /some parcel.. g. in rivà i tò surei/ di pac be.3 arrive.. the your sister.. /some parcel.. ‘There arrived some parcels/your sisters’ h. gh’ è vegnù gió i sciur del pian de  be..3 come.... down the people of.the floor of sura [-agr] upstairs i. in vegnì gió i sciur del pian de sura be..3 come... down the people of.the floor of upstairs ‘There came down the people from upstairs’ In this respect, Italo-Romance dialects instantiate the highest degree of variation and French the highest degree of consistent lack of verbal agreement, i.e., of grammaticalization (Dobrovie-Sorin 1994; Legendre 1990; Legendre and Sorace 2003 and references therein), the finite verb always reverting to the unmarked 3rd person singular and the past participle to the masculine singular in compound tenses, as shown in (8h) for Milanese.

Tab. 8: Grammaticalization cline for agreement with presentatives in French and Italo-Romance. Consistent agreement + > – Grammaticalized

Optionality of agreement ±

Lack of agreement – > + Grammaticalized

+ AGR: activity verbs (NIDs)

+ EXPL: – AGR: states, achievements, accomplishments (NIDs)

all aspectual classes (Fr., NIDs)

– EXPL: + AGR, all aspectual classes (NIDs)

208

Michela Cennamo

In compound tenses in most Romance languages/varieties, the masculine singular ending of the (active) past participle has been reanalysed as a default agreement marker, signalling lack of agreement, with transitives/unergatives and even unaccusatives in Spanish and Catalan (when the auxiliary ‘have’ occurs, see Loporcaro [2016a: 803–804]). The default masculine singular also occurs with a noncanonical (clausal) controller, like the infinitival subject in Italian (e.g., fumare è diventato/*a una abitudine [smoking. be..3 become..../*. a habit.] ‘Smoking has become a habit’), a property taken up by the neuter singular ending/form in Surselvan (e.g., tgei ei succediu? [what be..3 happen...] ‘what happened?’) and some central Italian dialects (Loporcaro 2016a: 804 and references therein). In contrast, in Aromanian, Megleno-Romanian and western Romanian varieties an invariant morphologically feminine form occurs, for all non-finite forms of the verb (past participle, supine, gerund and long infinitive, e.g., am mâncată [have..1 eat....] ‘I (/) have eaten’, Ledgeway [2012: 295, note 12]).

... Subject clitics and agreement markers A characteristic feature of some Romance languages from southern France, Switzerland and a part of northern Italy, including Tuscany, is the occurrence of subject clitics (Pescarini 2016; Poletto and Tortora 2016 and references therein), functioning as agreement/agreement-like markers, in (so-called) subject clitic doubling patterns, with different languages/varieties instantiating different points along the wellknown clitic-agreement grammaticalization cline (Hopper and Traugott 2003: 142, among others). A variety of Quebec French – where the construction is optional, restricted to some syntactic environments (e.g., ungrammatical with relative clauses and Wh- questions) and semantically constrained, e.g., only possible with definite subjects (la sirène (a) chante chaque matin (*a sirène …) t[he mermaid she sing...3 every morning] ‘The mermaid sings every morning’) – represents a less advanced stage on the grammaticalization cline (Cournane 2010), unlike nonstandard French, where the subject clitic pronoun il ‘he’ has become a bound agreement marker, indexing agreement with the lexical subject (ma femme il est venu [my. wife  be...3 come. ...] ‘My wife has come’; Lambrecht [1981: 40]; Hopper and Traugott [2003: 15; 223]). Northern Italian dialects show a highly varied picture, not only as regards the grammaticalization process, but also in relation to the etymological source and current syntactic function of subject clitics. These are reported also for some southern Italian (Sicilian) dialects, where they have been reanalysed as aspectual and modality markers (respectively in Pantelleria [Loporcaro 2012a] and Palermo [Sorrisi and Giorgi 2012; Poletto and Tortora 2016: 783, note 20], whilst in Paduan the clitic a > Lat. pronoun  ‘I’ carries a pragmatic function, marking a sentence as conveying new information, e.g., a piove [cl rain..3] vs. piove ‘It is raining’ [Benincà 1983, 2014; Poletto and Tortora 2016: 772]). Dialects such as Trentino and generally

Mechanisms and paths of grammaticalization and reanalysis in Romance

209

Friulan, Bolognese, Florentine, as well as Piedmontese and Ligurian varieties (Mair Parry, p.c.) realize a more advanced stage towards the agreement marker status of the clitic, since subject clitic doubling is obligatory in a wider range of syntactic contexts and semantically unconstrained (Tr. la Maria la magna [the Mary  eat..3] ‘Mary is eating’; Poletto and Tortora [2016: 778–779, 783 f.] and further details and references therein).

.. Object agreement Object agreement is fully grammaticalized with pronominal third person Os in compound tenses in Italian, French, Occitan, which display consistent past participle agreement, whilst participial agreement with SO arguments/subjects reflects lexicoaspectual and thematic constraints (e.g., agentivity), as in French, where participle agreement is retained with subjects selected by verbs denoting change of state/ location and existence (Legendre 1990, Legendre and Sorace 2003; Loporcaro 2016a: 806 and further details and examples). A lower degree of grammaticalization is also found in the eastern Abruzzese dialect of Arielli, where O/SO participial agreement is determined by number, whereby agreement occurs with any plural nominal, thus, involving also A/SA arguments (D’Alessandro and Roberts 2010; Ledgeway 2012: 349).

... Object doubling and agreement markers The doubling of (most typically human) direct objects, is a characteristic feature of Latin American Spanish (e.g., Rioplatense, Buenos Aires Spanish, see [9a], Company-Company [2003]; David [2014], among others), Romanian (9b) (Dobrovie-Sorin 1994; Pană Dindelegen 2013b: 136–139; David 2014) and several Italian dialects (Rohlfs 1968: 168; Manzini and Savoia 2005, II: 502–523), most notably from the south (e.g., Neapolitan) (9c) (Ledgeway 2000: 37 f., 2018; Fiorentino 2003), where it is also possible with inanimate objects (Ledgeway 2009: 354), with examples reported also for some dialects from Romagna, Triestino and several other northern dialects. It is confined to the high positions of the animacy hierarchy, namely first and second persons (Manzini and Savoia 2005, II: 523; Roberts 2016: 801), marked by the preposition a (Lat. a(d) ‘to, towards’) in Spanish (where, however, the preposition a has been reanalysed as a case-marker for all types of direct objects, i.e., as an accusative marker) (Company-Company 2003: 231; Belloro 2007; Leonetti 2008, among others), a/ma/da in Italo-Romance (Manzini and Savoia 2005, II: 502), pe (< Lat. per ‘through’) in Romanian (‘on’ in its lexical function) (Pană Dindelegan 2013b: 128; Mardale 2015: 201). In these languages/varieties the object is doubled by means of a clitic pronoun agreeing with it in number, person, gender (depending on the variety), functioning as a type of object agreement. The construction results from the grammaticalization of an erstwhile clitic as an agreement marker on the

210

Michela Cennamo

verb, occupying a fixed position, adjacent to the verb, and most typically co-occuring with Differential Object Marking, linked to its licensing conditions (e.g., human referents, specificity, depending on the language/variety). In Spanish, the dative clitic has grammaticalized to an invariant, affix-like form, le, functioning as an object marker, referring to an indirect object [± animate], a process also known as dative clitic doubling. It is initially found with inanimate arguments, attested by the 16th century, and frequently used with inanimate participants by the 20th century (9d) (Company-Company 2003: 237): (9)

a. La encuentro a la tía .. meet..1  the woman ‘I meet the woman’ (Quilis et al. 1985: 105; Belloro 2007: 53) b. Îl văd pe Ion ... see.prs.1sg  Ion. ‘I see Ion’ (Pană Dindelegan 2013b: 136) c. l’ hai visto a quello? (Neap.) .. have..2 see...  that.. ‘Did you see that man?’ d. no hay que darle tanta importancia a las not have.3 that give..MRK so-much importance to the apariencias appearances.. ‘Not so much importance should be given to appearances’ (spontaneous speech)

 Grammaticalization of complex constructions Five main grammaticalization processes can be detected in the domain of complex constructions and clause linking, partly as a result of the gradual emergence of head-marking coding systems in the passage from Latin to Romance: (i) the fixing of dedicated positions for the lexical verb and the A and O arguments (i.e., the subject and object relations) (§ 4.1), (ii) the emergence of (finite and) non-finite complementizers, with different types and degrees of lexicalization of the realis/irrealis distinction, interacting with the marking of the control relation (§ 4.2.), (iii) the pressing into service of grammatical elements such as prepositions, (originally deictic pronominal) complementizers, adverbials, to form complex conjunctions/phrases, introducing different types of adverbial clauses (§ 4.3), (iv) the rise of invariant

Mechanisms and paths of grammaticalization and reanalysis in Romance

211

relative markers as a type of relativization strategy (§ 4.4), (v) insubordination strategies (§ 4.5).

. Word order A characteristic feature of the transition from Latin to Romance is the fixing of the position of the verb and its core arguments and adjuncts according to the sequence S V O ADV, and the grammaticalization of the verb medial position (with SVO and OVS being statistically the main word order patterns in Late Latin texts) (Herman 2000: 86; Ledgeway 2011c: 725), resulting in so-called V2 syntax, owing to the raising of the verb to the C(omplementizer) position in the left edge of main clauses, preceded by constituents fronted from the sentence core for pragmatic reasons, generally in topic/focus function (cf. autre chose ne pot li roi trouver [other thing not could the king find], OFr., M. Artu 101; [Ledgeway 2011c: 725], a questo resposse Iasone [to this answered Iason]; ONeap. LDT 60.12; Ledgeway [2012: 66]). For a thorough and perceptive description of word order changes in the passage to Romance see Ledgeway (2011c: 725, 2012: 64–71). V2 syntax is a well-attested feature of medieval Romance, especially Gallo-Romance and Raetho-Romance (Benincà 1994; Vincent 1997b, among others), although old Sardinian differs, in that the placing of another constituent before the fronted verb is far less common, resulting in frequent V1 (Lombardi 2007; Wolfe 2015, 2018). V2 syntax is retained in some contemporary varieties where the medium of culture is the German language (Romansh in Grisons and Ladin in the Alto-Adige/Südtirol (Haiman and Benincà 1992: 167–175; Ledgeway 2012: 65; Salvi 2016: 1009–1010). However, see Sornicola (2000) for a different view and Sitaridou (2012: 556; 595; 597–598), for a refinement of the V2 typology in old Romance, considered as an ‘epiphenomenon of information structure’, reflecting (degrees of) grammaticalization of functional projections, resulting in restrictions on linearization and on word order options, with old French instantiating a structural V2 and old Portuguese, old Occitan and old Spanish a linear V2 and recent discussion in Wolfe (2018). At this initial stage, therefore, word order is grammatically fixed at the level of the sentence core constituents, but it is pragmatically flexible in the left periphery of the clause (Ledgeway 2011c: 725; 2012: 68; Ledgeway 2016b: 771; Cruschina and Ledgeway 2016: 571–572). Subsequently, the frequent fronting of the topic subject in the left edge sentence initial position led to the grammaticalization of the sentence initial position as the subject unmarked position in main clauses, leading to the order S (AUX) V (*ADV) (O) (IO) (*ADV) (Ledgeway [2011c: 725–726, 2012: 71, 2016b: 770–771] for an in-depth investigation of the issue and related references) and to the reanalysis of weakened (optionally raised) preverbal subject pronouns as obligatory subject clitics, subsequently developing into agreement markers (e.g., Pad. la se verze it (sc. puerta) [ opens] ‘it (sc. the door) opens’) (Poletto 1995; Parry 2010,

212

Michela Cennamo

2013a; § 3.4.1 above).12 The fixation of word order has resulted in SVO as the main word order type in French (only allowing also VOS and, marginally, OVS), regarded as the most grammaticalized Romance language with respect to word order, as also shown by the constraints on the VOS order, most restrictive in French, where the pattern is possible only when the postverbal S has identificational focus (Kiss 1998) (e.g., paieront une amende tous les automobilistes en infraction [pay..3 a fine all the drivers in breach-of-the-law] ‘all drivers who will infringe the law will be fined’, Lahousse and Lamiroy [2012: 402]) but not when it conveys (new) information focus (*prend le microphone le directeur technique [take..3 the microphone the technical director] ‘the technical director is taking the microphone’ (Lahousse and Lamiroy 2012: 402). This feature stems from a grammaticalization process that is well advanced in French, less prominent in Italian and even less so in Spanish (as independently confirmed by the increasing use of cleft sentences in modern French compared with other Romance languages; cf. discussion in Lahousse and Lamiroy [2012: 395–396] and references therein).13

. Complement clauses Finite complements are generally marked by a reflex of Latin  > que/che ‘that’, plus a preceding preposition reflecting the valency of the verb in Ibero-Romance (Sp. me acuerdo de que es … [I. remember..1 of this] ‘I remember that it is …’ < me acuerdo de esta cosa) where in colloquial varieties this complex type of complementizer (valency selected preposition + complementizer que) is spreading to verbs which do not select a prepositional complement (dice de que es el compleaños de su hermano [say..3 from comp be..3 art. birthday of his brother] ‘He says that it is his brother’s birthday’ (Kabatek and Pusch 2011: 88; see also Ledgeway 2016c: 1015). For propositional infinitival complements there occur either Ø, as in Gallo- and Ibero-Romance (cf. Fr. il savait s’être trompé [he know..3  be. be-mistaken.] ‘He knew he had made a mistake’, Ledgeway [2016c: 1015]), or the continuant of the Latin preposition  ‘of, from’ in Italo-Romance (It. sapeva di sbagliare [know..3 of be-mistaken.] ‘He knew he was wrong’), whilst irrealis infinitival complements are often headed by a reflex of Latin  ‘to’. The distinc-

 On the controversial related issue whereby undergoer subjects, i.e., SO/O arguments are excluded from the unmarked preverbal subject position in some languages/varieties, e.g., Italian, most typically occurring, in the unmarked VS order, see Ledgeway (2012: 69); Salvi (2016: 1002–1003), also with reference to the unmarked preverbal position of subjects, (Dubert and Galves 2016: 427) on the difference between European and Brazilian Portuguese.  French is sometimes regarded as being more advanced in the grammaticalized process than the other Romance languages also in other domains such as auxiliaries (Lamiroy 1999), mood (Loengarov 2005; 2006), determiners (Carlier 2007), prepositions (Lamiroy 2001), existential sentences (Meulleman 2012: 260) (Lahousse and Lamiroy 2012: 408).

Mechanisms and paths of grammaticalization and reanalysis in Romance

213

tion, however, is not clear-cut and the distribution of these infinitival markers is varied and often arbitrary/non-systematic (Ledgeway 2016c: 1015–1016). Thus, in Italian and French a and de may alternate as markers of irrealis complementation, according to the verb and its valency, whilst in Catalan de has been grammaticalized as the default complementizer, often employed for realis and irrealis complement clauses (Wheeler, Yates, and Dols 1999: 395; Ledgeway 2016c: 1016). The choice between different complementizers introducing an infinitival clause may also convey different control relations, in addition to the realis/irrealis distinction, depending on the verb (e.g., subject control with de and object control with a), as in Italian (e.g., Marco convinse tutti di aver preso la decisione giusta [Mark convince..3 everybody of have. take. the decision right] ‘Mark convinced everybody to have taken the right decision’ vs Marco convinse tutti a prendere la decisione giusta [Mark convince..3 everybody to take. the decision right] ‘Mark convinced everybody to take the right decision’), or just control relations (see Ledgeway [2016c: 1016]; Cruschina and Ledgeway [2016: 565–568] for an indepth investigation of the variation encountered in Romance). Some upper southern Italian dialects, instead exhibit a dual finite complementizer system, employing the complementizer ca (< Lat.  ‘because’) for propositional (indicative) complements and che/chi (< Lat.  ‘that’), mu/ma/mi (> Lat. ()), cu (< Lat.  ‘that’) for irrealis complement clauses (Ledgeway 2016c: 1018–1019), surfacing syntactically when the distinction is formally neutralized and both complementizers may occur in either contexts, although they cannot precede topics/foci in irrealis contexts (Ledgeway and Lombardi 2014; Ledgeway 2016c: 1019 and § 3.3.1). Some varieties (e.g., Galician and north-western Italian dialects; [Paoli 2003]; [Ledgeway 2016c: 1019–1020]) display so-called recomplementation, whereby the complementizer occurs twice, around a fronted topic or focus constituent (10), restricted to the subjunctive in northwestern Italian dialects (Paoli 2003; Ledgeway 2016c: 1020), similar to the Romanian recomplementation pattern consisting of the irrealis complementizer să (< *se < Lat.  ‘if ’) preceded by the complementizer ca (< Lat. ()), in the fixed group ca … să ‘in order for, for … to’ (Gheorghe 2013: 467; 469–470; Maiden 2016a: 120), whilst also attested in infinitival clauses introduced by the complementizer de ‘of ’ in old Sardinian (Vincent [2006: 12]; full discussion in Ledgeway [2016c: 1020, note 11]):14 (10) Dixéronme que a ese rapaz que o coñecemos na festa say. .3.I.  that to that boy that him meet..3 in.the party ‘They told me that we met the guy at the party’ (Galician; Ledgeway 2016c: 1020) Romance also displays different degrees of interlacing (Lehmann 1988: 204–209) between matrix and dependent infinitival clauses, as with so-called restructuring  The pattern is also found in old Romance varieties, e.g., old French, old Spanish, old Tuscan (Ledgeway 2016c: 1019–1020).

214

Michela Cennamo

verbs (Rizzi 1982), modal, aspectual and motion verbs showing obligatory/optional clitic climbing of the infinitival pronominal DO/IO/locative arguments and auxiliary switch (according to the language/variety), owing to the blurring of surface clausal boundaries, reflecting the degree of semantic integration between the matrix and dependent infinitive. Interlacing is higher in Neapolitan (11a), where clitic climbing is obligatory (auxiliary switch not applying, ‘have’ being the sole auxiliary) (Ledgeway 2000: 83), lower in Italian, displaying also optional auxiliary switch in this syntactic context (11b), the two syntactic properties patterning together when clitic climbing takes place (11c) (Ledgeway 2016c: 1022; Cruschina and Ledgeway 2016: 563–564 for further examples and references): (11) a. o’ jamm a chiammà him go...1 to call. ‘We are going to call him’ (Ledgeway 2000: 83) b. Marco è/ha voluto tornare Mark be...3/have...3 want.... return. a casa/tornarvi to home/return..there ‘Mark wanted to return there’ c. Marco vi è voluto tornare Mark there () be...3 want.... return. (*vi ha voluto tornare) there () have...3 ‘Mark wanted to return there’ In the southern Italian varieties where the subordinate infinitive only occurs in restructuring contexts (Ledgeway 2016c: 1027), there occur paratactic-like complementation patterns with different degrees of grammaticalization of the matrix verb like aspectuals (e.g., ‘stand’), or modals (e.g., ‘want’) and, to a lesser extent, motion verbs in aspectual function (e.g., ‘exit’, ‘go’, ‘come’, ‘return’) and interlacing between the matrix and its complement/dependent verb (Manzini and Savoia 2005, I: 688–701). Following Ledgeway (2016c: 1027–1028 and extensive discussion in Ledgeway 1997), three subtypes can be identified: (i) the juxtaposition of two inflected verbs, with clitics either climbing to the matrix/aspectual verb or occurring between the latter and the governed/dependent verb (12a–b), (ii) the sequence of matrix and the dependent verb linked by the conjunction a (< Lat ), homophonous with the complementizer a ‘to’ (Rohlfs [1969: § 76], but see Manzini and Savoia [2005, I: 688– 689] for the proposal of regarding it as the complementizer a) (12c), (iii) juxtaposition of the aspectual and the dependent verb (with initial consonantal lengthening signalling the original dropped conjunction a), the aspectual occurring in an invari-

Mechanisms and paths of grammaticalization and reanalysis in Romance

215

ant form, functioning as a marker of imminental aspect (12d). This pattern is a recent development, confined to Salentino and the verbs ‘stand’ and ‘go’ (Ledgeway 2016c: 1027–1028; 2016d: 159): (12) a. Væ u cæmǝ go..2 and call..1 ‘You are going to call him’ (Minervino Murge; Manzini and Savoia 2005, I: 689) b. Ni veníanu pigliávanu a ra stazione us come..3 take..3 at the station ‘They would come and fetch us at the station’ (Cos.; Ledgeway 2016c: 1027) c. u stok a f’fatsǝ it stand..1 and do..1 ‘I am doing it’ (Putignano; Manzini and Savoia 2005, I: 689) d. lu va/sta pperdu/pperdi … it go/stand. lose..1/lose..2 … ‘I am going to lose it’/’I am losing it’ (Ledgeway 2016c: 1028)

Tab. 9: Grammaticalization of complementation markers/devices. Finite complementation

Non-finite complementation

que (Fr., Pt., Cat., Ast., …)/che (It.) ‘that’; valency selected preposition + que ‘that’ in Ibero-Romance (de ‘of, from’ (Sp.), a ‘to’ (Ast.) …): propositional/irrealis complements

propositional infinitival complements: Ø (Gallo-, Ibero-Romance); reflexes of Lat.  ‘of, from’, (It., Fr.); irrealis: a ‘to’: (It., Fr.); Cat. de: default complementizer (propositional/ irrealis)

Dual complementizer system (some USIDs): Interlacing matrix V-infinitive with restructuring ca (< Lat.  ‘because’): propositional; verbs: modals, aspectuals, motion verbs): che/chi (< Lat.  ‘that’), mu/ma/mi higher: Neap., lower: It. (< Lat. ()), cu (Lat.  ‘that’): irrealis Paratactic complementation: degrees of grammaticalization. Matrix V –dependent Verb (southern Italo-Romance): (i) juxtaposition two inflected verbs > (ii) sequence matrix V + conjunction (a (Lat. ) + dependent V > (iii) juxtaposition aspectual V – dependent V (conjunction dropped) > (iv) aspectual in invariant form = aspectual marker

216

Michela Cennamo

. Relative clauses Two grammaticalization paths can be detected in Romance as regards the relative clause domain: (i) the gradual loss of case, number and gender distinctions associated with the Latin ancestors of Romance relativizers qui/quae/quid, resulting in invariant markers – che/qui/que and variants, the French possessive relative adverb dont < Lat. + ‘from where’ (Grevisse 1980: §§ 1217–1236), originally marking source (e.g., Fr. qui/que, It. che, Sp./Cat. que, Pt. que < Lat. qui/quae/quid, Ro. care < Lat.  ‘which’/ce < Lat.  ‘what’ (Gheorghe 2013: 489; 492; 2016: 475; 479) – instantiating the [–case], gapping strategy, and reanalysed as complementizers once they can occur in all syntactic functions, like their homophonous complementizer que ‘that’ and its variants (cf. It./Fr./Sp./Cat./Pt. che/que < late Lat.  ‘that’). See also Stark (2016: 1037); Giacalone Ramat (2005: 4, note 4) for a discussion of the controversial pronoun-complementizer status of the markers que/qui and variants (e.g., It. la ragazza che lavorava con me [the girl who work..3 with me] ‘the girl who used to work with me’); (ii) the rise of new relative pronouns, encoding gender, number distinctions and the syntactic function of the relativized element, depending on the language/variety (cf. It. la quale [the.. which] ‘the which’), based on the grammaticalization of the Latin determiner  ‘that’ in conjunction with the adjective  ‘which’, a Romance innovation well attested in the early Romance texts (e.g., the 12th century in French, Spanish and Italian), built on the Latin ‘connecting relative’ (Giacalone Ramat 2005: 124 f. and further references therein), a non-restrictive relative clause with a sentence initial relative pronoun referring back to an NP in the preceding clause (e.g., Lat. id oppidum Lentulus … tenebat; qui … profugit ex oppido [this town Lentulus held; who fled from town] ‘Lentulus … held this town: who … left the town’ [Caesar. De bello civili I, 15]; Giacalone Ramat [2005: 123]). The relative pronoun il quale, [+ case-coding] strategy, was originally introduced as a textual device, characteristic of the written language, and occurring in non-restrictive clauses in subject and object function in old Italian, subsequently spreading to restrictive clauses and merging with the invariant relativizers paradigms (Giacalone Ramat 2005, 2008 and further references therein). Type (i), the [–case], gapping strategy occurs, most typically, for core arguments, namely subjects (A/S) and (non-differentially marked) objects (O), although also available with IOs and generally obliques for human antecedents (e.g., Fr. qui, Cat./Sp./Pt. que ‘who’) (Stark 2016). Type (ii) the [+ case-coding], relative pronoun strategy is employed with obliques and adjuncts (e.g., It. il quale, Fr. lequel, Sp. el cual, Pt. o qual ‘the which’). Both invariant markers (e.g., Fr./Cat. qui ‘who’) and relative pronouns (Pt. quem, Sp. quien ‘who’) are employed for human antecedents (Stark 2016: 1031–1032) (Cat. la noia a qui vaig donar les claus ‘The girl that I gave the keys to’, Wheeler, Yates, and Dols [1999: 536]), whilst Catalan has a special accented form què ‘that’ for inanimate referents, only used with obliques (Wheeler, Yates, and Dols 1999: 538–539; Stark 2016: 1033).

Mechanisms and paths of grammaticalization and reanalysis in Romance

217

A third type of strategy, characteristic of informal speech and of non-standard varieties, except in Romanian – where it is characteristic of the standard language and compulsory, also with DO and IOs, since the dummy preposition pe is often omitted in the spoken language (Gheorghe 2013: 490) – involves a mixed type, the use of an invariant marker and a resumptive pronoun specifying the syntactic function of the head NP. It occurs most typically with non-core arguments/adjuncts (cf. Pad. [Venetan, north-eastern Italo-Romance] il ragazzo che ci ho viaggiato insieme [the boy that him have..1 travel.... together] ‘the boy with whom I travelled’, with the stranded preposition insieme ‘together’, heading the comitative phrase [Cennamo 1997b: 195]). This strategy is also possible with DOs in (nonstandard) Catalan (és una dona que la veiem cada dia [be..3 a woman that her see..1 every day] ‘she is a woman that we see every day’ (Wheeler, Yates, and Dols 1999: 536) and for subjects in Brazilian Portuguese (Mioto and Lobo 2016: 282) and in some northern Italian dialects (e.g., Paduan) in non-restrictive clauses (Cennamo 1997b: 191). For IOs, obliques and adjuncts, alongside an invariant marker (cf. also Table 8), there also occurs a relative pronoun, coding number and gender (Fr. lequel, It. il quale, Sp. el cual, Pt. o qual, Romanian care ‘[the] which’), as shown in Table 8, summarizing the various strategies and their distribution on the Nounn Phrase A(ccessibility) H(ierarchy). Despite the differences in the type and distribution of relativizers – not always conforming to the Noun Phrase AH to relativization proposed by Keenan and Comrie (1977), Comrie (1989: 156), Subject > Direct Object > Indirect Object > Oblique > Genitive/Possessor (> Object of Comparison), reflecting the ease of relativization of syntactic functions, with subjects more easily relativized than direct objects etc. – two parameters appear to determine the type of strategy employed in Romance: (i) the syntactic function of the antecedent in the matrix clause, (ii) its animacy. Thus, gapping for the functions higher on the AH, subject and direct object, and a [+ case coding] strategy for obliques and adverbials, i.e., along a continuous segment of the AH, appear to hold for standard Italian but not for non-standard varieties, the Italian dialects and Spanish, Catalan and French, where the [-case strategy] occurs for higher and lower positions on the AH (Cennamo 1997b; Stark 2016: 1032 and further references therein). In addition, French further differentiates subjects from objects, for which a different relativizer occurs, qui, so-called ‘que → qui rule (Kayne 1976; Jones 1996: 507; Stark 2016: 1032), the same form found also for animate, human, personified antecedents with obliques (a qui, de qui, sur qui) (Grevisse 1980: § 189). Catalan too employs a different form, the stressed relativizer qui for human obliques (en qui, amb qui, a qui), vs the stressed form què for non-human and inanimate ones (Wheeler, Yates, and Dols 1999: 538–539).15 In Romanian, on the other hand,  A similar distinction obtains in early Italian varieties, with the forms ke/que/che most typically associated with non-human subject antecedents and inactive states of affairs (i.e., unaccusatives), the choice of relativizer in subject relatives thus reflecting the active-inactive typological realignment in the transition from Latin to Romance (Parry 2007: 215 and references therein).

218

Michela Cennamo

Tab. 10: Distribution of relativizing strategies in Romance on the Accessibility Hierarchy. Catalan

Spanish

Portuguese

French

Italian

Romanian

Subject











,  (care)

Direct Object







 (qui)



,  (care)

Indirect Object







 (qui)



,  (care)

Oblique  + Human antecedent  (qui), – Human antecedent  (què) 

 ,  

 ,  

  (qui),  

 ,  

  

Possessor (Genitive)

, 

, 

, 

, 

, 

, 

Legenda: 1 = invariant marker; 2 = relative pronoun; 3 = invariant marker + resumptive pronoun

the primary/main strategy is instantiated by the relative pronoun care (< Lat.  ‘which’), invariant for case, gender and number in the / but inflected for number and gender in the / (e.g., cărora-/.), with gender distinctions only overtly marked in the / singular (e.g., căruia-., căreia–.) (Gheorghe 2013: 489; Stark 2016: 1033), occurring for all syntactic functions, alternating with ce ‘that’ (< Lat. ) for subject and object positions (Stark [2016] for a panromance overview).

. Adverbial clauses In Romance interrogative words occur not only as markers of complement clauses and (restrictive) relative clauses (§§ 4.2–4.3), but of adverbial clauses as well (Heine and Kuteva 2006: 205f), a polysemy regarded by Haspelmath (1998b: 281–282) a typical S(tandard) A(verage) E(uropean) feature, since it is shared with several other European languages (e.g., Germanic, Slavic, Modern Greek, Hungarian, Georgian), resulting from a contact-induced grammaticalization process of extension (of question markers to subordinators), with Romance and the Slavic languages as the innovating foci/centres of the change (Heine and Kuteva 2006: 225). Among the different conjunctions introducing adverbial clauses, conveying temporal, locative, causal, conditional/concessive information about the situations described by the main clause, only few of them originate directly from Latin interrogative words (e.g., It. se, Fr. si ‘if ’ < Lat.  ‘if ’; Sp. cuando, Pt./It. quando ‘when’ < Lat.  ‘when’; Fr./Cat. car ‘because’ < Lat.  <   ‘which thing’), Fr. où, Cat. on ‘where’ < Lat.  ‘where’). Most typically, one finds instead complex conjunctions/phrases based on nominal, adverbial, prepositional, and, more rarely, verbal elements plus the complementizer ‘that’ (que and its variants), showing a high degree of coalescence (e.g., Fr. afin que lit. ‘to end that’, ‘in order that’; Cat. tot i que lit. ‘all and

Mechanisms and paths of grammaticalization and reanalysis in Romance

219

Tab. 11: Grammaticalization in adverbial clauses. Simplex conjunctions (< Lat. interrogative words)

It. se, Fr. si, ‘if’ < Lat.  ‘if’; Sp. quando, Pt./ It. quando ‘when’ < Lat. ; Fr., Cat. car ‘because’ < Lat.  <   ‘which thing’; Fr. où, Cat. on ‘where’ < Lat.  ‘where’

Complex conjunctions (nominal/adverbial/prepositional phrases, verbal element + ‘that’ (< que and variants); high degree of coalescence

Fr. afin que ‘in order that’ (lit. ‘to end that’); Cat. tot i que ‘although’ (lt. ‘all and that’); It. allorché ‘when’ (lit. ‘at the hour that’)

Conjunction + simple adverbials

Fr. quand que, ou que NIDs

Differences in mood conveying differences in clause linking

When clauses + indicative: past, present, generic/habitual eventualities (It., Fr., SP., Pt., Cat. etc.); When clauses + subjunctive: future eventualities (Sp./Pt.); + indicative (It., Fr.) Purpose clauses: subjunctive; Result clause: indicative (Fr.)

that’, ‘although’; It. allorché lit. ‘at the hour that’, ‘when’; Sp. siquiera lit. ‘if you wish’, ‘even if ’) (Harris [1988: 82 f.], for a discussion of semantic shifts involved in the rise of some concessive markers in Romance). In some varieties of Canadian French and most northern Italian dialects, the conjunction ‘that’ (Fr. que) also tends to be added to simple adverbials, as in quand que, où que (Kabatek and Pusch 2011: 89). In some languages/varieties the difference between different types of adverbial clauses is not conveyed by the conjunction but by a difference in mood, pointing to the grammaticalization of mood to convey differences in clause linking, as in temporal clauses introduced by ‘when’ in Spanish/Portuguese and Italian, selecting, respectively, the subjunctive (Sp. te llamaré cuando llegue [you call..1 when arrive..1] ‘I will telephone you when I arrive’) and the indicative (ti telefonerò quando arriverò/arrivo [you call..1 when arrive..1/arrive...1] ‘I will telephone you when I arrive’) for future eventualities and the use of the subjunctive vs the indicative in French, differentiating purpose from result clauses introduced by one and the same conjunction (J’étais gentil de façon qu’il veuille me revoir [I-be. .1 nice of way that-he want..3 me-re.see.] ‘I was nice so that he would want to see me again’ vs J’étais gentil de façon qu’il veut me revoir [I-be..1 nice of way that-he want..3 me re-see.] ‘I was nice and so he wants to see me again’, Quer [2016: 964]). The variation illustrated above is summarized in Table 9.

. Insubordination A different type of grammaticalization is insubordination, leading to the ‘conventionalised’, independent, main clause use of originally subordinate/coordinate

220

Michela Cennamo

clauses and other constructions within this divide, resulting from three different mechanisms: (i) ellipsis, the omission of the main clause within the complex sentence (Evans 2007; Evans and Watanabe 2016: 3 and references therein), (ii) extension of a subordinate clause to an independent use on the basis of a perceived similarity with its original contexts of occurrence as a subordinate clause (Mithun 2008), and (iii) clause disengagement, whereby subordinate clauses displaying independent-like properties are freed from occurrence with their matrix clause and come to be used as independent clauses (Cristofaro 2016). These constructions are often compatible with different processes and sources, interacting at different stages of the grammaticalization process (Cristofaro 2016: 25). The phenomenon is very widespread among the Romance languages, with Latin ancestors for some of them, such as sequences of independent infinitives for indirect speech and the independent subjunctive (Lakoff 1968), also with an epistemic interpretation (non venias [neg come..2] ‘maybe you will not come’, Evans [2007: 388, 395]), these patterns being also fairly common in several different languages (e.g., Lithuanian), although with different markers of subordination (Evans 2007: 395 and references therein). The structural features associated with clause linking occurring in insubordinated patterns in Romance are, most typically, conjunctions or complementizers (e.g., if, because, that) and verbal morphology (e.g., main clause use of the infinitive or of the subjunctive in Italian, conditional in Spanish) (Evans 2007: 367; Cristofaro 2016; Schwenter 2016: 199). The functions covered by insubordinated patterns vary from commands, warnings, threats, modality (e.g., deontic/evidential uses) to negation, contrastive statements, reiteration (Evans 2007: 368). The insubordination of conditional clauses, very common crosslinguistically, is often associated with requests and suggestions, as in French (e.g., si on allait se promener? [if one go..3  walk.] ‘what if we went for a walk?’ (Evans 2007). In Spanish, ‘free conditionals’ are most typically associated with refutation (Schwenter 2016: 89 and references therein), being also available for requests, as in Italian, where both meanings are available (13a–c) (Lombardi Vallauri 2016: 146, 149), and in other languages, such as English (Evans 2007). These patterns may be regarded as stemming from ellipsis of the main clause, whose ease in recoverability may be viewed as reflecting the degree of conventionalisation, i.e., of grammaticalization of the insubordinated structure, low if the elided (main) clause is easily recoverable from the wider discourse context, high if only the general pragmatic content is inferable, leading to the intended meaning of the insubordinated clause (e.g., refutation vs request). In Spanish, ‘suspended/free conditionals’ may also occur in paratactic patterns in which the si clause conveys a discourse-pragmatic ‘causal explanation for a preceding assertion’ (the so-called ‘causal’ usage) (Schwenter 2016: 105), as in (13d), where the clause introduced by si ‘if ’ confirms the content of the preceding main clause (Juan está enfermo).

Mechanisms and paths of grammaticalization and reanalysis in Romance

221

(13) a. Sp. ¡Si es horrible! if be..3 Horrible ‘If it is horrible!’ b. It. (refutation) (Ma) se è orribile! but if be..3 Horrible ‘(But) if it is horrible!’ c. It. (request) Se mi dice la pagina if I. tell..3 the Page ‘If you could tell me what page’ (Lombardi Vallauri 2016: 146) d. Juan está enfermo, si lo He visto Juan stay..3 ill if he. have..1 see.... hoy en el médico today at the doctor ‘Juan is ill if I saw him at the doctor’s today’ (liter.) (if = because, since) The grammaticalization path and mechanisms involved in these different uses include an initial change, leading to the main clause assertive declarative (refutation, request) use, with ellipsis of the main clause and reanalysis of the conditional clause as an independent clause; a subsequent step consists of the extension of insubordinated si-clauses to contexts with causal meaning, and ensuing ‘emancipation’ (Mithun 2008) of the original conditional marker si. In Spanish the high degree of grammaticalization of independent conditionals is also shown by their changed prosody, characteristic of declarative clauses, a feature that is not shared with the corresponding Italian constructions, which retain the intonational pattern of conditional clauses, a property which is in line with the lower degree of grammaticalization of the insubordinated pattern in Italian, where the construction is confined to informal, dialogic situations and excluded from the written language (Lombardi Vallauri 2016). Italian also shows examples of clause disengagement, as testified by self-standing that and because clauses (14a–b) (Cristofaro 2016): (14) a. Che poi io continuo a pensare a Lui that then I continue..1 to think. to He ‘That then I continue to think of him’ b. No, perché poi questo workshop sembra Interessante no because then this workshop seem..3 Interesting ‘Besides, this workshop looks interesting’ (Cristofaro 2016: 6)

222

Michela Cennamo

In French, alongside insubordinate conditional clauses (15), there also occur insubordinate concessives introduced by the conjunctions bien que ‘even though’, quoique ‘although‘, puisque ‘since‘ (Debaisieux 2007; example [15] below), with the verb in the indicative rather than the subjunctive, as in canonical concessives and characterized by a different syntactic behaviour (Debaisieux 2007: 5). These patterns too appear to be a case of clausal disengagement (Cristofaro 2016: 7), the insubordinate clause being similar in meaning to the preceding one introduced by porquoi ‘why’, focusing on a discourse topic related to shared knowledge/background between speaker and hearer, as in (15) (from Debaisieux 2007: 7): (15) pourquoi moi je me suis mariée Avec euh – avec why I.dat I I. be..1 marry.... With uhm with Ralph puisque – mon mari s’ appelle Ralph – il est Ralph since my husband .call..3 Ralph he be..3 allemande a trente euh trente-six ans Et German have..3 thirty uhm thirty-six years And ‘Why I married Ralph, since – my husband’s name is Ralph – he is German and he is thirty, thirty-six years old’ Similar examples of insubordinated clauses arising from clause disentanglement are provided by the grammaticalization of the Gascon complementizer que ‘that’ as an assertive marker, so-called ‘enunciative particle’ in independent clauses (Que cresi que benlhèu qu’ei ua chança finalament [ think..1 that perhaps  be..3 a chance finally] ‘I think that it is perhaps a chance after all’) (Giurgea and Remberger 2016: 864; Oliviéri and Sauzet 2016: 342; also Cruschina and Ledgeway 2016: 568 and references therein), Italian ‘that/because’, ‘when’ clauses, quite common also in non-Romance languages (e.g., German, English, Japanese) (Cristofaro 2016 and references therein). In these patterns the complementizer che and the causal conjunction perché, introduce a new discourse topic, related to shared knowledge with the hearer (14a–b) and a new unexpected event in the case of when clauses (16), encoding new information rather than resuming shared knowledge between speaker and hearer (Cristofaro 2016: 6–7, 9, 11): (16) Eravamo sulla spiaggia quando all’improvviso un bambino be..3 on-the beach when all-of-a-sudden a child cominciò a urlare start..3 to cry. ‘We were on the beach when a child suddenly started crying’ A common pattern is also the use of the coordinating conjunction and in Italian to introduce a new discourse topic, in an insubordinated-like use, a phenomenon quite common in several languages (Cristofaro 2016 and references therein) (17):

Mechanisms and paths of grammaticalization and reanalysis in Romance

223

(17) E i tuoi hanno degli Invite per Natale? and the your.parents have..3 any Invitations for Christmas? And do your parents have any invitations for Christmas? (Abrupt beginning of a conversation) (Cristofaro 2016: 20)

 Other patterns of grammaticalization and reanalysis . Causative constructions Analytic causative constructions – consisting of the verbs ‘make’, ‘let’, ‘order’, denoting different types of causation, followed by a dependent infinitive, which may be inflected in Portuguese and Galician (cf. Ledgeway [2016c: 1018]; Sheenan [2016] for a full description, also Vincent [2016a] for a general discussion of the diachrony of the construction in the transition from Latin to Romance) – display different degrees of grammaticalization and even degrammaticalization for Spanish and Portuguese, witnessed by their different morphosyntactic behaviour in relation to a number of parameters, among which (i) the realization of the argument structure of the dependent infinitive (e.g., the case-marking of the causee, encoded as O(bject) if in the accusative/dative (as in Italian, French, Spanish) and as S(ubject) if in the nominative, as in Portuguese), (ii) the order of the matrix, the dependent verb and the (nominal) causee and its interpolation according to three sequences: VV(O), VOV and VSV), (iii) differential behaviour with reflexive verbs: obligatory presence of the reflexive morpheme if the pattern is biclausal, lack of the reflexive if the matrix verb and its dependent infinitive form a complex predicate (Davies 1995; Soares da Silva 2012). Confining the discussion to coercive causation, as expressed by the verb ‘make’, Italian fare and French faire show a high degree of semantic bleaching, with Italian fare displaying higher syntactic coalescence with its dependent infinitive, always forming a monoclausal pattern, i.e., a complex predicate, with both transitive and intransitive dependent infinitives. The semantic bleaching of the verb is shown by its high degree of generality, whereby It. fare and Fr. faire, for instance, overlap in some of their uses, with the permissive causative verbs lasciare/laisser ‘let’ (It. fammi vedere [make..2.I. see.] ‘let me see’), unlike Spanish hacer, and Portuguese fazer,16 which only have the original coercive causa In European Portuguese, causative constructions are most productively formed with mandar, the continuant of Lat.  ‘entrust’ > ‘command’ (Sheenan 2016: 984, among others), albeit in the dialects (except in Madeira) (CORDIAL-SIN corpus, Pereira 2015: 64–70) causative constructions with fazer are widespread. Fazer and mandar instead are both employed in Brazilian Portuguese (Soares da Silva 2012; Sheenan 2016; Sheenan and Cyrino 2016 and references therein).

224

Michela Cennamo

tion meaning (cf. Pt. fazer ver/Sp. hacer ver [make see] ‘make [someone] see’) (Soares da Silva 2012: 520). The syntactic coalescence of the matrix and dependent verb, i.e., the monoclausality of the sequence ‘make’ + infinitive in Italian is revealed by the case-marking of the causee, in the accusative/dative with an intransitive/transitive dependent infinitive, respectively (18a–b), the lack of interpolation (18b), and the lack of se in the dependent reflexive infinitive (18c): (18) a. Marco fece partire i ragazzi Mark make..3 leave. the boys ‘Mark made the boys leave’ b. Maria fa studiare molto la matematica Mary make..3 study. much the Maths fa studiare … /*gli Fa molto make..3 study. he. make..3 much

a Marco/gli to Mark/he. studiare study.

c. il rumore fece svegliare i ragazzi/li fece the noise make..3 wake-up. the boys/they. make..3 svegliare/*svegliarsi wake-up./*wake-up.. ‘The noise woke up the children’ In French, on the other hand, se may occur with the dependent reflexive infinitive (19) and can be interpolated so as to be close to the infinitive (19) (Soares da Silva [2012: 529–530] for a full discussion of the various parameters revealing the different degrees of grammaticalization of causative constructions): (19) Le bruit les fait se lever the noise them make..3  get.up. ‘The noise wakes them up’ (Soares da Silva 2012: 530) Spanish and Portuguese also display a different degree of structural cohesion between the matrix (causative) verb and its dependent infinitive, clearly detectable through the position and case-marking of the causee in the sequence matrix-dependent verb. This is higher in Spanish, which most typically displays the VV(O) order (hizo traer un regalo a su mujer [make..3 bring. a present to his wife] ‘He made his wife bring a gift’), characteristic of the monoclausal pattern, although allowing the order VOV with transitive dependent verbs and ambiguity of interpretation (Juan hizo a su mujer traer un regalo [John make..3 to his wife bring. a present] ‘John had his wife bring a gift/John made someone bring a present to his wife’) (Cano Aguílar 1981: 243 in Soares da Silva 2012: 531–532). Spanish also variably encodes the causee in the accusative/dative, depending on the variety, with both transitive and intransitive verbs and different constraints on clitic doubling (Juan le hizo saltar (**a María) [Juan she. make..3 jump. to María] ‘Juan

Mechanisms and paths of grammaticalization and reanalysis in Romance

225

made her jump’, Peninsular leísta Spanish vs. Juan la hizo saltar a Maria [Juan she. make..3 Mary] ‘Juan made Mary jump’/le hizo comprar el auto a María [(sc. Juan) he. make..3 buy. the car Mary] ‘Juan made Mary buy the car’, Rioplatense Spanish) (Sheenan 2016: 986). By contrast, in (European and Brazilian) Portuguese, the most common construction with a transitive dependent verb is VOV (Soares da Silva 2012: 531, n. 11), when the causee is [+]. The latter is encoded in the accusative (a Maria fez/mandou os miúdos (-os) ler esse livro [Mary make/ order..3 the children (them) read. that book] ‘Mary made the children read that book’, Soares da Silva [2012: 526]), also with intransitive verbs, the dative being less acceptable when the causee is a clitic (?? fiz-lhe corer [make..3 –he. run.] ‘I made him run’, Soares da Silva [2012: 530–531]). The causee is encoded as nominative with the inflected infinitive (a Maria fez/mandou os miúdos (eles/ *-os) correrem/(eles) lerem esse livro [Mary make/order..3 the children (they/ *them) read...3 that book] ‘Mary made the children read that book’), a pattern that arose in the 16th century (Soares da Silva 2012: 526). Thus, the degree of syntactic cohesion between the causative matrix verb and its dependent infinitive is lower in Portuguese than in Spanish. The synchronic variation partially illustrated above has recently been interpreted as revealing relative degrammaticalization in Spanish and Portuguese in this grammatical domain, and the following general cline has been proposed, whereby the Italian causative construction is more grammaticalized than the corresponding French, Spanish and Portuguese patterns, with Italian and Portuguese instantiating the highest and lowest degrees of grammaticalization, respectively (Soares da Silva 2012: 532–534; 546–547): (20) - fazer + Inf > hacer + Inf > faire + Inf > fare + Inf + Portuguese Spanish French Italian The degree of variation for coercive causation and some of the parameters determining it in the Romance languages discussed are schematized on Table 12:

Tab. 12: Grammaticalization clines in the coercive causation domain. Semantic bleaching

Coercive meaning (Sp., Pt.)

Coercive and permissive meaning (Fr., It.)

Syntactic coalescence

Realization of argument structure of dependent infinitive (encoding of causee)

Order matrix-dependent V, nominal causee and its interpolation

Presence/absence of se with reflexive verbs

monoclausality

/ (It., Fr., Sp.); agentive/instrumental 

VV(O) (It., Fr., Sp., Pt.)

– se

intermediate biclausality

 (Sp., Pt.)  (marginally . with intr.) (Pt.)

VOV (Sp., Pt.) VSV (Pt., inflected infinitive)

+ se + se

226

Michela Cennamo

. Negation Three types of negation are found in Romance: (i) preverbal negation (It. non vedo bene [not see..1 well] ‘I cannot see well’), (ii) discontinuous negation, with a pre- and post-verbal negative marker, the latter being either optional, as in Italian (mica in non è (mica) stupido [not be..3 not stupid] ‘He is not stupid’) and the other Romance languages, generally conveying the negation of a conversational implicature (so-called ‘presuppositional negation’), or obligatory, as in French (pas in je n’ai pas faim [I not have..1 hunger] ‘I’m not hungry’), and type (iii) postverbal negation (coll. Fr. j’ai pas faim [I have..1 not hunger] ‘I’m not hungry’) (De Clercq 2017: 50) (cf. Bernini and Ramat [1996: 17]; Kabatek and Pusch [2011: 86–87]; Hansen and Visconti [2012] for French and Italian; Poletto [2016]). Type (i) continues a Latin and early medieval Romance pattern, and typically occurs in Italian, Portuguese, Romanian and Spanish; type (ii), already attested in early and Classical Latin (Zamboni 2000: 121), stems, in several languages/varieties, from the grammaticalization of so-called n-words (nominal and adverbial elements occurring in negative concord structures) (e.g., Piedm. ren, Cat. res ‘thing’, Occ. ges, ‘kind’, It. niente ‘nothing’), and of elements denoting a small quantity, i.e., minimizers (Poletto 2016: 838) (e.g., It./Cat. mica, Emil./Lomb. miga/mia, OFr. mie ‘crumb’, Fr./Cat./ Occ. pas, Piedm. pä ‘step’, Fr. point ‘point’) (Kabatek and Pusch [2011: 87]; Poletto [2016] for a recent thorough analysis). The discontinuous element can also be a negative marker, such as não in spoken Brazilian Portuguese (não quer-o não not want..1 not ‘I don’t want (it)’ and is also common in several Romance creoles (Kabatek and Pusch 2011: ibid.) (cf. It. Non piove mica not rain..3 not ‘It is not raining’ (NIt.) (Cinque 1976; Poletto 2016: 835 and references therein). Type (iii) in some languages (e.g., French, Brazilian Portuguese, Italian) is a colloquial variant of type (ii), with the dropping of the first negative marker (e.g., BPt. sei não [know..1 not] ‘I don’t know’, restricted to northeastern speakers and some southeastern areas and occurring typically in replies to direct questions) (Furtado da Cunha 2007: 1642; 1649–1650) and colloquial Italian c’era niente da fare [there.be..3 nothing to do.] ‘there was nothing to do’ (Bernini and Ramat 1996: 21). Postverbal negators arise from the grammaticalization of three different types of constituents/elements: minimizers (following the path lexical element > polarity item > negative polarity item > negative marker) (Poletto 2016: 838–839), nwords such as It. niente ‘nothing’ (coll. It. fa niente [does nothing] ‘It doesn’t matter’), and pro-sentence negators such as BPt. não and Lombard no (el lup el va no [the wolf he goes not] ‘the wolf does not go’ (Poletto 2016: 839). The three types of negative strategies correspond to different stages on the so-called negative cycle (Jespersen 1917). Type (i) realizes the initial stage on the cycle, whereas type (ii) represents a subsequent (optional) stage, stemming from the phonological and semantic weakening of the original negative element, leading to the (optional) introduction of a new negative marker, originally used for emphasis and placed in Ro-

Mechanisms and paths of grammaticalization and reanalysis in Romance

227

Tab. 13: The negation cycle and its instantiations. Types of negation

. Preverbal – n V

. Discontinuous n V (n )

. Postverbal V n (after inflected V/clause final)

It. Pt., Rom., Sp.

Fr. (n oblig.), It. (n opt.), Cat., Pt., NIDs (n only occurs in some emphatic contexts)

Spoken Fr. Queb. Fr., Hait. Creole/other Fr.-based creoles, NWID (oblig.), Spoken BPt., It. (opt.); colloquial variant of type , with dropping of n (Fr., BPt., It.), with dropping of n )

Types of n-words

It. non, Fr. ne, Sp./ Cat. no, Pt. não, Ro. nu, Fl./Sic. un ‘not’ < Lat  ‘not’;

i. nominals/adverbials (Piedm. ren, Cat. res ‘thing’, Occ. ges ‘kind’/It. niente ‘nothing’)

i. minimizers (Fr. pas < ‘step’, Eml. brisa < ‘crumb’, Fl. punto < ‘stich’)

manco < ‘not even’ (Bas., CIDs, SIDs, coll. It.);

ii. minimizers (It./Cat. mica, Emil./ Lomb. miga/mia, OFr. mie ‘crumb’)

ii. n-word ‘nothing’ Fr. (nen, Prv. ren, RaeR. nia)

adverb: neca < nu(n) è ca ‘lit. not is that’ (Sic.)

iii. negative marker (BPt. não ‘not’)

iii. negative marker (não ‘not’)

mance after the finite lexical verb/auxiliary (Poletto 2016: 836). This initial stage in the grammaticalization of a discontinuous negator is exemplified by contemporary Catalan, Portuguese and colloquial northern Italian and some northern Italian dialects (e.g., in Veneto and Liguria) (Parry [2013b], and Manzini and Savoia [2005, III: ch. 6] for dialectal data), where the postverbal negator only occurs in some emphatic contexts (Poletto 2016: 836), whereas French, where the discontinuous negator is obligatory, realizes the second stage in the grammaticalization path on the negative cycle, as do some northern Italian dialects (e.g., Emilian, Ticinese and border Piedmontese-Ligurian varieties (Parry 2013b: 78). In the third and last stage of the negation cycle (five stages in more recent discussion; van der Auwera [2009]; van Gelderen [2011]; Hansen [2011]),17 the preverbal negator has been lost and only the

 van der Auwera (2009, 2010) and van Gelderen (2011) underline, instead, the role played by semantico-pragmatic notions such as emphasis and the economy principle as the triggers of the changes, respectively.

228

Michela Cennamo

postverbal negative marker occurs, as in spoken French, Quebécois French, Haitian creole and other French-based creoles and some northwestern Italian dialects (e.g., Piedmontese and Lombard) (Parry 2013b; Poletto 2016: 836; Miola 2017: 143–149). After this stage, the cycle starts again, as shown in Piedmontese, for instance, where the postverbal negator nen cooccurs with a preverbal negative marker pa with an emphatic function (e.g., to mark presuppositional negation), pa nen (Zanuttini 1997: 75). Unlike colloquial French, Occitan, Québécois French, Piedmontese, where the cycle has been completed, Italian and Spanish appear to be at the initial stage of the cycle, whereas southern Italian dialects and Romanian are unaffected by it. Thus, the n-word manco and its variants may grammaticalize from an original emphatic negator into a plain preverbal negator in southern Italian dialects (Ledgeway [2017a: 107] for Calabrian), replacing the original preverbal negator in Rionero in Vulture (Basilicata) (manc o piglià [not it take.] ‘don’t take it’) (Poletto 2016: 839). In addition, in Sicilian a new preverbal negative marker, neca ‘not’, has developed from the grammaticalization of a cleft sentence, nu(n) è ca [not is that] ‘it is not that’ (from Ledgeway 2017a: 107), although it has not replaced yet the original plain preverbal negator (nun/un) (Cruschina 2011; Poletto 2016: 839). The main aspects of the variation found in Romance are illustrated in Table 13.

 Conclusions The grammaticalization chains and the reanalysis changes witnessed in Romance generally fall within well-known paths and principles, reflecting crosslinguistically well attested changes, albeit also displaying typologically unusual developments. They mostly instantiate five main processes: semantic bleaching, decategorialization, reanalysis, interlacing, layering, subjectification, interacting in different and complex ways in determining the rise and implementation of grammaticalization and marginally, also degrammaticalization phenomena. In the domain of nominal syntax, the reduction of the Latin inflectional classes brought about significant changes in the assignment of gender and number, with an original gender distinction reanalysed as a number difference, as in the case of a neuter plural reinterpreted as a feminine plural, alternating with a corresponding masculine singular, the so-called genus alternans, a typologically rare phenomenon of IE origin. A crosslinguistically widespread grammaticalization change, instead, is the development of articles from demonstratives, a new category and a characteristic feature of the core European languages (SAE) (Haspelmath 1998b; 2001; Heine and Kuteva 2006; Giacalone Ramat 2008: 19), part and parcel of a wider phenomenon, the grammaticalization of Definiteness/Individuation, including also the grammaticalization of prepositions as partitive articles and non-canonical Os, so-called Dif-

Mechanisms and paths of grammaticalization and reanalysis in Romance

229

ferential Object marking, that in some Latin American varieties of Spanish correlates with the specificity and definiteness of the O argument. In the verbal domain, four main grammaticalization chains can be detected. The rise of several active and passive verbal periphrases, through the semantic bleaching of lexical verbs of motion, state, change of state and activity, leading to their copular function and subsequent auxiliarization, becoming TAM markers with A/S and even O orientation (so-called voice markers) and/or light verbs (instantiating different types and degrees of decategorialization), as in their periphrastic future, past, progressive, passive, light verb function(s), is in line will well-known crosslinguistic tendencies (Heine 1993; Bybee, Perkins, and Pagliuca 1994; Giacalone Ramat 2008; Hansen and de Haan 2009; Haspelmath 2004, among others), along with typologically unusual developments such as the auxiliarization of verbs of possession and apparently unique patterns in Romance such as the ‘go’-past in Catalan, some Occitan and Calabrian varieties and the ‘have’-passive in some southern Italian varieties. A characteristic Romance feature is the reanalysis of reflexives as (in)transitivity and voice modulators, optionally acquiring the function of indefinite markers, reflecting the widening of their referential domain and their degree of grammaticalization, as in their so-called inclusive interpretation, a process shared with other originally nominal elements, the latter apparently instantiating a characteristic Romance phenomenon (e.g., a gente ‘we’ in Brazilian Portuguese). Whereas the reanalysis of reflexives as markers of the lack of control of the A/S argument (and A in the dative if the verb is bivalent) is well attested crosslinguistically (Dixon 1994: 26–27), as are reflexive passives (Siewierska 1984; Kemmer 1993, among others), an unusual grammaticalization/degrammaticalization path is attested in the development of impersonal/indefinite expressions, instantiated by Abruzzese nome ‘one’, that does not acquire referential (indefinite) or a first person (singular/ plural) status in its change to a plural affix, contra the well-known commonly assumed grammaticalization cline: generic > human non-referential indefinite > referential indefinite/1st person singular/plural > plural marker (Giacalone Ramat and Sansò 2007a: 106; D’Alessandro 2014). The same area also exhibits the opposite phenomenon, the degrammaticalization of a third person plural marker anne ‘they have’ (originally a plural auxiliary), turning into a non-referential, arbitrary pronoun, following the corresponding reverse diachronic path observed for nome (D’Alessandro 2014) (§ 3.1.3). A major Romance innovation is also the rise of clitics, in pronominal and expletive-existential/presentative functions, as part of the general move towards headmarking, and their grammaticalization to affixal status, most prominently reached in spoken, colloquial French for subject clitics, and Spanish, especially Latin American varieties, for clitic objects, as in clitic doubling constructions. As for the clause and clause linkage domains, interesting grammaticalization paths are exemplified by (i) the dual complementizer system restricted to southern Italian dialects and Romanian, conveying the realis/irrealis distinction, (ii) the vary-

230

Michela Cennamo

ing degrees of grammaticalization of the subjunctive as a marker of subordination, with French exemplifying the most advanced stage, since the subjunctive, highly restricted in its use, is selected by the governor element and is no longer semantically determined, (iii) the degrees of interlacing between the verb and its complement/ dependent verb, occurring in southern Italian dialects, the matrix verb becoming in some cases an aspectual marker, the latter being a recent ongoing change. Insubordination phenomena too are well attested, and follow common crosslinguistic characteristics/paths and typology, showing a higher degree of grammaticalization in Spanish, where for instance insubordinated (i.e., independent) conditionals have the characteristic prosody of declarative clauses, unlike in Italian, that displays a lower degree of grammaticalization of independent conditionals. Other interesting grammaticalization chains involve causative coercive constructions and negation. The former displays varying degree of grammaticalization, higher in French than in Italian, whilst Portuguese and Spanish witness instead different stages on a process of degrammaticalization, more prominent in Portuguese, which exhibits a lower degree of syntactic cohesion between the matrix, causative verb and its dependent infinitive. By contrast, the negation cycle, whose initial, intermediate and finale stages are realized by Catalan, standard French and spoken French (among other languages/ varieties), respectively, also display unusual ongoing grammaticalization processes, such as the development of a negative marker from a cleft sentence, as in some southern Italian dialects. Some processes also appear to exemplify contact-induced cases of grammaticalization and reanalysis. Thus, the replacement of simple preterites with ‘have-perfects’, witnessed in spoken French, Romanian and northern Italian, among other languages/varieties, albeit reflecting a universal process following universal principles (e.g., unidirectionality) (Bybee, Perkins and Pagliuca 1994), results from century-long crosscultural contacts (Giacalone Ramat 2008; Drinka 2017), spreading from twelfth century Parisian French, where this innovation has developed most. As can be seen from the above discussion, the analysis of variation in various grammaticalization domains in Romance does not support the hypothesis that Romance languages can be organized along a grammaticalization cline, with languages like French being more ‘grammaticalized’, i.e., more advanced on the grammaticalization cline with respect to several phenomena (auxiliaries, mood, determiners, prepositions, existential sentences) than Italian, Spanish and Portuguese, with Italian being more ‘grammaticalized’ than Spanish and Portuguese (Lamiroy and De Mulder [2011]; De Mulder and Lamiroy 2012], among others and Vincent’s [2017: 307–308] perceptive criticism of this view). The variational data reveal instead that a grammaticalization phenomenon may be more or less advanced in relation to specific constructions in a language/variety, which may be therefore more ‘innovative’ for some constructions in specific grammatical domains but more ‘conservative’ for others. For instance, French is more advanced than other Romance lan-

Mechanisms and paths of grammaticalization and reanalysis in Romance

231

guages as far as the grammaticalization of (the distribution of) the pragmatic notions/features of Definiteness and Newness is concerned, pairing with northern Italian dialects, whilst being more conservative in other domains, showing for instance only incipient grammaticalization of the preposition a as a differential object marker (together with object doubling) in the spoken language and no such phenomenon in the standard language. Also, when grammaticalization clines appear to exist/hold, as with coercive causation, they are construction-specific, and further in-depth investigation might reveal grammaticalization stages and tendencies departing from the overall trend(s) detectable for the standard varieties, revealing the complex interplay of bundles of parameters as is the case for instance with negation, for which spoken French appears to instantiate a more advanced stage of grammaticalization, unlike the standard language.

Acknowledgements I thank Andrej Malchukov and an anonymous reviewer for commenting on an earlier draft of the article. I am also very grateful and deeply indebted to Adam Ledgeway, Martin Maiden and Mair Parry for their detailed and insightful comments on an earlier version of the article as well as for lenghthy and informative discussions, on different occasions, on grammaticalization in Romance. I also wish to thank Nigel Vincent for interesting exchanges of views on the notions of grammaticalization chains and grammaticalization ‘pace’, as well as Francesco Ciconte, Alexandru Nicolae and Anna Giacalone Ramat. Their remarks and suggestions enhanced my understanding of several issues and highly contributed to refine my arguments. The usual disclaimers apply.

Abbreviations Abbreviations follow the Leipzig glossing rules. Additional abbreviations include  – ablative,  – accusative, Aux. – auxiliary,  – active, CIDs – central Italian dialect(s),  – clitic,  – dative,  – Direct Object,  – expletive,  – genitive,  – gerundive/ gerund,  – imperfect (tense), . – imperative,  – impersonal,  - Indirect Object, . – intransitive, . – lexical (verb),  – marker,  – nominative,  – preposition,  – existential/locatice proform,  – pluperfect,  – past participle,  – preterite,  – past,  – participle,  – subject clitic,  – subjunctive, . – transitive,  – vocative. Abr. – Abruzzese, Arag. – Aragonese, Ast. – Asturian, Bal. – Balearic (Catalan), Bas. – Basilicata, BPt. – Brazilian Portuguese, Cal. – Calabrian, Cat. – Catalan, CID(s) – Central Italian Dialect(s), CL – Classical Latin, coll. – colloquial, Cors. – Corsican, Cos. – Cosentino, Cpd. – Campidanese, Emil. – Emilian, EPt. – European Portuguese, Fl. – Florentine, Fr. – French, Friul. – Friulan, Gal. – Galician, Hait. – Haitian, IE – Indo-European, It. – Italian, Lad. – Ladin, Lat. – Latin, Log. – Logudorese, Lomb. – Lombard, Mil. – Milanese, Molf. – Molfettese, Neap. – Neapolitan, NID(s) – Northern Italian Dialect(s), NIt. – northern Italian, NWID(s) – North Western Italian Dialect(s), Nuo. – Nuorese,

232

Michela Cennamo

O – old, Occ. – Occitan, Piedm. – Piedmontese, Pt. – Portuguese, Prv – Provençal, Pugl. – Pugliese, Raer. – Raeto-Romance, Ro. – Romanian, Rom. – Romanesco, Romsh. – Romansh (dialects spoken in southestern Swiss Canton of Graubünden/Grisons/Grigioni/Grischun), Sic. – Sicilian, Srd. – Sardinian, SID(s) – Southern Italian Dialect(s), SItR. – Southern Italo-Romance, Sp. – Spanish, Srs. – Surselvan, Tsc. – Tuscan, USID(s) – Upper Southern Italian Dialect(s), Ven. – Venetian

References Adams, James N. 2013. Social variation and the Latin language. Cambridge: Cambridge University Press. Alfieri, Gabriella. 1992. La sicilia. In Francesco Bruni (ed.), L’italiano nelle regioni. Lingua nazionale e identità regionali, 798–860. Turin: UTET. Amenta, Luisa & Erling Strusdholm. 2002. « Andare a + infinito » in italiano. Parametri di variazione syncronici e diacronici. Cuadernos de Filología Italiana 9. 11–29. Anderson, Stephen R. 2016. Romansh (Rumantsch). In Adam Ledgeway & Martin Maiden (eds.), The Oxford guide to the Romance languages, 169–184. Oxford: Oxford University Press. Andriani, Luigi. 2017. The syntax of the dialect of Bari. Cambridge: University of Cambridge PhD thesis. Auwera, Johan van der. 2009. The Jespersen Cycles. In Elly van Gelderen (ed.), Cyclical change, 35–71. Amsterdam & Philadelphia: John Benjamins. Auwera, Johan van der. 2010. On the diachrony of negation. In Lawrence R. Horn (ed.), The expression of negation, 73–109. Berlin: de Guyter. Badia i Margarit, Antoni. 1951. Gramática histórica catalana. Madrid: Gredos. Badia i Margarit, Antoni. 1995. Gramàtica de la Llengua Catalana: Descriptiva, normativa, diatòpica, diastràtica. Barcelona: Biblioteca Universitària Edicions Proa. Barrett Brown, Charles. 1931. The disappearance of the indefinite hombre from Spanish. Language 7. 265–77. Bartoli, Matteo. 1929. La norma linguistica dell’area maggiore. Rivista di filologia e d’istruzione classica 57. 333–345. Bartoli, Matteo. 1933. Le norme neolinguistiche e la loro utilità per la storia dei linguaggi e dei costume. Atti della Socirtà Italiana per il Progresso delle Scienze 21. 157–167. Bartra Kaufmann, Anna. 2002. La pasiva i les construccions que s’hi relacionen. In Joan Solà, Maria-Rosa Lloret, Joan Mascaró & Manuel Pérez Saldanya (eds.), Gramàtica del català contemporani, Vol 2, Barcelona: Editorial Empúries, 2111–2179. Bat-Zeev Shyldkrot, Hava. 1981. A propos de la forme passive ‘se voir + Vinf’. Folia Linguistica 15 (3–4). 387–407. Bauer, Brigitte. 2003. The adverbial formation mente in vulgar & Late Latin: a problem in grammaticalization. In Heikki Solin, Martti Leiwo & Hilla Halla-aho (eds.), Latin Vulgaire – Latin Tardif VI, 439–457. Heldesheim: Olms. Belloro, Valeria. 2007. Spanish clitic doubling: A Study of the Syntax-Pragmatics interface. Buffalo: University of New York at Buffalo PhD thesis. Benincà, Paola. 1983. Il clitico ‘a’ nel dialetto padovano. In Paola Benincà, Manlio Cortelazzo, Aldo Prosdocimi, Laura Vanelli & Alberto Zamboni (eds.), Scritti linguistici in onore di Giovan Battista Pellegrini, 25–35. Pisa: Pacini. Benincà, Paola. 2007. Clitici e ausiliari: gh ò, z è. In Delia Bentley & Adam Ledgeway (eds.), Sui dialetti Italoromanzi. Saggi in onore di Nigel B. Vincent, The Italianist, Special supplement 1. 27–47. Norfolk: Biddles. Benincà, Paola. 2014. Subject clitics and particles in Provençal. Probus 26. 183–215.

Mechanisms and paths of grammaticalization and reanalysis in Romance

233

Benincà, Paola & Cecilia Poletto. 1997. The diachronic development of a modal verb of necessity. In Ans van Kemenade & Nigel B. Vincent (eds.), Parameters of morphosyntactic change, 94– 118. Cambridge: Cambridge University Press. Bentley, Delia. 2004. Definiteness effects: evidence from Sardinian. Transactions of the Philological Society 102(1). 57–101. Bentley, Delia. 2006. Split intransitivity in Italian. Berlin: Mouton de Gruyter. Bentley, Delia. 2013. Subject canonicality and definiteness effects in Romance there-sentences. Language 89(4). 675–712. Bentley, Delia. 2018. Grammaticalization of subject agreement on evidence from Italo-Romance. Linguistics 56(6). 1246–1301. Bentley, Delia. 2020. Active-middle alignment and the aoristic drift. In Sam Wolfe & Martin Maiden (eds.), Variation and Change in Gallo-Romance Grammar, 191–212. Oxford: Oxford University Press. Bentley, Delia & Adam Ledgeway. 2014. Manciati siti? Les constructions moyennes avec les participes résultatifs-statives en italiens et dans les variétés italo-romanes méridionales. Languages 194. 63–80. Bentley, Delia & Adam Ledgeway. 2015. Autour de la question des participes résultatifs-statifs dans les varieties romanes. In Ignazio Mirto (ed.), Le relazioni irresistibili. Scritti in onore di Nunzio la Fauci per il suo sessantesimo compleanno, 61–91. Pisa: ETS. Bentley, Delia, Francesco Ciconte & Silvio Cruschina. 2015. Existentials and locatives in Romance dialects of Italy. Oxford: Oxford University Press. Bentley, Delia & Francesco Ciconte. 2016. Copular & existential constructions. In Adam Ledgeway & Martin Maiden (eds.), The Oxford guide to the Romance languages, 847–862. Oxford: Oxford University Press. Bernini, Giuliano & Paolo Ramat. 1996. Negative sentences in the languages of Europe. Berlin: Mouton de Gruyter. Berretta, Monica. 1989. Sulla presenza dell’accusativo preposizionale in italiano settentrionale: note tipologiche. Vox Romanica 48. 13–37. Bertinetto, Pier Marco & Mario Squartini. 2016. Tense and aspect. In Adam Ledgeway & Martin Maiden (eds.), The Oxford guide to the Romance languages, 939–953. Oxford: Oxford University Press. Blasco Ferrer, Edoardo. 2003. Tipologia delle presentative romanze e morfosintassi storica: fr. C’est e prov. –I (estai, fai, plai), Zeitschrift für romanische Philologie 119. 51–90. Börjars, Kersti & Nigel Vincent. 2011. Grammaticalization and directionality. In Heiko Narrog & Bernd Heine (eds.), The Oxford handbook of grammaticalization, 163–176. Oxford: Oxford University Press. Bossong, George. 1998. Le marquage differentiel de l’objet dans les langues d’Europe. In Jack Feuillet (ed.), Actance et Valence dans les langues de l’Europe, 259–295. Berlin: Mouton de Gruyter. Bossong, Georg. 2016. Classifications. In Adam Ledgeway & Martin Maiden (eds.), The Oxford guide to the Romance languages, 63–72. Oxford: Oxford University Press. Branca, Vittore. 1976. Tutte le opera di G. Boccaccio. Milan: Mondadori. Bres, Jacques & Emmanuelle Labeau. 2012. De la grammaticalization des forms itive (aller) et ventive (venir): valeur en langue et emplois en discours. In Louis de Saussurre & Alain Rihs (eds.), Études de sémantique et pragmatique françaises, 143–166. Bern: Lang. Bresnan, Joan. 2001. Lexical-functional syntax. Oxford: Blackwell. Bybee, Joan, Revere Perkins & William Pagliuca. 1994. The evolution of grammar: Tense, aspect, and modality in the languages of the World. Chicago & London: University of Chicago Press. Cano Aguilar, Rafael. 1981. Estructuras síntácticas transitivas en el español actual. Madrid: Gredos.

234

Michela Cennamo

Cappellaro, Chiara. 2016. Tonic pronominal system: morphophonology. In Adam Ledgeway & Martin Maiden (eds.), The Oxford guide to the Romance languages. 722–741. Oxford: Oxford University Press. Carlier, Anne. 2007. From preposition to article: the development of the French partitive. Studies in Language 31. 1–49. Carlier, Anne, Walter De Mulder & Béatrice Lamiroy. 2012. Introduction: the pace of grammaticalization in a typological perspective. Folia Linguistica 46(2). 287–301. Carlier, Anne & Béatrice Lamiroy. 2014. The grammaticalization of the prepositional partitive in Romance. In Silvia Luraghi & Thomas Huomo (eds.), Partitive cases and related categories, 477–519. Berlin: de Gruyter. Cennamo, Michela. 1997a. Passive and impersonal constructions. In Martin Maiden & M. Mair Parry (eds.), Dialects of Italy, 145–61. London: Routledge. Cennamo, Michela. 1997b. Relative clauses. In Martin Maiden & M. Mair Parry (eds.), Dialects of Italy, 190–201. London: Routledge. Cennamo, Michela. 1998. Transitivity in the Italian dialects: synchronic aspects and diachronic Implications. In Hans Geisler & Daniel Jacob (eds.), Transitivität und Diathese in romanischen Sprachen, 73–87. Tübingen: Niemeyer. Cennamo, Michela. 1999. Late Latin pleonastic reflexives and the Unaccusative hypothesis. Transaction of the Philological Society 97(1). 103–150. Cennamo, Michela. 2000. Patterns of active syntax in Late Latin pleonastic reflexives. In John Charles Smith & Delia Bentley (eds.), Historical linguistics 1995, 33–55. Amsterdam & Philadelphia: John Benjamins. Cennamo, Michela. 2005. Passive auxiliaries in Late Latin. In Sandor Kiss, Luca Mondin & Giampaolo Salvi (eds.), Études de linguistique offertes à József Herman à l’Occasion de son 80ème Anniversaire, 179–196. Tübingen: Niemeyer. Cennamo, Michela. 2006. The rise and grammaticalization paths of Latin fieri and fakere as passive auxiliaries. In Werner Abraham & Larisa Leisiö (eds.), Passivization and typology, 311–336. Amsterdam & Philadelphia: John Benjamins. Cennamo, Michela. 2007. Auxiliaries and serials between Late Latin and early Romance. In Delia Bentley & Adam Ledgeway (eds.), Sui dialetti Italoromanzi. Saggi in onore di Nigel B. Vincent, The Italianist, Special supplement 1. 63–87. Cennamo, Michela. 2008. The rise and development of analytic perfects in Italo-Romance. In Thórhallur Eythórsson (ed.), Grammatical change and linguistic theory, 115–142. Amsterdam & Philadelphia: John Benjamins. Cennamo, Michela. 2009. Argument structure and alignment variations and changes in Late Latin. In Johanna Barðdal & Shobana Chelliah (eds.), The role of semantics and pragmatics in the development of case, 307–346. Amsterdam & Philadelphia: John Benjamins. Cennamo, Michela. 2014. Passive and impersonal reflexives in the Italian dialects. Synchronic and diachronic aspects. In Paola Benincà, Adam Ledgeway & Nigel Vincent (eds.), Diachrony and dialects. Grammatical change in the dialects of Italy, 71–95. Oxford: Oxford University Press. Cennamo, Michela. 2016. Voice. In Adam Ledgeway & Martin Maiden (eds.), The Oxford guide to the Romance languages, 967–980. Oxford: Oxford University Press. Cennamo, Michela. 2019. Aspects of grammaticalization and reanalysis in the voice domain in the transition from Latin to early Italo-Romance. In Iván Igartua, Brian Joseph, Lars Heltoft, Kristen Jeppesen Kragh & Lene Schøsler (eds.), Perspectives on language structure and language change, 205–231. Amsterdam & Philadelphia: John Benjamins. Cennamo, Michela & Antonella Sorace. 2007. Auxiliary selection and split intransitivity in Paduan. Variation and lexical-aspectual constraints. In Raúl Aranovich (ed.), Split auxiliary systems: A cross-linguistic perspective, 65–99. Amsterdam & Philadelphia: John Benjamins.

Mechanisms and paths of grammaticalization and reanalysis in Romance

235

Ciconte, Francesco. 2015. Historical context. In Delia Bentley, Francesco Ciconte & Silvio Cruschina (eds.), Existentials and locatives in Romance dialects of Italy, 217–260. Oxford: Oxford University Press. Cinque, Guglielmo. 1976. Mica. Annali della Facoltà di Lettere e Filosofia dell’Università di Padova 1. 101–112. Clackson, James. 2016. Latin as a source for the Romance languages. In Adam Ledgeway & Martin Maiden (eds.), The Oxford guide to the Romance languages, 3–13. Oxford: Oxford University Press. Coleman, Robert. 1971. The origin and development of Latin habeo + infinitive. Classical Quaterly 21. 215–232. Company-Company, Concepción, 2003. Transitivity and grammaticalization of object. The diachronic struggle of direct and indirect object in Spanish. In Giuliana Fiorentino (ed.), Romance objects. Transitivity in the Romance languages, 217–260. Berlin: Mouton de Gruyter. Comrie, Bernard. 1989. Language universals and linguistic typology (second edn). London: Blackwell. Cornillie, Bert, Walter De Mulder, Tine Van Hecke & Dieter Vermandere. 2009. Modals in the Romance languages. In Björn Hansen & Ferdinand de Haan (eds.), Modals in the languages of Europe. A reference work, 107–138. Berlin: Mouton de Gruyter. Cournane, Ailis. 2010. Using synchronic microvariation to understand pathways of change: subject clitic doubling in Romance dialects. Proceedings of the 2010 annual conference of the Canadian Linguistic Association. 1–13. Creissels, Denis. 2007. Impersonal and anti-impersonal constructions: a typological approach. MS: University of Lyon 2 (http://deniscreissels.fr/). Creissels, Denis. 2010. Fluid intransitivity in Romance languages: a typological approach. Archivio Glottologico Italiano 95(2). 117–151. Cristofaro, Sonia. 2016. Routes to insubordination: a cross-linguistic perspective. In Nicholas Evans & Honoré Watanabe (eds.), Insubordination, 393– 422. Amsterdam & Philadelphia: John Benjamins. Cruschina, Silvio. 2011. Tra dire e pensare: casi di grammaticalizzazione in italiano e siciliano: La Lingua Italiana: Storia, Strutture, Testi 7. 105–125. Cruschina, Silvio & Adam Ledgeway. 2016. The structure of the clause. In Adam Ledgeway & Martin Maiden (eds.), The Oxford guide to the Romance languages, 556–74. Oxford: Oxford University Press. Culbertson, Jennifer & Geraldine Legendre. 2014. Prefixal agreement and impersonal il in Spoken French: Experimental evidence. Journal of French Language Studies 24(1). 83–105. Cyrino, Sonia. 2007. Construções com se e promoção de argumento no Português Brasileiro: uma investigação diacrônica. Revista de Abralin 6(2). 85–116. Cyrino, Sonia 2013. Argument promotion and se-constructions in Brazilian Portuguese. In Elly van Gelderen, Michela Cennamo & Johanna Barðdal (eds.), Argument structure in Flux. The Naples-Capri papers, 284–306. Amsterdam & Philadelphia: John Benjamins. D’Alessandro, Roberta. 2014. Death and contact-induced rebirth of impersonal pronouns. A case study. Probus 26(2). 249–274. D’Alessandro, Roberta & Artemis Alexioudu. 2003. Nome: a subject clitic in a southern Italian dialect. In Martine Coene & Yves D’Hulst (eds.), Current studies in comparative Romance linguistics, 165–192. Antwerp: Antwerp Papers in Linguistics. D’Alessandro, Roberta & Ian Roberts. 2010. Past participle agreement in Abruzzese: split auxiliary selection. Natural Language and Linguistic Theory 28. 41–72. D’Alessandro, Roberta & Diego Pescarini. 2016. Agreement restrictions and agreement oddities in Romance. In Susann Fischer & Gabriel Christoph (eds.), Manual of grammatical interfaces in Romance, 267–294. Berlin: De Gruyter.

236

Michela Cennamo

Dall’Aglio Hattner, Marise M. & Kees Hengevald. 2016. The grammaticalization of modal verbs in Brazilian Portuguese: a synchronic approach. Journal of Portuguese Linguistics 15(1). 1–14. Dalrymple, Mary & Irina Nikolaeva. 2011. Objects and information structure. Cambridge: Cambridge University Press. David, Oana. 2014. Subjectification in the development of clitic doubling: a diachronic study of Romanian and Spanish. Proceedings of the 40th Annual Meeting of the Berkely Linguistic Society. 42–61. Davies, Mark. E. 1995. The evolution of causative constructions in Spanish and Portuguese. In Jon Amastae, Grant Goodall, Mario Montalbetti & Marianne Phinney (eds.), Contemporary research in Romance linguistics, 105–122. Amsterdam & Philadelphia: John Benjamins. Day, Meagan & Sara Zahler. 2014. The continuous path of grammaticalization in Modern Peninsular Spanish. University of Pennsylvania Working Papers in Linguistics 20(1). 70–80. De Borja Moll, Francesc.1952. Gramática Histórica Catalana. Madrid: Gredos. Debaisieux, Jeanne-Marie. 2007. La distinction entre dependence grammaticale et dependence macrosyntaxique comme moyen de résoudre les paradoxes de la subordination. Fait de Langues 28. 119–132. De Clercq, Karen. 2017. The nanosyntax of French negation: A diachronic perpective. In Silvio Cruschina, Katharina Hartmann & Eva-Maria Remberger (eds.), Negation: Syntax, semantics, and variation, 49–80. Göttingen: Vandenhoeck & Ruprecht/Vienna University Press. De Mulder, Walter & Beatrice Lamiroy. 2012. Gradualness of grammaticalization in Romance: the position of French, Spanish and Italian. In Kristis Davids, Tine Breban, Lieselotte Brems & Tanja Mortelmans (eds.), Grammaticalization and language change: New reflections, 199– 226. Amsterdam & Philadelphia: John Benjamins. Detges, Ulrich. 2004. How cognitive is grammaticalization? The history of the Catalan perfet perifràstic. In Olga Fischer, Muriel Norde & Harry Peridon (eds.), Up and down the cline. The nature of grammaticalization, 211–227. Amsterdam & Philadelphia: John Benjamins. Detges, Ulrich. 2015. The Romance adverbs in –mente: a case study in grammaticalization. In Peter O. Müller, Ingeborg Ohnheiser, Susan Olsen & Franz Rainer (eds.), Word-Formation (vol. 3), 1824–1841. Berlin: de Gruyter. Dik, Simon.1987. Copula auxiliarization: how and why. In Martin Harris & Paolo Ramat (eds.), Historical development of auxiliaries, 54–84. Berlin: Mouton De Gruyter. Dixon, Robert M. W. 1979. Ergativity. Language 55. 59–138. Dixon, Robert M. W. 1994. Ergativity. Cambridge: Cambridge University Press. Dobrovie-Sorin, Carmen.1994. The syntax of Romanian. Berlin: Mouton de Gruyter. Dragomirescu, Adina. 2013. Passive and impersonal constructions. By-phrases. In Gabriela Panǎ Dindelegan (ed.), The grammar of Romanian, 169–173. Dragomirescu, Adina & Alexandru Nicolae. 2014. The multiple grammaticalization of Romanian veni ‘come’. Focusing on the passive construction. In Maud Devos & Jenneke van der Wal (eds.), COME and GO off the beaten grammaticalization path, 69–100. Berlin: Mouton de Gruyter. Dragomirescu, Adina & Alexandru Nicolae. 2016. Case. In Adam Ledgeway & Martin Maiden (eds.), The Oxford guide to the Romance languages, 911–923. Oxford: Oxford University Press. Dragomirescu, Adina & Alexandru Nicolae. 2017. Semantic constraints on the reflexive/nonreflexive alternation of Romanian unaccusatives. In Lars Hellan, Andrej L. Malchukov & Michela Cennamo (eds.), Contrastive studies in verbal valency. 408–430. Amsterdam & Philadelphia: John Benjamins. Drinka, Bridget. 2003. Areal factors in the development of the European periphrastic perfect. Word 54: 1–38. Drinka, Bridget. 2017. Language contact in Europe. Cambridge: Cambridge University Press.

Mechanisms and paths of grammaticalization and reanalysis in Romance

237

Drinka, Bridget. 2020. Motivating the North-South continuum: evidence from the perfects of Gallo-Romance. In Sam Wolfe & Martin Maiden (eds.), Variation and Change in GalloRomance Grammar. Oxford: Oxford University Press. Duarte, Inês. 2003. A família das construções inacusativas. In Maria Helena M. Mateus, Ana Maria Brito, Inês Duarte & Isabel Hub Faria (eds.), Gramática da Língua Portuguesa (5th edn), 507–548. Lisboa: Caminho. Duarte, Maria E. L., Mary A. Kato & Pilar Barbosa. 2003. Sujetos indeterminados em PE e PB. Associação Brasileira de Linguística 26. 405–409. Dubert, Francisco & Charlotte Galves. 2016. Galician and Portuguese. In Adam Ledgeway & Martin Maiden (eds.), The Oxford guide to the Romance languages, 411–446. Oxford: Oxford University Press. Egerland, Verner. 2003. Impersonal pronouns in Scandinavian and Romance. Working Papers in Scandinavia Syntax 71. 75–102. Egerland, Verner. 2010. On old Italian uomo and the classification of indefinite expressions. In Roberta D’Alessandro, Adam Ledgeway & Ian Roberts (eds.), Syntactic variation. The dialects of Italy, 71–85. Cambridge: Cambridge University Press. Escandell-Vidal, Victoria. 2009. Differential object marking and topicality. The case of Balearic Catalan. Studies in Language 33(4). 832–885. Evans, Nicholas. 2007. Insubordination and its uses. In Irina Nikolaeva (ed.), Finiteness: Theoretical and empirical foundations, 366–431. Oxford: Oxford University Press. Evans, Nicholas & Honoré Watanabe. 2016. The dynamics of insubordination. An overview. In Nicholas Evans & Honoré Watanabe (eds.), Insubordination, 1–38. Amsterdam & Philadelphia: John Benjamins. Fagard, Benjamin & Alexandru Mardale. 2012. The pace of grammaticalization and the evolution of prepositional systems: Data from Romance. Folia Linguistica 46(2). 303–340. Fagard, Benjamin & Alexandru Mardale. 2014. Non, mais tu l’as vu à lui? Analyse(s) du marquage différentiel de l’objet en Français. Verbum 26. 145–170. Fici Giusti, Francesca. 1994. Il Passivo nelle Lingue Slave. Milano: Franco Angeli. Fiorentino, Giuliana. 2003. Prepositional objects in Neapolitan. In Giuliana Fiorentino (ed.), Romance objects. Transitivity in Romance languages, 117–151. Berlin: Mouton de Gruyter. Fleischman, Suzanne. 1982. The future in thought and language. Diachronic evidence from Romance. Cambridge: Cambridge University Press. Frank-Job, Barbara & Maria Selig. 2016. Early evidence and sources. In Adam Ledgeway & Martin Maiden (eds.), The Oxford guide to the Romance languages, 24–34. Oxford: Oxford University Press. Furtado da Cunha, Maria Angélica. 2007. Grammaticalization of the strategies of negation in Brazilian Portuguese. Journal of Pragmatics 39. 1638–1653. Fuß, Eric & Carola Trips. 2004. Diachronic clues to synchronic grammar. Amsterdam & Philadelphia: John Benjamins. Gelderen, Elly van. 1997. Verbal agreement and the grammar behind its breakdown. Minimalist feature checking. Tübingen: Niemeyer. Gelderen, Elly van. 2011. The grammaticalization of agreement. In Heiko Narrog & Bernd Heine (eds.), The Oxford handbook of grammaticalization, 491–501. Oxford: Oxford University Press. Gheorghe, Mihaela. 2013. Argument clauses. In Gabriela Pană Dindelegan (ed.), The grammar of Romanian, 466–473. Oxford: Oxford University Press. Giacalone Ramat, Anna. 2000. On some grammaticalization patterns for auxiliaries. In John Charles Smith & Delia Bentley (eds.), Historical Linguistics 1995 (vol. 1), 125–154. Amsterdam & Philadelphia: John Benjamins. Giacalone Ramat, Anna. 2005. Persistence and renewal in the relative pronoun paradigm: the case of Italian. Folia Linguistica Historica 26. 115–138.

238

Michela Cennamo

Giacalone Ramat, Anna. 2008. Areal convergence in grammaticalization processes. In María López-Couso & Elena Seoane (eds.), Rethinking grammaticalization: New perspectives, 129– 167. Amsterdam & Philadelphia: John Benjamins. Giacalone Ramat, Anna. 2017. Passives and constructions that resemble passives. Folia Linguistica Historica 38. 149–176. Giacalone Ramat, Anna & Andrea Sansò. 2007a. The indefinite usage of  (‘’) in early Italo-Romance. Grammaticalization and areality. Archivio Glottologico Italiano 92(1). 65–111. Giacalone Ramat, Anna & Andrea Sansò. 2007b. The spread and decline of indefinite manconstructions in European languages. In Paolo Ramat & Elisa Roma (eds.), Europe and the Mediterranean as linguistic areas: Convergences from a historical and typological perspective, 95–131. Amsterdam & Philadelphia: John Benjamins. Giacalone Ramat, Anna & Andrea Sansò. 2011. L’emploi indéfini de homo en latin tardif: aux origines d’un «européanisme». In Michele Fruyt & Olga Spevak (eds.), La quantification in Latin, 93–116. Paris: L’Harmattan. Giacalone Ramat, Anna & Andrea Sansò. 2014. Venire (‘come’) as a passive auxiliary in Italian. In Maud Devos & Jenneke van der Wal (eds.),  and  off the beaten grammaticalization path, 21–44. Berlin & New York: Mouton de Gruyter. Giammarco, Ernesto. 1968. Dizionario Abruzzese e Molisano (Vol 1). Rome: Ateneo. Giurgea, Ion & Eva-Maria Remberger. 2016. Illocutionary force. In Adam Ledgeway & Martin Maiden (eds.), The Oxford guide to the Romance languages, 863–878. Oxford: Oxford University Press. Giusti, Giuliana. 2016. The structure of the nominal group. In Adam Ledgeway & Martin Maiden (eds.), The Oxford guide to the Romance languages, 541–555. Oxford: Oxford University Press. Gómez Torrego, Leonardo. 1988. Perifrasis verbales. Madrid: Arco Libros. Green, John N. 1982. The status of Romance auxiliaries of voice. In Nigel Vincent & Martin Harris (eds.), Studies in the Romance verb, 97–138. London: Croom Helm. Grevisse, Maurice. 1980. Le bon usage (11th edn.). Glemboux: Duculot. Haiman, John. 1988. Rhaeto-Romance. In Martin Harris & Nigel Vincent (eds.), The Romance languages, 351–390. London: Routledge. Haiman, John Benincà & Benincà, Paola. 1992. The Raetho-Romance languages. London: Routledge. Hansen, Björn & Ferdinand de Haan. 2009. Concluding chapter: modal constructions in the languages of Europe. In Björn Hansen & Ferdinand de Haan (eds.), Modals in the languages of Europe. A reference work, 511–559. Berlin: Mouton de Gruyter. Hansen, Maj-Britt Mosegaard. 2011. Negative cycles and grammaticalization. In Bernd Heine & Heiko Narrog (eds.), The Oxford handbook of grammaticalization., 570–579. Oxford: Oxford University Press. Hansen, Maj-Britt Mosegaard & Jacqueline Visconti. 2012. The evolution of negation in French and Italian: similarities and differences. Folia Linguistica 46(2). 417–452. Harris, Martin B. 1982. The ‘past simple’ and the ‘present perfect’ in Romance. In Nigel Vincent & Martin Harris (eds.), Studies in the Romance verbs: Essays offered to Joe Cremona on the occasion of his 60th Birthday, 42–70. London: Croom Helm. Harris, Martin B. 1988. French. In Martin Harris & Nigel Vincent (eds.), The Romance languages, 209–243. London: Croom Helm. Haspelmath, Martin. 1998a. Does grammaticalization need reanalysis? Studies in Language 22. 315–351. Haspelmath, Martin. 1998b. How young is Standard Average European. Language Sciences 20. 271–287.

Mechanisms and paths of grammaticalization and reanalysis in Romance

239

Haspelmath, Martin. 2001. The European linguistic area: Standard Average European. In Martin Haspelmath, Ekkehard König, Wulf Oesterreicher & Wolfgang Raible (eds.), Language typology and language universals (vol. II), 1492–1510. Berlin/New York: de Gruyter. Haspelmath, Martin. 2003. The geometry of grammatical meaning: semantic maps and crosslinguistic comparison. In Michael Tomaselli (ed.), The new psychology of language, vol. 2. 211–242. Mahwah, NJ.: Erlbaum. Haspelmath, Martin. 2004. On directionality in language change with particular reference to grammaticalization. In Olga Fischer, Muriel Norde & Harry Perridon (eds.), Up and down the cline – The nature of grammaticalization, 17–44. Amsterdam & Philadelphia: John Benjamins. Haspelmath, Martin. 2011. On S, A, P, T and R as comparative concepts for alignment typology. Linguistic Typology 15. 535–567. Hastings, Robert 1994. L’espressione del soggetto indefinito in un dialetto abruzzese. L’Italia Dialettale 57. 9–33. Heine, Bernd. 1993. Auxiliaries. Cognitive forces and grammaticalization. Oxford: Oxford University Press. Heine, Bernd & Tania Kuteva. 2006. The changing languages of Europe. Oxford: Oxford University Press. Herman, Jòzsef. 2000. Vulgar Latin. University Park: Pennsylvania State University Press. Hinzelin, Marc-Olivier. 2009. Neuter pronouns in Ibero-Romance: discourse reference, expletives and beyond. In Georg A. Kaiser & Eva M. Remberger (eds.), Proceedings of the Workshop ‘Null-subjects, expletives and locatives in Romance’, 1–25. Konstanz: Universität Konstanz. Hopper, Paul J. 1991. On some principles of grammaticization. In Elizabeth Closs Traugott & Bernd Heine (eds.), Approaches to grammaticization (vol I), 17–35. Amsterdam & Philadelphia: John Benjamins. Hopper, Paul J. & Elisabeth Closs Traugott (eds.). 2003. Grammaticalization (2nd edn.). Cambridge: Cambridge University Press. Iemmolo, Giorgio. 2010. Topicality and differential object marking: Evidence from Romance and beyond. Studies in Language 34. 239–272. Igartua, Ivan. 2006. Genus alternans in Indo-European. Indogermanische Forschungen 3. 56–70. Jacobs, Bart. 2011. Present and historical perspectives on the Catalan go-past. Zeitschrift für Katalanistik 24. 227–255. Jacobs, Bart & Hans P. Kunert. 2014. What happened to the Occitan go-past? Insights from the dialects of Gascony and Guardia Piemontese. Revue Romane 49(2). 177–203. Jespersen, Otto. 1917. Negation in English and other languages. København: Høst. Jones, Michael. 1993. Sardinian syntax. London: Routledge. Jones, Michael. 1996. Foundations of French syntax. Cambridge: Cambridge University Press. Juge, Matthew L. 2006. Morphological factors in the grammaticalization of the Catalan “go” past. Diachronica 23(2). 313–339. Kabatek, Johannes. 2016. Diglossia. In Adam Ledgeway & Martin Maiden (eds.), The Oxford guide to the Romance languages, 624–633. Oxford: Oxford University Press. Kabatek, Johannes & Claus D. Pusch. 2011. The Romance languages. In Bernd Kortmann & Johan van der Auwera (eds.), The languages and linguistics of Europe: A comprehensive guide, 69– 96. Berlin: de Gruyter. Kailuweit, Rolf. 2011. Romance anticausatives: a constructionist RRG approach. In Wataru Nakamura (ed.), New perspectives in Role and Reference Grammar. 104–133. Newcastle upon Tyne: Cambridge Scholars. Kaiser, Georg A. & Eva-Maria Remberger (eds.), Proceedings of the Workshop ‘Null-subjects, expletives and locatives in Romance’. Konstanz: Universität Konstanz. Kayne, Richard. 1975. French syntax. The transformational cycle. Cambridge, Mass.: The MIT Press.

240

Michela Cennamo

Kayne, Richard. 1976. French relative ‘que’. In Marta Lujan & Fritz Hensey (eds.), Current studies in Romance linguistics, 255–299. Washington, DC: Georgetown University Press. Keenan, Edward & Bernard Comrie. 1977. Noun phrase accessibility and universal grammar. Linguistic Inquiry 8. 63–99. Kemmer, S. 1993. The middle voice. Amsterdam & Philadelphia: John Benjamins. Kiss, Katalin E. 1998. Identificational focus versus information focus. Language 74. 254–273. Koch, Peter. 2003. From subject to object and from object to subject: (de)personalization, floating and reanalysis in presentative verbs. In Giuliana Fiorentino (ed.), Romance objects: Transitivity in Romance languages, 153–185. Berlin: Mouton de Gruyter. La Fauci, Nunzio. 1994. Objects and subjects in the formation of Romance morphosyntax. Bloomington: Indiana University Linguistics Club Publications. La Fauci, Nunzio. 2000. Modularità della diatesi: convergenze e divergenze grammaticali nel passivo. In Fabiana Fusco, Vincenzo Orioles & Alice Parmeggiani (eds), Processi di convergenza e differenziazione nelle lingue dell’Europa medievale e moderna, 73–97. Udine: Forum. Lahousse, Karen & Béatrice Lamiroy. 2012. Word order in French, Spanish and Italian: a grammaticalization account. Folia Linguistica 46(2). 387–414. Lakoff, Robin. 1968. Abstract syntax and Latin complementation. Cambridge, MA: The MIT Press. Lambrecht, Knud. 1981. Topic, antitopic and verb-agreement in non-standard French. Amsterdam & Philadelphia: John Benjamins. Lamiroy, Béatrice & Walter De Mulder. 2011. Degrees of grammaticalization across languages. In Heiko Narrog & Bernd Heine (eds.), The Oxford handbook of grammaticalization, 302–317. Oxford: Oxford University Press. Lamiroy, Béatrice & Anna Pineda. 2018. Grammaticalization across Romance languages and the pace of language change: The position of Catalan. Lingvisticæ Investigationes 40(2). 304– 331. Larreya, Paul. 2005. Sur les emplois de la périphrase aller + infinitif. In Hava Bat-Zeev Shyldkrot & Nicole Le Querler (eds.), Les periphrases verbales, 337–360. Amsterdam & Philadelphia: John Benjamins. Ledgeway, Adam. 1997. Asyndetic complementation in Neapolitan dialect. The Italianist 17(1). 231–273. Ledgeway Adam. 2000. A comparative syntax of the dialects of Southern Italy: A minimalist approach. Oxord: Blackwell. Ledgeway Adam. 2009. Grammatica diacronica del Napoletano. Beihefte zur Zeitschrift für romanische Philologie. Tübingen: Niemeyer. Ledgeway, Adam. 2011a. Adverb agreement and split intransitivity: evidence from southern Italy. Archivio Glottologico Italiano 96. 31–66. Ledgeway, Adam. 2011b. Morphosyntactic typology and change in Latin and Romance. In Martin Maiden, John Charles Smith & Adam Ledgeway (eds), The Cambridge history of the Romance languages. Volume 1: Structures, 382–471, 724–734. Cambridge: Cambridge University Press. Ledgeway, Adam. 2011c. Grammaticalization from Latin to Romance. In Bernd Heine & Heiko Narrog (eds.), The Oxford handbook of grammaticalization, 719–728. Oxford: Oxford University Press. Ledgeway Adam. 2012. From Latin to Romance. Morphosyntactic typology and change. Oxford: Oxford University Press. Ledgeway, Adam. 2014. Romance auxiliary selection in light of Romanian evidence. In Gabriela Pană Dindelegan, Rodica Zafiu, Adina Dragomirescu, Irina Nicula & Alexandru Nicolae (eds.), Diachronic variation in Romanian, 3–35. Newcastle upon Tyne: Cambridge Scholars.

Mechanisms and paths of grammaticalization and reanalysis in Romance

241

Ledgeway, Adam 2016a. The dialects of southern Italy. In Adam Ledgeway & Martin Maiden (eds.), The Oxford guide to the Romance languages, 246–269. Oxford: Oxford University Press. Ledgeway, Adam 2016b. Functional categories. In Adam Ledgeway & Martin Maiden (eds.), The Oxford guide to the Romance languages, 761–771. Oxford: Oxford University Press. Ledgeway, Adam 2016c. Complementation. In Adam Ledgeway & Martin Maiden (eds.), The Oxford guide to the Romance languages, 1013–1028. Oxford: Oxford University Press. Ledgeway, Adam 2016d. From coordination to subordination: The grammaticalisation of progressive and andative aspect in the dialects of Salento. In Fernanda Pratas, Sandra Pereira & Clara Pinto (eds.), Coordination and subordination. Form and meaning. Selected papers from CSI Lisbon 2014, 157–184. Newcastle upon Tyne: Cambridge Scholars. Ledgeway, Adam 2017a. Marking presuppositional negation in the dialects of southern Italy. In Silvio Cruschina, Katharina Hartmann & Eva-Maria Remberger (eds.), Negation: Syntax, semantics, and variation, 105–130. Göttingen: Vandenhoeck & Ruprecht/Vienna University Press. Ledgeway, Adam 2017b. Syntheticity and analyticity. In A. Dufter & Elisabeth Stark (eds.), Manual of Romance morphosyntax and syntax, 837–84. Berlin: De Gruyter. Ledgeway, Adam 2018. Parametric variation in DOM in Italo-Romance. Differential object marking in Romance – Towards microvariation. Paris, 9–10 November 2018, Handout. Ledgeway, Adam. 2019. Parameters in the development of Romance perfective auxiliary selection. In Michela Cennamo & Claudia Fabrizio (eds.), Selected Papers from the 22nd International Conference on Historical Linguistics, 347–388. Amsterdam & Philadelphia: John Benjamins. Ledgeway, Adam. & Alessandra Lombardi. 2014. The development of the Southern subjunctive: morphological loss and syntactic gain. In Paola Benincà, Adam Ledgeway & Nigel Vincent (eds.), Diachrony and dialects. Grammatical change in the dialects of Italy, 25–47. Oxford: Oxford University Press. Ledgeway, Adam N. & Martin Maiden (eds.). 2016. The Oxford guide to the Romance languages. Oxford: Oxford University Press. Ledgeway, Adam. & John Ch. Smith. 2016. Deixis. In Adam Ledgeway & Martin Maiden (eds.), 879–896. Oxford: Oxford University Press. Legendre, Geraldine. 1990. French impersonal constructions. Natural Language and Linguistic Theory 8. 81–128. Legendre, Geraldine & Paul Smolensky. 2009. French inchoatives and the unaccusativity hypothesis. In Donna B. Gerdts, John C. Moore & Maria Polinsky (eds.), Hypothesis A/ Hypothesis B: Linguistic explorations in honor of David M. Perlmutter, 229–46. Cambridge, MA: The MIT Press. Legendre, Géraldine & Antonella Sorace. 2003. Auxiliaires et intransitivité en français et dans les langues romanes. In Danièle Godard & Anne Abeillé (eds.), Les langues romanes; problèmes de la phrase simple, 185–234. Paris: Editions du CNRS. Lehmann, Christian. 1988. Predicate classes and participation. In Studies in general comparative linguistics, 33–77. Köln: Institut für Sprachwissenschaft Universität zu Köln. Lehmann, Christian. 1992. Word order change by grammaticalization. In Marinel Gerritsen & Dieter Stein (eds.), Internal and external factors in syntactic change, 395–416. Berlin/New York: Mouton de Gruyter. Lehmann, Christian. 2015. Thoughts on grammaticalization (3rd edn.). Berlin: Language Science Press. Lehmann, Christian, José Pinto de Lima & Rute Soares. 2010. Periphrastic voice with ‘see’ in Portuguese. In Gabriele Diewald & Elena Smirnova (eds.), Paradigmaticity and obligatoriness, 75–100. London: Routledge (Acta Linguistica Hafniensia, special issue, 42/1). Leone, Alfonso. 1995. Profilo di sintassi Siciliana. Palermo: Centro di Studi Filologici e Linguistici Siciliani.

242

Michela Cennamo

Leonetti, Manuel. 2008. Definiteness effects and the role of the coda in existential constructions. In Høeg Müller & Alex Klinge (eds.), Essays on Nominal Determination, 131–162. Amsterdam & Philadelphia: John Benjamins. Lombardi, Alessandra, 2007. Posizione dei clitici e ordine dei costituenti nella lingua sarda medievale. In Delia Bentley & Adam N. Ledgeway (eds.), Sui dialetti Italo-Romanzi. saggi in onore di Nigel B. Vincent. The Italianist 27. 133–147 (Special Supplement 1). Lombardi Vallauri, Edoardo. 2016. Insubordinated conditionals in spoken and non-spoken Italian. In Nicholas Evans & Honoré Watanabe (eds.), Insubordination, 145–170. Amsterdam & Philadelphia: John Benjamins. Longobardi, Giuseppe. 2001. Formal syntax, diachronic minimalism, and etymology: the history of French chez. Linguistic Inquiry 32(2). 275–302. Loporcaro, Michele. 1988. Grammatica storica del dialetto di Altamura. Pisa: Giardini. Loporcaro, Michele. 2012a. A new strategy for progressive marking and its implications for grammaticalization theory: the subject clitic construction of Pantiscu. Studies in Language 36. 747–784. Loporcaro, Michele. 2012b. Per lo studio della morfosintassi dei dialetti lucani: acquisizioni recenti e nuove prospettive. In Patrizia Del Puente (ed.), Atti del II convegno internazionale di dialettologia. – Progetto A.L.B.A., 176–198. Rionero in Vulture: Calice Editore. Loporcaro, Michele. 2016a. Auxiliary selection and participial agreement. In Adam Ledgeway & Martin Maiden (eds.), The Oxford guide to the Romance languages, 802–819. Oxford: Oxford University Press. Loporcaro, Michele. 2016b. Gender. In Adam Ledgeway & Martin Maiden (eds.), The Oxford guide to the Romance languages, 924–936. Oxford: Oxford University Press. Loporcaro, Michele. 2018. Gender from Latin to Romance. History, geography, typology. Oxford: Oxford University Press. Loporcaro, Michele & Tania Paciaroni. 2011. Four-gender systems in Indo-European. Folia Linguistica 45(2). 435–464. Loporcaro, Michele, Lorenza Pescia & Maria A. Ramos. 2004. Costrutti dipendenti participiali e participi doppi in portoghese. Revue de Linguistique Romane 68. 15–46. Lyons, Christopher. 1986. On the origin of the Old French strong-weak possessive distinction. Transactions of the Philological Society 84. 1–41. Luraghi, Silvia. 2017. Partitives and differential marking of core arguments: a cross-linguistic survey. MS: University of Pavia. Luraghi, Silvia & Seppo Kittilä. 2014. Typology and diachrony of partitive case markers. In Silvia Luraghi & Thomas Huomo (eds.), Partitive cases and related categories, 17–62. Berlin: de Gruyter. Maiden, Martin. 1995. A linguistic history of Italian. London: Longman. Maiden, Martin. 2011. Morphological persistence from Latin to Romance. In Martin Maiden, John Charles Smith & Adam N. Ledgeway (eds.), The Cambridge history of the Romance languages, vol. 1: Structures, 155–215, 699–706. Cambridge: Cambridge University Press. Maiden, Martin. 2013. The Latin ‘third stem’ and its Romance descendants. Diachronica 30(4). 492–530. Maiden, Martin. 2016a. Romanian, Istro-Romanian, Megleno-Romanian, and Aromenian. In Adam Ledgeway & Martin Maiden (eds.), The Oxford guide to the Romance languages, 91–126. Oxford: Oxford University Press. Maiden, Martin. 2016b. Inflectional morphology. In Adam Ledgeway & Martin Maiden (eds.), The Oxford guide to the Romance languages, 497–523. Oxford: Oxford University Press. Maiden, Martin. 2016c. Number. In Adam Ledgeway & Martin Maiden (eds.), The Oxford guide to the Romance languages, 697–707. Oxford: Oxford University Press. Maiden, Martin. 2016d. The Romanian alternating gender in diachrony and synchrony. Folia Linguistica Historica 37(1). 111–144.

Mechanisms and paths of grammaticalization and reanalysis in Romance

243

Maiden, Martin & Mair Parry (eds.). 1997. Dialects of Italy. London: Routledge. Manzini, Maria Rita & Leonardo Savoia 2005. I dialetti italiani e romanci. Morfosintassi generativa (Vols. II–III). Alessandria: Edizioni dell’Orso. Mardale, A. 2008. Microvariation within differential object marking: data from Romance. Revue Roumaine de Linguistique 4. 448–467. Mardale, Alexandru. 2015. Differential object marking in the first original Romanian texts. In Virginia Hill (ed.), Formal approaches to DPs in Old Romanian, 200–245. Leiden & Boston: Brill. Martellotta, Mario E. & Maria M. Cezario. 2011. Grammaticalization in Brazilian Portuguese. In Heiko Narrog & Bernd Heine (eds.), The Oxford handbook of grammaticalization, 729–739. Oxford: Oxford University Press. Martín Zorraquino, Maria Antonia. 1979. La construccione pronominales en Español. Paradigma y desviaciones. Madrid: Gredos. Martins, Ana. 2005. Passive and impersonal se in the history of Portuguese. In Claus D. Pusch, Johannes Kabatek & Wolfgang Raible (eds.), Romance corpus linguistics and language change, 411–430. Tübingen: Gunter Narr. Martins, Ana. 2009. Subject doubling in European Portuguese dialects: the role of impersonal se. In Enoch O. Aboh, Elisabeth van der Linden, Josep Quer & Petra Sleeman (eds.), Romance languages and linguistic theory: Selected papers from ‘Going Romance’, 179–200. Amsterdam & Philadelphia: John Benjamins. Mendikoetxea, Amaya. 1999a. Construcciones inacusativas y pasivas. In Ignacio Bosque & Violeta Demonte (eds.), Gramática descriptiva de la lengua Española, Vol. 2, 175–1630. Madrid: Espasa Calpe. Mendikoetxea, Amaya. 1999b. Construcciones con se: medias, pasivas e impersonales. In Ignacio Bosque & Violeta Demonte (eds.), Gramática descriptiva de la lengua Española, Vol. 2, 1631– 1723. Madrid: Espasa Calpe. Meulleman, Machteld. 2012. Degrees of grammaticalization in three Romance languages: a comparative analysis of existential constructions. Folia Linguistica 46(2). 417–452. Michaelis, Susanne. 1998. Antikausativ als Brücke zum Passiv: fieri, venire und se im Vulgärlateinische und Altitalienischen. In Wolfgang Dahmen, Günter Holtus, Johannes Kramer, Michael Metzeltin, Wolfgang Schweickard & Otto Winkelmann (eds), Neuere Beschreibungsmethoden der Syntax romanischen Sprachen, 69–98. Tübingen: Narr. Miola, Emanuele. 2017. The position of Piedmontese on the Romance grammaticalization cline. Folia Linguistica 51(1). 133–167. Mioto, Carlos & Maria Lobo. 2016. Wh-movement: interrogatives, relatives and clefts. In W. Leo Wetzels, João Costa & Sergio Menuzzi (eds.). The handbook of Portuguese linguistics, 275– 293. Oxford: Wiley. Mithun, Marianne. 2008. The extension of dependency beyond the sentence. Language 83. 69–119. Mithun, Marianne & Wallace Chafe. 1999. What are S, A and O? Studies in Language 23. 569–596. Mortelmans, Tanja, Kasper Boye & Johan van der Auwera. 2009. Modals in the Germanic languages. In Björn Hansen & Ferdinand de Haan (eds.), Modals in the languages of Europe, 11–70. Berlin & New York: Mouton de Gruyter. Neamţu, G. 1986. Predicatul în limba română. Bucharest: Editura Știntţifică și Enciclopedică. Nedelcu, Isabela. 2013. Prepositions and prepositional phrases. In Gabriela Pană Dindelegan (ed.), The grammar of Romanian, 451–465. Oxford: Oxford University Press. Nichols, Johanna. 1986. Head-marking and dependent-marking grammar. Language 62. 56–119. Nicolae, Alexandru. 2013a. Demonstratives. In Gabriela Pană Dindelegan (ed.), The grammar of Romanian, 295–300. Oxford: Oxford University Press. Nicolae, Alexandru, 2013b. The determiner cel. In Gabriela Pană Dindelegan (ed.), The grammar of Romanian, 309–318. Oxford: Oxford University Press.

244

Michela Cennamo

Nicolae, Alexandru. 2019. Word order and parameter change in Romanian. Oxford: Oxford University Press. Nicolle, Steve. 2012. Diachrony and grammaticalization. In Robert I. Binnick (ed.), The Oxford handbook of tense and aspect, 370–397. Oxford: Oxford University Press. Niculescu, Dana. 2013. The possessive dative structure. The possessive object. In Gabriela Pană Dindelegan (ed.), The grammar of Romanian, 183–190. Oxford: Oxford University Press. Norde. Muriel. 2009. Degrammaticalization. Oxford: Oxford University Press. Oliviéri, Michèle & Patrik Sauzet. 2016. Southern Gallo-Romance (Occitan). In Adam Ledgeway & Martin Maiden (eds.), The Oxford guide to the Romance languages, 319–349. Oxford: Oxford University Press. Onea, Edgar & Alexandru Mardale. 2018. From topic to object. Grammaticalization of differential object marking in Romanian. MS: University of Göttingen/INALCO-Paris. Pană Dindelegan, Gabriela. 2013a. The structure of the clause. In Gabriela Pană Dindelegan (ed.), The grammar of Romanian, 100–124. Oxford: Oxford University Press. Pană Dindelegan, Gabriela. 2013b. Objects. In Gabriela Pană Dindelegan (ed.), The grammar of Romanian, 125–157. Oxford: Oxford University Press. Paoli, Sandra. 2003. Mapping out the left periphery of the clause: evidence from North-Western Italian varieties. In Josep Quer, Jan Schroten, Mauro Scorretti, Petra Sleeman & Els Verheugd-Daatzelaar (eds.), Romance languages and linguistic theory 2001, 263–277. Amsterdam & Philadelphia: John Benjamins. Parkinson, Stephen. 1988. Portuguese. In Martin Harris & Nigel Vincent (eds.), The Romance languages, 131–69. London: Routledge. Parry, M. Mair. 2000. Accordo e soggetti postverbali in piemontese. In Annick Englebert, Michel Pierrard, Laurence Rosier & Dan van Raemdonck (eds.), Actes du XXIIe Congrès International de Linguistique et Philologie Romanes, Bruxelles 1998, VI. De la grammaire des formes à la grammaire du sens, 391–402. Tübingen: Niemeyer. Parry, M. Mair. 2003. L’oggetto preposizionale nel ligure medievale. Verbum 5. 113–126. Parry, M. Mair. 2007. The interaction of semantics and syntax in the spread of relative che in the early vernaculars of Italy. In Delia Bentley & Adam Ledgeway (eds.), Sui dialetti italoromanzi. Saggi in onore di Nigel B. Vincent. The Italianist 27. 200–19. (Special Supplement 1). Parry, M. Mair. 2010. Non-canonical subjects in the early Italian vernaculars. Archivio Glottologico Italiano 95(2). 190–226. Parry, M. Mair. 2013a. Variation and change in the presentational constructions of north-western Italo-Romance varieties. In Elly van Gelderen, Michela Cennamo & Johanna Barðdal (eds.), Argument structure in flux. The Naples-Capri papers, 511–48. Amsterdam & Philadelphia: John Benjamins. Parry, M. Mair. 2013b. Negation in Italo-Romance. In David Willis, Christopher Lucas & Anne Breitbarth (eds.), The history of negation in the languages of Europe and the Mediterranean, Vol. I: Case Studies, 77–118. Oxford: Oxford University Press. Pereira, Sandra. 2015. Causative and perception constructions in European Portuguese: the dialectal data. Dialectologica 5. 53–80. Pescarini, Diego. 2016. Clitic pronominal systems: morphophonology. In Adam Ledgeway & Martin Maiden (eds.), The Oxford guide to the Romance languages, 742–760. Oxford: Oxford University Press. Poletto, Cecilia. 1995. The diachronic development of subject clitics in north-eastern Italian dialects. In Adrian Battye & Ian Roberts (eds.), Clause structure and language change, 295–324. Oxford: Oxford University Press. Poletto, Cecilia. 2016. Negation. In Adam Ledgeway & Martin Maiden (eds.), The Oxford guide to the Romance languages, 833–846. Oxford: Oxford University Press.

Mechanisms and paths of grammaticalization and reanalysis in Romance

245

Poletto, Cecilia & Cristina Tortora. 2016. Subject clitics. Syntax. In Adam Ledgeway & Martin Maiden (eds.), The Oxford guide to the Romance languages, 786–801. Oxford: Oxford University Press. Poplack, Shana, Rena Torres Cacoullos, Nathalie Dion, Rosane de Andrade Berlinck, Salvatore Digesto, Dora Lacasse & Jonathan Steuck. 2017. Variation and grammaticalization in Romance: a cross – linguistic study of the subjunctive. In Wendy Ayres-Bennett & Janice Carruthers (eds.), Manual of Romance sociolinguistics, 217–252. Berlin & New York: Mouton de Gruyter. Pountain, Christopher. 1982. *Essere/Stare as a Romance Phenomenon. In Nigel Vincent & Martin Harris (eds.), Studies in the Romance verb, 139–60. London: Croom Helm. Pountain, Christopher. 2016. Standardization. In Adam Martin Maiden (eds.), The Oxford guide to the Romance languages, 634–643. Oxford: Oxford University Press. Quer, Josef. 2016. Mood. In Adam Ledgeway & Martin Maiden (eds.), The Oxford guide to the Romance languages, 954–966. Oxford: Oxford University Press. Ramat, Paolo & Davide Ricca. 2016. Romance: a typological approach. In Adam Ledgeway & Martin Maiden (eds.), The Oxford guide to the Romance languages, 50–63. Oxford: Oxford University Press. Ricca, Davide. 1998. Una perifrasi continua-iterativa nei testi piemontesi dal Cinquecento all’Ottocento: tenere + participio passato. In Paolo Ramat & Elisa Roma (eds.), Sintassi Storica. Atti del XXX Congresso Internazionale della Società di linguistica Italiana, 345–368. Rome: Bulzoni. Rizzi, Luigi. 1982. Issues in Italian syntax. Dordrecht: Foris. Roberts, Ian. 2016. Object clitics. In Adam Ledgeway & Martin Maiden (eds.), The Oxford guide to the Romance languages, 786–801. Oxford: Oxford University Press. Rohlfs, Gerhard. 1968. Grammatica storica della lingua italiana e dei suoi dialetti: morfologia. Turin: Einaudi. Rohlfs, Gerhard. 1969. Grammatica storica dell’italiano e dei suoi dialetti: sintassi e formazione delle parole. Turin: Einaudi. Rosen, Carol. 1991. auxiliation and serialization: on discerning the difference. In Alex Alsina, Joan Bresnan & Peter Sells (eds.), Complex predicates, 175–202. Stanford: CSLI. Said Ali, Manoel. 1964. Gramática histórica da língua Portuguesa. São Paulo: Melhoramentos. Salvi, Giampaolo. 2011. Morphosyntactic persistence. In Martin Maiden, John Ch. Smith & Adam Ledgeway (eds.), The Cambridge history of the Romance languages, I: Structures, 318–381. Cambridge: Cambridge University Press. Salvi, Giampaolo. 2016. Word order. In Adam N. Martin Maiden (eds.), The Oxford guide to the Romance languages, 997–1012. Oxford: Oxford University Press. Salvi, Giampaolo & Laura Vanelli 2004. Nuova grammatica Italiana. Bologna: Il Mulino. Sánchez Lancis, Carlos. 2001. The evolutions of the old Spanish adverbs ende and ý: a case of grammaticalization. Catalan Working Papers in Linguistics 9. 101–118. Sansò, Andrea & Anna Giacalone Ramat. 2016. Deictic motion verbs as passive auxiliaries. The case of Italian andare ‘go’ (and venire ‘come’). Transactions of the Philological Society 114(1). 1–24. Schwenter, Scott A. 2016. Independent si-clauses in Spanish: functions and consequences for insubordination. In Nicholas Evans (ed.), 89–112. Schwenter, Scott A. & Rena Torres Cacoullos 2008. Defaults and indeterminacy in temporal grammaticalization: the ‘perfect’ road to perfectivity. Journal of Language Variation and Change 20(1). 1–39. Sheehan, Michelle. 2016. Complex predicates. In Adam Ledgeway & Martin Maiden (eds.), The Oxford guide to the Romance languages, 981–993. Oxford: Oxford University Press. Sheenan, Michelle & Sonia Cyrino. 2016. Variation and change in the Romance faire-par causative. In Ernestina Carrilho, Alexandra Fiéis, Maria Lobo & Sandra Pereira (eds.),

246

Michela Cennamo

Romance languages and linguistic theory 10. Selected papers from Going Romance 28. Lisbon, 279–304. Amsterdam & Philadelphia: John Benjaimins. Siewierska, Anna.1984. The passive: A comparative linguistic analysis. London: Croom Helm. Sitaridou, Ioanna. 2012. A comparative study of word order in Old Romance. Folia Linguistica Historica 46(2). 553–604. Smith, John Charles. 2016. French and northern Gallo-Romance. In Adam Ledgeway & Martin Maiden (eds.), The Oxford guide to the Romance languages, 292–318. Oxford: Oxford University Press. Soares da Silva, Augusto. 2012. Stages of grammaticalization of causative verbs and constructions in Portuguese, Spanish, French and Italian. Folia Linguistica 46(2). 513–552. Sorace, Antonella. 2000. Gradients in auxiliary selection with intransitive verbs, Language 76. 859–890. Sorace, Antonella. 2004. Gradience at the lexicon-syntax interface: evidence from auxiliary selection and implications for unaccusativity. In Artemis Alexiadou, Elena Anagnostopoulou & Martin Everaert (eds.), The Unaccusativity puzzle. Explorations of the syntax–lexicon interface, 243–287. Oxford: Blackwell. Sorace, Antonella. 2011. Gradience in split intransitivity: the end of the unaccusativity hypothesis? Archivio Glottologico Italiano 96. 67–86. Sornicola, Rosanna. 2000. Stability, variation and change in word order: some evidence from the Romance languages. In Rosanna Sornicola, Erich Poppe & Ariel Shisha-Halevy (eds.), Stability, variation and word order patterns over time, 101–118. Amsterdam & Philadelphia: John Benjamins. Sornicola, Rosanna. 2011. Romance linguistics and historical linguistics: reflections on synchrony and diachrony. In Martin Maiden, John Charles Smith & Adam Ledgeway (eds.), The Cambridge history of the Romance languages. Vol. 1. Structures, 1–49. Cambridge: Cambridge University Press. Sorrisi, Fabrizio & Alessandra Giorgi. 2012. Forme verbali valutative: un caso dal palermitano. Quaderni di Lavoro dell’Atlante Sintattico d’Italia 14. 123–140. Squartini, Mario. 1998. Verbal periphrases in Romance: Aspect, actionality and grammaticalization. Berlin: Mouton de Gruyter. Squartini, Mario. 2004. Disentangling evidentiality and epistemic modality in Romance. Lingua 114. 873–895. Squartini, Mario & Pier Marco Bertinetto. 2000. The simple and compound past in Romance languages. In Östen Dahl (ed.), Tense and aspect in the languages of Europe. Berlin: Mouton de Gruyter. Stan, Camelia. 2013. Genitive and dative case-marking. In Gabriela Pană Dindelegan, (ed.), The grammar of Romanian, 262–272. Oxford: Oxford University Press. Stark, Elisabeth. 2016. Relative clauses. In Adam Ledgeway & Martin Maiden (eds.), The Oxford guide to the Romance languages, 1029–1040. Oxford: Oxford University Press. Teyssier, Paul. 1984. Manuel de Langue Portugaise (Portugal-Brésil). Paris: Klincksieck. Thomas, Earl W. 1969. The syntax of spoken Brazilian Portuguese. Nashville: Vanderbilt University Press. Toppino, Giuseppe. 1926. Il dialetto di Castellinaldo. Italia Dialettale 1. 114–160. Traugott, Elisabeth C. 2011. Grammaticalization and mechanisms of change. In Heiko Narrog & Bernd Heine (eds.), The Oxford handbook of grammaticalization, 19–30. Oxford: Oxford University Press. Traugott, Elisabeth & Richard Dasher. 2002. Regularity in semantic change. Cambridge: Cambridge University Press. Tuten, Donald N., Enrique Pato & Ora R. Schwarzwald. 2016. Spanish, Astur-Leonese, NavarroAragonese, Judaeo-Spanish. In Adam Ledgeway & Martin Maiden (eds.), The Oxford guide to the Romance languages, 382–410. Oxford: Oxford University Press.

Mechanisms and paths of grammaticalization and reanalysis in Romance

247

Van Peteghem, Marleen. 2012. Possessives and grammaticalization in Romance. Folia Linguistica 46(2). 605–634. Varvaro, Alberto. 1991. Latin and Romance: fragmentation or restructuring? In Roger Wright (ed.), Latin and the Romance languages in the Early Middle Ages, 44–51. London: Routledge. Vasilescu, Andra. 2013a. Reflexive constructions. In Gabriela Pană Dindelegan (ed.), The grammar of Romanian, 174–179. Oxford: Oxford University Press. Vasilescu, Andra. 2013b. Reciprocal construcions. In Gabriela Pană Dindelegan (ed.), The grammar of Romanian, 179–182. Oxford: Oxford University Press. Vincent, Nigel. 1982. The development of the auxiliaries habere and esse in Romance. In Nigel Vincent & Martin Harris (eds.), Studies in the Romance verb, 54–68. London: Croom Helm. Vincent, Nigel. 1997a. Synthetic and analytic structures. In Martin Maiden & Mair Parry (eds.), Dialects of Italy, 99–105. London: Routledge. Vincent, Nigel. 1997b. The emergence of the D-system in Romance. In Ans van Kemenade & Nigel Vincent (eds.), Parameters of morphosyntactic change, 149–169. Cambridge: Cambridge University Press. Vincent, Nigel. 1998. Tra grammatica e grammaticalizzazione: articoli e clitici nelle lingue (italo-) romanze. In Paolo Ramat & Elisa Roma (eds.), Sintassi storica, 411–440. Rome: Bulzoni. Vincent, Nigel. 2003. Head-versus dependent marking: the case of the clause. In Greville G. Corbett, Norman M. Fraser & Scott McGlashan (eds.), Heads in grammatical theory, 140–163. Cambridge: Cambridge University Press. Vincent, Nigel. 2006. Il problema del doppio complementatore nei primi volgari d’Italia. In Alvise Andreose & Nicoletta Penello (eds.), Laboratorio sulle Varietà Romanze Antiche: Giornata di lavoro sulle varietà romanze antiche. 27–42. Padua: University of Padua. Vincent, Nigel. 2014. Similarity and diversity in the evolution of Italo-Romance morphosyntax. In Paola Benincà, Adam Ledgeway & Nigel Vicent (eds.), Diachrony and dialects. Grammatical change in the dialects of Italy, 1–11. Oxford: Oxford University Press. Vincent, Nigel, 2016a. Causatives in Latin and Romance. In James N. Adams & Nigel Vincent (eds.), Early and Late Latin. Continuity or change? 294–312. Cambridge: Cambridge University Press. Vincent, Nigel. 2016b. A structural comparison of Latin and Romance. In Adam Ledgeway & Martin Maiden (eds.), The Oxford guide to the Romance languages, 37–49. Oxford: Oxford University Press. Vincent, Nigel. 2017. Parts and particles. The story of Dē. In Los Bettelou & Pieter de Haan (eds.), Word order change in acquisition and language contact, 291–310. Amsterdam & Philadelphia: John Benjamins. Wagner, Max L. 1960–1964. Dizionario etimologico Sardo. Heidelberg: Winter. Wartburg, Walter von. 1950. Die Ausgliederung der romanischen Sprachräume. Bern: Francke. Weerenbeck, B. J. 1943. Le Pronom ‘on’ en Français et en Provençal. Amsterdam: NoordHollandische Uitgevers Haatscoppij. Welton-Lair, Lisa K. 1999. The evolution of the French indefinite pronoun on: A corpus-based study in grammaticalization. Ithaca: Cornell University PhD thesis. Wheeler, Max, Alan Yates & Nicolau Dols. 1999. Catalan: a Comprehensive Grammar. London: Routledge. Whitlam, John. 2011. Modern Brazilian Portuguese grammar. London: Routledge. Wiemer, Björn & Walter Bisang. 2004. What makes grammaticalization? An appraisal of its components and its fringes. In Nikolaus P. Himmelmann, Walter Bisang & Björn Wiemer (eds.), What makes grammaticalization? A look from its fringes and its components, 3–20. Berlin/New York: Mouton de Gruyter. Wolfe, Sam. 2015. The Old Sardinian Condaghes. A syntactic study. Transactions of the Philological Society 113(2). 177–205.

248

Michela Cennamo

Wolfe, Sam. 2018. Verb second in Medieval Romance. Oxford: Oxford University Press. Wright, Roger. 1991. The conceptual distinction between Latin and Romance: invention or evolution? In Roger Wright (ed.), Latin and the Romance languages in the Early Middle Ages, 103–113. London: Routledge. Wright, Roger. 2016. Latin and Romance in the medieval period: a sociophilological approach. In Adam Ledgeway & Martin Maiden (eds.), The Oxford guide to the Romance languages, 14–23. Oxford: Oxford University Press. Yllera, Alicia. 1999. Las perífrasis verbales de gerundio y participio. In Ignacio Bosque & Violeta Demonte (eds.), Gramática descriptiva de la lengua Española, Vol. 2, 3391–3442. Madrid: Espasa Calpe. Zafiu, Rodica. 2013a. Mood, tense and aspect. In Gabriela Pană Dindelegan (ed.), The grammar of Romanian, 24–65. Oxford: Oxford University Press. Zafiu, Rodica. 2013b. Modality and evidentiality. In Gabriela Pană Dindelegan (ed.), The grammar of Romanian, 575–583. Oxford: Oxford University Press. Zamboni, Alberto. 1998. Dal latino tardo al romanzo arcaico: aspetti diacronico-tipologici della flessione nominale. In Paolo Ramat & Elisa Roma (eds.), Sintassi storica, 127–146. Rome: Bulzoni. Zamboni, Alberto. 2000. Alle origini dell’Italiano: Dinamiche e tipologie della transizione. Rome: Carocci. Zanuttini, Raffaella. 1997. Negation and clausal structure: A comparative Study of Romance languages. Oxford: Oxford University Press.

Björn Wiemer

5 Grammaticalization in Slavic  Introduction: General information about the languages, including their typological characteristics The Slavic (BEngl. Slavonic) languages are the largest language group of Europe, with a total of around 290 million native speakers and spreading over most of Eastern Europe, large parts of Central Europe and the Balkans.1 The largest language by far, Russian (approx. 167 million speakers), is also spoken in northern Asia (Siberia and some successor states of the USSR). The group is traditionally divided into East, West and South Slavic,2 mostly on the basis of phonological criteria, but this split turns out justified to some extent even for some grammaticalization phenomena. The group consists of 16 recognized languages with a written standard; to some extent, the diatopic and diastratic homogeneity of the Slavic languages varies considerably. The reconstructed ancestor of the whole group is Common Slavic (CS),3 an IE dialectal continuum which can be assumed to have been quite homogeneous up to the time of the Great Migrations (4th–6th centuries AD), but afterwards it split up rather quickly. The oldest written form, created in the 9th century, was Old Church Slavonic (OCS), still quite similar to CS, but classified as a South Slavic idiom. It had an enormous cultural and linguistic impact in particular on the development of modern standard Russian (East Slavic). This impact, however, did not bring about any conceivable consequences for grammaticalization processes. From a typological point of view, if there can be said anything in general about the morphosyntax of Slavic applying not only to their contemporary stage, but also to properties persistent in time, one should mention the consistent exploitation of inflectional morphology for purposes of agreement (Comrie and Corbett 1993: 14– 17). However, the same set of inflectional suffixes (usually realized as portmanteau morphemes and with a lot of allomorphy) has basically remained unaltered (if not repartitioned or simplified) since CS times, and there is practically nothing to “gain” for grammaticalization (probably, because these paradigmatic sets are so old). No

 For more detailed comprehensive overviews cf. Comrie and Corbett (1993), Sussex and Cubberley (2006) and Hansen (2011).  This textbook division should not be confused with merely geographic distinctions into eastern and western “halves“, which will prove appropriate in the following. As we will see, in other respects a salient division into North (= East+West) Slavic vs. South Slavic turns out most suitable.  ‘Proto-Slavic’ usually refers to a much longer period BC. Problems with establishing chronological layers of Slavic previous to written documentation are discussed, among many others, by Andersen (1985) and Holzer (2014). https://doi.org/10.1515/9783110563146-005

250

Björn Wiemer

less persistent in time and across entire Slavic is the rich derivational morphology in all major parts of speech; this pertains to both transpositional derivation that changes the syntactic class (e.g., participles, verbal nouns) and to non-transpositional one, i.e., derivation without a change of the syntactic class (e.g., diminutives). Although derivational morphology has been largely excluded from grammaticalization processes, it is the productivity of stem-derivational non-transpositional patterns which has led to the evolution of the Slavic opposition of perfective: imperfective aspect. Ubiquitous and pervasive as this system is for all Slavic languages, it is highly “inconvenient” in terms of standard parameters of grammaticalization (see § 3.2). Another salient feature of Slavic through time and space is the use of an inherited, originally clitic reflexive marker. It is highly multifunctional and used predominantly (though not exclusively) for argument reduction, as in anticausatives (e.g., Pol. łamać → łamać się ‘break. → ’); cf. Geniušienė (1987) and Plungjan (2011: 276–298). There are detailed differences, in part quite considerable, among the languages; but Slavic languages are typical representatives of languages with an ‘anticausative prominence’, which is considered by Haspelmath (2001) as an SAE-feature. However, the formation of verbs with the reflexive marker is relevant for grammaticalization only insofar as they participate in the expression of passives (§ 3.1.1). In East Slavic the former enclitic has been agglutinated as a postfix (compare Russ. smejat‘sja vs. Czech smát se or Croat. smijati se ‘laugh’), but this fact holds true for all other reflexive formations as well and cannot be regarded as a reliable indicator of grammaticalization (Wiemer 2004: 280–281). Participles are another salient “pan-Slavic” morphosyntactic feature (at least in the standard varieties), which have undergone considerable and areally unevenly distributed restructuring in historical time (Wiemer 2014a). Participles have participated in the formation of different analytical tenses (§ 3.4.2), of passives (§ 3.1.1) and provided the input to the rise of gerunds (§ 5.4). Furthermore, Slavic languages are characterized by a sparsely developed mood system, since after the CS period an older paradigm of conditional/subjunctive had declined and the general subjunctive marker by (South Slavic bi), combined with the l-suffixed anteriority participle, spread throughout the entire territory. Various analytical grams with hortative, permissive or optative functions have been emerging since then (§ 3.1.2, § 3.5). Finally, the whole group has always represented SVO-languages, but since word order predominantly follows communicative principles, it has not been subject to serious “syntacticization”. A slight exception is (standard and colloquial) Upper Sorbian, where the order SOV occurs strikingly often in both dependent and main clauses. This frequency phenomenon has most probably been established mirrowing SOV-order typical in German dependent clauses (Scholze 2008: 321–323). An uneven areal distribution of conservative and innovative features can furthermore be observed for the following phenomena:4 (i) South Slavic as well as  On the inner-Slavic spread of relativization strategies on a European background cf. Danylenko (2014).

Grammaticalization in Slavic

251

Czech are conservative in retaining pronominal clitics and integrating newly arisen auxiliary clitics (in Zwicky’s sense of ‘special clitics’), while East Slavic has lost all such clitics, either by dropping them (e.g., during the perfect > past shift) or by agglutinating them (e.g., the reflexive marker {sja} in East Slavic has become a postfix); Polish is transitional in this respect. One would therefore assume that South Slavic might have better preconditions for typical grammaticalization paths (running through coalescence); this however has happened only to a very limited extent. (ii) All in all, the western half of Slavic is richer in auxiliaries of different domains (TAM, modality, voice, causatives); ‘inexpectative ’ (§ 3.3.2) might be a remarkable exception. (iii) East Slavic is most conservative in the use of the infinitive on the level of simple and complex sentences. By contrast, Polish demonstrates a certain preference for verbal nouns. This might speak in favour of a repeatedly claimed predilection of Polish for nominalizations (in disfavour of the infinitive). In turn, it has been claimed that Czech shows a preference of finite structures in the combination of clauses (Weiss 1983; Lotko 1997: 10–11, 61). Concomitant with “infinitive prominence” is the relative frequency of non-canonical subjects (mostly in the dative) in East Slavic compared to West Slavic (this shows up particularly with modal auxiliaries; § 3.3.1). (iv) The eastern part of South Slavic (= Balkan Slavic, i.e., Bulgarian, Macedonian and the East Serbian Torlak dialects) has lost the infinitive (with relics in Bulgarian and the Torlak dialects). This has had fundamental consequences for the syntax (§ 3.7.1, § 4.1.1). (v) It is only these languages in which the system of past tenses inherited from CS has been entirely preserved, and inasmuch as the perfect has become extended into indirect evidentiality (§ 5.1) it has even been paradigmatically strengthened there. (vi) Moreover, it is only these languages which have altogether lost morphological markers of cases in nouns (not in pronouns). As with the loss of the infinitive and the retention of the old past tenses, loss of cases finds its equivalence in an inner-South Slavic cline: these features decrease from Bulgarian (in the Southeast) via Serbian-Croatian up to Slovene (in the Northwest). Whether these features and the corresponding processes are causally connected or not remains an open question; however, all three belong to known features of the Balkan League, (vii) as do definite articles. However, articles (definite and indefinite ones) either have been emerging or have already been fully established also in regions where Slavic languages show the full inventory of morphological cases (and differences in the systems of tense, including perfects and resultatives). We may generalize that articles have developed only in regions where Slavic speakers came into contact with speakers of languages that already had article systems (Greek, Albanian, Italian, German); see § 2.2. (viii) Animacy distinctions have resulted in differential object and subject marking (DOM, DSM) to varying degrees; it is much better established in North Slavic (with much internal differentiation not caused by any perceivable influence from non-Slavic neighbors), while Balkan Slavic has developed clitic doubling as another strategy of DOM, based on the specificity of referents (§ 3.6; see also ex. 5), in accordance with other Balkan languages. This

252

Björn Wiemer

latter type of DOM represents a rather typical case of grammaticalization, whereas animacy-based DOM (and DSM) cannot reasonably be related to grammaticalization parameters, being rather an example of the restructuring of available case(-gendernumber) markers. The same, mutatis mutandis, applies to quantification-based DOM and DSM ( vs.  of indeterminate quantity), which is rather unevenly scattered over Slavic.5 In addition, Russian (and, to a large extent, the rest of East Slavic) is known for its avoidance of a -verb in predicative possession, although such a verb was inherited from CS (Russ. imet’ < iměti). This avoidance is transferred to all alignment patterns and the formation of analytic tense-aspect paradigms (resultatives and perfects; § 3.4.2). The adessive pattern with u ‘at’ +  for the possessor, byt’ ‘be’ and the possessum in the nominative (genitive if negated) is another pattern of predicative possession known as ancient in Slavic (Grković-Major 2011: 42–45). Its prominence in Russian can most plausibly be explained as a Finnic substratum (Veenker 1967), which enhanced the frequency of this pattern. The remainder of Slavic makes ample use of -verbs for all aforementioned grammatical purposes. Together with this, Russian represents the extreme northeast on an inner-Slavic cline on which the -verb (both as copula and as existential verb) is avoided in the present tense indicative, where Russian omits it altogether; only one form of the paradigm is left (est’ < 3.) and used only for marked focus. Although language contact as a catalyst of change and spread of innovations (or of minor usage patterns) is not in the focus of this volume, I will account for possibly contact-induced grammaticalization. If appropriate, I will use the distinction of PAT- vs. MAT-borrowing following the terminology and framework of Matras and Sakel (cf. Sakel 2007).

 Grammaticalization of nominal categories From time immemorial, Slavic has had paradigms of portmanteau morphemes marking gender and number (distributed over different declension classes). This includes case, with the exception of Balkan Slavic which has lost case altogether. These morpheme sets have always belonged to the core inventory of Slavic grammar, as a rule they can be clearly separated from stems, but it is their highly fusional nature and old age (reaching into PIE) which preclude any reconstruction of their source expressions and, consequently, an analysis with the aid of grammaticalization parameters.

 An uneven distribution can be observed according to the following criteria: (a) whether DOM is accompanied by DSM (as in West Slavic), (b) the degree to which  vs  choice has become a lexicalized feature of verbs, (c) in general, the range of verbs affected by an – alternation, and (d) the reliability which which  replaces  (if there is a choice at all).

Grammaticalization in Slavic

253

. Possession In South Slavic, we observe how enclitic dative pronouns marking external possessors have been turning into NP-internal markers of inalienable possession. The phenomenon is well established in Balkan Slavic, to the extent that the same dative clitic can, or even must, be used twice if it also happens to mark a dative-argument. See an example from Nitsolova (2014: 34): (1)

Bulgarian [[Majk-a]N mu]NP mu prigotv-i zakuska-ta. mother- him. him. prepare[]-.3 breakfast-.. ‘His mother prepared him breakfast.’

Identical examples could be supplied for Macedonian and the East Serbian Torlak dialect (Mitkovska 2011; B. Stanković, forthcoming). The rise of dative possessive clitics is closely connected (both in chronology and causally) to the merger of dative and genitive in Balkan languages and to the establishment of clitic clusters (apart from shifts in word order). Obviously, this rise was favoured by bridging contexts in which pronominal dative clitics occurred in the immediate post-verbal position and could be interpreted either as markers of affected participants or as inalienable possessors, or simultaneously as both (doubly bound datives). In Old Church Slavonic, such contexts were supplied by verbs that implied affected participants rather frequently; see structures such as in (2a). One could even find post-verbal dative enclitics with object-NPs marked with possessive pronouns (as in 2b). (2)

Old Church Slavonic (cit. from Krapova and Dimitrova 2015: 187) a. otъdad-ętъ ti sę grěs-i forgive[].-3 2.  sin-. (Mt. 9:2, Cod. Sab.) b. ōtpuštaj-ǫtъ ti sę grěs-i tvo-i forgive[].-3 2.  sin-. 2.-. ‘Your sins will be forgiven.’ (Mt. 9:2, Cod. Zogr., Assem.)

The grammaticalization process consisted in the attachment of the enclitic pronoun to the object-NP (see examples 1, 3b, 4b), its increase in predictability and an extension to possessum types down an inalienability hierarchy (on which cf. Nichols [1992: 160–162]): body parts > kinship terms > extended kinship (e.g., shirt, house, bread) > abstract properties. Simultaneously, the dative enclitic has kept its function as an external possessor (or rather as a marker of an affected participant), to an extent that the position may now disambiguate these two functions if the predicate

254

Björn Wiemer

implies an affected participant. Thus, in modern Bulgarian post-verbal use of the dative enclitic is interpreted as an external possessor, whereas post-nominal use can only be understood as an NP-internal possessor (compare ex. 3a–b). With “nonaffective” predicates either position is possible, but since the external possessor reading makes no sense, the dative clitic invariably marks the NP-internal possessor even if it is extraposed in immediate post-verbal position (compare 4a–b); cf. Krapova and Dimitrova (2015). This comparison shows that syntactic tightening has occurred only to a limited extent 6 and that variation is meaningful depending on the semantic verb type. (3)

Bulgarian a. otkradna-xa mi portmone-to steal[]-.3 1. purse- ‘They stole my purse on me.’



b. otkradna-xa portmone-to mi steal[]-.3 purse- 1. ‘They stole my purse.’

(4)

a. zna-m mu adres-a know[]-.1 him. address-

=

b. zna-m adres-a mu know[]-.1 address- him. ‘I know his address’. (cit. from Krapova and Dimitrova 2015: 183)

Moreover, the post-nominal position of the NP-internal dative enclitic can be overridden by the requirements of clitic clusters that apply pre-verbally. See the following Macedonian example in which the reflexive clitic si in relational terms “belongs” to the object-NP greški-te ‘sins-’, although its linear position places it in a cluster of verbal proclitics. (5)

Macedonian Isto taka si gi zna-m grešk-i-te also thus . 3- know[]-.1 sin--. i kolku učam od niv. (Antena, 12. 05. 2000: 29) ‘Also, I know my errors and how much I learn from them.’ (cit. from Mitkovska 2011: 94)

 Alternatively, one may say that the Wackernagel-properties of the enclitic vary between NP- and clause (or VP) level. This distinguishes them from the fixed NP-internal Wackernagel position of the definite article (see § 2.2.1).

Grammaticalization in Slavic

255

In fact, as such examples demonstrate, the tendency toward clitic clusters7 can create effects that run counter to grammaticalization, since they inhibit NP-internal syntactic strengthening (i.e., increase of bondedness). Apart from that, the inalienability hierarchy (see above) is not always followed; for instance, in Macedonian and East Serbian Torlak dialects post-nominal dative enclitics are used obligatorily with names of close kinship, but not with names of body parts (Mitkovska 2011: 85; B. Stanković, p.c.). In the remainder of South Slavic this way of marking inalienable possession is less systematic, unproductive and lexically limited, and it has been considered archaic (Mološnaja [1986: 180–181] on Croatian/Serbian), inasmuch as NP-internal dative possessors were quite common already in OCS (see above), as they were in other ancient IE languages (Pancheva 2004: 184–186). However, among these dative possessors one could find also non-clitic pronouns and nouns, so that, to some extent, the grammaticalization process in Balkan Slavic meant ousting these forms in NP-internal position. This inner-(South) Slavic areal cline neatly corresponds to the rise of object-doubling by enclitics (for which see § 3.6). This is expectable, since “possessor-marked” NPs are inherently definite, they are, thus, endowed with the definite article in Balkan Slavic (see ex. 3–4). Both clines – NP-internal dative enclitics and object-doubling by clitics – inscribe well into general areal clusters in Europe (cf. Dimitrova-Vulchanova 2000; Friedman 2008), while in languages further to the northwest these features are unknown or have been on their retreat (for a recent account of West Germanic and Romance concerning possessor marking cf. Van de Velde and Lamiroy [2017]).

. Determiners Demonstrative pronouns behave like 1st/2nd person or reflexive possessive pronouns, both as concerns their morphological paradigms and in that they function as attributive agreement markers in relation to their heads. If indefinite articles have been developing in Slavic, the source expression has always been the numeral ‘1’. However, the numeral had first to acquire properties of indefinite pronouns (§ 2.2.2). Wherever articles or article-like usage of pronouns or of the numeral ‘1’ has evolved, they inflect for number and gender (outside Balkan Slavic additionally for case). Definite articles in South Slavic are enclitics that abide by the Wackernagel rule NPinternally (see § 2.2.1), indefinite articles are usually non-clitics posed before the entire NP. Exceptions are those isolated Slavic dialects that have been under allembracing contact with Italian dialects for centuries, Resian (NE-Italy, Friuli) and Molisean (Appulian mountains); cf. Steenwijk (1992), Benacchio (2014) and Breu

 Clusters of preverbal proclitics are a prominent feature of Balkan Slavic, which developed in historical time.

256

Björn Wiemer

(2012), respectively. These dialects have a proclitic indefinite article. Resian also has a definite article (obviously proclitic as well), whereas Molisean lacks a definite article (for an explanation cf. Breu [2012: 307–309]).

.. Definite articles All languages of the western half of Slavic show a salient use of the demonstrative pronoun (< *t-, deictically neutral) in recognitional function, which is unusual in East Slavic.8 This usage type is (like anaphoric, deictic use) conditioned by the pragmatics of discourse, or text coherence, and can be considered as the last stage before definiteness starts being marked on nouns that are definite by virtue of their semantics (e.g., all unica and relative nouns) or because of associative links based on metonymic or frame relations (as, e.g., in We entered a/the restaurant, the servant came at once and gave as the menu …). Definiteness marking of generic NPs seems to develop last (if at all).9 On the backdrop of this global split of functions on the way to definite articles, even in Colloquial Upper Sorbian (CUS) the demonstrative pronoun () is used obligatorily only in contexts of pragmatic definiteness, while with semantically definite NPs this pronoun is largely optional, although it tends to be used. This also concerns different subtypes of generic use both in CUS (Scholze 2008: 157–181; 2012) and in Resian (Benacchio 2014: 210). In Balkan Slavic, the definite article always attaches enclitically to the first word of the phrase. In this respect it abides to general clitic rules of these languages; however, contrary to enclitic dative pronouns used as NP-internal possessive markers (see § 2.1), the definite article cannot be “extracted” from its NP and, thus, shows a lower degree of syntagmatic variability. Macedonian has a threefold system of articles based on the close–distal–neutral distinction of demonstrative pronouns (-ov, -on, -ot), which is used in narrative perspectivization (Topolińska 2006; Sonnenhauser 2010). The preservation of this contrast (whether spatial or non-deictic in narrative contexts) testifies to a retention of intraparadigmatic variability motivated by the original deictic distinction and therefore diminishes the degree of grammaticalization. For other details of Balkan Slavic cf. Mišeska Tomić (2006).

 Cf. Berger (1993), Wiemer (1997: 194–202), Trovesi (2004): This function includes proper names, but it can also be found in emphatic comparative predicative use as in Pol. Pracuje jak ten wół lit. ‘S/He is working / works like this ox’ (Mendoza 2004: 272). This usage reminds us of the exploitation of ‘1’ in predicative (usually pejorative) comparisons (see § 2.2.2).  Cf. Himmelmann (2001) for a typology and stages. Breu (2012) and Scholze (2008, 2012) basically followed him in their accounts on Molisean and Colloquial Upper Sorbian.

Grammaticalization in Slavic

257

.. Indefinite articles The development of indefinite articles out of ‘1’ has been accepted to run through the following functional stages (adapted on the basis of Givón [1981] and Heine [1997]): ()

(i) > ‘’

(ii) > presentative

(iii) > specific-indefinite

(iv) nonspecificindefinite

>

(v) generic.

This cline interferes with functions of ‘1’ in predicative and comparative use (‘X is / acts like (a typical representative of) Y’), the latter being affine to stages (iv–v). In both usage types, ‘1’ is employed to emphasize or contrast some referent on the background of a larger set of likes. A prescriptive or evaluative function (A British citizen admires the Queen; He behaves like a fool) proves particularly inducive for the development of ‘1’ into an indefinite article. Moreover, discourse-pragmatic notions like noteworthiness and referential continuity (Wright and Givón 1987) have been important factors triggering this development (see below). The degree of predictability of ‘1’ as an NP-modifier differs from language to language. The same applies to the decrease of discourse-pragmatic emphasis (i.e., rhetoric devaluation; [Dahl 2004]). A high degree of grammaticalization is associated with the extension of ‘1’ as a modifier of mass and plural nouns. Among all Slavic varieties, CUS seems to have reached the highest degree of grammaticalization, inasmuch as  ‘1’ is practically obligatory in generic NPs and even in non-emphatic (or: nonprescriptive) predicative usage. The next example demonstrates both usage types: (7)

Colloquial Upper Sorbian Jen tigor jo jene wulke zwěrjo. one.. tiger.[]-() be..3 one.. big animal.[]- ‘A tiger is a big animal.’

This also applies to ‘1’ in NPs with comparative function.  is excluded only from mass nouns and does not occur with inalienably possessed parts, e.g., of the body (Scholze 2008: 150–155; 2012). More incipient stages of an indefinite article on the basis of the numeral ‘1’ have been pointed out for the remaining West as well as for South Slavic languages (Belaj and Matovac 2015). In all these cases the development of ‘1’ stops short at stage (iii); compare, for instance: (8)

Croatian (Belaj and Matovac 2015: 4) Ču-l-a sam to od jedn-e prijateljic-e. hear[]-_-. be..1 this. from one-.. friend[]-. ‘I heard it from a [lit. one] (female) friend.’

258

Björn Wiemer

It seems stage (iv) has been reached only in South and West Slavic varieties in close contact with Italian and/or German. Consider, for instance, Istrian Čakavian (more se, ja, reć jenemu Žminjcu ‘yes, you can say that to a person from Žminj’ [Kalsbeek 2011: 144]), Molise Slavic (Breu 2012) and Resian (Benacchio 2014). As a common denominator, stages (iv–v) are reached, but the use of ‘1’ is still not obligatory. The seemingly neat picture presented by the cline in (6) is “spoilt” by some recurrent observations. Thus, Macedonian and Bulgarian ‘1’ demonstrates advanced stages toward the indefinite article, as it functions as predictable markers of specificity. Bg. edin can be used even with generic subject NPs, while Mac. eden is prominent in emphatic comparative use (Weiss [2004] for Macedonian, Geist [2011: 142–143] for Bulgarian). The latter can however be encountered also in languages with an otherwise less advanced development of ‘1’, such as Croatian/Serbian or Polish. Moreover, ‘1’ has been observed in predicative NPs even in a language for which stages (iv–v) in referential (argument) NPs exclude use of ‘1’. Compare (9) from Croatian with (7) from CUS and (10) from Bulgarian (similarly for Polish in Mendoza [2004: 318]): (9)

Croatian (Belaj and Matovac 2015: 9–11) (*Jedan) Lev je jedn-a velik-a one... lion[]-(.) be..3 one-.. big-.. životinj-a animal[]-(.) ‘A/The lion is a big animal.’

(10) Bulgarian (Geist 2011: 139, with reference to Ivanova and Koval’ 1994: 59) Petăr e (?edin) učitel  be..3 one... teacher[]-() ‘Petăr is a teacher.’ In general, the use of ‘1’ becomes more likely if the predicative noun implies a (usually pejorative) evaluation (e.g., Bg. Ivan e edin glupak / e edin učitel! ‘Ivan is such a fool / such a [sc. bad] teacher!‘). The same holds true for Pol. jeden (normally postposed: Co za idiota jeden! ‘What for an idiot!’). Moreover, ‘1’ has been observed to share a property with indefinite pronouns, which might be responsible for nascent stages toward an indefinite article: these determiners often serve as discourse pragmatic means of introducing referents which will become salient in the subsequent discourse.

. Other phenomena Contrary to case-gender-number portmanteau morphemes inherited from PIE (and repeatedly redistributed since then), Slavic innovated a separate inflection type for

Grammaticalization in Slavic

259

adjectives, which can be considered a paradigm example of grammaticalization. The new inflection resulted from coalescence: enclitic third-person pronouns of the inherited IE *j-class agglutinated and then, to some extent, fused with former endings of the nominal inflection (Pohl 1980; Wiemer 2011b: 741–742). For this reason, this phenomenon is called pronominal inflection (a.k.a. “long adjectives”, as a shortcut). See an example in a simplified schema (with intermediate steps omitted): (11) CS

*dobr-ě jeji (žen-ě) > dobr-ěji (žen-ě) ‘[the good (woman/wife)]...’ > Russ. dobr-oj žen-e, Pol. dobr-ej żon-ie, Czech dobr-é žen-ě, etc.

Since this process has affected the entire group uniformly, it must have started not later than during the CS period. It led to a clear distinction of declensional paradigms of adjectives against nouns, from which they had been weakly set apart morphologically (originally most forms coincided). Initially, the distinction was associated with (in)definiteness (or rather: (non)specificity), but since the old nominal (“short”) paradigm has been lost in Northern Slavic and Slovene (with lexicalized relics), this opposition as such has survived only in South Slavic (except Slovene).10 Notable exceptions in North Slavic are standard Russian and standard Czech (spisovná čeština), where nominal forms are still used in predicative use. However, this use is largely very idiosyncratic, and its maintenance is only due to South Slavic (namely OCS) influence “from above” (as for Russian) or to a deliberate revival of more ancient stages of the language (as for Czech).

 Grammaticalization of verbal categories . Voice/valency Since CS, Slavic languages never had productive applicative or causative morphology, but analytical causatives have evolved (§ 3.1.3). The reflexive marker, an enclitic in West and South Slavic and agglutinated in East Slavic, has always served as a device in the reduction of arguments or their rearrangement on the syntactic level; it acquired the status of a passive marker and has retained it in modern Slavic languages (except Polish). An alternative device of marking the passive is participles

 The opposition still exists in Baltic, with the nominal forms being the default (conversely to the development in Slavic). The rise of the pronominal adjective declension in Baltic can be explained as a parallel, but independent development, based on the same IE premises (Pohl 1980: 77–78). In modern Baltic the functions of pronominal adjectives have been undergoing further functional expansion (Holvoet 2012).

260

Björn Wiemer

with the n/t-suffix oriented toward the lowest-ranking semantic argument.11 The grammaticalization of passives may thus partially be characterized by the distributional relation of these alternative marking devices in the particular language (§ 3.1.1–2). In general, passives should be understood as grammatical converses of unmarked (‘active’) voice, which means that a full-fledged passive does not change the argument hierarchy of a predicate, but manipulates their coding in the (morpho)syntax (Haspelmath 1990: 26–27; Kazenin 2001: 899, 903). The assessment of passives in terms of grammaticalization is a particularly complex task, since passives are a paradigm example of a parasitic category: their rise involves several components each of which usually has its own history as part of separate diachronic changes. Thus, only the convergence of various individual changes on verb phrase and clause level lead to a full-fledged passive (Wiemer 2004, 2011a: 535–537). Consequently, the structure and rise of passives in individual languages have to be looked at from different angles at once: (i) morphological changes in the lexical verb (which are intertwined with, or restrained by, other verbal categories); (ii) auxiliaries; (iii) oblique coding of the highest-ranking (= most agentlike) argument, a.k.a. ‘(oblique) Actor phrase’. In addition, one should account for possible grammatical or semantic restrictions indicative of an expansion after the first steps toward grammaticalization are accomplished. This survey will address only passives with a nominatival subject (triggering number and, in the past, gender agreement in the finite verb; Foregrounding Passives), while passives without such a privileged syntactic argument (Backgrounding Passives, a.k.a. ‘subject impersonals’, e.g., Plungjan [2011: 293–298]) will be left aside (see however § 5.3).

.. Passives Slavic Foregrounding Passives (FPs) closely resemble German passives in that the promotion to a subject-NP is strictly ruled by morphological case. Promotion is practically restricted to the most patient-like argument, which is coded in the active with the accusative. I will call this type ‘direct passive’ (§ 3.1.1.1) to distinguish it from another FP which typically occurs with ditransitives whose second object argument (in the active) is marked with dative and can normally be characterized as recipient, beneficient/maleficient or goal. This type is known as ‘recipient passive’ (§ 3.1.1.2). It occurs only in analytic formations with participles.

 Participles with the m-suffix (“present passive”) early lost productivity and, if they still occur, they are either lexicalized (e.g., Pol. wid-o-m-i ‘those who are able to see‘) or belong to registers influenced by OCS (particularly in standard Russian). Cf. Wiemer (2014a: 1629–1631), also for a discussion of the role of voice-orientation of originally voice-indifferent deverbal adjectives.

Grammaticalization in Slavic

261

... ‘Direct passives’ In practically all Slavic languages reflexive marked (= RM) passives and participial passives (n/t-participle + ) co-occur, but with uneven distribution over perfective vs. imperfective aspect (on which see § 3.2.2). RM-passives tend to occur with imperfective verbs, while the participial passive is more evenly distributed, but if anything, it shows more propensity to perfective verbs. The details of the diachronic development have remained understudied.12 Practically complementary distribution (imperfective ⊃ RM-passive, perfective ⊃ participial passive) is reached only in standard Russian; it must have stabilized only recently: we can find RM-passives with perfective verbs during the 19th century (see ex. 12), and there have been debates on the status of utterances like (13), in which RM-marked perfective verbs seem to read like passives (e.g., Percov 2003). However, such utterances are either obsolete, or they do not allow an agent-NP (in the instrumental) to be added anyway, and there is an additional nuance of capability or dynamic possibility behind the factive statement. (12) Rubašk-a tvoj-a takže k tebe prišle-t-sja. shirt[]-. ... also to 2. send[]-.3- ‘Your shirt will also be sent to you.’ (N. I. Turgenev, letter to S. I. Turgenev, 4/24/1820). (13) Pis’m-o raspečata-l-o-s‘ (*printer-om). letter[]-. print_out[]--- printer- (≈ uda-l-o-s‘ raspečata-t‘) succeed[]--- print_out[]- ‘The letter has got printed out (*with/by a printer).‘ (≈ ‘The letter was successfully printed out.‘) Actually, in this case we are dealing with a factitive equivalent to the so-called modal passive (as in Germ. Dieses Fahrrad läßt sich leicht fahren ‘This bike rides/can be ridden easily’). These constructions may be called ‘RM-marked facilitatives’, following Holvoet, Grzybowska, and Rembiałkowska (2015), who have supplied a finegrained semantic map on the basis of Baltic and Polish. In fact, such voice operations are rather widespread, although to a different extent, in Slavic languages. They belong to a larger area in Europe in which facilitative RM-constructions are common and which seems to coincide with the area of anticausative prominence (see § 1).

 Cf. Wiemer (2004: 279–286) for an outline concerning Northern Slavic. RM-passives have been associated with OCS influence (Borkovskij 1968: 34; Vlasto 1988: 180, among others). The classical work by Havránek ([1928] 1937) contains a wealth of data yet to be considered thoroughly in a stricter theoretical framework.

262

Björn Wiemer

For modern Russian we may conclude: allomorphic variability has become practically entirely predictable and connected to aspect and, thus, been paradigmatically tightened to a maximum. Polish ousted the RM-passive at the turn from the 19th to the 20th century (Bajerowa 2000: 56). Allomorphic variation was tightened as well, but via complex interaction between an auxiliary contrast with the aspect of the n/t-participles (see below). Regardless of other factors that might strengthen the passive in the grammar of particular Slavic languages, the RM-passive systematically intersects with anticausative readings (in either aspect), while the participial passive often proves indistinguishable from the object-oriented resultative (especially with perfective stems). These ambiguities can usually be resolved only on the basis of the particular context. Let us come to passive auxiliaries. Their spread is clearly skewed. We find them only in West Slavic, for which different varieties of German supplied the most likely models.13 Only Polish, Kashubian and Sorbian have auxiliaries of direct passives that are comparable to Germ. werden used in the so-called dynamic (or eventive) passive. Such PAT-borrowings can be noted for verbs which initially meant ‘remain’, like Pol. zostać or Kash. ostac (Nomachi 2012b: 122–123). We encounter MATborrowed passive auxiliaries in non-standard varieties of Upper and Lower Sorbian (Wiemer and Giger 2005: 101–102; Scholze 2008: 197–200). Extinct Polabian (a.k.a. Slovincian) and Pomeranian supply a probable case of polysemy copying: apart from its functions as a future auxiliary, the initially inchoative BE-verb *bǫdǫ was also used as present tense passive marker; this passive–future polysemy of the auxiliary is unique in Slavic, and it corresponds to standard German in which the passive und the future auxiliaries are homophonic (werden); cf. Wiemer and Giger (2005: 83) for references. This polysemy was noticed also after World War II in Kashubian and in Polish dialects of Eastern Mazuria (Urbańczyk 71984: 52). Thus, varieties of German are likely to have supplied a model, although it needs to be clarified from which varieties and, moreover, which mode of communication prompted polysemy copying of this kind.14 Pol. zostać has developed from a meaning ‘remain’. It is most likely that the process was initiated by a PAT-borrowing in the 15th–16th century from Low German blivan ‘remain > become’, before it eventually established itself as an auxiliary of the passive during the 19th century, most probably under influence of New High German. Part of this process, in particular a trigger for the aspectual shift ‘remain > become’, must be regarded as an extension of an initial copular and existential use (with nouns, adjectives or PPs) into collocations with unnegated n/t-participles

 Remarkably, inchoative copula verbs have persisted throughout, but they have not served as sources of passive auxiliaries (Wiemer [1998] on Polish and Russian).  Despite the clear areal bias and historical circumstances, which speak in favor of German as a model, a problem remains the fact that in dialectal and spoken German the werden-future plays only a marginal role.

Grammaticalization in Slavic

263

(Weiss 1982; Wiemer 2004: 298–304; Wiemer and Hansen 2012: 91–95). A combination of grammatical properties of this auxiliary makes standard Polish unique within Slavic and in larger areal terms: as an auxiliary, perfective zostać is fully integrated into the tense-aspect system (just like its lexical source verb, which has persisted), it has an imperfective counterpart (zostawać), admittedly rare and used only in non-actual present tense (Górski 2008: 50, 68). In terms of admissible lexical input zostać proves less grammaticalized than Germ. werden: the restriction to narrowly telic verbs and speech act verbs has remained rather strict (Wiemer 2004: 298–304, also Weiss 2009: 132–133). These restrictions virtually do not exist for CUS hodwać (< wordować, MAT-borrowed from Germ. werden). On the other hand, CUS hodwać does not enter into aspectual pairedness and can also be used in impersonal passives (Scholze 2008: 198–199). In comparison to Pol. zostać this shows that criteria on the degree of grammaticalization need not correlate (Wiemer 2017a: 136–138). Finally, obliquely marked Actors in Slavic are of different origins: (i) bare instrumental, (ii) ablative PP (ot/od ‘from’ + ), (iii) adessive PP (Russ. u ‘at’ + ), (iv) perlative (Pol. przez ‘through’ + ). (i–ii) are much older, and they still cooccur as oblique Actor markers in Czech and Croatian/Serbian, whereas (iii) has evolved into a genuine oblique Actor marker only in NW-Russian dialects,15 and (iv) became one only in standard Polish. No marking device is specialized on the oblique Actor, each of them is part of a larger network of meanings. In sum: intraparadigmatic variability varies between the languages (it is lowest in standard Polish, CUS and standard Russian, since there is only one possibility), but each marker remains highly polyfunctional.

... Recipient passives Recipient passives are amply attested in West Slavic, but largely absent in the remainder of the Slavic languages. Thus, the vicinity to German does not seem to be accidental in this regard. However, Polish shows a different pattern compared to the other West Slavic languages, and there it is not obvious that German exerted an influence (see below). Unambiguous recipient passives based on a -verb are apparently a rare phenomenon, but they are a salient feature of German, where bekommen and kriegen ‘get, receive’ have gone some way toward auxiliation in combination with passive participles. All West Slavic languages except Polish have replicated this model, together with perfective n/t-participles, either as PAT-borrowing (Czech, Slovak, Upper and Lower Sorbian, Kashubian, † Slovincian) or as MAT-borrowing from kriegen (Upper and Lower Sorbian).16 However, PAT-borrowings are

 Seržant (2012) argues for North Russian dialects to have conventionalized the adessive agent phrase (u+) directly in the new perfect arisen there with n/t-participles (§ 3.4.2), i.e., not as part of (or via) a passive construction.  E.g., krónć/krynć < krydnuć in CUS (Scholze 2008: 201–202). Cf. Nomachi (2012b) on Kashubian, Giger (2003b, 2004, 2012) on the other West Slavic languages.

264

Björn Wiemer

likely to have occurred also in those South Slavic varieties that have experienced most intensive contact with German, namely in Slovene and Burgenland Croatian (Nomachi 2012b: 110–111, 119–120). The recipient passive entered and spread in Sorbian, Czech and Slovak probably in different ways, which correlates with their relative age and intensity of contact with varieties of German (Giger 2012). As for Kashubian, Nomachi (2012b: 134) concludes that the recipient-passive construction with  (dostac) was PAT-borrowed from German “as a ready-made product”, i.e., polysemy copying seems to be the most plausible explanation. For many cases, the complicated relationship between (probably contact-induced) grammaticalization and polysemy copying needs yet to be determined. Until recently, recipient passives with auxiliaries other than  have been neglected, although in all West Slavic languages there is a construction with a auxiliary, and it is by no means a marginal phenomenon (Bunčić 2015; Wiemer 2017a: 153–157, with further references). See an example (cit. from Sawicki 2011: 71): (14) Polish Ząb-k-i mia-ł ogląda-n-e w czerwc-u. tooth-- have[]-(-3.) look_at[]-- in June- ‘He had his teeth examined in June.’ Similar readings are possible with a cognate construction in Serbian, where its use seems to underlie more severe lexical restrictions (Nomachi 2012a: 94). In the other West Slavic languages -based recipient passives are claimed to exist in parallel with -based recipient passives. Note that (at least in Polish) the -based recipient passive allows for imperfective n/t-participles (see last example), in contrast to -resultatives in which imperfective n/t-participles (if they occur at all) just “copy” the function of their perfective counterpart (Wiemer 2017a: 153–157); see § 3.4.2. -based recipient passive and -resultatives have often been confused. The semantic link between -resultative and -recipient passive consists of the fact that the subject of the clause does not denote the agent 17 and in either case the -verb has been auxiliarized to a very low degree. However, if the participle of the -recipient passive is of imperfective aspect, temporal adverbials can focus on the time of event (see the last example); this construction thus fully equals the range of temporal functions of the active.18 From this comparison we can also infer that the consolidation of recipient passives (against -resulta-

 A similar case can be observed with Engl. get (whose “homonymy” is even vaster, as it often includes a causative and active reading as well; compare, e.g., I’ve got the homework done).  Otherwise ipfv. n/t-participles and temporal adverbials can highlight habitual events (or activities). This parallels the system of the Polish passive, which is even more differentiated than the active voice, as aspectual distinctions are ruled not only by the aspect of the participle, but also by the choice of the auxiliary (zostać vs. być); cf. Lehmann (1992) and Górski (2008: 48–50, 64–65).

Grammaticalization in Slavic

265

tives) has been conditioned by the interaction with the aspect opposition (on which see § 3.2), i.e., from categorial restrictions and, thus, paradigmatic tightening. Bunčić (2015: 421–423) discusses some parallels as well as areal and diachronic connections and plausible models of PAT-borrowing of the Polish -based recipient passive, whose first attestations reach back to the early 15th century (Mendoza 2013).

.. Analytical causatives Analytical causatives and related constructions have developed in practically all of the Slavic standard languages on the basis of the ditransitive verb ‘give’ (()), but there is a pronounced cline, both in functional range and token-frequency, with a decrease from West to East. First, consider the functional gradient in (15), which includes causative subtypes in the middle and a “post-causative” stage at the end:19 () lexical ‘give’ > permissive > reflexive-permissive > modal passive > passive  >   >    Russian has stopped short at the reflexive-permissive stage (Podlesskaja ), so has Bulgarian, while in the western part of Slavic we encounter more general causative functions. Czech has run through the entire cline and given up the permissive function (replaced by nechat ‘let, discharge’). Polish is intermediate between the rest of West Slavic and East Slavic in that it amply uses modal passives based on da(wa)ć+ (both in agreeing and non-agreeing patterns), but has not devel-

 This gradient does not include curative causation (‘have sb do sth’) and direct causation (manipulation, ‘make sb do sth’). Likewise, these are weakly developed at least in the Eastern part of Slavic (Levshina 2015: 509–510). Both functions can be conceived of as subtypes of factitive causatives (von Waldenfels 2012: 18–19; 2015a: 116). It is not clear whether factitive causatives belong to the cline in (15), i.e., what its relation to passive-like functions looked like in diachrony. Conceivable is a split to passive-like, on the one hand, and to factitive, on the other, after the permissive stage has been reached (cf. Lord, Yap, and Iwasaki [2002: 231–232] for a similar assumption for languages in Asia and Africa).  The permissive function is skewed for polarity as it occurs predominantly under negation. Presumably, this is an old feature (von Waldenfels 2012: 247; 2015a: 113).  Competition with other lexical sources of analytical permissives occurs also in other Slavic languages; compare, first of all, Bg. ostavam ‘leave, let’, dopuskam ‘allow’, Slv. pustiti, Pol. dozwolić ‘allow’, Russ. pozvolit‘ ‘allow’ (von Waldenfels 2015a: 114).  The non-agreeing pattern has given rise to a lexicalized ”formula“ (nie) da się / (nie) dało się+. If the infinitive is transitive the object shows the normal  →  switch after negation. The construction is based on the pfv. verb (dać), although it has become indifferent for aspect, since it can even be combined with the future auxiliary (e.g., (nie) będzie się dało ‘I will (won’t) be possible’); cf. von Waldenfels (2012: 171–174; 2015b). Correspondingly, it is restricted to circumstantial (a.k.a. dynamic) (non-)possibility (compare with Pol. potrafić and Cz. (ne)dovést ‘(not) be able to’).

266

Björn Wiemer

oped a genuine passive (in the sense defined in § ..). For details cf. von Waldenfels (, a). The degree to which functions on the right part of the cline in () have become more prominent seems to correlate with the possibility of introducing oblique Actor-phrases that are canonical for the respective language (see § ...). For instance, in Polish the dative is being replaced by przez+ in the modal passive: Dał się przekonać swoim przyjaciołom.. > przez swoich przyjaciół.. lit. ‘He let himself convince by his friends’. This testifies to something like analogical levelling in accordance with a more strongly grammaticalized pattern (here: the direct passive). Second, the aforementioned type-based trends are corroborated by token-based frequencies. Thus, a usage-based account shows that, for instance, with Bg. davam+da+Vfin the permissive function predominates (as it does with da(va)t’+ in Russian), whereas in Czech, Polish and Slovene the modal passive is the most frequent on the text level. Moreover, on a European scale, Slavic languages belong to the greatest “splitters” of subfunctions in the causative domain (while the Germanic ones, especially German and Dutch, are the greatest “lumpers”); cf. Levshina (). German lassen+INF has been considered as the source of the model, in particular with respect to factitive causation (‘X makes / has Y do Z’), but also for the widespread modal passive in West Slavic and Slovene (von Waldenfels a: ).

. Aspect After some background (3.2.1) I will summarize the basics about the most pervasive verbal category, common to all Slavic languages, i.e., stem-derivational aspect (3.2.2), and then give a few comments on presumably nascent stages of progressives (3.2.3).

.. The past domain In the past domain, the CS tense-aspect system resembled conservative Romance: there was an aspectual opposition of aorist and imperfect, accompanied by a perfect (see § 3.4.2). The imperfect was an innovation unknown even to closely related IE branches. Its rise can most probably be explained from analogical extension of derivational suffixes to the aorist stem. Its inflectional character is secondary and relies on a distinct set of personal desinences as the result of morphonological alternations at the end of extended stems (Wiemer and Seržant 2017: § 3.2.4–5). This innovation decayed early in the predominant part of Slavic and has persisted only in its most southeastern region, i.e., in Balkan Slavic.

Grammaticalization in Slavic

267

.. The stem derivational perfective : imperfective opposition In stark contrast to this, the “proprietary label” of Slavic as a whole, the opposition of perfective (pfv.) and imperfective (ipfv.) aspect, started in CS as well, but it has persisted and is still developing in the modern stages all over Slavic. The reason is that the pfv.: ipfv. opposition interferes with every level of verbal morphosyntax (see below). It is not an inflectional category, but, in its morphological core, built on regular and productive patterns of stem derivation. That is, almost every stem – regardless of whether it is inflected for tense and person-number or occurs in a nonfinite form – is in toto allotted to either the class of pfv. stems or the class of ipfv. stems. These classes, in turn, are constituted by functional oppositions among which we find the core properties of any system of grammatical (a.k.a. viewpoint) aspect – i.e., an opposition between eventualities presented as limited vs. unlimited in time, or an opposition between single and repeated events. These oppositions are consistently expressed by different, though morphologically related verb stems; new stems are derived by an additional suffix or prefix. See examples from modern Polish, for infinitives (16a–b) and inflected forms (17a–b); * marks off reconstructed forms:23 (16) Perfective/imperfective derivation with infinitives a. simplex imperfective → perfective by prefixation łowi-ć  → z-łowi-ć  catch- -catch- patrze-ć  → po-patrze-ć , observe- -observe- podoba-ć się  → s-podoba-ć się  please-  -please-  b. perfective stem by prefixation na-mówi-ć  -talk_into- prze-kona-ć  -persuade- s-po-strze-c  (< *s-po-streg-ti) --take_notice-

→ → → →

secondary imperfective by suffixation na-mawi-a-ć , -talk_into-- prze-kon-ywa-ć  -persuade-- s-po-strzeg-a-ć  --take_notice--

 Here only the most productive and salient patterns are illustrated. In some cases, suffixes are not added, but replaced. However, with one exception, replacement relations have become unproductive. The exception is the nasal suffix. For example, -ną- in Polish replaces -a-, but only for semelfactive (pfv.) vis-à-vis multiplicative (ipfv.) verbs; compare mach-a-ć vs. mach-ną-ć ‘wave’, dźg-a-ć vs. dźg-ną-ć ‘prod, stab’, etc. These suffixes are older than the suffixes used in productive additive patterns of prefixation and suffixation.

268

Björn Wiemer

(17) Perfective/imperfective derivation with finite forms of past and present a. simplex imperfective → perfective by prefixation  pis-a-ł-a ’ ‘she wrote, → na-pis-a-ł-a  ‘she wrote (up)ʼ was writing write---. -write---. pisz-ę  ‘I write, → na-pisz-ę  ‘I will write’ am writing’ (< *(na-)pis-jǫ) write.-.1 -write.-.1 b. perfective stem by prefixation roz-wiąz-a-l-i  -bind---.

→ →

secondary imperfective by suffixation roz-wiąz-ywa-l-i  ‘they tied/were tying off ’ -bind-- -. roz-wiąz-uj-ą  ‘they tie/are tying offʼ -bind--.3

In Slavic verbal morphology, this principle is pervasive. Both prefixation and suffixation can be combined and are able to focus solely on aspectual features without restrictions of tense stems24 or changes related to argument structure or valency (in contrast to practically all other European languages). Most of these prefixes and suffixes are clearly segmentable both from the original stem and desinences marking other categories, despite systematic morphonological alternations between stem and inflectional ending (see [17a] for the present tense stem) or allomorphy of suffixes (see [17b] for past/infinitive vs. present tense stem). This is why we end up with a binary classificatory system in which the morphological relations among the absolute majority of stems remain transparent.25 A large part of stems (if not their majority) is organized in pairs which are distributed over complementary functions so that they can, or must, substitute for one another as sort of “lexical copies” in clearly defined functions (see below). Even if stems are not organized in such pairs, their membership in either the pfv. or the ipfv. class is determined by restrictions to functions of different sets. That is, either class (pfv. : ipfv.) is characterized by sets of functions which have become, or tend toward becoming, complementary. Moreover, these sets have been increasing, as the opposition between pfv. and ipfv. verbs has been encroaching from actionality (as the core domain of aspect) into other domains such as modality or illocutionary force. In addition, the pfv. : ipfv. opposition started interacting more or less tightly

 Since CS times, Slavic has been characterized by a consistent opposition between present tense and infinitive (or aorist) stems.  Slavic is the best-known system of this type, although it is by no means unique. For a first attempt at a typology of classificatory aspect systems cf. Arkadiev and Shluinsky (2015).

Grammaticalization in Slavic

269

with other grammatical categories, such as tense, the imperative, the passive, or certain types of dependent clauses, sometimes to the extent that no choice is left or ipfv. verbs obligatorily replace pfv. verbs (or vice versa) in clearly defined contexts. For instance, – in modern standard Russian the reflexive-marked and the participial passive show complementary distribution over ipfv. and pfv. stems, respectively (see § 3.1.1.1); in East Slavic and Polish. verbs must replace pfv. ones in the narrative present tense; – in practically all Slavic languages the pfv. imperative is replaced by an ipfv. verb if the imperative is negated (and refers to the same single situation), e.g., Russ. Otkroj  dver’! ‘Open the door!’ vs Ne otkryvaj  dver’! ‘Don’t open the door!’. By contrast, verb forms used for preventive speech acts are regularly expressed by negation with pfv. stems (e.g., Russ. Smotri, ne upadi ..!, Bg. Pazi se da ne padneš ..2! ‘Be careful and don’t fall!’), cf. Wiemer (submitted: § 3) on the prohibitive--preventive split ruled by aspect choice across and its different consistency in Slavic languages; – in South Slavic, conditional and complement clauses with the ‘irreal’ complementation device da (if relating to single events) show a strong bias toward pfv. stems (for this complementizer see § 4.1.1). The more reliably the choice of pfv. vs ipfv. stem marks off contrasting values of stable functional oppositions – not only within actionality, but also beyond it – the more the stem-derivational aspect opposition becomes entrenched in the grammar of Slavic languages. We may regard this increase in entrenchment as indicative of an advanced stage of grammaticalization of the pfv. : ipfv. opposition. It is crucial to understand that productive stem-derivation and the classificatory properties mutually condition the grammatical character of this system. Prefixal perfectivation and consistent imperfectivation by suffixes go hand in hand tightly.26 The diachronic development of this system can hardly be captured in terms of commonly applied criteria of grammaticalization. A main reason is that the morphological inventory involved does not originate in lexical items: suffixes have partly been created from inherited PIE suffixes, partly new suffixes have been used after various morphological reanalyses; in turn, the rise of prefixes from lexical items (spatial adverbs, etc.) in principle corresponds to standard examples of grammaticalization, but this rise clearly predates the emergence of aspect. Thus, it can by no means be considered a sufficient condition of grammaticalization, it supplies only one among many premises, and the ultimately lexical origin of prefixes should not be overestimated as a factor in the evolution of the Slavic aspect system. Moreover, the net result of this development – from PIE over CS to modern times –  As for the role of prefixal perfectivation in the rise of aspect systems and the central position of Slavic in areal terms within Europe and its eastern periphery cf. the comprehensive investigation by Arkadiev (2014, 2015).

270

Björn Wiemer

is a decrease, rather than an increase, of morpho(no)logical opacity, which is quite unusual for grammaticalization processes (Wiemer and Seržant 2017). In any case, the diachronic story of Slavic aspect is a very untypical example of grammaticalization. It can better be described by parameters based on context extension and expansion within the lexicon (V. Lehmann 1999: 208). From among the known parameters in Chr. Lehmann (1995), only paradigmatic tightening can really be made a case for, in three respects: (i) the number of suffixes used for the derivation of ipfv. stems (see ex. 16b, 17b) has decreased and (ii) aspect choice has become much more conditioned by other categories (see preceding paragraph). Finally, the latter process has to be assessed in connection with the fact that (iii) the choice between pfv. and ipfv. stem has become obligatory (i.e., the category as such cannot be avoided), and this means that transparadigmatic variability has been reduced to a minimum. The functional expansion of the pfv. : ipfv. opposition into different domains has not come to a halt, and Slavic languages differ among each other as for the consistency and range with which this expansion has established itself. Many of these functional and distributional differences have been highlighted recently, to the extent that a West–East cline has been argued for (cf., first of all, Dickey [2000] and the critical evaluation in Fortuin and Kamphuis [2015]), which should be critically explored in its manifold details. A comprehensive treatment of the first stages of this evolution is provided in Wiemer and Seržant (2017), other details are discussed in Wiemer (2001, 2008a, 2014b, 2015a); see there also for further references.

.. Nascent progressives? It is striking that through all their diversification, Slavic languages have not shown any considerable tendencies toward developing progressives. Only two cases in statu nascendi can be figured out, which are located at opposite ends of the historically attested evolution of Slavic languages. The first case is rather occasional combinations of the past copula with the present active participle in OCS, e.g., bě uč-ę (= be..3 + teach....) lit. ‘he was (a) teaching (one)‘. It is disputable whether such combinations showed any degree of auxiliation, but at least one cannot just dismiss them as PAT-borrowing from Greek (Růžička 1963: 202–221), since they are attested in ancient stages of other Slavic languages (e.g., Old Polish, Old Czech, Čakavian Croatian) as well (for a thorough discussion cf. Večerka [1961: 70– 87]). The other, much more recent case is the Polish equivalent of the French être en train de faire-construction: być w trakcie +VerbNOM, e.g.:

Grammaticalization in Slavic

271

(18) Polish (NKJP; 2005, Onet.pl – Rozmowy) Jest-em w trakci-e pisa-n-ia kolejn-ej be[]-1 in course- write[]--. next-.. powieśc-i. novel[]-. ‘I am (currently) writing my next novel.’ In principle this construction can be freely formed from any dynamic ipfv. verb, but it has very low token frequency as a VP (i.e., in predicative use). One wonders whether French might be considered as a model of PAT-borrowing for this construction; at least register restrictions (literary speech) speak in favor of such a provenance. On the other hand, this construction fits well into the often claimed proneness of standard Polish for action nouns. No research has been conducted so far, but the fact that practically no other Slavic standard language has such a progressive construction strikes the eye. Moreover, a similar construction – -verb plus action noun with locative v ‘in’ – was identified in (now extinct) Slovincian by Piotrowski (1981: 40–41), who indicated the structural and geographic closeness to a regional German construction (am/beim -en sein ‘be at -ing’).

. Modality Modality as a conceptual domain in a narrow sense constrained by the two cognitive primitives (essity) and (ibility) (Van der Auwera and Plungian 1998) is dealt with in § 3.3.1, while another domain commonly associated with a wide notion of modality is treated in § 3.3.2.

.. Modality in the narrow sense (constrained by necessity vs. possibility) Modal auxiliaries constitute a young category all over Slavic, the only modals shared by all Slavic languages are successors of CS *mošti ‘can’ (< ‘be mighty’), which today is also the one most generalized over modal subdomains (dispositional, circumstantial, deontic, epistemic) as a POSS-operator. In the earliest genuinely East Slavic documents (the Novgorodian birch bark letters, 11th–15th centuries) moči is attested only for circumstantial (= dynamic) possibility. OCS mošti occurred only in circumstantial and deontic contexts, but already allowed for inanimate subjects (Hansen 2001: 248–250, 2003: 68, 72–73). The earliest attested Slavic varieties lacked dedicated necessity modals, in Russian they started developing only after moč’ in the 16th century (Hansen 2003: 73–76), and Germ. müssen ‘must’ was MAT-borrowed into West Slavic (Hansen 2000). In contrast to Germanic, Slavic modal auxiliaries do not show any particular morphology, but the category can internally be graded along a cluster of semantic,

272

Björn Wiemer

syntactic and categorial features (Hansen 2001, 2004). A modal cannot be used alone and combines only with the infinitive or, where the infinitive has been lost (in Balkan Slavic), with da as a generalized irrealis-connective (§ 3.7.1, § 4.1.1) and a restricted set of tense-aspect forms in the verb of the complement (cf. Wiemer 2014b on Mac. mora da ‘must’). Central members of the inventory of modals are used in more than one modal subdomain and they have lost their lexical source meanings (i.e., no layering has been preserved), e.g., Pol. musieć ‘must’, SCr. trebati ‘should’, Mac. mora (da), whereas more peripheral ones are still used in independent lexical meanings, e.g., Russ. dolžen ‘1. owe (money), 2. must’, Pol. mieć ‘1. have, 2. have to, should, 3. ’ (see 5.2), Cz. dovést ‘1. bring (up) to, 2. be (cap)able’.27 Most modals do not have a pfv. counterpart, and they regularly lack certain grammatical categories typical of “normal” verbs, such as the passive or person-number agreement. There is an inner-Slavic cline concerning agreement with the subject (= most agentlike argument): modals which lack nominatival subjects are more prominent in the eastern part of Slavic (e.g., Russ. nado ‘necessary’), while agreeing modals play a more important role in Slavic languages of the western half (e.g., Pol. potrafić ‘be able to’). In this respect, this half “fits” well with adjacent languages in western Europe. In turn, among modals without nominatival subjects we can distinguish those that do not allow for any overt subject (e.g., Pol. można ‘’, trzeba ‘’) and those which do not block oblique subjects (e.g., Russ. možno, nado ‘ditto’); compare Pol. Można (*nam 1.) szybko wypić kawę vs Russ. Možno (nam 1.) bystro vypit‘ kofe ‘We can have a quick coffee’. Finally, another parameter of grammaticalization is the loss of ontological restrictions for the arguments of the modal complex. At the most extreme point, even zero-place verbs can be combined with modals (in epistemic use, e.g., Pol. Może padać ‘It can [= it is possible that it will] rain’). For comprehensive treatments cf. Besters-Dilger, Drobnjaković, and Hansen (2009), Hansen (2014), for the evolution of Polish modals cf. Hansen (1999). Various Slavic languages belong to areal clusters reaching beyond Slavic. Prominent cases are NEC-modals like Russ. prixoditsja (Hansen 2001: 388–390) and its somewhat obsolete cognate in Polish (przychodzi), which have retained lexical source meanings related to ‘come’ or ‘befall upon, have to do with’. Their lexical source and the modality subdomain are affine to acquisitive modals, which are prominent in Northern Europe (Van der Auwera, Kehayov, and Vittrant 2009). They are attested in the adjacent Baltic-Finnic region, where they are based either on the same lexical source as Slavic ‘come’ (with or without reflexive marker) or on its lexical converses (e.g., Latv. nākties ‘come’, Lith. tekti ‘be gotten’ vs. gauti ‘re-

 Slv. lahko ‘1. lightly, easily, 2. ’ is a special case, as it is a maximally polyfunctional modal and “does not impose any restrictions on the first argument of the verb”, but it “retains its original lexical meaning” (Roeder and Hansen 2006: 156).

Grammaticalization in Slavic

273

ceive’).28 Russian and Polish are peripheral to this spread zone,29 inasmuch as their ‘come’-based modals are restricted to circumstantial necessity, whereas in Lithuanian they have both  and  circumstantial readings, and in the larger area (over Northern Europe) -readings seem to predominate (occasionally for dispositional meanings as well). Polish (to a lesser extent also Czech) shows striking parallels in crosslinguistically (and in the larger area) less frequent meaning extensions of modals and specific properties of particular modals (Weiss 1987a; 2009: 134–143). Most prominent is the deontic-reportive polyfunctionality of Pol. mieć (and its cognates Cz. mít, Slk. máť), which very much resembles German sollen (Wiemer [2010: 79–85] for the European background); see § 5.2. As for Polish, striking parallels with German extend into modal constructions with być ‘be’ or mieć ‘have’ and deverbal nouns (nomina actionis) governed by the preposition do ‘for’ (analogous to Germ. zu+); these constructions have deontic or circumstantial meanings, and they are likely candidates of PAT-borrowings from German (Weiss 1987b). To a lesser extent this applies to Croatian/Serbian (with the preposition za ‘for’), used with the infinitive in circumstantial or dispositional meaning (Wiemer and Hansen 2012: 129–131).

.. Modality outside of necessity and possiblity Interesting on an areal background is a construction that bears a certain resemblance to serial verbs. Russ. vzjat’ ‘take’ has started on a road toward auxiliation: a construction of vzjat’ + inflected lexical verb conveys the meaning of unexpectedness, or arbitrariness, of an event 30 (dubbed ‘inexpectative vzjat’’ in Weiss [2008b] etc.). The lexical verb must be telic or punctual.  and the lexical verb must share all grammatical features (tense, aspect, mood, grammatical voice, personnumber, finite–non-finite forms) and the same subject;  itself occurs in the ipfv. aspect as well (brat’), but only in iterative present tense (even with generic subjects) or for narrative purposes (also with the imperative), not in the past.31 Moreover,

 Cf. Heine and Kuteva (2005: 206), based on Stolz (1991: 79–81), and Majsak (2005: 212), Usonienė and Jasionytė (2010).  Simultaneously, the mentioned auxiliaries are peripheral in terms of the category of modals (see preceding paragraph).  These two functions may seem contradictory, at first sight. However, the specific semantics depends largely on the viewpoint: is the event unexpected for the speaker or for the agent (subject referent) or any other participant of the scene set by the lexical verb? This pragmatically dependent variability attests to a rather low degree of grammaticalization.  In this respect the aspectual pair vzjat’–brat’ behaves very much like the Polish passive auxiliary zostać–zostawać (§ 3.1.1.1): the ipfv. auxiliary is just a “copy” of the same event used in contexts of iteration and the inactual present. These pairs are, thus, well entrenched in the pfv. : ipfv opposition (for which see § 3.2.2).

274

Björn Wiemer

’s loss of argument structure manifests itself in its ability to combine with intransitive verbs. It can furthermore occur with inanimate subjects, negation never occurs with  and only very rarely with lexical verbs – which implies that the predication is factive – and the two verbs cannot permute their order (*lexical verb + ).32 These properties speak in favor of grammaticalization. The degree of grammaticalization is, in turn, lowered by lexical restrictions (see above) and by variation in the internal make-up of the construction:  and the lexical verb can be simply juxtaposed (see ex. 19) or be divided by a coordinative conjunction (i, da, da i); see (20). Moreover, the construction is still felt to be “expressive”, i.e., its rhetorical potential has not bleached out (see also fn. 30):33 (19) Da, Sanja, ja ženščina kapriznaja i opasnaja, vot voz’m-u zakolduj-u tebja, prevrašč-u v  take[]-.1 bewitch[]-.1 2. turn[]-.1 into čern-ogo lebedj-a… (black swan)- ‘Yes, Sanja, I’m a capricious and dangerous woman, I just will take and bewitch you, I will turn you into a black swan.’ (P. Proskurin, Černye pticy) (20) Galkin, v svoju očered‘, vzja-l i zagovori-l . in_turn take[]--(.) and .speak[]--(-) na kabardinskom jazyke! ‘Galkin, in turn, all of a sudden started speaking Kabardinian!’ (Komsomol’skaja pravda, 06/08/2008) On the one hand, the inexpectative -construction shows structural and semantic affinity to pseudo-coordination for implicative verbs (Boguslavskij 1988), which is very marginal even in Russian, and to the pseudo-imperative in the protasis of counterfactual conditionals (§ 4.3). On the other hand, to some extent it resembles so-called ‘double verbs’, which are quite productive in colloquial Russian (much less in other Slavic languages). Double verbs consist of two (sometimes three, rarely four) juxtaposed lexical verbs none of which is dependent on the other, and they denote one and the same event, activity or state. They share the same grammatical features (as do the verbs in the -construction), but they must also share the

 Cf. Kuznetsova (2006), Kor Chahine (2007), Weiss (2007; 2008b), Leonova (2011). Kor Chahine discusses occasional examples in which the lexical verb is negated, and one example with inversion of +lexical verb, for which however discourse deictic vot as a conjunctive element is required (cf. also Weiss 2007: 428; 2008b: 478).  Kor Chahine (2007) proposes to explain this synchronic variability of the conjunctive element (and its absence) as reflexes of a diachronic chain, however without analysing data from language history. Cf. also Weiss (2008b: 501–502) on possible diachronic scenarios.

Grammaticalization in Slavic

275

same valency properties, can be permuted, but normally do not allow non-affixal grammatical morphemes to be repeated (Weiss 1993, 2000, 2008a, 2008b; Leonova 2011). Most probably their productivity has been enhanced by Finnic substrata in Russia (Weiss 2003, 2012). It is however debatable whether this is a case of grammaticalization. Rather, we are dealing with the formation of lexical classes, comparable to the productive formation of binominals (dvandva co-compounds), whose productivity increases from Russia into continental Asia (cf. Wälchli [2005: ch. 5] for an elaborate discussion of co-compounds). Semi-auxiliarized  is attested also in other Slavic languages, certainly to different degrees. Actually, such constructions are known for practically all language groups over Europe (as for Germanic compare Kempf and Nübling, this volume) as well as in Turkish and Farsi (Coseriu 1966; Ekberg 1993), but, in contrast to Russian (beside Finnish and Finnic minority languages in Russia), they seem to be more specifically associated to ingressive or instantaneous events (Weiss 2008b: 502). Moreover, they differ in details of their distribution over grammatical contexts and in the form of the lexical verb (for a recent areal comparison cf. Nau et al. [2019]). All in all, the areal distribution of +lexical verb differs strikingly from the spread of double verbs: the latter are hardly attested west and south of Russian (Weiss 2007, 2008a: 162).

. Tense Common Slavic did not know any designated future tense. It is thus not surprising that future tenses in modern Slavic languages vary considerably (§ 3.4.1); for a survey cf. Wiemer and Hansen (2012: 104–112). Apart from future grams, we observe many innovations in the resultative-perfect domain, multi-layered both in terms of their age and their functional differentiation and scattered over different (often unconnected) “edges” of the Slavic continuum (§ 3.4.2). After these two gram families I will add a remark on the absentive (§ 3.4.3).

.. Future Slavic future grams are of very different shape and origin, and they belong to those features which serve as the most basic grammatical isoglosses within the Slavic landscape, as they divide the whole territory into a northern part with a future marker of inchoative origin (*bǫd-) and a sourthern part in which the future gram has developed from the verb CS xotěti ‘want, wish’. This split almost entirely coincides with the restrictions of the future marker as regards aspect: in the south the devolitional gram can be combined with both ipfv. and pfv. verbs, while the future gram in the north cannot be combined with pfv. verbs (with the exception of CUS;

276

Björn Wiemer

[Scholze 2008: 192–193]). Instead, here the pfv. future is marked with pfv. present tense stems, whose grammatical default function has shifted into the future domain (and can be used in the inactual and narrative present to various degrees).34 The exception to this coincidence of form and distribution is Slovene (and Kajkavian dialects in adjacent Croatia; [cf. Lončarić 1996: 109]), which has the future gram of inchoative origin, but shows the same lack of restrictions regarding aspect as do the other South Slavic languages. Furthermore, both future grams inscribe into larger areal biases with a range beyond Slavic: the ‘devolitional’ gram belongs among well-known core features of the so-called Balkan League, whereas the ‘inchoative’ gram belongs to a large area in northern Europe basically stretching from the Alpes into Scandinavia and Finnic. In this area -verbs have been showing a remarkable propensity toward future usage (Dahl 2000b), which however only in North Slavic has become conventionalized. The same region is known for its lacking or only weakly developed futures (Dahl 2000a: 325–326), but neither in Germanic nor in Finnic has a stem-derivational aspect system evolved. So, it seems rather obvious that the North Slavic conventionalization of *bǫd- as future marker with ipfv. verbs is connected to the interaction with a strengthening opposition of pfv. vs. ipfv. verb stems (see § 3.2.2). The marker *bǫd- is remotely cognate to the existential-copular verb byti ‘be’ (< PIE *bhū-), we can thus say that in modern Slavic languages it serves as the latter’s semi-suppletive future. Its inchoative flavor must have been lost already by the first written attestations (Krämer 2005: 75–81). The infinitive was lost and only present tense forms remained. Probably *bǫd- was first used as a futurum exactum (= perfect in the future) with the l-participle marking anteriority. While still the only one in Slovene, in Polish it is optional beside the predominant use with the ipfv. infinitive, which is the only choice in the remainder of North Slavic. Beside the restriction to ipfv. aspect *bǫd- has not undergone any other changes in terms of Lehmann’s parameters. It has not even decategorialized, but strengthened its position as (analytical) part of the tense-aspect paradigm in North Slavic. The devolitional future gram, by contrast, is a paradigm example of grammaticalization according to mainstream criteria. Its development has been detailed in countless descriptions (cf. Heine and Kuteva [2005: 188–192] for a summary). Here I indicate only some facts relevant for determining the degree of grammaticalization. In general, the development of CS xoštu, xošte(t) ‘I want, s/he wants’ toward a phonologically reduced future marker runs along a cline from free to bound forms. But

 In contrast to Germanic, this shift is strict inasmuch as (with the exception of CUS) no Slavic pfv. verb can be used, for instance, in actual present function (compare, e.g., German: Schau, er zerreißt (gerade) das Buch ‘Look, he is tearing apart the book’, where the telic verb describes an ongoing process), and perfectivity is not restricted by telicity. Moreover, the fact that in South Slavic the future gram can be freely combined with pfv. stems means, conversely, that there is also a regular pfv. present. This, in turn, is largely restricted to dependent clauses and independent clauses with a suspension of reality (factuality); see § 4.1.1 and Wiemer (2014b, 2017b).

Grammaticalization in Slavic

277

while in Croatian/Serbian this form has ended up in a Wackernagel enclitic which itself carries agreement marking (and combines with the infinitive), see ex. (21), in Bulgarian and Macedonian this marker has become an uninflected proclitic (Bg. šte / Mac. ќe < 3. xošte ‘wants’) which attaches to the present tense form of the lexical verb in the appropriate person-number forms (ex. 22). Therefore, Croatian/ Serbian, on the one hand, and Bulgarian and Macedonian, on the other, demonstrate a largely inverse distribution of grammatical and lexical information, and of a part of the grammatical information (tense vs. agreement), over the predicate complex: (21) Croatian/Serbian a. U subot-u će sti-ći Ivo. in Saturday- .3 arrive[]- Ivo. ‘Ivo will arrive on Saturday.’ b. U subot-u ću sti-ći. in Saturday- .1 arrive[]- ‘I will arrive on Saturday.’ (22) Bulgarian a. Skoro šte dojd-e poštenskij razdavač. soon  come[]-.3 postman ‘The postman will come soon.’ b. Skoro šte dojd-a. soon fut come[]-.1 ‘I will come soon.’ Which kind of marking attests to a higher degree of grammaticalization? Autonomy is lost in either case inasmuch as the phonologically eroded, semantically bleached, paradigmatically reduced and syntagmatically highly constrained “trunc” of the verb (xotěti ‘want’ > ) needs a lexical host to combine into a complex predicate. However, in Croatian/Serbian the form has retained signs of the former morphological categories (person, number), in this respect it can be regarded as less grammaticalized than its Bulgarian/Macedonian cognates. Moreover, the source expression (xotěti > hteti, 1..hoću ‘I want’) continues to be used as lexical verb in Croatian/Serbian, while in Balkan Slavic  was replaced by other verbs generally used in the meaning of ‘want, wish’ (Bg. iskam, Mac. sakam). However, in Bulgarian some inflected relics of  are left (as in Pravi kakvoto šteš ‘Do as you like’), whereas Macedonian does not show any remnants at all. Moreover, both Bulgarian and Macedonian use a suppletive uninflected form in negation (Bg. njama da, Mac. nema da lit. ‘doesn’t have that’); cf. Nitsolova (2014: 35), but in Macedonian nonsuppletive ne ќe proves to be less restricted over grammatical contexts (Kramer

278

Björn Wiemer

1986: 96–97), whereas in unnegated contexts the 3.-form of  can be used instead of ќe as future marker (e.g., Mac. Utre ima da učam ‘Tomorrow I will learn’). These few details (there are more) already show that within South Slavic (even when it comes to comparing only Bulgarian and Macedonian) the integration of devolitional future markers into paradigmatic structure of analytic verbal morphology is highly differentiated and does not lend itself for trivial answers about degrees of grammaticalization. The only -based future in Slavic exists in Molisean Slavic, spoken in Southern Italy and for 500 isolated from its ancestor (probably from the Dalmatian outbacks). -based complex predicates (e.g., Mam.1. po:. dom. ‘I have go home’) are only half-way toward future, since they still convey a flavour of root (i.e., non-epistemic) modality. In this respect, they do “sharework” with the general South Slavic devolitional future (Ču.1. po:. dom. ‘I will (probably) to go home’), which is epistemically motivated (Breu 1998: 352, 2017: 50). Clearly, the future-like use of the -auxiliary is reminiscent of early stages of the Romance future, in particular of surrounding Italian dialects. However, the onset of a shift (obligation > deontically marked future) has independently also been noticed for Macedonian ima da ‘has to; is necessary’ (see above and Kramer 1986: 96). Thus, in the case of Molisean Slavic, Italian dialects have probably just strengthened some more general tendency already inherent to the Slavic contact idiom. Moreover, we find the onset of future periphrases based on ingressive verbs since OCS (Večerka 1993: 181–182), and they are well attested for the middle East Slavic period (Vlasto 1988: 163). All these ingressive verbs were derived from the root *-čę-ti ‘begin’, prefixed by po-, za-, u- (compare Russ. načat’, Pol. zacząć ‘begin’). These periphrases remained nascent and were ultimately ousted by one of the other grams discussed above (depending on the territory). We can only find an established “ingressive” future in West Ukrainian (Ruthenian) dialects from the Carpathian region, but this one is based on the CS verb *jęti ‘take’ (present tense jm-u.1, jm-eš.2, etc.). After this verb underwent a semantic shift ‘take > begin’ (probably triggered by the semantic opposition to static im-a-ti or im-ě-ti ‘have’, which are suffixed present stem extensions of ję-ti)35 its present tense forms became encliticized and nearly agglutinated to the infinitive of another verb; e.g., čytatymu, čytaty-meš, čytaty-me(t‘) ‘I, you(), s/he will read’. As with bud- (< *bǫd-), the infinitive can be only of ipfv. aspect; this ingressive gram has, thus, been entrenched into categorial restrictions typical for the aspect system in North Slavic. Both budu and {mu} have been existing in parallel with much regional differentiation (obviously depending on influence from Russian, which only knows bud-). By

 These suffixed extensions have often erroneously been considered the basis for an alleged -based future in Ukrainian. Cf. Danylenko (2010, 2011, 2012: 16–21) for the dialectal and diachronic facts and a convincing argument in favour of jęti and its cliticization as a future marker via ingressive meaning.

Grammaticalization in Slavic

279

now, {mu} seems to have been almost ousted even in Carpathian dialects. Its quite restricted regional spread in this mountain area might have been supported by a parallel emergent future marker in East Hungarian dialects based on the preverb fog ‘take’. This issue has remained entirely uninvestigated.

.. Resultatives and perfects The following rough outline is based on Wiemer and Giger (2005), Wiemer (2017a) and Arkadiev and Wiemer (forthcoming) and takes its starting point from Nedjalkov ([1983] 1988). All Slavic resultatives and perfects are based on participles, and perfects evolve from resultatives. That is, the input of resultatives is restricted to verbs denoting a perceivable change of state (i.e., they are telic in the narrow sense), while with perfects no such restrictions apply; instead, perfects allow for participles of verbs that are not telic (e.g., see, run, taste, laugh, be) as well as of intransitive verbs. This expansion of the lexical input conditions the transition from resultative to perfect, which is characterized by a shift from a focus on the resultant state (presupposing a change of situation) to a focus on the event (change of situation) itself (Nedjalkov and Jaxontov 1983; Breu 1988). In short, the expansion of lexical input goes hand in hand with the appearance of current relevance (or an experiential function) as the central function of the construction; this expansion should thus be treated as a main indicator of the degree of grammaticalization of erstwhile resultatives. The semantic and distributional properties of perfects are the most complex ones, probably together with passives with which they partially interfere (§ 3.1.1). New resultatives have “cropped up” repeatedly, particularly in non-standard varieties on the Northwestern, Western and Southern peripheries of the Slavic-speaking area. The old CS perfect was based on an active anteriority participle with an l-suffix inflected for gender and number, and used only in predicative function, plus a auxiliary (CS *byti). In the majority of Slavic languages, it has turned into a general past tense, a process which inscribes into a general cline in the middle of the European continent (Breu 1994: 56–8; Thieroff 2000: 282–285; Drinka 2017). Hardly any parameters regarded as indicators of grammaticalization have accompanied this process, except the fact that the auxiliary was lost in East Slavic, while in South and West Slavic it has been cliticized or shows signs of agglutination (Polish); see § 3.6.36 Newer resultatives have spread on the basis of anteriority participles suffixed with {n/t}, {(v)ši} or {l}. The first (n/t-participles) can be found all over Slavic and predominate in resultatives and perfects, whereas {(v)ši}- and case-inflected {l}-par-

 Tommola (2000: 443–445) provides a brief areal overview. Cf. also Meermann and Sonnenhauser (2016) with a focus on South Slavic and Drinka (2017: ch. 13) for the discussion of some debatable issues.

280

Björn Wiemer

ticiples early became extinct in South Slavic. Resultatives with {(v)ši}-participles are prominently used in East Slavic regions that are close to (or overlap with) Baltic, while case-inflected {l}-participles have survived only in West Slavic resultatives. Object-oriented resultatives that have not developed into perfects are systematically ambiguous with static passives. However, all participle types underwent changes in their voice-orientation (probably several times; [Wiemer 2014a: 1629–1631 et passim]). Only few of the more recent resultatives have evolved into perfects, in particular -based constructions, as a rule, seem to have resisted a switch to perfect since the time of their earliest appearance (see below).37 Usually, newer resultatives are restricted to participles of pfv. verbs, exceptions are found in some Macedonian, Kashubian and NW-Russian dialects. Ipfv. n/t-participles which have been attested in West Slavic -based resultatives are often lexicalized (probably archaic) instances and they usually do not differ in their temporal interpretation from constructions with pfv. n/t-participles (e.g., Cz. Máme vařeno = uvařeno / placeno = zaplaceno ‘We have cooked / paid’).38 Among the exclusions we find southern dialects of Macedonian (spoken in Greece and Albania), in which ipfv. n/t-participles in the -construction can be exploited to mark habitual or even durative actions in the past (Topolińska 1995: 209–210). On NW-Russian dialects cf. Wiemer and Giger (2005: 33–38). Otherwise, wherever n/t-participles of ipfv. verbs are part of productively used constructions, these constructions are to be characterized as (foregrounding) passives; e.g., in standard Polish (§ 3.1.1). New resultatives divide into -based and -based constructions, and this split corresponds to predominant patterns in predicative possession (see § 1). The western part of Slavic with -based resultatives continues the vast area of based (“possessive”) perfects in Germanic and Romance. Resultatives with a auxiliary are typical for all West Slavic languages and the adjacent part of Belarusian and Ukrainian (East Slavic); they are also attested almost all over South Slavic, in particular a southwestern Macedonian innovation first attested in the 17th century and spreading eastwards along the border with Greece into certain varieties of Bulgarian from there39 (Friedman 1976). This innovation is the third of three perfect types for which Macedonian has been mentioned (Graves 2000); see Table 1. These three types are distributed unevenly in diatopic and diastratic terms, and the differences correspond to their age and the center of irradiation of the younger types (see Table 1); types B and C probably developed hand in hand in contact with Arumanian (see below). In those dialects which have all three types, functional differentia-

 Cf. Giger (2003a: 355–452) on Czech, Bunčić (2015: 416), following Mendoza (2013), on Polish, Mirčev ([1973] 1976) on Bulgarian.  Wiemer and Giger (2005: 87) with further references and Nomachi (2012a: 91–92, 96) on Macedonian following Topolińska (1995). For a systematic investigation cf. Wiemer (2017a).  Xaralampiev (2001: 144) claims that in these varieties (the south-western and Thracian dialects) the -construction is used practically synonymous to the old perfect ( + l-form).

Grammaticalization in Slavic

281

Tab. 1: Macedonian perfects. A

Sum došo-l / doš-l-a. / Sme doš-l-i. ‘I / We have come ( /  / ).’

 + l-participle (inherited from CS)

nonconfirmative (indirect evidential)

B

Sum dojden(a). / Sme dojden-i. ‘I / We have come ( /  / ).’

 + n/t-participle (inflected for gender and number)

only resultative

C

Ima-m dojdeno. ‘I have come.’

 + n/t-participle (indeclinable)

only resultative

tion has occurred, and the oldest type has been entirely reinterpreted as a marker of indirect evidentiality (‘nonconfirmative’ in Friedman’s terminology; § 5.1). Remarkably, despite being restricted to resultant states, both type B and C allow for both transitive and intransitive verbs as lexical input, and they can occur without objects (e.g., B-type Jas sum jadena ‘I have eaten [⊃ I am not hungry now]’); cf. Friedman (1976), Drinka (2012: 537–541). -based resultatives are attested in Slovene, Croatian/Serbian and Bulgarian as well, but, apart from dialects close to Macedonian, they have basically remained resultative (Vasilev 1968; Breu 1994: 54–55). It should be emphasized that absence vs presence of object (or subject) agreement is not a reliable indicator of a possible shift toward perfects of Slavic resultatives40 (Giger 2003a: 283–290, 404–407; 2016), nor is objectless use conclusive,41 as long as the construction does not show other signs of grammaticalization (first of all, expansion of lexical input). In turn, proclitic negation attaches to the -verb in many cases, although semantically it refers to the participle. Compare a Serbian example (cited form Nomachi 2012a: 93): (23) Vi ovde ne-ma-te urađe-n-o sve ono što you[] here -have-.2 do[]-- all that. what dogovore-n-o. agree[]-- ‘You haven’t done all that was [lit. is] promised.’ The behaviour of clitics cannot be considered a really reliable indicator of grammaticalization, either, since clitic rules tend to be very language-specific. They may be

 NW-Russian dialects do not show agreement at all, but this was plausibly caused by the lack of number distinctions, and of gender altogether, in Finnic substrates.  For instance, one can come across occasional examples of the Serbian -based resultative without an object-NP for which, however, “the omission is not necessarily straightforward” (Nomachi 2012a: 93), among oher things because it can be considered as semantically incorporated (e.g., Već imam položeno za vožnju ‘I have already passed the driving test’, lit. for driving).

282

Björn Wiemer

promotors of grammaticalization (if they lead to coalescence with lexical stems of the relevant syntactic class), but one must be careful whether their behavior does not converge with other symptoms of grammaticalization by accidence, and they may even run counter to other parameters (see § 2.1). Another striking thing is that, despite their age, -based resultatives with n/t-participles in most extant West Slavic varieties do not seem to have changed for at least six centuries: both in Old Czech and Old Polish we find the same range of lexical input and the same lack of systematic shift toward an eventive reading (= perfect) as today.42 German influence seems to have had an effect only in those (mostly meanwhile extinct) varieties in which *bǫd- apparently was slavishly PATcopied as a passive auxiliary from Germ. werden based on the latter’s homophony with the future marker (§ 3.1.1.1). As a rule, the conservative behaviour of resultatives correlates with lack of reanalysis and of changes on accepted parameters of grammaticalization. By the same token, auxiliation can be observed only to a very limited degree (cf. Arkadiev and Wiemer, forthcoming: § 3.5.2). Perfects based on a -verb (incl. its “zero realization”) and the n/t-participle are typically encountered in peripheral dialect regions of East Slavic, close to Baltic and Finnic, or, again, in the extreme south, namely Macedonian. This areal distribution makes contact-induced grammaticalization likely: from the western “edge” of Slavic, replication of models from German and/or from Italian (as for Slovene) is very likely, while in Macedonian we seem to be dealing with mutual copying with Greek, Albanian and/or Arumanian (cf. Makarova 2016: 231–232, following Gołąb 1984). On the northern periphery (i.e., on the western edge of the northern half of East Slavic) the rise of -based resultatives with n/t-participles was most likely supported by contact with Finnic. In the southern part of this area and, in particular, in the tiny Slavic-Baltic border region southwest from the East Slavic-Finnic contact area we find -based resultatives with participles of the {(v)ši}-type, e.g., Belarusian near Braslav: fs’a ulica byla zγare-ṷšy ‘The whole street had burned up’, lit. was burned up (cit. from Erker 2015: 94). The spread and persistence of this resultative type can be explained with “conserving support” from Baltic (Wiemer and Giger 2005: 40–41; Arkadiev and Wiemer, forthcoming). Finally, a marginal type of resultative has (very occasionally) been attested in Belarusian and Polish varieties in a small zone of overlap (or border) between East Slavic and Lithuanian. It combines the active anteriority {(v)ši}-participle with a -auxiliary. In light of the fact that this construction is crosslinguistically extremely rare, but occurs in colloquial Lithuanian (Wiemer 2012), it is plausible to assume that this type of resultative was PAT-borrowed from local Lithuanian into contiguous Slavic nonstandard varieties (Erker 2015: 96).

 On the lack of a resultative>perfect shift in Polish cf. Łaziński (2001) and Bunčić (2015), in Czech and Slovak cf. Giger (2003a: 355–452), in general Wiemer and Giger (2005), Wiemer (2017a).

Grammaticalization in Slavic

283

.. Absentive Czech belongs to a spread zone in which an absentive gram has become more salient (cf. De Groot 2000). The absentive conveys that the referent of the subject NP went away to do something and (if used in the past tense) came back after an expected time span. It is expressed by an -copula (Cz. být) and an infinitive. In Czech the copula is predominantly in the past tense, which can be considered indicative of its being a less grammaticalized PAT-borrowing from German (Berger 2008, 2009: 24).

. Mood In Croatian/Serbian the negated imperative form of moći ‘can’ has evolved into a prohibitive marker (with , e.g., nemoj (2), nemojmo (1), nemojte pevati! (2) ‘Don’t sing!’). Today this form competes with the regular negated imperative, its onset is attested already in OCS (Hansen 2004: 258). In different Slavic languages we observe the emergence of analytic hortative markers, whose history is closely related to the rise of analytical causatives (§ 3.1.2). In East Slavic (in particular, Russian) the / imperative of ipfv. davat’ (davaj!., davajte!.) is used as a hortative marker with either the (inclusive) 1.. or the . of the lexical verb (e.g., Davaj poobedaem!, Davaj obedat’! ‘Let us have lunch (together)!’); cf. Xrakovskij and Volodin (1986). In South Slavic the pfv. equivalent (daj!, dajte!) is used; cf. von Waldenfels (2015a: 122) for Bulgarian and Slovene. However, it may be argued that these units are rather instances of discourse particles with low (or no) degree of morphosyntactic coalescence with the lexical verb.43 At least as for Russ. davaj(te), its status as grammatical marker can be justified because it does not only reinforce the illocutionary force of the utterance, but, in fact, without davaj(te) the directive illocution would be lost (von Waldenfels [2015a: 121–122] for discussion). Apart from ()’ ‘give’, other verbs have served as sources of hortative or permissive ‘particles’ partaking in respective analytical moods: Russ. puskat’. / pustit’. ‘let, release’ > puskaj, pust’ (= .), e.g.: (24) Russian Pust’ pere-moj-ut vsj-u posud-u!  (-wash)[].-3 all-.. dishes[]-. ‘May they do all the dishes!’, or ‘Let them do all the dishes!’

 The same would apply to the Turkish loan ajde, hajde used throughout South Slavic except Slovene.

284

Björn Wiemer

Hortative and permissive meaning may turn out difficult to be distinguished. Apart from that, the originally singular form of the imperative is used regardless of the number of addressees (contrary to ()); see ex. (24) An exact West Slavic equivalent as a hortative-permissive particle derived from a -verb is Pol. niech(aj) (< † niechać). This petrified (and truncated) form continues the earlier singular imperative of the pfv. verb. Slv. naj (< nehaj = . of nehati ‘let’) is another case in point. It has expanded further from directive use via interpretive deontics into hearsay (Holvoet and Konickaja 2011: 11–13). Its functional development is comparable to the extension of the Polish deontic auxiliary mieć ‘have to, should’ into reportive evidentiality (see § 5.2).

. Agreement (subject/object agreement) Agreement within NPs and between nominatival subject and finite verb belong to the most persistent features of Slavic in general, and this system has been renewed by redistribution of available morphemes (see § 1). There is thus not much to tell about agreement, except for three issues. The first of them is the rise of a new form of adjectival inflection (NP-internal; see § 2.3). The second is the (incomplete) grammaticalization of subject agreement on past tense verb forms in Polish. During the last few centuries, Polish has been showing a tendency toward the agglutination of enclitic person-number markers to the gender-number inflected l-participles = past tense markers (-m.1, -ś.2, -śmy.1, -ście.2, -Ø.3/, e.g., widzia-ł-a-m see[]---1 ‘I saw [said by a woman]’, etc., widzie-l-i-śmy see[]-..-1 ‘we saw’, etc.). These person-number markers, in turn, derive from clitic present-tense forms of być ‘be’ used as an auxiliary or copula (Decaux 1955; Rittel 1975; Andersen 1987). The enclitic character of these to-be-desinences is still visible in the other West Slavic and in the South Slavic languages. The third case has to do with agreement on clause level. Balkan Slavic (as other Balkan languages), on the basis of clitic prononuns (still marked for  and ), has developed rather strict rules of clitic doubling, although the specific rules differ from variety to variety (even within the same language). Macedonian is claimed to be most grammaticalized in this respect inasmuch as on all implicational hierarchies established for clitic doubling in that region it proves most advanced and consistent. This carries over from specific into generic reference, also in colloquial Bulgarian. Cf. Friedman (2008), in particular for the areal complexity and intricacy of rules, and Wiemer and Hansen (2012: 122–128) for a survey and the general diachronic background of South Slavic. It should be emphasized that in most cases we are still dealing with clitics, i.e., the marking of specific object-NPs by non-affixal morphemes. Again, Macedonian seems to be most advanced in terms of affix-like behavior (cf. also Cyxun 1968). Clitic doubling again becomes more frequent in the extreme northwest of South Slavic, for instance, in Istrian Čakavian (Kalsbeek 1998,

Grammaticalization in Slavic

285

2011: 143–144) or Resian44 (Benacchio 2009: 186–187). Here, obviously, Italian dialects have been supporting this tendency.

. Other To complete the picture with categories relevant for the VP or the clause, some brief remarks on a particular type of complex predicate formation (§ 3.7.1) and on an existential construction (§ 3.7.2) follow.

.. Complex predicates with da (South Slavic) The clause-initial particle da has undergone one of the greatest “careers” in Slavic. It developed in CS and already in OCS was used to mark suspension of assertiveness (which has been associated to irrealis marking in the typological literature) on various levels of syntactic structure. However, as a clause-linking device it has been expanding only in South Slavic and is part and parcel of constructions that replace infinitival complements (§ 4.1.1). In short, da has undergone all stages of syntactic tightening, from juxtaposition via use as complementizer and conjunction to a morpheme “knitting together” an auxiliary or phasal verb with a lexical verb (Wiemer 2014b: 154–162; 2017b: 300–303, 329–330; 2018: 295–303);45 e.g., Bg. moga..1 da piša..1 ‘I can write’, počvam..1 da piša..1 ‘I begin to write’ (lit. ‘I can / begin that I write’). Da has thus become part of the wider verbal paradigm. Concomitantly, it has been losing its restriction to suspended assertiveness. This becomes clear from the combination with phasal verbs (see last example), but also from the western part of South Slavic where this sort of semantic bleaching has affected da as a complementizer as well (§ 4.1.1). In Macedonian and Bulgarian, in turn, da has turned into a strict proclitic to the lexical verb and, thus, lost promiscuous attachment as a typical clitic property (cf. Spencer and Luís 2012), similarly to the person-number to-be-desinences in Polish (see § 3.6).

.. Existential constructions and predicative possession Apart from the general split into - and -based patterns (see § 1), we find an isolated case in close contact to German. Various Sorbian varieties can express

 The migration history of this Slovene dialect markedly differs from all other Slovene dialects. For almost a millenium, Resian has been isolated from the rest of Slovene and it has been entirely surrounded by Italian (Friulian) dialects.  This is tantamount to saying that da has “worked through” Semantic Integration Hierarchies assumed for complementation patterns since Givón’s (1980) proposal of a binding hierarchy, which

286

Björn Wiemer

existential clauses on the basis of ‘give’ (compare Germ. es gibt + ) or of ‘set, make sit (down)’ (e.g., standard Upper Sorbian dawać and sydać, respectively). Existential constructions with these source expressions are an exotic type of grammaticalization, and they are lacking in other (West) Slavic languages. For these reasons, German can be safely assumed as having supplied the model, at least as for ‘give’. The PAT-borrowing differs somewhat more from the German model in CUS where an agreeing nominatival subject NP is more usual than the accusative (Scholze 2008: 320–321; Vykypěl and Rabus 2011: 184–5; von Waldenfels 2015a: 123–124). The morphosyntactic properties of the German model seem to be more strictly followed even in a Kashubian equivalent of this construction (to dô, lit. ‘that has’), in which, however, only the pfv. verb is used despite the static semantics (M. Nomachi, p.c.).

 Grammaticalization of complex constructions . Complement clauses The following two subsections again demonstrate a clearly uneven inner-Slavic distribution: da-complements are a ubiquitous property of South Slavic (§ 4.1.1), whereas ‘as if ’-complements occur only on the Northeastern periphery of Slavic and are highly specialized (§ 4.1.2). For details cf. Wiemer (2018).

.. Da-complements (South Slavic) The evolution of South Slavic da into a connective element of complex predicates (§ 3.7.1) is only the extreme end on a cline of syntactic tightening. For a reconstruction of the steps the historical varieties of the štokavian (Croatian/Serbian) dialect continuum (from the 12th century) appear most suitable (Grickat 1975; Grović-Major 2004). Probably, da started developing into a complementizer as a clause-initial particle of directive utterances that were posed right after speech-act verbs (13th century). During the 15th century, such da-clauses started occurring after verbs that allowed for a hearsay interpretation (‘X heard da=that P’). This process however might have had lasted for centuries already (Wiemer [2017b: 327–329] for a summary). In hindsight, we can say that renalysis prepared the ground for grammaticalization, which can however be “measured” only in terms of syntactic tightening and the increase of categorial restrictions on the verb in the complement.46 Already dur-

included a claim about an iconic relation between semantic and morphosyntactic tightness (one or two events?) in complementation.  For the latter ones cf. Wiemer (2014b) concerning Macedonian and Mišeska Tomić (2006), Friedman (2011) for a general survey over Balkan languages.

Grammaticalization in Slavic

287

ing the 14th century clauses with initial da started encroaching into same-subject constructions (J. Grković-Major, p.c.). It is visible in the contemporary stage of Croatian/Serbian where the complement of ‘want’ requires a verb in the present tense and only a same-subject interpretation is possible. Compare (25) for older Serbian: the translation on the left of the arrow corresponds to older Serbian and the translation on the right to the modern stage: (25) xote-hui… da skaž-ui/k(?) want-.3  say[]-.3 ‘Theyi wanted that theyi ∨ k sayʼ > ‘theyi wanted that theyi/*k sayʼ (= ‘They wanted to sayʼ). It was, in turn, the same-subject interpretation which paved the way for modal and phasal verbs to appear (and, thus, for da to encroach into monoclausal patterns and take part in complex predicate formation); see § 3.7.1. Certainly, the loss of the irrealis requirement of da followed later; in part it was a consequence of an increase in the admissible lexical input of the complement-taking predicate (from directive and manipulative to representative and other verbs with “realis” entailments).

.. ‘as if’-complements (northeastern part of Slavic) Clausal complements introduced by a complementizer that derives from an ‘as if, as though’ connective of irreal comparison prove to be a rare phenomenon, at least if this complementizer further specializes in evidential meanings, first of all a reportive one.47 On this backdrop, Pol. jakoby and units centering around Russ. budto look rather exotic, and their occurrence in a restricted region in Europe’s northeast may not be accidental (Wiemer, 2018: § 4.3). See two examples of jakoby as a complementizer in modern Polish to illustrate verbal vs. nominal heads and epistemicinferential vs. reportive usage:

 Complementizers with an ‘as if ’ source expression which simultaneously are somehow connected to evidentiality are restricted to - and perception predicates (López-Couso and MéndezNaya 2012, 2015: 196). Other candidates mentioned in the literature are presented as void of evidential functions; see, for instance, Heine and Kuteva (2002: 257–258). Boye, van Lier, and Theilgaard Brink (2015: 8–10), who examined 89 languages of a genealogically balanced worldwide sample, reported only on Turkish and West Greenlandic as having complementizers with a reportive function. In turn, in evidentiality research patterns of complementation have figured quite prominently, but, as a rule, only the morphosyntactic realization of clausal complements has attracted attention as, e.g., finiteness distinctions or the limited availability of TAM-forms in comparison to main declarative clauses (cf. Aikhenvald 2004: 120–123, 253–256, among others). The development of say/ sayC-units into reportive markers in Romance varieties (e.g., Span. dizque), at least according to the analyses in Olbertz (2007) and Cruschina and Remberger (2008), has ended up in particles, not in complementizers.

288

Björn Wiemer

(26) Kapłan nie da-ł posłuch-u plotk-om, priest-(.)  give[]--(.) listening- gossip-. jakoby as though Najwyższy Zebon był obdarowany niezwykłą mocą (…). ‘The priest did not lend his ear to gossip, [saying] as if the Highest Zebon was endowed with extraordinary power.’ → reportive, nominal head (NKJP; I. Surmik: Ostatni smok. Warszawa, 2005) (27) By-ł-o-by błęd-em sądzi-ć, jakoby be--- mistake- think[]- as though młodzież w wieku 18–17 lat nie kochała swego miasta … ‘It would be an error to assume that the young people aged 17–18 don’t love their town.’ → inferential/epistemic, verbal head (NKJP; Stolica 10/1962) As a lexical unit (function word) jakoby resulted from the univerbation of the general comparison marker jako ‘how’ and the ‘irrealis particle’ by (used also as subjunctive marker). This new unit concomitantly entered into a grammaticalization process inasmuch as its use was tightened syntactically: in clause-initial position it was reanalyzed as a complementizer after specific sets of verbs or nouns. This is tantamount to saying that jakoby started filling a slot as a connective that cannot be left empty and which marks off clausal arguments. An analogous remark applies to Russ. budto (< bud’ = . of byt’ ‘be’ + deictic particle to). Many details concerning various stages of this process remain to be investigated,48 but we may say that the shift from irreal comparison to the marking of clauses with propositional content quite obviously was mediated by an effect of epistemic distance associated to irreal comparison. As concerns Pol. jakoby, the rise of this unit – which conditioned its exploitation in clausal subordination – was favoured by the Wackernagel rule, which still applied consistently in older stages of Polish, so that the irrealis marker by could coalesce with clause-initial jako. Moreover, nominal heads appear to have been preferred attachment sites of jakoby as a complementizer (see ex. 26) after the 16th century and, thus, provided a locus of spread.

 For Pol. jakoby cf. Wiemer (2015b), including references concerning the contemporary usage. For Russ. -units cf. Zaitseva (1995), Letuchiy (2010). For some corpus-based remarks concerning both items during the last 200 years cf. Wiemer (2008b).

Grammaticalization in Slavic

289

. Adverbial clauses A remarkable case of grammaticalization in the formation of complex sentences is counterfactual (occasionally potential) conditionals in Russian, in which the protasis is introduced by an imperative form (Fortuin 2008: ch. 6; Kuznecova 2009). See a constructed example: (28) Prid-i my vovremja na xolm, solnc-e by ešče ne come[]-. 1. in_time on hill sun[]-  yet  zaš-l-o za gorizont. go_down[]-- behind horizon ‘If we had come to (the top of) the hill in time, the sun would not have set yet behind the horizon.’ The following properties apply: (i) the protasis has to begin with the imperative form of the predicate (fixed order within protasis), (ii) the protasis always has to precede the apodosis (fixed order of clauses), (iii) the imperative form has “bleached” (no control by an agent) and (iv) it does not distinguish number, but (v) can be combined with any person. In sum, we observe a complex of changes including not only decategorialization, but also semantic and paradigmatic extension of the imperative form;49 moreover, the whole clause construction has syntagmatically tightened. Within Slavic, the counterfactual “pseudo-imperative” use seems to be restricted to East Slavic (maybe even to Russian). Some of its aforementioned properties are characteristic of other pseudo-imperative constructions in Russian with equivalents not only in other Slavic, but also in many other languages worldwide, with meanings mainly from the optative (as for simple sentences) and from the concessive domain (as for complex sentences).50 On this backdrop, the extension of Russian pseudo-imperatives into the counterfactual-potential domain looks more outstanding. Note that the conditional construction illustrated in (28) is neither futuredirected, nor does it convey any predictable illocutionary effects (e.g., a reproach). Such effects arise only as particularized conversational implicatures.51

 An alternative hypothesis has been to explain this form as an exaptated 3-aorist form, but there is no unambiguous evidence in its favour.  For a survey over Russian “pseudo-imperatives” cf. Xrakovskij and Volodin (1986: 226–246), Percov (2001: 231–244), on a large crosslinguistic basis cf. Aikhenvald (2010: 235–253), Gusev (2013: 245–281).  Compare something in English like Make one mistake and there’ll be trouble, which is futureoriented and carries a more conventionalized illocutionary effect (warning or similar).

290

Björn Wiemer

 Other (prominent) patterns of grammaticalization and reanalysis Some of the cases discussed above might be regarded as debatable instances of grammaticalization, inasmuch as they crucially hinge upon reanalysis and “hard core” parameters of grammaticalization are rather weakly developed. In this section, I comment on selected additional phenomena which either depend on grammaticalization or which are basically characterized by changes that are concomitant to grammaticalization, but can also be explained simply by reanalysis, analogical extension, functional expansion (or shift) via the conventionalization of implicatures, exaptation or any combination thereof.

. Extensions of TAM-grams The pluperfect and the perfect-in-the-future (futurum exactum) which were inherited from CS (past copula  / *bǫd- + l-participle), have basically survived only in Balkan Slavic and bookish Upper Sorbian. In the remainder of Slavic, these grams have become extinct, or at best obsolete and restricted to a narrow set of lexemes and often in the conditional.52 New pluperfects have developed on the basis of the new resultatives which were discussed in § 3.4.2. Less widespread are cases in which remnants of the pluperfect, namely the uninflected (formerly neuter) l-participle of the -verb (e.g., Russ. bylo), have been reinterpreted as markers of an antiresultative or related meanings. This gram type denotes either a situation that was planned or already about to occur, but not realized, or which was realized, but then its result was annulled (Plungjan 2001). Standard Russian is very prominent in this respect (Šošitajšvili 1998); cp. Pošel bylo k domu, no ostanovilsja ‘He wanted/was about to go to the house, but (then) stopped’ (goal not realized); Pojavilsja bylo v dome, no tut že snova isčez ‘He (had) arrived in the house, but immediately disappeared again’ (result annulled). Essentially, this is an instance of exaptation, since relic forms, fossilized or deprived of their paradigms, have been “recycled”. Apart from that, the antiresultative function is conceptually linked to counterfactual meanings in which (former) pluperfects happen to be exploited quite often (Plungjan 2004). Evidential extensions of the “old” perfect ( + l-participle) can be found only in Balkan Slavic. The Bulgarian and Macedonian present perfect has extended to mark inferential and reportive (plus mirative) functions, the particular subfunction within this domain is determined by discourse context. Similar shifts (with vague meaning ranges) are well-known from many languages with present perfects whose

 For a comprehensive and thorough cross-Slavic account cf. Sičinava (2013). Cf. also Arkadiev and Wiemer (forthcoming: § 3.6).

Grammaticalization in Slavic

291

functions are sufficiently distinct from a past tense. The conceptual mechanism behind this extension is a shift of fore- and background: a focus on some anterior event (i.e., the core function of the perfect) implies relevance for the speech time interval; this relation gets inversed: an inference about an anterior event is drawn from an observed state (simultaneous to speech time). Evidential extensions of perfects are a widely known phenomenon, and they belong among core Balkanisms. In Balkan Slavic the decisive formal criterion between perfect (≈ ‘current relevance’) and evidential functions has been claimed to be presence vs. lack of the auxiliary verb in the third person singular. However, the empirical situation is more complicated (cf. Sonnenhauser 2012, 2013). Apart from that, the evidential extension of the Balkan Slavic perfect has been connected with new morphology. L-forms in exclusively indirect evidential reading have appeared that were not inherited from CS; they are based on the imperfect (or present tense) stem (e.g., piše-l ‘He is said to write / He must have written’ alongside with pisa-l based on the aorist). This morphological consequence of a functional extension from the perfect has resulted from the analogical expansion of patterns of morphonological alternations in stem-final position for a new function. In SW-Macedonian evidential extensions of the l-participles have even affected the latest perfect based on ima ‘has’ (§ 3.4.2; cf. Friedman 1976).

. Reportive extension of deontic auxiliaries: West Slavic In Polish (less prominently also in Czech and Slovak) we observe a meaning extension of a deontic modal into functions associated with reportive evidentiality. Apparently, this pattern is typologically infrequent. Pol. mieć, a modal of weak obligation (which continues functioning as the basic -verb, just like Germ. haben) has undergone a change from obligation to reportive function, which strikingly resembles the development of Germ. sollen ‘have to > ’. This extension was the outcome of a switch from the mediation of a directive speech act (see ex. 29; A and B mark different speakers) to a merely reportive function, without any volitional component (see ex. 30). This shift implies that the original directive illocution could be backgrounded and eventually be lost altogether. (29) B: Ma-sz jejk przynieś-ć wazon. have-.2 her. bring[]- vase. (⊂ A: Przynieś mik wazon!) (⊂ Bring me the vase!) ‘You have to bring her a vase’, or: ‘Bring her a vase (as you have been told)!’ (30) Jutro ma pada-ć. tomorrow have-(3.) rain[]- ‘It says / People say that it will rain tomorrow.’

292

Björn Wiemer

Notice that for modern Pol. mieć (as well as for modern Germ. sollen), contextual ambiguity between the (diachronically primary) directive illocution and its transmission by a mediator is nothing unusual.53

. Participial subject impersonal (a.k.a. Backgrounding Passive) Polish is known for a specific type of Backgrounding Passive, i.e., a construction which demotes the most agentlike argument of the verb from the syntax, while implying it in the argument structure, but does not promote any other argument as a subject in the nominative. Because of this property it has also been called subject impersonal. It is based on the petrified neuter form of the obsolete nominal (“short”) declension of n/t-participles (-no/-to) and can be freely formed from any verb (except być and auxiliaries), provided its agentmost argument is human. It is not restricted by aspect, but has only past tense reference and does not allow for an oblique expression of the demoted argument in modern Polish (it did allow more or less until the end of the 18th century). Compare an example: (31) W czasie uroczystośc-i składa-no bog-u ofiar-y. during ceremony-. lay.down[]- god- sacrifice-. ‘During the ceremony gifts were given to the god.’ [more lit. one gave gifts] (NKJP; powieść, 2005) We are dealing with an exact functional equivalent of German man-constructions in the past domain. Its conventionalization as a subject impersonal resulted from a pending reanalysis of n/t-participles as markers of a foregrounding passive which however broke down “halfway”. That is, with transitive verbs the syntactic interpretation of NPs as subjects or objects lingered to and fro for at least 400 years, and intransitive verbs in this construction were frequent, until (obviously by the end of the 18th century) the switch toward a subject-interpretation of such NPs (and thus toward a foregrounding passive) ultimately failed and object-interpretation became entrenched. This development was possible because, among other factors, after the decay of the nominal (“short form”) declension type (for which see § 2.3) the neuter form of this declension ended up in paradigmatic isolation and was then “recycled”.

 On the diachronic devlopment of modal mieć cf. Hansen (1999: 122–128), on a brief diachronic comparison with Germ. sollen cf. Wiemer and Hansen (2012: 75–78), on the mechanism of switch between two communicative situations cf. Weiss (2009: 136–139), based on earlier unpublished work. In addition, cf. Holvoet (2005, 2012), Holvoet and Konickaja (2011) for an account in terms of interpretive deontics.

Grammaticalization in Slavic

293

The pending process of (ultimately failed) reanalysis was, thus, dependent on exaptation. Probably, the participial subject impersonal served as a model for an RMmarked subject impersonal (compare with ex. 31: … składa-ł-o[]--. się. ofiar-y.. …): the diachronic onset of the latter was later than the onset of the participial construction (Wiemer, forthcoming). We encounter cognate constructions of the Polish participial subject impersonal in certain (west) Ukrainian varieties (with less grammatical and more lexical restrictions) and other adjacent Slavic idioms (Wiemer and Giger 2005: 61–66; Danylenko 2006: 258–264). This fact is indicative of a former small spread zone, irrespective of the fact that subject impersonals are well attested in other regions of Europe (outside Slavic) as well (cf. Sansò [2006] and Wiemer [2006: 281–284] for surveys).

. Adverbial participles Like modal auxiliaries (see § 3.3.1), gerunds (‘adverbial participles’) appeared in Slavic as a new category. However, apart from being largely a property of higher registers of standard languages, their rise rather resulted from exaptation and reanalysis, not grammaticalization proper, although syntactic strengthening took place as well.54 After nominal paradigms of adjectives (thus, also of participles) had sufficiently demised and their paradigms deteriorated, remnant forms were reinterpreted as adjuncts; those suffixed with {(v)ši} are cognate to nuclei of resultatives in the zone close to Baltic (§ 3.4.2). Only in older stages of some Slavic languages do we still find rare attestations in which gerunds were used in clausal complementation (namely, an AcP-construction), which are today entirely ungrammatical. This distinguishes Slavic gerunds from their closest etymological “cousins” in Baltic (cf. Wiemer [2014a: 1634–1638], [2014c: 202–205], with further differences between Slavic and Baltic and references).

 Summary and comparative outlook We may subsume that, in Slavic, grammaticalization processes have taken place primarily in the verbal domain and, to some extent, on the level of the clause and of clause combining. Outside the verbal domain, practically no new inflectional paradigms have arisen, the only exception being the pronominal declension of adjectives. Nominal categories, or categories relevant for the NP, have only been affected to a small extent, the big exception being articles. Derivational morphology is also

 For a partially different view cf. Birzer (2010).

294

Björn Wiemer

conservative; in productive patterns of derivation, particularly of non-transpositional ones, we can see one of the factors favoring the rise of the specific Slavic stemderivational perfective: imperfective opposition (§ 3.2.2). At least on a European background, this sparseness of changes to be qualified as grammaticalization might be interpreted as another confirmation of the rather conservative behaviour of Slavic languages in the domain of morphology, in particular of nominal, or NP-related, categories. This concerns both the “slots” of paradigms and their phonological realizations; for instance, there are no new case(-gender-number) endings whose appearance could be described as grammaticalization. Moreover, only in Polish can we observe a certain stage of agglutination of enclitic past tense auxiliaries which is typical for the rise of agreement markers (see § 3.6), but only a very low (if any) degree of coalescence between nouns and enclitic dative pronouns as NP-internal markers of inalienable possession in Balkan Slavic (see § 2.1). The overall picture is similar in the verbal domain; object agreement via clitic doubling in Balkan Slavic provides but a faint step toward morphologization. In sum, the South Slavic volitional future and the West Ukrainian (or Ruthenian) ingressive future are the only paradigm examples of grammaticalization, but the ingressive future (in the Carpathian region) has remained at best a minor gram. Slavic languages do however supply a good deal of auxiliation, i.e., the rise of complex predicate formation, which we have observed in various domains and at different stages of grammaticalization: modal, causative, passive, resultative-perfect, inexpectative . However, apart from the definitorial property that by auxiliation a lexically autonomous unit loses its argument structure, the appearance of auxiliaries in Slavic has not, as a rule, been accompanied by erosion, a decrease in syntagmatic variability or an increase in bondedness (i.e., coalescence). More exotic phenomena are (i) existential ‘give’ in the Sorbian languages (PATborrowed from German), (ii) recipient passives either with -auxiliaries (PAT- or MAT-borrowed on German models) in all West Slavic languages except Polish or with a -auxiliary (with unclear provenance of a model) particularly prominent in Polish, (iii) acquisitive modals (with restriction to circumstantial NEC) in the northeastern periphery (Polish, Russian), clustering with other languages in northern Europe, (iv) polyfunctionality of weak obligation and hearsay in West Slavic (first of all Polish) -modals (evidently PAT-borrowed from Germ. sollen), (v) ‘as if ’-complementizers with evidential functions in Polish and Russian (without any discernible “foreign” model). Judged in areal terms, the predominant number of grammaticalization phenomena in Slavic (irrespective of the degree of grammaticalization) integrate well into larger areal clines over Europe, sometimes even further into northern Eurasia; see the comments given in the relevant subsections. Polish is somewhat outstanding within Slavic for its -based recipient passive (§ 3.1.1.2) and the development of highly productive subject impersonals (§ 5.3). There are no contiguous (Slavic or non-Slavic) languages which demonstrate an equally consistent usage of these pat-

Grammaticalization in Slavic

295

terns, although certainly the preconditions for the rise of these constructions in Polish are as much internally Slavic as they are for those constructions and patterns for which support by contact with non-Slavic languages is much more plausible. Outstanding, in areal and typological terms, is also the rise of the stem-derivational aspect opposition. However, apart from inner-Slavic differentiation (mostly with an East-West cline) with respect to this grammatical category, Slavic as a whole and seen from a macro-areal perspective looks like a convergence zone between Eurasian languages from the East (in which actionality-changing suffixation is a prominent derivational technique and no preverbs occur) and languages in the western part of Europe (where actionality-modifying suffixes are virtually absent, but many languages amply apply preverbs and/or prefixes); cf. Wiemer and Seržant (2017: §§ 4–5).

Acknowledgments I am obliged to Andrej Malchukov, Motoki Nomachi and Damaris Nübling for helpful remarks on a first version of this contribution, to Jasmina Grković-Major also for supplying me with very useful literature on a couple of topics dealt with here. I also thank Eleni Bužarovska, Victor Friedman, Markus Giger, Liljana Mitkovska, Lenka Scholze and Branko Stanović for their remarks and help in the interpretation of some specific data as well as Anke Lensch for her thorough proofreading. The usual disclaimers apply.

Abbreviations Bg. = Bulgarian, Croat. = Croatian, CS = Common Slavic, CUS = Colloquial Upper Sorbian, Germ. = German, IE = Indo-European, Kash. = Kashubian, Latv. = Latvian, Lith. = Lithuanian, Mac. = Macedonian, OCS = Old Church Slavonic, PIE = Proto-Indoeuropean, Pol. = Polish, Russ. = Russian, SCr. = Croatian/Serbian, Slk. = Slovak, Slv. = Slovene (Slovenian);  = connective (da in auxiliary complexes),  = hortative,  = imperative,  = impersonal participle (Polish),  = infinitive,  = ingressive, _ = l-participle,  = nominalizer,  = non-virile (Polish),  = prefix,  = proper noun,  = past passive participle,  = past actiuve participle,  = particle,  = suffix,  = thematic vowel,  = virile (Polish)

References Aikhenvald, Alexandra Y. 2004. Evidentiality. Oxford: Oxford University Press. Aikhenvald, Alexandra Y. 2010. Imperatives and commands. Oxford: Oxford University Press. Andersen, Henning. 1985. Protoslavic and Common Slavic – questions of periodization and terminology. International Journal of Slavic Linguistics and Poetics. 31–32. 67–82.

296

Björn Wiemer

Andersen, Henning. 1987. From auxiliary to desinence. In Martin Harris & Paolo Ramat (eds.), Historical Development of Auxiliaries, 21–52. Berlin & New York: Mouton De Gruyter. Arkadiev, Peter. 2014. Towards an areal typology of prefixal perfectivation. Scando-Slavica 60(2). 384–405. Arkadiev [Arkad’ev], Petr M. 2015. Areal’naja tipologija prefiksal’nogo perfektiva (na materiale jazykov Evropy i Kavkaza) [Areal typology of the prefixed perfective aspect (on the basis of languages in Europe and the Caucasus)]. Moskva: Jazyki slavjanskoj kul’tury. Arkadiev [Arkad’ev], Petr M. & Andrej B. Shluinsky [Šluinskij]. 2015. Slovoklassificirujuščie aspektual’nye sistemy: Opyt tipologii [Classifying aspect systems: Approaching their typology]. Vestnik SPbGU 2015 (3). 4–24. Arkadiev, Peter & Björn Wiemer. Forthcoming. Perfects in Baltic and Slavic. In Robert Crellin & Thomas Jügel (eds.), Perfects in Indo-European languages, vol. 2: The later history of perfects in IE languages. Amsterdam & Philadelphia: Benjamins. Bajerowa, Irena. 2000. Polski język ogólny XIX wieku. Stan i ewolucja, t. III: Składnia. Synteza. [Superregional Polish of the 19th century. State and evolution, vol. III: Syntax. Synthesis] Katowice: Wydawnictwo Uniwersytetu Śląskiego. Belaj, Branimir & Darko Matovac. 2015. On the article-like use of the indefinite determiners jedan and neki in Croatian and other Slavic languages. Suvremena lingvistika 79. 1–20. Benacchio, Rosanna. 2009. Il contatto slavo-romanzo nel croato del Molise e nei dialetti sloveni del Friuli. In Lenka Scholze & Björn Wiemer (eds.), Von Zuständen, Dynamik und Veränderung bei Pygmäen und Giganten (Festschrift für Walter Breu zu seinem 60. Geburtstag), 177–191. Bochum: Brockmeyer. Benacchio [Benakk’o], Rosanna. 2014. Grammatikalizacija v situacijax jazykovogo kontakta: Razvitie artiklja v rez’janskom dialekte [Grammaticalization in situations of language contact: The development of the article in a Resian dialect]. In Motoki Nomachi, Andrii Danylenko & Predrag Piper (eds.), Grammaticalization and lexicalization in the Slavic languages, 205–217. München: Sagner. Berger, Tilman. 1993. Das System der tschechischen Demonstrativpronomina. Textgrammatische und stilspezifische Gebrauchsbedingungen. München (unpubl. postdoctoral thesis). Berger, Tilman. 2008. Deutsche Einflüsse auf das grammatische System des Tschechischen. In Tilman Berger (ed.), Studien zur historischen Grammatik des Tschechischen, 57–71. München: Lincom. Berger, Tilman. 2009. Einige Bemerkungen zum tschechischen Absentiv. In Tilman Berger, Markus Giger, Sibylle Kurt & Imke Mendoza (eds.), Von grammatischen Kategorien und sprachlichen Weltbildern – Die Slavia von der Sprachgeschichte bis zur Politsprache (Festschrift für Daniel Weiss zum 60. Geburtstag), 9–28. München-Wien: Sagner. Besters-Dilger, Ana Drobnjaković & Björn Hansen. 2009. Modals in the Slavonic languages. In Björn Hansen & Ferdinand de Haan (eds.), Modals in the Languages of Europe (A Reference Work), 167–197. Berlin & New York: Mouton de Gruyter. Birzer, Sandra. 2010. Russkoe deepričastie: Processy grammatikalizacii i leksikalizacii [The Russian adverbial participle: Processes of grammaticaliztation and lexicalization]. München:Sagner. Boguslavskij, Igor’ M. 1988. O nekotoryx tipax nekanoničeskix sočinitel’nyx konstrukcij [On some types of non-canonical coordinative constructions.]. Problemy razrabotki formal’noj modeli jazyka, 5–18. Moskva. Borkovskij, Valentin I. (ed.) 1968. Sravnitel’no-istoričeskij sintaksis vostočnoslavjanskix jazykov (členy predloženija) [Comparative-historical syntax of East Slavic languages (clause members)]. Moskva: Nauka. Boye, Kasper, Eva van Lier & Eva Theilgaard Brink. 2015. Epistemic complementizers: a crosslinguistic survey. Language Sciences 51. 1–17.

Grammaticalization in Slavic

297

Breu, Walter. 1988. Resultativität, Perfekt und die Gliederung der Aspektdimension. In Jochen Raecke (ed.), Slavistische Linguistik 1987, 44–74. München: Sagner. Breu, Walter. 1994. Der Faktor Sprachkontakt in einer dynamischen Typologie des Slavischen. In Hans- Robert Mehlig (ed.), Slavistische Linguistik 1993, 41–64. München: Sagner. Breu, Walter. 1998. Romanisches Adstrat im Moliseslavischen. Die Welt der Slaven XLIII. 339–354. Breu, Walter. 2012 The grammaticalization of an indefinite article in Slavic micro-languages. In Björn Wiemer, Bernhard Wälchli & Björn Hansen (eds.), Grammatical replication and borrowability in language contact, 275–322. Berlin & New York: Mouton de Gruyter. Breu, Walter. 2017. Moliseslavische Texte aus Acquaviva Collecroce, Montemitro und San Felice del Molise. (Slavische Mikrosprachen im absoluten Sprachkontakt I). Wiesbaden: Harrassowitz. Bunčić, Daniel. 2015. “To mamy wpajane od dziecka” – a recipient passive in Polish? Zeitschrift für Slawistik 60(3). 411–431. Comrie, Bernard & Greville G. Corbett. 1993. Introduction. In Bernard Comrie & Greville G. Corbett (eds.), The Slavonic Languages, 1–19. London & New York: Routledge. Coseriu, Eugenio. 1966. Tomo y me voy. Ein Problem vergleichender europäischer Syntax. Vox Romanica 25. 13–55. Cruschina, Silvio & Eva-Maria Remberger. 2008. Hearsay and reported speech: evidentiality in Romance. Rivista di Grammatica Generativa 33. 95–116. Cyxun, Gennadij A. 1968. Sintaksis mestoimennyx klitik v južnoslavjanskix jazykax [Syntax of pronominal clitics in South Slavic languages]. Minsk: Izdatel’stvo “Nauka i texnika”. Dahl, Östen. 2000a. The grammar of future time reference in European languages. In Östen Dahl (ed.), Tense and aspect in the languages of Europe, 309–328. Berlin, New York: Mouton de Gruyter. Dahl, Östen. 2000b. Verbs of becoming as future copulas. In Östen Dahl (ed.): Tense and aspect in the languages of Europe, 351–361. Berlin & New York: Mouton de Gruyter. Dahl, Östen. 2004. The growth and maintenance of linguistic complexity. Amsterdam & Philadelphia: Benjamins. Danylenko, Andrii. 2006. Slavica et Islamica. Ukrainian in Context. München: Sagner. Danylenko, Andrii I. 2010. Naskil’ky ukrajins’kyj syntetyčnyj majbutnij čas e syntetyčnym? [To which extent is the Ukrainian future tense synthetic?] Movoznavstvo 2010 (4–5). 113–121. Danylenko, Andrii. 2011. Is there any inflectional future in East Slavic? A case of Ukrainian against Romance reopened. In Motoki Nomachi (ed.), Grammaticalization in Slavic languages, 147– 177. Sapporo: Slavic Research Center / Hokkaido University. Danylenko, Andrii. 2012. Auxiliary clitics in Southwest Ukrainian: Questions of chronology, areal distribution, and grammaticalization. Journal of Slavic Linguistics 20(1). 3–34. Danylenko, Andrii. 2014. On the relativization strategies in East Slavic. In Motoki Nomachi, Andrii Danylenko & Predrag Piper (eds.), Grammaticalization and lexicalization in the Slavic languages, 183–204. München: Sagner. Decaux, Étienne. 1955. Morphologie des enclitiques polonais. Paris: Institut d‘études slaves. De Groot, Casper. 2000. The absentive. In Östen Dahl (ed.), Tense and aspect in the languages of Europe, 693–719. Berlin & New York: Mouton de Gruyter. Dickey, Stephen M. 2000. Parameters of Slavic aspect (A cognitive approach). Stanford, CA: CSLI Publications. Dimitrova-Vulchanova, Mila. 2000. Clitics in the Slavic languages. In Henk Van Riemsdijk (ed.), Clitics in the Languages of Europe, 83–122. Berlin & New York: Mouton de Gruyter. Drinka, Bridget. 2012. The Balkan perfects: Grammaticalization and contact. In Björn Wiemer, Bernhard Wälchli & Björn Hansen (eds.), Grammatical replication and borrowability in language contact, 511–558. Berlin, New York: Mouton de Gruyter. Drinka, Bridget. 2017. Language contact in Europe: The periphrastic perfect through history. Cambridge: Cambridge University Press.

298

Björn Wiemer

Ekberg, Lena. 1993. The cognitive basis of the meaning and function of cross-linguistic   V. Belgian Journal of Linguistics 8. 21–42. Erker [Ėrker], Aksana. 2015. Strukturnye čerty smešannyx belorusskix govorov na baltoslavjanskom pogranič’e [Structural properties of Belarusian dialects in the Baltic-Slavic border zone]. Leipzig: Biblion Media. Fortuin, Egbert L. 2008. Polisemija imperativa v russkom jazyke [The polysemy of the imperative in Russian]. Voprosy jazykoznanija 2008 (1). 3–24. Fortuin, Egbert & Jaap Kamphuis. 2015. The typology of Slavic aspect: a review of the East-West Theory of Slavic aspect. Russian Linguistics 39. 163–208. Friedman, Victor A. 1976. Dialectal synchrony and diachronic syntax. Chicago Linguistic Society: Papers from the parasession of diachronic syntax, 96–103. Friedman, Victor A. 2008. Balkan object reduplication in areal and dialectological perspective. In Dalina Kallulli & Liliane Tasmowski (eds.), Clitic doubling in the Balkan languages, 35–63. Amsterdam & Philadelphia: Benjamins. Friedman, Victor A. 2011. Tipologijata na upotrebata na da vo balkanskite jazici [The typology of the use of da in Balkan languages]. In Victor A. Friedman (ed.), Makedonistički studii, 43–52. Skopje: MANU. Geist, Ljudmila. 2011. Bulgarian edin: The rise of an indefinite article. In Uwe Junghanns, Dorothee Fehrmann, Denisa Lenertová & Hagen Pitsch (eds.), Formal description of Slavic languages: The ninth conference, 125–148. Frankfurt/M.: Lang. Geniušienė, Emma. 1987. The typology of reflexives. Berlin & New York: Mouton de Gruyter. Giger, Markus. 2003a. Resultativkonstruktionen im modernen Tschechischen (unter Berücksichtigung der Sprachgeschichte und der übrigen slavischen Sprachen). Bern: Lang. Giger, Markus. 2003b. Die Grammatikalisierung des Rezipientenpassivs im Tschechischen, Slovakischen und Sorbischen. In Patrik Sériot (ed.), Contributions suisses au XIIIe congrès mondial des slavistes à Ljubljana, aôut 2003, 79–102. Bern: Lang. Giger, Markus. 2004. Recipientné pasívum v slovenčine [The recipient passive in Slovak]. Slovenská reč 69. 37–43. Giger, Markus. 2012. The “recipient passive” in West Slavic. A calque from German and its grammaticalization. In Björn Wiemer, Bernhard Wälchli & Björn Hansen (eds.), Grammatical replication and borrowability in language contact, 559–588. Berlin & New York: Mouton de Gruyter. Giger, Markus. 2016. Kongruenzbrüche in slovakischen possessiven Resultativa (Evidenz aus dem slovakischen Nationalkorpus). Jazykovedny časopis 67(3). 283–294. Givón, Talmy. 1980. The binding hierarchy and the typology of complements. Studies in Language 4. 333–377. Givón, Talmy. 1981. On the development of the numeral ‘one’ as an indefinite marker. Folia Linguistica Historica 2(1). 35–53. Gołąb, Zbigniew. 1984. The Arumanian dialect of Kruševo in SR Macedonia, SFR Yougoslavia. Skopje: MANU. Górski, Rafał. 2008. Diateza nacechowana w polszczyźnie (Studium korpusowe) [Marked diathesis in Polish (A corpus-based study)]. Kraków: Lexis. Graves, Nina. 2000. Macedonian – a language with three perfects? In Östen Dahl (ed.), Tense and aspect in the languages of Europe, 479–494. Berlin & New York: Mouton de Gruyter. Grickat, Irena. 1975. Studije iz istorije srpskohrvatskog jezika. Beograd. Grković-Major [Grković-Mejdžor], Jasmina. 2004. Razvoj hipotaktičkog da u starosrpskom jeziku [The evolution of the hypotactic da in Serbian]. Zbornik Matice Srpske za filologiju i lingvistiku 47(1–2). 185–203. Grković-Major, Jasmina. 2011. The development of predicative possession in Slavic languages. In Motoki Nomachi (ed.), The grammar of possessivity in South Slavic languages: Synchronic and diachronic perspectives, 35–54. Sapporo: Slavic Research Center (Hokkaido University).

Grammaticalization in Slavic

299

Gusev, Valentin Ju. 2013. Tipologija imperativa [A typology of the imperative]. Moskva: Jazyki slavjanskoj kul’tury. Hansen, Björn. 1999. Die Herausbildung und Entwicklung der Modalauxiliare im Polnischen. In Tanja Anstatt (ed.), Entwicklungen in slavischen Sprachen, 83–167. München: Sagner. Hansen, Björn. 2000. The German modal ‘müssen’ and the Slavonic Languages – Reconstruction of a success story. Scando-Slavica 46. 77–93. Hansen, Björn. 2001. Das slavische Modalauxiliar: Semantik und Grammatikalisierung im Russischen, Polnischen, Serbischen/Kroatischen und Altkirchenslavischen. München: Sagner. Hansen, Björn. 2003. a namъ ospodine nemočьno žitь. Untersuchungen der Modalität in Birkenrindentexten als Beitrag zur historischen Linguistik des Russischen. In Tanja Anstatt & Björn Hansen (eds.), Entwicklungen in slavischen Sprachen 2 (Für Volkmar Lehmann zum 60. Geburtstag von seinen Schülerinnen und Schülern), 63–82. München: Sagner. Hansen, Björn. 2004. Modals and the boundaries of grammaticalization: The case of Russian, Polish and Serbian-Croatian. In Walter Bisang, Nikolaus P. Himmelmann & Björn Wiemer (eds.), What makes grammaticalization? A look from its fringes and its components, 245– 270. Berlin & New York: Mouton de Gruyter. Hansen, Björn. 2011. Slavonic languages. In Bernd Kortmann & Johan van der Auwera (eds.), The languages and linguistics of Europe (A comprehensive guide), 97–123. Berlin & Boston: Mouton de Gruyter. Hansen, Björn. 2014. The syntax of modal polyfunctionality revisited: evidence from the languages of Europe. In Elisabeth Leiss & Werner Abraham (eds.), Modes of modality. Modality, typology, and Universal Grammar, 89–126. Amsterdam & Philadelphia: Benjamins. Haspelmath, Martin. 1990. The grammaticization of passive morphology. Studies in Language 14(1). 25–72. Haspelmath, Martin. 2001. The European linguistic area: Standard average European. In Martin Haspelmath, Ekkehard König, Wolfgang Oesterreicher & Wolfgang Raible (eds.), Language typology and universals: An international handbook, vol. 2, 1492–1510. Berlin & New York: De Gruyter. Havránek, Bohuslav. 1937 [1928]. Genera verbi v slovanských jazycích [Genus verbi (voice) in Slavic languages] I–II. Praha. Heine, Bernd. 1997. Cognitive foundations of grammar. Oxford: Oxford University Press. Heine, Bernd & Tania Kuteva. 2002. World lexicon of grammaticalization. Cambridge: Cambridge University Press. Heine, Bernd & Tania Kuteva. 2005. Language contact and grammatical change. Cambridge: Cambridge University Press. Himmelmann, Nikolaus P. 2001. Articles. In Martin Haspelmath, Ekkehard König, Wulf Oesterreicher & Wolfgang Raible (eds.), Language typology and language universals. An international handbook, vol. 1, 831–841. Berlin & New York: Mounton de Gruyter. Holvoet, Axel. 2005. Evidentialität, Modalität und interpretative Verwendung. In Björn Hansen & Petr Karlík (eds.), Modality in Slavonic languages. New perspectives, 95–105. München: Sagner. Holvoet, Axel. 2012. Towards a semantic map for definite adjectives in Baltic. Baltic Linguistics 3. 65–99. Holvoet, Axel & Jelena Konickaja. 2011. Interpretive deontics. A definition and a semantic map based on Slavonic and Baltic data. Acta Linguistica Hafniensia 43(1). 1–20. Holvoet, Axel, Marta Grzybowska & Agnieszka Rembiałkowska. 2015. Middle voice reflexives and argument structure in Baltic. In Axel Holvoet & Nicole Nau (eds.), Voice and argument structure in Baltic, 181–209. Amsterdam & Philadelphia: Benjamins. Holzer, Georg. 2014. Vorhistorische Periode. In Karl Gutschmidt, Sebastian Kempgen, Tilman Berger & Peter Kosta (eds.), Die slavischen Sprachen: ein internationales Handbuch zu ihrer

300

Björn Wiemer

Struktur, ihrer Geschichte und ihrer Erforschung, vol. 2, 1117–1131. Berlin & Boston: Mounton de Gruyter. Ivanova, E. Ju. & S. Koval’. 1994. Bolgarskoe EDIN s točki zrenija referencial’nogo analiza [Bulgarian edin ‘one’ from the point of view of referentiality]. Vestnik Sankt-Peterburgskogo Universiteta 16. 112–115. Kalsbeek, Janneke. 1998. The Čakavian dialect of Orbanići near Žminj in Istria. Amsterdam & Atlanta: Rodopi. Kalsbeek, Janneke. 2011. Contact-induced innovations in Istrian Čakavian dialects. In Cornelius Hasselblatt, Peter Houtzagers & Remco van Pareren (eds.), Language contact in times of globalization, 133–154. Amsterdam & New York: Rodopi. Kazenin, Konstantin I. 2001. The passive voice. In Martin Haspelmath, Ekkehard König, Wolfgang Österreicher & Wolfgang Raible (eds.), Language typology and universals: An international handbook, vol. 2, 899–916. Berlin & New York: De Gruyter. Kempf, Luise & Damaris Nübling. This volume. Grammaticalization in the Germanic Languages. In Walter Bisang & Andrej Malchukov (eds.), Grammaticalization Scenarios. Areal Patterns and Cross-Linguistic Variation. A comparative handbook. Kor Chahine [Kor Šaxin], Irina. 2007. O vozmožnom puti grammatikalizacii russkogo vzjat’ [On a possible grammaticalization path of Russ. vzjat’ ‘take’]. Russian Linguistics 31(3). 231–248. Kramer, Christina Elizabeth. 1986. Analytic modality in Macedonian. München: Sagner. Krämer, Sabine. 2005. Synchrone Analyse als Fenster zur Diachronie: Die Grammatikalisierung von werden + Infinitiv. München: Lincom Europa. Krapova, Iliana & Tsvetana Dimitrova. 2015. Genitive-dative syncretism in the history of the Bulgarian language. Towards an analysis. Studi Slavistici XII. 181–208. Kuznetsova [Kuznecova], Julia. 2006. The first verb of pseudocoordination as an auxiliary. Ms. Yale University. Kuznecova, Julia L. 2009. Semantičeskie i strukturnye svojstva uslovnoj kvaziimperativnoj konstrukcii [Semantic and structural properties of a quasi-imperative conditional costruction.]. In Ksenija L. Kiseleva, Vladimir A. Plungjan, Ekaterina V. Paxilina & Sergej G. Tatevosov (eds.), Korpusnye issledovanija po russkoj grammatike, 314–334. Moskva: Probel2000. Łaziński, Marek. 2001. Was für ein Perfekt gibt es im modernen Polnisch? Linguistik online 8(1– 01). Lehmann, Christian. 1995. Thoughts on grammaticalization. München & Newcastle: Lincom. Lehmann, Volkmar 1992. Le prétérit déictique et le prétérit narratif en polonais moderne. In Marguerite Guiraud-Weber & Charles Zaremba (eds.), Linguistique et slavistique. Melanges offerts à Paul Garde, 545–557. Aix-en-Provence, Paris. Lehmann, Volkmar. 1999. Sprachliche Entwicklung als Expansion und Reduktion. In Tanja Anstatt (ed.), Entwicklungen in slavischen Sprachen, 169–254. München: Sagner. Leonova, Anna. 2011. Semantičeskaja dominanta i modal’nye ottenki konstrukcii vzjat’ sdelat’ [The semantic dominant and modal nuances of the construction vzjat' sdelat']. Ms. (Handout of talk given at the conference Russkij jazyk: konstrukcionnye i leksiko-semantičeskie podxody, Sankt-Peterburg, 24–26 marta 2011 g.) https://iling.spb.ru/confs/rusconstr2011/ materials/Leonova_handout.pdf Letuchiy, Alexander. 2010. Syntactic change and shifts in evidential meanings: five Russian units. In Björn Wiemer & Katerina Stathi (eds.), Database on evidentiality markers in European languages. (= STUF 63–4), 358–369. Berlin: Akademie-Verlag. Levshina, Natalia. 2015. European analytic causatives as a comparative concept. Evidence from a parallel corpus of film subtitles. Folia Linguistica 49(2). 487–520. Lončarić, Mijo. 1996. Kajkavsko narječje [The Kajkavian dialect]. Zagreb: Školska knjiga.

Grammaticalization in Slavic

301

López-Couso, María & Belén Méndez-Naya. 2012. On the use of as if, as though, and like in Present-day English complementation structures. Journal of English Linguistics 40(2). 172– 195. López-Couso, María & Belén Méndez-Naya. 2015. Secondary grammaticalization in clause combining: from adverbial subordination to complementation in English. Language Sciences 47. 188–198. Lord, Carol, Foong Ha Yap & Shoichi Iwasaki. 2002. Grammaticalization of ‘give’: African and Asian perspectives. In Ilse Wischer & Gabriele Diewald (eds.), New reflections on grammaticalization, 217–235. Amsterdam & Philadelphia: Benjamins. Lotko, Edvard. 1997. Synchronní konfrontace češtiny a polštiny (Soubor statí) [Synchronic comparison of Czech and Polish (Collection of articles)]. Olomouc. Majsak, Timur A. 2005. Tipologija grammatikalizacii konstrukcij s glagolami dviženija i glagolami pozicii [A typology of the grammaticalization of constructions with movement and position verbs]. Moskva: Jazyki slavjanskix kul’tur. Makarova, Аnastasija L. 2016. O formax i funkcijax perfekta v zapadnomakedonskix dialektax [On the forms and functions of the perfect in the western dialects of Macedonian]. In Timur A. Majsak, Vladimir A. Plungjan & Ksenija P. Semënova (eds.), Issledovanija po teorii grammatiki. Vyp. 7. Tipologija perfekta [Studies in the Theory of Grammar. Typology of the Perfect], 217–234. Acta Linguistica Petropolitana 12(2). Meermann, Anastasia & Barbara Sonnenhauser. 2016. Das Perfekt im Serbischen zwischen slavischer und balkanslavischer Entwicklung. In Alena Bazhutkina & Barbara Sonnenhauser (eds.), Linguistische Beiträge zur Slavistik. XXII. JungslavistInnen-Treffen in München. 12. bis 14. September 2013, 83–110. München: Sagner. Mendoza, Imke. 2004. Nominaldetermination im Polnischen: Die primären Ausdrucksmittel. München. Unpubl. postdoctoral thesis. http://www.uni-salzburg.at/index.php?id=31213& MP=44700-200607 %2C200731-200747 %2C118-44805 Mendoza, Imke. 2013. Verhinderte Grammatikalisierung? Zur Diachronie von Resultativkonstruktionen mit mieć ‘habenʼ im Polnischen. Wiener Slawistischer Almanach 72. 77–102. Mirčev, Kiril. 1976. Za săčetanijata na glagol imam + minalo stradatelno pričastie v bălgarski ezik [On the collocations of the verb imam ‘have’ + past passive participle in Bulgarian]. In Petăr Pašov & Ruselina Nicolova (eds.), Pomagalo po bălgarska morfologija. Glagol, 565–567. Sofia: Nauka i izkustvo. [First published in Bălgarski ezik 23–6. 1973] Mišeska Tomić, Olga. 2006. Balkan Sprachbund morpho-syntactic features. Dordrecht: Springer. Mitkovska, Liljana. 2011. Competition between Nominal Possessive Constructions and the Possessive Dative in Macedonian. In Motoki Nomachi (ed.), The grammar of possessivity in South Slavic languages: Synchronic and diachronic perspectives, 83–109. Sapporo: Slavic Research Center (Hokkaido University). Mološnaja, Tat’jana N. 1986. Posessivnye sintaksičeskie konstrukcii v serbskoxorvatskom jazyke [Syntactic possessive constructions in Serbo-Croatian]. In Ljudmila Ė. Kalnyn’ & Tat’jana N. Mološnaja. (eds.), Slavjanskoe i balkanskoe jazykoznanie (Problemy dialektologii. Kategorija posessivnosti), 179–188. Moskva: Nauka. Nau, Nicole, Kirill Kozhanov, Liina Lindström, Asta Laugalienė & Paweł Brudzyński. 2019. Pseudocoordination with ‘take’ in Baltic and its neighbours. Baltic Linguistics 10, 237–306. Nedjalkov, Vladimir P. (ed.). 1988 [1983]. Tipologija rezul’tativnyx konstrukcij (rezul’tativ, stativ, passiv, perfekt) [A typology of resultative constructions (resultative, stative, passive, perfect)]. Leningrad: Nauka. [English translation, with additions, 1988 at Benjamins.] Nedjalkov, Vladimir P. & Sergej E. Jaxontov. 1983. Tipologija rezul’tativnyx konstrukcij [A typology of resultative constructions]. In Vladimir P. Nedjalkov (ed.), Tipologija rezul’tativnyx konstrukcij (rezul’tativ, stativ, passiv, perfekt), 5–41. Leningrad: Nauka.

302

Björn Wiemer

Nichols, Johanna. 1992. Linguistic diversity in space and time. Chicago: Chicago University Press. Nitsolova [Nicolova], Ruselina. 2014. Correlation between formal and semantic changes in grammaticalization. In Motoki Nomachi, Andrii Danylenko & Predrag Piper (eds.), Grammaticalization and lexicalization in the Slavic languages, 31–50. München: Sagner. NKJP: Narodowy korpus języka polskiego [Polish national corpus]. http://nkjp.pl/ Nomachi, Motoki. 2012a. On the so-called possessive perfect in standard Serbian language (with a glance at other Slavic languages). Leptir Mašna: The literary journal of Balkan Studies 9(1). 89–97. Nomachi, Motoki. 2012b. The Recipient-Passive construction and its grammaticalization in Kashubian. In Andrii Danylenko & Serhii Vakulenko (eds.), Studien zu Sprache, Literatur und Kultur bei den Slaven (Gedenkschrift für George Y. Shevelov aus Anlass seines 100. Geburtstages und 10. Todestages), 109–135. München, Berlin: Sagner. Olbertz, Hella 2007. Dizque in Mexican Spanish: the subjectification of reportative meaning. In Mario Squartini (ed.), Special issues: Evidentiality between lexicon and grammar. Rivista di linguistica 19(1). 151–172. Pancheva, Roumyana. 2004. Balkan possessive clitics. The problem of case and category. In Olga Mišeska Tomić (ed.), Balkan syntax and semantics, 175–219. Amsterdam & Philadelphia: Benjamins. Percov, Nikolaj V. 2001. Invarianty v russkom slovoizmenenii [Invariants in Russian inflection]. Moskva: Jazyki russkoj kul’tury. Percov, Nikolaj V. 2003. Vozvratnye stradatel’nye formy russkogo glagola v svjazi s problemoj suščestvovanija v morfologii [Reflexive passive forms of the Russian verb in connection with the problem of existence in morphology]. Voprosy jazykoznanija 2003(4). 43–71. Piotrowski, Jan. 1981. Składnia słowińska wobec wpływów języka niemieckiego [Slovincian syntax in the light of German influences]. Wrocław: Wrocławskie Towarzystwo Naukowe. Plungjan, Vladimir A. 2001. Antirezul’tativ: do i posle rezul’tata [The antiresultative: before and after the result]. In Vladimir A. Plungjan (ed.), Issledovanija po teorii grammatiki 1: Glagol’nye kategorii, 50–88. Moskva: «Russkie slovari». Plungjan, Vladimir A. 2004. O kontrafaktičeskix upotreblenijax pljuskvamperfekta [On counterfactual usage types of the pluperfect]. In Vladimir A. Plungjan (ed.), Issledovanija po teorii grammatiki 3: Irrealis i irreal‘nost‘, 273–291. Moskva: Gnozis. Plungjan, Vladimir A. 2011. Vvedenie v grammatičeskuju semantiku: grammatičeskie značenija i grammatičeskie sistemy jazykov mira [Introduction into grammatical semantics: grammatical meanings and grammatical systems in languages of the world]. Moskva: Izdatel’stvo RGGU. Podlesskaja, Vera I. 2005. ‘Give’-verbs as permissive auxiliaries in Russian. Language typology and universals (STUF) 58(1). 124–138. Pohl, H. D. 1980. Baltisch und Slavisch. Die Fiktion von der baltisch-slavischen Spracheinheit (Erster Teil). Klagenfurter Beiträge zur Sprachwissenschaft 6. 58–101. Rittel, Teodozja. 1975. Szyk członów w obrębie form czasu przeszłego i trybu przypuszczającego [The sequence of sentence parts in the past tense and the conditional]. Wrocław: Ossolineum. Roeder, Carolin F. & Björn Hansen. 2006. Modals in contemporary Slovene. Wiener Slavistisches Jahrbuch 52. 153–170. Růžička, Rudolf. 1963. Das System der altslavischen Partizipien und sein Verhältnis zum Griechischen. Berlin: Akademie-Verlag. Sakel, Jeanette. 2007. Types of loans: Matter and pattern. In Yaron Matras & Jeanette Sakel (eds.), Grammatical borrowing in cross-linguistic perspective, 15–31. Berlin & New York: Mouton de Gruyter. Sansò, Andrea. 2006. ‘Agent defocusing’ revisited: Passive and impersonal constructions in some European languages. In Werner Abraham & Larisa Leisiö (eds.), Passivization and typology (form and function), 232–273. Amsterdam & Philadelphia: Benjamins.

Grammaticalization in Slavic

303

Sawicki, Lea. 2011. The perfect-like construction in colloquial Polish. Zeitschrift für Slawistik 56. 66–83. Scholze, Lenka. 2008. Das grammatische System der obersorbischen Umgangssprache im Sprachkontakt. Bautzen: Domowina. Seržant, Ilja A. 2012. The so-called possessive perfect in North Russian and the Circum-Baltic area. A diachronic and areal account. Lingua 122. 356–385. Sičinava, Dmitrij. V. 2013. Tipologija pljuskvamperfekta. Slavjanskij pljuskvamperfekt [A typology of the pluperfect. The Slavic pluperfect]. Moskva: Ast-Press. Sonnenhauser, Barbara. 2010. Die Diskursfunktionen des ‘dreifachen Artikels’ im Makedonischen: Perspektivität und Polyphonie. Die Welt der Slaven LV. 334–359. Sonnenhauser, Barbara. 2012. Auxiliar-Variation und Textstruktur im Bulgarischen. Die Welt der Slaven 57(2). 351–379. Sonnenhauser, Barbara. 2013. ‘Evidentiality’ and point of view in Bulgarian. Săpostavitelno ezikoznanie 2–3. 110–130. Spencer, Andrew & Ana R. Luís. 2012. Clitics: an introduction. Cambridge: Cambridge University Press. Stanković, Branimir. Forthcoming. Posesivne klitike unutar DP-a u torlačkom dijalektu i makedonskom jeziku [DP-internal possessive clitics in the Torlak dialect and in Macedonian]. Godišnjak za srpski jezik 17. Niš: Faculty of Philosophy. Steenwijk, Johannes. 1992. The Slovene dialect of Resia. San Giorgio. Amsterdam & Atlanta: Rodopi. Stolz, Thomas. 1991. Sprachbund im Baltikum? Estnisch und Lettisch im Zentrum einer sprachlichen Konvergenzlandschaft. Bochum: Brockmeyer. Sussex, Roland & Paul Cubberley. 2006. The Slavic languages. Cambridge: Cambridge University Press. Šošitajšvili, Igor’ A. 1998. Russkoe bylo: put’ grammatikalizacii [Russian bylo: grammaticalization paths]. Rusistika segodnja 3–4. 59–78. Thieroff, Rolf. 2000. On the areal distribution of tense-aspect categories in Europe. In Östen Dahl (ed.), Tense and aspect in the languages of Europe, 265–305. Berlin & New York: Mouton de Gruyter. Tommola, Hannu. 2000. On the perfect in North Slavic. In Östen Dahl (ed.), Tense and aspect in the languages of Europe, 441–478. Berlin & New York: Mouton de Gruyter. Topolińska, Zuzanna [Topolinjska, Zuzana]. 1995. Makedonskite dialekti vo egejska Makedonija, kn. 1: Sintaksa (del I) [Clitics of Macedonian in the Aegean region, vol. 1: Syntax (part I)]. Skopje: MANU. Topolińska, Zuzanna [Topolinjska, Zuzana]. 2006. Trojniot člen – da ili ne? [A threefold article – yes or no?] Južnoslovenski filolog 62. 7–15. Trovesi, Andrea. 2004. La genesi degli articoli determinativi: modalità di espressione della definitezza in ceco, serbo-lusaziano e sloveno. Milano. Urbańczyk, St. 71984. Zarys dialektologii polskiej [A sketch of Polish dialectology]. Warszawa: Państwowe Wydawnictwo Naukowe. Usonienė, Aurelija & Erika Jasionytė. 2010. Towards grammaticalization: Lithuanian acquisitive verbs gauti (‘get’) and tekti (‘be gotten’). Acta Linguistica Hafniensia 42(2). 199–220. Van de Velde, Freek & Béatrice Lamiroy. 2017. External possessors in West Germanic and Romance: Differential speed in the drift toward NP configurationality. In Daniel Van Olmen, Hubert Cuyckens & Lobke Ghesquière (eds.): Aspects of grammaticalization. (Inter)subjectification and directionality, 353–399. Berlin & Boston: De Gruyter Mouton. Van der Auwera, Johan & Vladimir A. Plungian. 1998. Modality’s semantic map. Linguistic Typology 2(1). 79–124.

304

Björn Wiemer

Van der Auwera, Johan, Petar Kehayov & Alice Vittrant. 2009. Acquisitive modals. In Lotte Hogeweg, Helen de Hoop & Andrej Malchukov (eds.), Cross-linguistic semantics of tense, aspect and modality, 271–302. Amsterdam & Philadelphia: Benjamins. Vasilev, Christo. 1968. Der romanische Perfekttyp im Slavischen. In Еrwin Koschmieder & Maximilian Braun (eds.), Slavistische Studien zum VI. Internationalen Slavistenkongress in Prag 1968, 215–230. München: Trofenik. Večerka, Radoslav. 1961. Syntax aktivních participií v staroslověnštině [The syntax of active participles in Old Church Slavonic]. Praha: Státní pedagogické nakladatelství. Večerka, Radoslav. 1993. Altkirchenslavische (altbulgarische) Syntax II: Die innere Satzstruktur. Freiburg/Br.: Weiher. Veenker, Wolfgang. 1967. Die Frage des finnougrischen Substrats in der russischen Sprache. Indiana, Bloomington. Vlasto, Alexis P. 1988. A linguistic history of Russia (To the end of the eighteenth century. Oxford: Clarendon Press. Vykypěl, Bohumil & Achim Rabus. 2011. From giving to existence: on one remarkable grammmaticalization pathway. Linguistica Brunensia 59. 183–187. Waldenfels, Ruprecht von. 2012. The grammaticalization of ‘give’+infinitive (A comparative study of Russian, Polish, and Czech). Berlin & Boston: De Gruyter Mouton. Waldenfels, Ruprecht von. 2015a. Grammaticalization of ‘give’ in Slavic between drift and contact: Causative, modal, imperative, existential, optative and volative constructions. In Brian Nolan, Gudrun Rawoens & Elke Diedrichsen (eds.), Causation, permission, and transfer (Argument realisation in , , ,  and  verbs), 107–127. Amsterdam & Philadelphia: Benjamins. Waldenfels, Ruprecht von. 2015b. Czasownik nie do końca (nie)dokonany: dać się w użyciu modalnym. [A not completely (im)perfective verb: dać się in modal use]. Język Polski XCV (4). 316–324. Wälchli, Bernhard. 2005. Co-compounds and natural coordination. Oxford: Oxford University Press. Weiss, Daniel. 1982. Deutsch-polnische Lehnbeziehungen im Bereich der Passivbildung. In Eberhard Reissner (ed.), Literatur und Sprachentwicklung in Osteuropa im 20. Jahrhundert (Ausgewählte Beiträge zum 2. Weltkongreß für Sowjet- und Osteuropastudien), 197–218. Berlin: Berlin-Verlag. Weiss, Daniel. 1983. Zur typologischen Stellung des Polnischen (ein Vergleich mit dem Tschechischen und Russischen). In Peter Brang, G. Nivat & R. Zett (eds.), Schweizerische Beiträge zum IX. Internationalen Slavistenkongreß in Kiev, September 1983, 229–261. Bern: Lang. Weiss, Daniel. 1987a: Polsko-niemieckie paralele w zakresie czasowników modalnych (na tle innych języków zachodniosłowiańskich) [Temporal relations within double verbs in the perfective aspect (on the background of other West Slavic languages)]. In Gerd Hentschel, Gustav Ineichen & Alek Pohl (eds.), Sprach- und Kulturkontakte im Polnischen, 131–156. München: Sagner. Weiss, Daniel. 1987b. Funkcjonowanie i pochodzenie polskich konstrukcji typu “mam coś do załatwienia”, “coś jest do załatwienia” [The functioning and origin of the Polish constructions of the type “mam coś do załatwienia” ‘I have sth. to do’, “coś jest do załatwienia” ‘sth. is to do’]. In André de Vincenz & Alek Pohl (eds.), Deutsch-polnische Sprachkontakte (Beiträge zur gleichnamigen Tagung, 10.–13. April 1984 in Göttingen), 265– 286. Köln & Wien: Böhlau. Weiss, Daniel. 1993. Aus zwei mach eins. Polyprädikative Strukturen zum Ausdruck eines einzigen Sachverhalts im modernen Russischen. In Karen Ebert (ed.), Studies in clause linkage. Papers from the First Köln-Zürich Workshop, 219–238. Zürich: ASAS.

Grammaticalization in Slavic

305

Weiss, Daniel. 2000. Russkie dvojnye glagoly: kto xozjain, a kto sluga? [Russian double verbs: who is the master, who the servant?] In Leonid L. Iomdin & Leonid P. Krysin (eds.), Slovo v tekste i v slovare (Sbornik statej k semidesjatiletiju akademika Ju.D. Apresjana), 354–378. Moskva: Jazyki russkoj kul’tury. Weiss, Daniel. 2003. Russkie dvojnye glagoly i ix sootvetstvija v finno-ugorskix jazykax [Russian double verbs and their equivalents in Finno-Ugric languages]. Russkij jazyk v naučnom osveščenii 6(2). 37–59. Weiss, Daniel. 2004. The rise of an indefinite article: The case of Macedonian eden. In Walter Bisang, Nikolaus P. Himmelmann & Björn Wiemer (eds.), What makes grammaticalization? A look from its fringes and its components, 139–165. Berlin & New York: Mouton de Gruyter. Weiss, Daniel. 2007. The grammar of surprise: the Russian construction of the type Koška vzjala da umerla ‘Suddenly, the cat died’. In Tilmann Reuther, Leo Wanner & Kim Gerdes (eds.), Proceedings of the 3rd International Conference on Meaning – Text – Theory (Klagenfurt, May 21–24), 427–436. München, Wien: Sagner. (= Wiener Slawistischer Almanach, Linguistische Reihe, Sonderband 69.) Weiss, Daniel. 2008a. Vremennaja sootnesennost’ dvojnyx glagolov soveršennogo vida [Temporal cross-reference of double verbs in the perfective aspect]. In Aleksandr V. Bondarko, Galina I. Kustova & Raja I. Rozina (eds.), Dinamičeskie modeli: slovo, predloženie, tekst (Sbornik statej v čest‘ E. V. Padučevoj), 154–176. Moskva: Jazyki slavjanskix kul’tur. Weiss, Daniel. 2008b. Voz'mu i ne budu! Zum Inexspektativ im modernen Russischen. In Peter Kosta & Daniel Weiss (eds.), Slavistische Linguistik 2006–2007, 473–504. München: Sagner. Weiss, Daniel. 2009. Mögliche Argumentationen zum Nachweis von Calques am Beispiel der polnischen Modalverben. In Lenka Scholze & Björn Wiemer (eds.), Von Zuständen, Dynamik und Veränderung bei Pygmäen und Giganten (Festschrift für Walter Breu zu seinem 60. Geburtstag), 129–153. Bochum: Brockmeyer. Weiss, Daniel. 2012. Verb serialization in northeast Europe: the case of Russian and its FinnoUgric neighbors. In Björn Wiemer, Bernhard Wälchli & Björn Hansen (eds.), Grammatical replication and borrowability in language contact, 611–646. Berlin & New York: Mouton de Gruyter. Wiemer, Björn. 1997. Diskursreferenz im Polnischen und Deutschen – aufgezeigt an der narrativen Rede ein- und zweisprachiger Schüler. München: Sagner. Wiemer, Björn. 1998. Puti grammatikalizacii inxoativnyx svjazok (na primere russkogo, pol’skogo i litovskogo jazykov) [Grammaticalization paths of inchoative copula verbs (the example of Russian, Polish, and Lithuanian)]. In Markus Giger, Thomas Menzel & Björn Wiemer (eds.), Lexikologie und Sprachveränderung in der Slavia, 165–212. Oldenburg: BIS. Wiemer, Björn. 2001. Aspektual’nye paradigmy i leksičeskoe značenie russkix i litovskix glagolov (Opyt sopostavlenija s točki zrenija leksikalizacii i grammatikalizacii) [Aspectual paradigms and the lexical meaning of Russian and Lithuanian verbs (A contrastive approach from the point of view of lexicalization and grammaticalization)]. Voprosy jazykoznanija 2001(2). 26–58. Wiemer, Björn. 2004. The evolution of passives as grammatical constructions in Northern Slavic and Baltic languages. In Walter Bisang, Nikolaus P. Himmelmann & Björn Wiemer (eds.), What makes grammaticalization? A look from its fringes and its components, 271–331. Berlin & New York: Mouton de Gruyter. Wiemer, Björn. 2006. Relations between Actor-demoting devices in Lithuanian. In Werner Abraham & Larisa Leisiö (eds.), Passivization and Typology (Form and Function), 274–309. Amsterdam & Philadelphia: Benjamins. Wiemer, Björn. 2008a. Zur innerslavischen Variation bei der Aspektwahl und der Gewichtung ihrer Faktoren. In Karl Gutschmidt, Ulrike Jekutsch, Sebastian Kempgen & Ludger Udolph (eds.), Deutsche Beiträge zum 14. Internationalen Slavistenkongreß, Ohrid 2008, 383–409. München: Sagner.

306

Björn Wiemer

Wiemer, Björn. 2008b. Pokazateli s citativnoj i inferentivnoj funkcijami v russkom i pol’skom jazykax – kommunikativnye mexanizmy semantičeskogo sdviga [Markers with reportive and inferential functions in Russian and Polish – communicative mechanisms of a semantic shift]. In Björn Wiemer & Vladimir A. Plungjan (eds.), Lexikalische Evidenzialitätsmarker im Slavischen, 337–378. München: Sagner. (= Wiener Slawistischer Almanach, Sonderband 72.) Wiemer, Björn. 2010. Hearsay in European languages: toward an integrative account of grammatical and lexical marking. In Gabriele Diewald & Elena Smirnova (eds.), Linguistic realization of evidentiality in European languages, 59–129. Berlin & New York: De Gruyter Mouton. Wiemer, Björn. 2011a. The grammaticalization of passives. In Bernd Heine & Heiko Narrog (eds.), Handbook of grammaticalization, 535–546. Oxford: Oxford University Press. Wiemer, Björn. 2011b. Grammaticalization in Slavic languages. In Bernd Heine & Heiko Narrog (eds.), Handbook of grammaticalization, 740–753. Oxford: Oxford University Press. Wiemer, Björn. 2012. The Lithuanian -resultative – a typological curiosum? In Nicole Nau & Krzysztof Stroński (eds.), Lingua Posnansiensis 54(2), 69–81. [Special issue on resultatives]. Wiemer, Björn. 2014a. Umbau des Partizipialsystems. In Tilman Berger, Karl Gutschmidt, Sebastian Kempgen & Peter Kosta (eds.), Slavische Sprachen (Ein internationales Handbuch zu ihrer Struktur, ihrer Geschichte und ihrer Erforschung), 2. Halbband, 1625–1652. Berlin & Boston: Mounton de Gruyter. Wiemer, Björn. 2014b. Mora da as a marker of modal meanings in Macedonian: on correlations between categorial restrictions and morphosyntactic behaviour. In Elisabeth Leiss & Werner Abraham (eds.), Modes of modality. Modality, typology, and Universal Grammar, 127–166. Amsterdam & Philadelphia: Benjamins. Wiemer, Björn. 2014c. Sprachwandeltypen im litauisch-slavischen Kontakt: ein Überblick. In Tat’jana Civ’jan, Marija Zav’jalova & Artūras Judžentis (eds.), Baltai ir slavai: dvasinių kultūrų sankirtos / Balty i slavjane: peresečenija duxovnyx kul’tur, 196–217. Vilnius: Versmė. Wiemer, Björn. 2015a. O roli vida v oblasti kratnosti i pragmatičeskix funkcij (ėskiz s točki zrenija xronotopii) [O the role of aspect in pluractionality and for pragmatic functions (a sketch from the perspective of chronotopy)]. In Rosanna Benacchio [Bennak’o]. (ed.), Glagol’nyj vid: grammatičeskoe značenie i kontekst / Verbal aspect: Grammatical meaning and context (Sbornik dokladov III Konferencii Aspektologičeskoj Komissii, sostojavšejsja v Padue s 30. 9. po 4. 10. 2011), 585–609. München: Sagner. Wiemer, Björn. 2015b. An outline of the development of Pol. jakoby in 14th–16th century documents (based on dictionaries). In Björn Wiemer (ed.), Studies on evidentiality marking in West and South Slavic, 217–302. München: Sagner. Wiemer, Björn. 2017a. Slavic resultatives and their extensions: integration into the aspect system and the role of telicity. Slavia 86(2–3). 124–168. Wiemer, Björn. 2017b. Main clause infinitival predicates and their equivalents in Slavic – Why they are not instances of insubordination. In Łukasz Jędrzejowski & Ulrike Demske (eds.), Infinitives at the syntax-semantics interface: A diachronic perspective, 265–338. Berlin & Boston: Mouton de Gruyter. Wiemer, Björn. 2018. On triangulation in the domain of clause linkage and propositional marking. In Jasmina Grković-Major, Björn Hansen & Barbara Sonnenhauser (eds.), Diachronic Slavonic syntax: The interplay between internal development, language contact and metalinguistic factors, 285–338. Berlin, Boston: De Gruyter Mouton. Wiemer, Björn. Submitted. Major apprehensional strategies in Slavic: a survey of their areal and grammatical distribution. In Eva Schultze-Berndt, Marine Vuillermet & Martina Faller (eds.), [title to be established]. Berlin: Language Science Press. Wiemer, Björn. Forthcoming. On the rise, establishment and continued development of subject impersonals in Polish, East Slavic and Baltic. In Seppo Kittilä & Leonid Kulikov (eds.),

Grammaticalization in Slavic

307

Diachronic typology of voice and valency-changing categories. Amsterdam, Philadelphia: Benjamins. Wiemer, Björn & Markus Giger. 2005. Resultativa in den nordslavischen und baltischen Sprachen (Bestandsaufnahme unter arealen und grammatikalisierungstheoretischen Gesichtspunkten). München & Newcastle: Lincom Europa. Wiemer, Björn & Björn Hansen. 2012. Assessing the range of contact-induced grammaticalization in Slavonic. In Björn Wiemer, Bernhard Wälchli & Björn Hansen (eds.), Grammatical replication and borrowability in language contact, 67–155. Berlin & New York: Mouton de Gruyter. Wiemer, Björn & Ilja Seržant. 2017. Diachrony and typology of Slavic aspect: What does morphology tell us? In Walter Bisang & Andrej Malchukov (eds.), Unity and diversity in grammaticalization scenarios, 230–307. Berlin: Language Science Press. Wright, S. & Talmy Givón. 1987. The pragmatics of indefinite reference. Quantified text-based studies. Studies in Language 11. 1–33. Xaralampiev, Ivan. 2001. Istoričeska gramatika na bălgarskija ezik [A historical grammar of Bulgarian]. VelikoTărnovo: Faber. Xrakovskij, Viktor S. & Aleksandr P. Volodin. 1986. Semantika i tipologija imperativa: russkij imperativ [The semantics and typology of the imperative: the Russian imperative]. Leningrad: Nauka. Zaitseva, Valentina. 1995. The speaker’s perspective in grammar and lexicon (The case of Russian). New York: Lang.

Timur Maisak

6 Grammaticalization in Lezgic (East Caucasian)  Introduction . Overview of the Lezgic languages Lezgic, or the Lezgian languages constitute a group of the East Caucasian (NakhDaghestanian) family, one of the three large indigenous language families spoken in the Caucasus together with West Caucasian (Abkhaz-Adyghean) and South Caucasian (Kartvelian). All three families share important liguistic properties (e.g., rich consonantism, agglutination, morphological ergativity, SOV word order, etc.), which is sometimes taken as evidence of the existence of a Caucasian Sprachbund, although this position is debatable; cf. Tuite (1999) vs. Chirikba (2008). East Caucasian and West Caucasian languages are sometimes considered to be related as two branches within the North Caucasian macrofamily (as argued by Nikolayev and Starostin [1994]); however, this approach has not gained universal acclaim, although it has been supported by a number of specialists in the field (see, e.g., Alekseev and Testelets 1996). Languages of the East Caucasian family are mainly spoken in the south of the Russian Federation (in the republics of Daghestan, Chechnya and Ingushetia), as well as in the adjacent areas of Azerbaijan and Georgia. Apart from the Lezgic group, the family includes the Nakh, Avar-Andic, Tsezic and Dargwa groups, with Lak and Khinalug constituting independent branches. On the whole, there are more than forty languages in the East Caucasian family, spoken by ca. 5 million people.1 For general overviews of the family, see Van den Berg (2005) and Ganenkov and Maisak (2020). The Lezgic languages, spoken in southern Daghestan and northern Azerbaijan, are the most southern branch of the family. There are nine languages in this group, with seven ‘core’ languages and two outliers. The core languages, also known as the ‘Samur languages’, are located on both sides of the Great Caucasus range in the area surrounding the Samur river: Lezgian, Tabasaran and Agul (Aghul) belong to the eastern subbranch, Tsakhur and Rutul belong to the western subbranch, and Kryz (Kryts) and Budugh (Budukh) constitute the southern subbranch. Archi is the

 The details of the internal classification of the languages of the family, as well as the exact number of languages are subject to debate (especially with regards to the Dargwa group). See Nichols (2003) and Korjakov (2006: 26–40) for approaches to the classification issues, and Kassian (2015) for a recent assessment of the Lezgic genealogical tree. https://doi.org/10.1515/9783110563146-006

310

Timur Maisak

northwestern outlier, spoken in the Avar-dominated area of central Daghestan, and Udi is the southernmost outlier, spoken in the Azeri-dominated area of northern Azerbaijan. Udi is also the closest living relative of the extinct Caucasian Albanian language, which can be considered the tenth Lezgic language, and which is the only East Caucasian language with an ancient written tradition.2

. The typological profile Lezgic languages are predominantly agglutinative with ergative case alignment, extraordinarily rich case systems, nominal gender agreement, and elaborate consonant inventories. Among the phonological distinctions, most languages employ the four-term contrast between voiced, aspirated, intensive (non-aspirated) and ejective stops, for example g ~ k ~ kː ~ k’. In Udi, intensive and ejectives have merged into one set, which lacks strong glottalization. Post-velar (uvular, pharyngeal, epiglottal) consonants are very common, as well as pharyngealization or epiglottalization as secondary articulation on vowels or certain consonants. Archi is the only language that preserves a set of laterals (ɬ, tɬ, tɬ’, etc.), which is considered to be an archaic feature going back to the Proto-Lezgic stage and which has been lost in all the other Lezgic languages. Morphologically, the languages of the group are predominantly suffixing, with inflectional and derivational prefixation represented only within the verbal system (see sections 3.6 and 3.7 below on locative and repetitive prefixes). Nominal inflection is rich, with up to several dozens of case-number forms. Singular is umarked, plural suffixes follow the root, and case affixes follow number markers. It is important to note, that while the absolutive case is normally identical to the noun root, all other (“oblique”) cases are attached to a derived oblique stem whose derivational marker is partly phonologically, partly lexically determined. case markers follow the oblique stem extensions, although the ergative case is sometimes identical to the oblique stem: Table 1 shows the two stems and a selection of case forms for two Agul nouns with different oblique stem markers. The oblique stem extensions, being semantically empty, are separated by a period. Although case alignment in the East Caucasian languages is ergative-absolutive, they are not generally considered to be syntactically ergative, but rather morphologically and semantically ergative: case marking is determined by the semantic roles of arguments, and not by their syntactic status (Kibrik 1997), see also Forker (2017). Nonetheless, nominal gender agreement follows the ergative-absolutive pat-

 The Caucasian Albanian alphabet was created in the 5th century A.D. but fell into disuse several centuries later, only to be rediscovered in the 1930s. Gippert et al. (2008) present an edition of the only two remaining manuscripts preserved in lower layers of palimpsests of St. Catherine’s Monastery on Mt. Sinai.

Grammaticalization in Lezgic (East Caucasian)

311

Tab. 1: Agul: direct and oblique stems of two nouns. a. gada ‘boy’ absolutive (= direct stem) oblique stem ergative dative genitive comitative superessive superelative superlative

gada gada.jigada.ji gada.ji-s gada.ji-n gada.ji-qaj gada.ji-l gada.ji-l-as gada.ji-l-di

b. ruš ‘girl’ ruš ruš.aruš.a ruš.a-s ruš.a-n ruš.a-qaj ruš.a-l ruš.a-l-as ruš.a-l-di

tern. Those words that have prefixal, infixal or suffixal slots for gender agreement (first of all, verbs and adjectives, but also adverbs or even some pronouns and particles), agree with the absolutive noun phrase which is the subject of an intransitive verb (S) or the patient of a transitive verb (P). Thus, in (1) the third gender of the absolutive noun ɢaje ‘stone’ is indexed in the two verb forms and in the copula. Gender is not marked on the noun itself, although historically gender affixes can be recognized in some nouns referring to humans (see section 2.1). (1)

Tsakhur: gender agreement (Kibrik and Testelets 1999: 719) adam-ē alʲa‹p’›t’-u ɢaje ɨˁ‹w›χ-ɨ wo-b χoče.j-s man- take- stone. beat- - snake- ‘The man took a stone and threw (lit. beat) it at the snake.’

On the other hand, person agreement, which is an innovation in Tabasaran and Udi (though it must have developed independently in the two languages), follows the accusative alignment pattern: the agreement is with the subject, i.e., S or A argument (see section 3.9 for details). In terms of case alignment, Udi has switched from a purely morphologically ergative system to a mixed one: only indefinite patients are encoded by the absolutive in this language, while definite ones take the dative case (the rise of Differential Object Marking is most likely contact-induced, as it is present in important dominating languages of the area such as Azeri, Armenian and Persian). Apart from agreement, the tense and aspect systems of the Lezgic languages are rich and include dozens of both synthetic and periphrastic forms (the latter are discussed in more detail in section 3.2).3 The most important categories of the verb

 According to a famous estimation by Kibrik (1977b: 37), as many as 1,502,839 inflectional forms, both synthetic and periphrastic, can be derived from a single verb root in Archi. This number should be taken with caution, however, as it includes all possible gender-agreeing forms, all case-

312

Timur Maisak

are tense, aspect, mood/modality and evidentiality. The perfective vs. imperfective aspectual opposition is expressed by inflection within verb stems and is quite archaic (see Alekseev [1985: 75–89] for a summary and a reconstruction of the protolanguage). The verb lexicon includes a restricted number of non-derived verb stems: e.g., Kibrik (1977b: 233–243) lists only 163 simplex verbs for Archi. In Udi the number of simplex verbs, including historically prefixal verbs, is even smaller. Given that locative prefixation is not synchronically productive, and there are no regular morphological verbal derivations, compounding plays the leading role in creating new verb lexemes (see section 5 on the intermediate status of such complex verbs between syntax and morphology). Like other East Caucasian languages, Lezgic languages are predominantly leftbranching with dominant SOV word order . In spoken discourse, word order is not strict and the order of major constituents is subject to variation, depending on the information structure. Subordinate clauses are usually headed by non-finite forms (participles, converbs, infinitives, and various case forms of action nominals), although it is not infrequent that one and the same form can be used both in a dependent clause (e.g., as a converb) and as a finite tense. In Udi, there are more finite subordinate structures than in the other Lezgic languages, which may be the result of long-term contacts with the Indo-European languages of the region where Udi is spoken. In complex clauses, borrowed conjunctions of mostly Arabic and/or Turkic origin are used, such as eger / ägär ‘if ’, wa ‘and’, amma ‘but’ or ja… ja ‘both… and’, ‘neither… nor’, etc.

. Data sources and diachronic evidence Among the nine Lezgic languages, Lezgian and Tabasaran belong to the major languages of Daghestan, with developed literary standards and a Cyrillic writing system that has been in use since the late 1930s. Three smaller languages (Agul, Tsakhur and Rutul) became written only in the 1990s and still do not have much published literature, besides folklore editions, some poetry, newspaper articles and pieces of Bible translation. Udi can also be included in the latter group, as several attempts to create a modern alphabet have been undertaken since the 1990s. Now the language is taught at schools in the village of Nizh and the body of literature (mostly translated) gradually grows. The three smallest languages (Kryz, Budugh and Archi) remain unwritten.4

number forms of regular nominalizations and also forms that can be alternatively interpreted as combinations of verb forms with quotative and modal particles.  Budugh is also the only moribund Lezgic language: according to Authier and Haciyev (2016: 3546), less than one hundred fluent speakers of this language remain, almost all of them adults. All speakers, including elders, are bilingual, using Azeri as a native language.

Grammaticalization in Lezgic (East Caucasian)

313

The first accounts of the grammatical structure of Lezgic languages appeared in the second half of the 19th century: in 1863, the first monographic description of a Lezgic language, a grammar of Udi, was published in German by Franz Anton von Schiefner, a member of the Russian Imperial Academy. In the 1860s and 1870s, a prominent Russian researcher, Peter von Uslar, studied Lezgian and Tabasaran (although his grammar of the latter remained unpublished for almost a century). Later, in the early 1900s, the German linguist and ethnographer Adolf Dirr published grammatical sketches of Udi, Tabasaran, Agul, Archi, Rutul and Tsakhur (with text samples and word lists appended to the grammatical descriptions). Apart from these early sources and the data from the recently published Caucasian Albanian manuscripts, not much is known about the history of the Lezgic languages before the 20th century. The origin and the path of evolution of many old and prominent grammatical phenomena such as locative cases or gender agreement systems are not clear, although attempts to discover their history have been undertaken in works on comparative reconstruction (see especially Alekseev [1985]). Still, there are plenty of examples of more recent and transparent grammaticalization phenomena, which are not necessarily common to all languages of the group. In what follows, I will present an overview of the most clear and interesting cases, which comprise both typologically well-established types of development and rara or even rarissima among the world’s languages. I mostly rely on published sources, especially comprehensive modern grammars that by the 2010s have appeared for most of the languages of the group, including those written in languages other than Russian. For Agul, Tsakhur and Udi, I also use my own field data, both published and unpublished, as well as available texts.5 The most important published sources used for the present study include Haspelmath (1993) for Lezgian, Magometov (1965) and Babaliyeva (2013) for Tabasaran, Magometov (1970) and Merdanova (2004) for Agul, Kibrik and Testelets (1999) for Tsakhur, Alekseev (1994a) and Makhmudova (2001) for Rutul, Authier (2009) for Kryz, Alekseev (1994b) for Budugh, Kibrik (1977a), (1977b) for Archi, SchulzeFürhoff (1994) and Harris (2002) for Udi, and Gippert et al. (2008) for Caucasian Albanian. Previous shorter overviews of selected grammaticalization paths in Lezgic can be found in Maisak (2016a), and Arkadiev and Maisak (2018: 132–144).

 Grammaticalization of nominal categories . Nominal class (gender) Alekseev (1985: 89–95) reconstructs a system of four genders for the Proto-Lezgic stage, comprising so-called ‘strong’ and ‘weak’ sets of gender agreement markers  If not indicated otherwise, the Agul data provided in this paper come from the Huppuq’ dialect and were mainly collected in Daghestan since the early 2000s during the Agul Documentation

314

Timur Maisak

(*r ‘gender I’, *r ‘gender II’, *pː ‘gender III’, *tː ‘gender IV’ in the former and *w ‘gender I’, *r ‘gender II’, *v ‘gender III’, *j ‘gender IV’ in the latter). The 1st and the 2nd genders are masculine and feminine, respectively, while the 3rd and the 4th are neuter. Gender agreement is still retained in the majority of modern Lezgic languages, although there is a trend towards decline. Thus, in Tabasaran, gender agreement has reduced to a binary opposition between neutral and non-neutral, while in Lezgian, Agul and Udi gender agreement has disappeared altogether.6 This does not mean, however, that the proto-Lezgic gender markers have disappeared without a trace: in many cases, they remain as lexicalized (“petrified”) parts of lexemes and grammatical markers. For example, a great number of /b/-initial verb stems in Udi can be traced back to stems with a 3rd gender prefix b-, which is supported by the comparison with other Lezgic languages: e.g., the Udi root botː‘cut, cut off ’ corresponds to -at’- in Agul, Lezgian and Archi, hat’ʷ- in Rutul, etc. (Nikolayev and Starostin 1994: 272; Schulze 2001: 260). Some of the nouns where historical gender prefixes can be reconstructed probably have verbal origin, such as certain kinship terms, e.g., ‘brother’: cf. Udi viči with the putative masculine gender prefix v- which left only labialization in such cognates as Tabasaran čʷi or Agul ču (Alekseev 1985: 61–62). See also Schulze (1992) on the history of gender marking in East Caucasian languages.

. Number In Lezgic languages, unmarked singular nominal forms are opposed to marked plurals, e.g., gade ‘boy’ > gade-bɨ ‘boys’ or eč ‘apple’ > eč-ēr ‘apples’ in Tsakhur. As a rule, there is a set of plural suffixes which (like oblique stem extensions) are partly phonologically and partly lexically distributed. Affixes *-ar, *-pːər, *-əm and a few others are reconstructed by Alekseev (1985: 55–60) for the Proto-Lezgic level. In some languages, double plural marking is attested, usually without a clear semantic difference, e.g., naq’ʷ-ar-ar ‘graves’ in Agul with the ‘doubling’ of one and the same suffix, or kːož-ur-χo ‘houses’ in Udi where -ur and -χo are two different plural markers. Nouns with fossilized plural markers are also quite common, especially in Udi, e.g., uluχ ‘tooth’, buruχ ‘mountain’, aruχ ‘fire’, imuχ ‘ear’, elmuχ ‘soul’

Project (conducted by Dmitry Ganenkov, Solmaz Merdanova and the present author). The Udi data come mainly from the Nizh dialect and were collected in Azerbaijan since the mid-2000s.  Thus, in general, among the Lezgic languages there are two languages which do not possess any kind of agreement (Lezgian and Agul), one language with person agreement only (Udi), and one language with both person agreement and the remnants of gender agreement (Tabasaran). All other Lezgic idioms preserve the system of gender agreement and do not have person agreement (but see section 3.9 on pronominal doubling strategies).

Grammaticalization in Lezgic (East Caucasian)

315

etc. Historically, these nouns included the plural affixes -uχ, -ruχ, -muχ, which have now become part of the root (Alekseev 1985: 56).

. Case Among the cases, a distinction is usually drawn between non-locative (also called ‘grammatical’) and locative case forms. The former are represented by such ‘core’ cases as absolutive, ergative, dative and genitive, which can be reconstructed for the Proto-Lezgic stage (Alekseev 1985: 39–46). The non-locative set can also include other cases like comitative (‘with’) or a specialized possessive case encoding the possessor in predicative possessive constructions. Although experiencers of predicates like ‘see’ or ‘know’ are usually encoded by the dative, some languages possess a dedicated case form for this purpose (e.g., ‘affective’ in Tsakhur). The number of locative cases in all languages except Udi is much higher. Case forms of this set typically include two separately coded categories, localization and orientation. The localization marker defines a certain spatial domain with respect to a landmark (e.g., ‘inside’, ‘on the surface’, ‘behind’, ‘near’ or ‘under’), while the orientation marker specifies direction of motion with respect to this domain (e.g., motion to, motion from, or absense of motion).7 Thus, meanings like ‘in the direction towards the place near a landmark’ or ‘in the direction from the upper surface’ which in most languages of the world are expressed by means of adpositions or other syntactic constructions, can in Lezgic (and most other East Caucasian) languages be expressed by a complex case form, see (2). (2)

Tabasaran: adlative and superelative cases (Babaliyeva 2013: 43, 45) a. hamus äχü baba-x-na ʁäʁ-ür-za now grand mother-- go--1: ‘Now I will approach my grandmother.’ b. žil.i-l-an sa-b c’iric’ ä‹b›qin earth-- one- stick. bring. ‘{Next time you come here,} bring me from the earth a stick.’

The combination of up to eight localizations with three or four orientation values can yield dozens of specific locative forms, which has gained East Caucasian languages the fame of possessing the richest case inventories in the world.8 As far as

 See Comrie (1999), Kibrik (2003), and Daniel and Ganenkov (2008) on the details of morphology and functions of locative cases in East Caucasian.  Tabasaran was even mentioned in some editions of The Guinness Book of Records as the language which “uses the most noun cases, 48” (see, however, the skeptical discussion by Comrie and Polinsky [1998] on this topic).

316

Timur Maisak

Lezgic languages are concerned, it is notable that the same locative elements (probably adverbs) that became sources of localization markers have also entered another grammaticalization path, resulting in derivational locative prefixes on the verb (see section 3.6). There is no strict border between the ‘grammatical’ and the locative cases. Many locative cases express non-locative functions as well, including argument encoding. In Agul, for example, the superlative case is the main means of expressing instrumental meaning (‘by means of ’), the adessive and postessive cases encode temporal and permanent possessors, respectively, the adelative encodes the causee in a causitive construction (see section 3.1) and so on. Some of the ‘grammatical’ cases have an ultimately locative origin: the Udi dative in -a is originally an inessive case (the Proto-Lezgic dative in was lost in the Udi nominal paradigm), and the so-called ‘partitive’ case in -qˁiš in Archi (which encodes the set from which an object is selected), seems to be composed of the  (‘between, in a mass’) localization marker -qˁ and the elative suffix -š. On the whole, however, the etymology of most of the Lezgic cases is unclear, as they have an ancient origin that dates back to the Proto-Lezgic stage. It has also been hypothesized, that at a more distant chronological level, probably as deep as Proto-East-Caucasian, the language possessed a binary case system with a morphologically unmarked absolutive (direct) vs. a marked ergative (oblique) case (see discussion in Alekseev [2003: 94–100]). Thus, it is most plausible that the plethora of modern oblique cases has resulted from the morphologization of the oblique form with former postpositions. A rare example of a ‘young’ case form whose diachronic lexical source is clearly discernible is the Agul comitative. The case suffix -qaj is the result of fusion of the postessive in -q with the converb qaj ‘being with’, ‘being in someone’s possession’ derived from the locative verb ‘be behind’, ‘possess’, cf. a putative reconstruction: gada.ji-qaj ‘with the boy’ < *gada.ji-q qaj ‘being with the boy’. The comitative case thus ultimately goes back to the head of subordinate clause. As mentioned above, the  localization is used to mark possessors in Agul; the same localization is present in the verb ‘be behind’, ‘possess’ as a prefix.

. Determiners Traditionally, neither articles nor determiners are discussed in grammatical descriptions of East Caucasian languages. At the same time, there are (weakly) grammaticalized elements that are very close in function to definite and indefinite articles, and realize crosslinguistically typical grammaticalization paths leading to the emergence of such markers.9

 See for example Kibrik (1977a: 331–334) on the “tendency of development of an article” in Archi.

Grammaticalization in Lezgic (East Caucasian)

317

In particular, the numeral ‘one’ is used very frequently as a determiner for introducing new discourse participants, both singular (3a) and plural (3b). It also occurs in various types of introductory expressions with temporal or locative semantics, like ‘one day’, ‘in one town’ and so on (3c). In a recent typological study of articles, Becker (2018: 148–149) discussed the Agul sa in such contexts as an instance of a special type of article she labels ‘presentational’. See also Nasledskova (2019) on the use of ‘one’ as an indefinite article in Rutul. (3)

Agul narratives: ‘one’ as an indefinite determiner (corpus examples) a. muʁan.di-ʔ alčarxa-a sa jašlu idemi Mugan- meet.- one elderly man. ‘In the Mugan plain, they meet an old man.’ b. me agu-naa sa dijarkːa-jar.i-s this. saw.- one milkmaid-- ‘Some milkmaids saw her.’ c. aχpːa sa jaʁ.a gula-a ge-wur.i jac-ar then one day() get.lost.- that-() ox-. ‘And so, one day their oxen get lost.’

In maintaining reference to participants already introduced, demonstratives play a key role, either as noun phrase determiners (‘this X’), or as noun phrase heads themselves (‘this’ or ‘that’ as ‘he/she/it’). As reference tracking devices, different series of demonstratives are distributed depending on the discourse role of a given participant. Thus, in Archi, the proximal demonstrative jow ‘this (close to the speaker)’ is used to refer to the main protagonist, while the other two demonstratives jamu ‘that (close to the addressee)’ and tow ‘that (far from the interlocutors)’ refer to secondary participants. As a medial demonstrative, jamu is used in those contexts when the participant is close in status to the main one, and is not totally peripheral (Kibrik 1977a: 333). Similar discourse oppositions have been described for Tsakhur (Kibrik and Testelets 1999: 670–674) and Agul (Ganenkov, Maisak, and Merdanova 2009). Caucasian Albanian seems to be the only example of an East Caucasian language possessing definite articles separate from the system of (distal) demonstratives. Like the latter, definite articles in Caucasian Albanian distinguished between masculine, feminine and neuter gender. Although the Proto-Lezgic gender agreement had been lost in Caucasian Albanian, the gender distinction in the definite markers can be probably traced back to the ancient class/gender affixes. As stated by Gippert et al. (2008, II: 38), the Caucasian Albanian articles o, a and e were used in the same contexts where the Armenian enclitic definite articles -s (proximal), -d (medial), and -n (distal), respectively, are usually found.

318

Timur Maisak

. Pronominal series markers Series of indefinite and negative pronouns are derived from interrogative pronouns by means of special markers, which sometimes have discernible grammaticalization sources. For indefinite pronouns, conditional and concessive10 verb forms are commonly used as (partly morphologized) series markers. Thus, the pronouns corresponding to the English some-series are derived by means of the conditional copulas ejči(n) in Agul and wuš in Tabasaran, the concessive copula jat’ani in Lezgian, the concessive markers -(i)šaw in Archi and -xe=d in Tsakhur, see (4a). The literal meaning of such combinations can be presented as ‘whatever it may be’, ‘whoever it may be’ and so on. The series marker is phrase-final, and is hosted by an inflected form of an interrogative, cf. in Agul the absolutive fi ejči / fi-jči ‘something’ [what :], the dative fi.tːi-s ejči [what- :], the genitive fi.tːi-n ejči [what- :], etc. The numeral ‘one’, which also functions as a sort of indefinite article (see section 2.4), often precedes indefinite expressions. In free-choice indefinite pronouns, roughly corresponding to the English anyseries, the series markers are concessive forms of the verb ‘be, become’ (also used as an auxiliary in tense and aspect forms and as a possibilitive modal, see sections 3.2 and 3.3), or the verb ‘want’, cf. the Lezgian xajit’a=ni ‘even if (one) is’ in (4b) and the Agul kːanči=ra ‘even if (one) wants’ in (4c). In the latter case, the freechoice meaning is expressed in combination with the interrogative bases in a straightforward manner, as ‘whatever (whoever, etc.) one may wish’. (4) Indefinite pronoun series markers from verbal sources a. Tsakhur: ordinary indefinites hiǯō ‘what?’ > hiǯō-xe-d ‘something’, IV  hašːu ‘who?’ > hašːu-xe-r ‘someone’, I  nenke ‘when?’ > nenke-xe-d ‘sometime’, IV  (Kibrik and Testelets : ) b. Lezgian: free-choice indefinites wuč ‘what?’ > wuč xajit’a=ni ‘anything’ wuž ‘who?’ > wuž xajit’a=ni ‘anyone’ mus ‘when?’ > mus xajit’a=ni ‘anytime’ (Haspelmath : )

 As a rule, concessive forms themselves are built on conditional forms by means of an additive enclitic ‘also, even’, like =ni in Lezgian, =ra in Agul and Tabasaran, =al in Udi, or = (i.e., bare gender marker) in Tsakhur.

Grammaticalization in Lezgic (East Caucasian)

319

c. Agul: free-choice indefinites fi ‘what?’ > fi kːanči=ra ‘anything’ fiš ‘who?’ > fiš kːanči=ra ‘anyone’ mus ‘when?’ > mus kːanči=ra ‘anytime’ Indefinites based on interrogatives are the most common crosslinguistic type of indefinites (see Haspelmath 1997: 135–140). This type is also found in the neighboring languages, in particular Azeri. In Udi, which borrowed the realis conditional clitic =sa from Azeri, the indefinite series marker is based on the 3rd person singular enclitic (instead of a copula, which is lacking in the language), cf. šu=nesa ‘someone’ < šu ‘who’ + =ne=sa [3=], i.e., ‘whoever it may be’. Thus, although it is a borrowed item, the Udi indefinite marker realizes the same grammaticalization pattern as its equivalents in other Lezgic languages (see Maisak [2019a] for details). Negative pronouns in Lezgic languages do not include the negative marker as such, but occur in clauses with negation on the verb. In some languages, negative pronouns are derived from interrogatives (and also from certain indefinite expressions) by means of the additive enclitic ‘also, even’, which is a very frequent and rather polyfunctional enclitic in East Caucasian languages on the whole, cf. in Agul fi=ra ‘nothing’ [what=], fiš=ra ‘nobody’ [who=], mus=ra ‘never’ [when= ], etc. The numeral ‘one’ often precedes the pronoun, cf. in Lezgian sa zat’=ni ‘nothing’ [one thing=] or sa kas=ni ‘nobody’ [one person=]. Like other indefinite series markers, the additive enclitic is external to the inflected form of the pronoun, cf. in Agul the absolutive fi=ra ‘nothing’ [what=], the dative fi.tːi-s=ra [what-=], the genitive fi.tːi-n=ra [what-=], and so on. In Udi, negative pronouns also end in -al, which is identical to the additive clitic: hikːkːal ‘nothing’ (< he ‘what?’), šukːkːal ‘nobody’ (< šu ‘who?’), mašˁkːal ‘nowhere’ (< mačˁu ‘where, in what direction?’). It is not quite clear, though, whether this element really represents a “petrified” additive =al or not, and if so, what is the source of -kː(kː)-.11 In any case, the final -al here is an already lexicalized part of negative stems, and case inflections follow it, cf. the absolutive šukːkːal ‘nobody’, the dative šukːkːal-a, the ergative šukːkːal-en etc.

. Ordinal, collective, and distributive numeral markers Various orders of numerals are derived from cardinal bases, or from nominalized cardinals.

 Most probably, the (former) suffix -kː(kː)- is related to a dedicated negative pronoun marker of unknown origin found in some Lezgic languages, cf. fu=k’a ‘nothing’ (< fu ‘what’), fuž=k’a ‘nobody’ (< fuž ‘who’) in Tabasaran. Alternatively, the Udi -(kː)kːal may represent the morphologized agent noun / imperfective participle form of a speech verb, e.g., šukːkːal ‘nobody’ < šu ‘who’ + ukːal ‘saying’, ‘one who says’. The use of this form with interrogatives does not seem to be semantically

320

Timur Maisak

Ordinals are derived from nominalized cardinal numerals by means of a morphologized participle of a general speech verb (this is one of a number of grammaticalization paths for the verb ‘say’, see sections 3.4 and 4.1 for other examples). The languages differ with respect to the form of the participle employed in this function: while in some languages, the perfective participle is used, in other languages this is the future/debitive participle, and in still other languages the marker is identical to the imperfective participle, see (5) for examples.12 The source construction is probably that of naming: in Lezgic languages, ‘one whose name is N’ or ‘one that is called N’ is expressed as a relative clause headed by the speech verb (‘N-saying X’). (5) Models for ordinal numerals (‘three’ as a base) a. Tabasaran: cardinal numeral + perfective participle of ‘say’ šubur-pi ‘third’ < ‘three’ + ‘(one about which) it was said’ b. Archi: cardinal numeral + future/debitive participle of ‘say’ ɬeb-bosdub ‘third’ < ‘three’ + ‘(one about which) it will be said’ c. Kryz: cardinal numeral + imperfective participle of ‘say’ šibur liji ‘third’ < ‘three’ + ‘(one about which) it is being said’ This grammaticalization path of the verb ‘say’ prevails in the Lezgic languages, and is found in other East Caucasian languages to a lesser extent, but it seems to be quite uncommon crosslinguistically. The only other example I am aware of is found in Dravidian languages (Krishnamurti 2003: 266), where this is not the only available source of ordinal numeral markers. Collective numerals can be derived from nominalized cardinal numerals by means of the additive clitic, the same as employed in the negative pronouns (section 2.5). More precisely, these are collective noun phrases derived from numeral phrases, as not only a nominalized numeral can be the head of a phrase, but a noun as well, cf. the following pair from Agul or similar Lezgian examples in Haspelmath (1993: 234). (6)

a. ʡu-d=ra qaj-ne [two-]= :come:- ‘both came’ b. ʡu gada=ra qaj-ne [two boy.]= :come:- ‘both boys came’

motivated, though, unlike the use of the additive which regularly occurs in negative clauses as an emphatic element like ‘(not) a single’, ‘(not) even’.  The model became obsolete in Udi and Budugh, where the borrowed Azeri suffix -(i)mǯi is now used to derive ordinals.

Grammaticalization in Lezgic (East Caucasian)

321

As for distributive numerals, partial or full reduplication is used to derive them from the cardinal bases, see (7). In Lezgic languages, reduplication is not a common mechanism of expressing grammatical oppositions, and the derivation of distributive numerals is probably the most regular use thereof. (7) Reduplication in distributive numerals a. Agul: partial or full reduplication ʡu~ʡu ‘by two; two each’ xi~xibu ‘by three; three each’ ʕʷerš~ʕʷerš ‘by hundred; hundred each’ (Maisak and Ganenkov 2016: 3592–3593) b. Tsakhur: full reduplication q’oˁjre~q’oˁjre-na ‘by two; two each’ xebīre~xebīre-na ‘by three; three each’ waˁš~waˁš-na ‘by hundred; hundred each’ (Kibrik and Testelets 1999: 173)

(← ʡu ‘two’) (← xibu ‘three’) (← ʕʷerš ‘hundred’)

(← q’oˁjre ‘two’) (← xebīre ‘three’) (← waˁš ‘hundred’)

 Grammaticalization of verbal categories . Voice and valence Causative is the most common valence-changing operation in the languages of the group. It is predominantly expressed by means of syntactic constructions with verbs like ‘do’, ‘give’ or ‘let’, and to a lesser extent by morphological means. In periphrastic constructions, the embedded verb usually takes the infinitive form, the causer is always encoded by the ergative case (as a prototypical agent), and the causee can either retain the original encoding or appear in the dative or a locative case. Causative verbs often do not show clear signs of grammaticalization: thus, describing tun ‘make, cause’ in Lezgian as a standard way to express causative situations, Haspelmath (1993: 358) says that “there is no reason not to regard it as an ordinary complement-taking verb”. Still, one can argue that at least some of the syntactic causative constructions undergo clause union (clause fusion) and change from a biclausal to a monoclausal structure. For example, in Agul it is possible to keep the causee of a transitive verb in the original ergative encoding, or to express it by a locative case (adessive or adelative). While in the former case the encoding is only determined by the embedded verb, which makes the biclausality more obvious, in constructions like (8b) the case is ascribed by a complex predicate ‘eat-do’ (see also discussion of the Agul causative in Daniel, Maisak, and Merdanova [2012]).

322 (8)

Timur Maisak

Agul: variation in the causee encoding a. baw.a gada.ji šurpa ʕut’a-s q’u-ne mother() boy() broth. eat.- do.- b. baw.a gada.ji-f-as šurpa ʕut’a-s q’u-ne mother() boy-- broth. eat.- do.- ‘Mother made the boy eat the soup (e.g., by threats).’

In Kryz, the choice of the causative verb depends on the transitivity of the embedded predicate: ‘do’ is used with intransitives, and ‘give’ with transitives, as in (9). (9)

Kryz: periphrastic ‘give’-causative (Authier 2009: 307) χinib-ar-ir fura-z jak ʁajn-iz vu-jiǯ woman-- man- meat. take- give-. ‘The women made the man buy meat.’

Causative verbs became morphologized in Tsakhur and Udi, where (now synthetic) causatives are structurally identical to light verb constructions (see section 5.1). In Tsakhur, causative complex verbs include the light verb haʔas ‘do’ following the truncated infinitive stem of the lexical verb, e.g., sak’al-aʔas ‘send back’ based on the infinitive sak’alas ‘come back’, or ɢajsan-haʔas ‘make sleep’ from ɢajsanas ‘fall asleep’ (Kibrik and Testelets 1999: 59, 79). In Udi, causatives are regular derivations from infinitives by means of a light verb -d- (which devoices to /tː/ after /s/), e.g., akːes-tː- ‘show’ from akːes ‘see’, campes-tː- ‘make write’ from campes ‘write’, etc. The light verb -d- does not occur freely as an autonomous predicate, and most probably goes back to Old Udi ‘give’ (Schulze 2001: 324).13 The suffixal causative in -ar(-r-) is attested in Lezgian, although its derivation is lexically restricted and not possible with every intransitive verb; as an example, cf. ks-un ‘fall asleep’ > ksu-r-un ‘put to bed’ (-un is the action nominal suffix, see Haspelmath 1993: 163). In Budugh, causative verbs are derived by switching the low back vowels of the stem to high front vowels, which is the result of incorporation of the verb *ʔi- ‘do’ (Authier and Haciyev 2016: 3561). Another phenomenon which comes close to valence-increasing derivation is the morphological ‘verificative’ in Agul, see section 3.5 for details. The inventory of valence-decreasing derivations is even more scarce, as passives, antipassives or decausatives tend to lack grammaticalized means of expression. The causative/decausative alternations are commonly expressed by labile verbs or by pairs of complex verbs with ‘do’ vs. ‘be, become’ as light verbs (see

 The verb b- ‘do’ was regularly used as a causative light verb in Caucasian Albanian (Gippert et al. 2008, II: 46), but in the modern language there is but a handful of lexicalized derivatives employing this verb, like ačˁes-b- ‘loose’ (< ačˁes ‘disappear, get lost’) or apːes-b- ‘cook, boil’ (< apːes ‘get cooked, ripen’).

Grammaticalization in Lezgic (East Caucasian)

323

section 5.1). In Lezgian, the productive anticausative is formed by adding the light verb xun ‘be, become’ to the verb stem (Haspelmath 1993: 165–166). Suffixal detransitive voice exists in Kryz (to a lesser degree, it is represented in Budugh), where derivatives with the -ar- / -al- / -an- markers have passive, anticausative, and antipassive readings, as well as aspectual and modal nuances (Authier 2012: 133–134). Thus, in the imperfective, detransitive derivatives commonly have a habitual and deontic value, as in (10). (10) Kryz: detransitive derivation (Authier 2012: 144) riki jiʁʁaǯiʁ va-rčar-e yard. every.day -sweep.- ‘The yard is / must be swept every day.’ Another remarkable exception is Udi, where a morphological decausative has appeared. Like the causative, it is based on a light verb construction, although it is the verb stem, not the infinitive that is used as a coverb. The decausative is derived regularly from transitives, both simplex and complex verbs. In the former case, the light verb is added to the stem, so that the verb becomes complex, e.g., akː-ec- ‘be seen’ (< akː- ‘see’) or tad-ec- ‘be given’ (< tad- ‘give’). In the latter case, the decausative light verb is in complementary distribution with other light verbs, e.g., cam-ec‘be written’ (vs. cam-p- ‘write’) or gam-ec- ‘become warm’ (vs. gam-d- ‘make warm’). The light verb -ec- does not occur as a free lexical item, and has suppletive stems similar to those of motion verbs (in particular, the stem -ec- is found in past/perfective forms). According to Gippert et al. (2008, II: 45), the source of the decausative light verb was the Caucasian Albanian motion verb iġesown /iʁesun/ ‘go’; see also Schulze (2014) for a detailed discussion. Note that the grammaticalization of motion verbs is not very typical of the languages of the family; the only common path I am aware of is the use of the auxiliary ‘come’ in the adhortative construction (see section 3.3). There are no instances of morphologically marked reflexives or reciprocals in Lezgic languages; both are expressed by full pronominals. The local reflexive whose antecedent belongs to the same clause is commonly doubled, with the first instance taking the case of the antecedent and the second one taking the case required by the syntactic position of the argument, see the agent plus the beneficiary combination in (11). (11) Rutul: complex reflexive (Makhmudova 2001: 175) adam.a ǯu ǯu-s χal hɨʔɨri Adam() self() self- house. do. ‘Adam built himself a house.’ Reciprocals are also doubled, with the numeral ‘one’ as a base, cf. the partial reduplication of the numeral in sun~suna ‘to each other’ (dative) in Udi, or su~sundu

324

Timur Maisak

‘each other’s’ (genitive) in Kryz, where the pronoun bears one case inflection. Alternatively, the reciprocal pronoun can represent a combination of two inflected numerals akin to a compound reflexive, with the same principle of argument encoding, cf. saji sajis ‘(tell) one another’ [one. one.] in Agul. There are certain syntactic constructions presenting the agent or the patient as semantically/pragmatically demoted. Although no special voice-like marking appears on the verb, the case assignment and/or the gender agreement changes with respect to the canonical ergative-absolutive pattern. Thus, in the involuntary agent construction, the agentive argument is encoded by the locative case which is usually the same as employed for the subject of the possibilitive modal (section 3.3) and the causee of transitive verbs (section 3.1), e.g., the adelative in Rutul (12). The construction presents the agentive participant as affecting the patient accidentally, without noticing what s/he is doing; the agent can also simply let something happen by overlooking and not preventing the situation. As a rule, the involuntary agent construction is available with labile verbs (e.g., the Rutul verb jaχas ‘break, be broken’ is labile) and certain intransitives, mainly change-of-state verbs, see Ganenkov, Maisak, and Merdanova (2008) and Shushurin (2017) for a discussion of the Agul and Lezgian data, respectively, and also Kittilä (2005) for a broader crosslinguistic overview. (12) Rutul: the involuntary agent construction (Makhmudova 2001: 78) za-daa qːab jaχɨri I-. plate. .break. ‘I broke a plate (accidentally).’ As already mentioned in section 1.2, in a standard transitive clause the agent is encoded by the ergative, the patient is encoded by the absolutive and it is the patientive noun phrase that controls gender agreement (in those languages where the latter is present). In the bi-absolutive construction, both the agent and the patient appear in the absolutive case, and the gender agreement is partly controlled by the agentive noun phrase. Bi-absolutive constructions typically include imperfective periphrastic forms with the copula or existential verb, and a lexical verb in a nonfinite form (converb, participle). In example (13) from Archi, the first sentence represents a ‘canonical’ transitive clause with ergative-absolutive alignment, while the second sentence represents a bi-absolutive construction, with a zero agreement marker of the  gender. In the latter case, the gender agreement is split: while the lexical verb agrees with the patientive absolutive ( gender), the copula agrees with the agentive noun phrase ( gender). (13) Archi: ergative-absolutive and bi-absolutive constructions (Kibrik 1975: 56) a. dija.mu čiχir c’ar-ši i father() wine. .drink.- .be. ‘Father drinks .’

Grammaticalization in Lezgic (East Caucasian)

325

b. dija čiχir c’ar-ši w-i father. wine. .drink.- -be. ‘Father  .’ The typical functions of the bi-absolutive construction are “agent topicalization and its counterpart patient demotion” (Forker 2012: 80). Concerning the minimal pair in (13), Kibrik (1975: 56), who was the first to describe the functions of bi-absolutives, argues that the two sentences can be used as answers to different questions. The sentence in example (13a) focuses on the patient and can be uttered as a response to ‘What does the father drink?’. To the contrary, (13b) is about what father does, and both the predicate and the patient are in focus; this sentence can answer the question ‘What does the father do?’.

. Source structures for tense and aspect forms Alongside synthetic verb forms, periphrastic constructions play an important part in the tense and aspect systems of Lezgic languages (though they are not very prominent in Udi). Periphrastic forms are composed of a non-finite component – usually a participle, a converb, or an infinitive – and a postposed auxiliary. The most common type of auxiliary is a copula, but in some languages, an existential verb ‘be (inside)’ is also employed; both these auxiliaries are morphologically defective and mainly occur in the present or the past tense. Non-finite forms are suffixal, and are derived from one of the aspectual stems, perfective or imperfective (infinitives, which do not show an aspectual distinction, may have their own stem). The most common periphrastic models include: – converb (perfective or imperfective) + copula, – converb (perfective or imperfective) + existential verb, – participle (perfective or imperfective) + copula, – infinitive + copula. The specific markers of non-finite forms do not have to be cognate, what is common are the patterns themselves. Not all the patterns are attested in each and every Lezgic language, but the overall organization of the periphrastic verb paradigm is very similar: the opposition between perfective vs. imperfective forms (as encoded in the lexical verb) intersects with the opposition between the present vs. the past time reference (as encoded by the auxiliary). These semantic values reflect the source structure of periphrastic forms, but may change in the process of their development: thus, perfects originally have the structure “perfective converb + present copula”, although they end up being past tenses; the future tenses also include the present auxiliary, which reflects their aspectual and/or modal origin (e.g., ‘is going to do’ or ‘has to do’).

326

Timur Maisak

The degree of grammaticalization of periphrastic forms varies considerably both on the formal and on the semantic side. Some historically periphrastic forms have turned into synthetic ones, mainly as a result of copula loss or the merging of the auxiliary with the lexical verb. Also, structurally similar forms in different languages may cover different ranges of usage along the same grammaticalization path. Thus, forms based on the patterns “perfective converb + copula” and “perfective converb + existential verb” with the auxiliary in the present realize the meanings along the well-known grammaticalization path  >  >   (). In Agul, the now-synthetic aorist in -ne like aq’u-ne ‘did’ [do.-] ultimately goes back to a periphrastic form, cf. *aq’u-na e [do.- ], lit. ‘having done + is’. The Lezgian aorist in -na is identical to the perfective converb (e.g., awu-na ‘did’ or ‘having done’), which most probably reflects the same pattern with the subsequent loss of the copula.14 Perfects and resultatives are generally less morphologized, with the auxiliary verb clearly distinguishable, even in case of merger with the lexical verb. Both in Agul and Lezgian, the perfect includes the existential verb as an auxiliary, cf. aq’u-naa ‘has done’ in Agul and awu-nwa ‘has done’ in Lezgian, which were originally combinations of aq’u-na a [do.- .be.] and awu-na awa [do.- .be.], respectively. Only in Udi both the aorist and the perfect are synthetic suffixal forms without a clear periphrastic source (e.g., the aorist bi ‘did’ resp. the perfect be ‘has done’). For an overview of the Lezgic perfects, see also Maisak (forthcoming). In Agul and Archi, perfects have acquired indirect evidential meaning, which is another possible evolution of resultatives and perfects (Bybee, Perkins, and Pagliuca 1994: 95–97). In Archi, the indirective past in -li is syncretic with the perfective converb; while Kibrik (1977a: 87–89) describes -li as an indirective suffix as such, Tatevosov (2001: 456–457) argues that it should be regarded as finite use of a converb, resulting from the loss of the auxiliary.15 Among the present tenses, general presents and habituals are the most common types. Progressives are rare: e.g., for Kryz, according to Authier (2009), a peripheral progressive construction (“progressif constatif”) is attested, which comprises the manner converb and the existential verb ‘be inside’.

 At the same time, given the commonality of finite/non-finite syncretism in East Caucasian languages, one might explain this by a direct syntactic reanalysis (“direct insubordination”); see Mithun (2008) on a similar proposal for some languages of North America, and also Robbeets (2017) for a discussion of various scenarios of ‘finitization’ in Transeurasian languages.  Apart from perfects with indirect evidential meaning, the inventory of evidential markers in the Lezgic languages includes hearsay enclitics grammaticalized from speech verbs (section 3.4), and constructions with the auxiliary ‘find’ in Archi, which seems to be a contact-induced development (section 3.3).

Grammaticalization in Lezgic (East Caucasian)

327

(14) Kryz: progressive construction (Authier 2009: 276) ik-re ki sa-b buʔu aǯdaha ʕašxva-ra ʕaǯu. look-  one- large dragon.. .arrive- be.in.. ‘He sees that a big dragon is approaching (him).’ It is more usual to have a present tense covering both ‘focalized’ progressive and more general durative uses, in addition to habitual and generic. When the origin of such presents is clear, they normally go back to a periphrastic pattern with an imperfective converb or participle and an auxiliary in the present tense. In Udi, another source structure of the present is attested with clear parallels in Azeri and Iranian languages (e.g., Tat, Talysh): the main present is historically the dative (originally locative) case of the infinitive, cf. uk-sa ‘eats, is eating’ < *uk-es-a ‘in eating’ [eat--], see also Maisak (2011: 40–43) for discussion. Like the periphrastic pattern with the existential verb (lit. ‘is in doing’), the Udi present also represents a locative model, although the locative component here is marked in the form of the lexical verb, not in the auxiliary. Futures are commonly distinct from presents in the languages of the family, although the sources of dedicated futures are not easily discernible. The pattern based on the infinitive can give rise to a future, which is the case in Agul and Rutul, e.g., in the latter ulesi ‘will eat’ < *ul-es i [eat- ]. In Archi, however, the same combination yields a modal form with deontic meaning, e.g., χabus i [.sing- .be] ‘has to sing’ (Kibrik 1977a: 206–207). The Tsakhur construction based on the infinitive in its turn can express both deontic modality and prospective meaning, e.g., ez-as=o-d [plow-=-] ‘has to to plough / is going to plough’ (Kibrik and Testelets 1999: 269). A very common phenomenon is the future/habitual polysemy which results from the evolution of ‘old presents’ losing their ‘core’ present-tense meanings (see Haspelmath [1998] and Tatevosov [2005] on diachronic explanations of this type of development). For example, the Lezgian form in -da functions both as a future and a habitual/generic, and also occurs in stage remarks and in narratives as a ‘historical present’. Some forms with a structure clearly pointing to an imperfective origin are restricted to future time reference, e.g., the Kryz future is a morphologized combination of the imperfective participle and a copula (cf. kurac’ija ‘will slaughter’ from *kurac’-i ja [slaughter- ]), and the Udi future in -al is syncretic with the nomen agentis derived from the imperfective stem (cf. uk-al=e ‘will eat’ with the 3 person marker and ukal ‘eater, one who eats’). In Udi, another typical development of ‘old presents’ is attested: the imperfective form in -a, which used to be the main present tense in Caucasian Albanian (Gippert et al. 2008, II: 44), became the present subjunctive. In modern Udi, it is found almost exclusively in (finite) dependent clauses including complements of verbs like ‘want’, ‘need’, ‘know (what to do)’, purposive clauses etc. Subjunctives are non-typical of the Lezgic languages, and given similar developments of ‘old

328

Timur Maisak

presents’ in unrelated languages of the area (e.g., Persian or Eastern Armenian), contact influence can be suspected here. Returning to the structure of periphrastic forms, it should be made clear that such forms as aorists and perfects in the perfective domain, or presents, habituals and futures in the imperfective domain originally include auxiliaries in the present tense (although these can be subsequently lost or merged with the lexical verb). Those forms in which the auxiliary appears in the past tense give rise to pluperfects and ‘discontinuous’ pasts (i.e., “past with no present relevance”, in the terms of Plungian and van der Auwera [2006]), imperfects and past habituals, or futures-inthe-past which are also used as irrealis forms. Thus, in Agul the counterpart of the future is only used in the protases of counterfactual conditions, cf. the future aq’ase < aq’a-s e [do- ] ‘will do’ vs. the irrealis aq’asij < aq’a-s ij [do- :] ‘would have done’. Past auxiliaries in some languages develop into past enclitics which can combine freely with various finite forms, including forms that already express past time reference, like aorists or perfects.16 The Lezgian marker -j, the Udi clitic -j (-ij), and the Tsakhur ‘epistemic markers’ -jī and -nī which are restricted to the past tense domain, all belong to this category, deriving “retrospective” variants of various tenses. Thus, “retrospective” presents are semantically imperfects (i.e., past imperfectives), “retrospective” perfects are semantically pluperfects (i.e., past perfects), “retrospective” futures are futures-in-the-past or irrealis moods, etc., see some examples from Udi in Table 2.17 The difference with the canonical periphrastic forms here is that the opposition between the present and the past series of tenses changes from ‘equipollent’ to ‘privative’: a dedicated marker in the past is opposed to the absence of marking in the present. Apart from the synthetic forms and the ‘primary’ periphrastic forms, Lezgic languages also make use of a wide range of ‘secondary’ periphrastic forms with the regular verb ‘be, become, happen’. Being morphologically regular and not defective, unlike copulas or stative existential verbs, the verb ‘be, become’ can potentially take any form when used as an auxiliary, including periphrastic forms. Thus, forms like those illustrated in (15) and (16) from Agul consist of an imperfective converb of the lexical verb and an auxiliary in the perfect and the future, respectively, both of which are periphrastic in origin. Both sentences are also examples of bi-absolutive constructions.

 Plungian and van der Auwera (2006: 344) describe the function of such kind of markers as “retrospective shift”, as they change the default temporal interpretation of the verbal form “in introducing a temporal (or notional) break between the point of reference and the situation: metaphorically, one can speak about a kind of “detachment” or “shift””, the result of which is that the verbal forms in question become “more past”.  In Table 2, finite forms with the 3rd person singular marker =ne are given; note that person markers can occur both at the end of the verb form (as enclitics) as well as inside the verb stem (as ‘endoclitics’), see also section 5.3.

Grammaticalization in Lezgic (East Caucasian)

329

Tab. 2: Udi: basic and “retrospective” verb forms of ‘go away’. Basic forms

Retrospective forms

Perfect

tac-e=ne ‘has gone’

Pluperfect

tac-e=ne=j ‘had gone’

Present

ta=ne=sa ‘goes, is going’

Imperfect

ta=ne=sa=j ‘was going’

Subjunctive

taʁ-a=ne ‘(that s/he) go’

Past Subjunctive

taʁ-a=ne=j ‘(that s/he) would go’

Future

taʁ-al=e ‘will go’

Irrealis

taʁ-al=e=j ‘would have gone’

Prospective

taʁ-ala=ne ‘is going to go’

Past Prospective

taʁ-ala=ne=j ‘was going to go’

(15) ʜür.i-s hamiša χar jarʜa-j xu-naa. village- always hail. beat.- become.- ‘Our village always suffered from hail.’ (16) za-qaj kar aq’a-j xa-s-e me. I- work. do.- become.-- this. ‘He will be working with me.’ In Lezgian, the stative auxiliary ‘stay, remain’ occurs in a series of continuative forms. The auxiliary has the same distribution as the existential verb, cf. the general present raχa-zwa ‘talks, is talking’ (< raχaz awa, lit. ‘talking is’) and the continuative present raχa-zma ‘is still talking’ (< raχaz ama, lit. ‘talking stays’), see also (Haspelmath 1993: 130–131, 145). In Agul, the construction with the auxiliary verb aq’as (q’as) ‘do’ and the imperfective converb is attested, which expresses iterative meaning, e.g., ʕʷaj q’ase ‘will come from time to time’ (lit. ‘coming will-do’), as opposed to the plain future ʕʷase ‘will come’ (Merdanova 2004: 168).

. Auxiliaries in modal and evidential constructions Besides the ‘canonical’ periphrastic forms, a number of modal and/or evidential constructions are attested, with varying degrees of verb auxiliation. The verb ‘be, become’, which is one of the auxiliaries used to form various tense and aspect forms, is also used as a possibilitive modal, expressing both internal (‘can, be able’) and external possibility (‘may, be allowed’). As a rule, this verb

330

Timur Maisak

takes a complement headed by the infinitive, and a subject encoded by one of the locative cases: e.g., in Agul, Tabasaran (17) and Lezgian this is the adelative, also known as the apudelative (‘ad’ or ‘apud’ are the traditional labels for the localization ‘near’). (17) Tabasaran: possibilitive construction (Babaliyeva 2013: 40) did-x-an zav-ʔi-na=ra udu‹b›čʷ-uz š-ulu. that.-- sky--= rise- become- ‘It can even climb up in the sky.’ In Udi, the verb baksun ‘be, become’ has become a real auxiliary, as it does not control the case marking of the subject anymore, which solely depends on the lexical verb: thus, in (18), the subject of the transitive verb tašes ‘lead, carry’ takes the ergative. The 3rd singular form baneko of the same verb in the potential future is also used as a kind of parenthetical with epistemic meaning (‘maybe, probably’), as in (19). (18) kːačˁːi-n-en kːačˁːi-n-a taš-es ba‹ne›k-o? blind-- blind-- lead- become-: ‘Can the blind lead the blind?’ (Luk'an exlətbi Mŭq Xavar, 6:39) (19) ba‹ne›k-o, vaˁn=al hekjät-ä kːal-p-i čärkː-atːan become-: you.= tale- read--. finish- aχšˁum-kː-al=nan. laugh--=2 ‘Maybe, when you read this tale to the end, you will laugh as well.’ Although in many Lezgic languages morphological adhortative forms exist (i.e., ‘let’s do!’), constructions with the verb ‘come’ are also commonly employed for the same purpose. The auxiliary verb takes the imperative form and either ‘reinforces’ the dedicated adhortative, as in (20) from Udi, or is part of another construction, e.g., with the infinitive of the lexical verb, as in Rutul (21). It is also often the case that ‘come’ as an adhortative marker combines with various imperative forms, not being restricted to the 1st person plural value (e.g., ‘let me do it’, ‘let you do it’, ‘let him/her do it’, etc.). (20) Udi: adhortative (corpus example) ek-i qːohum bak-en come- relative. become- {You have a daughter, I have a son.} ‘Let’s become relatives!’

Grammaticalization in Lezgic (East Caucasian)

331

(21) Rutul: adhortative (Makhmudova 2001: 97) jɨq’-a si‹ǯ›ig-as! .come- let- ‘Let’s not allow!’ The imperative of the verb ‘come’ is used with the same adhortative function in Azeri, which could mean that this development is contact-induced in the Lezgic languages. Given the crosslinguistic commonality of the grammaticalization path in question, however, this is not necessarily the case (see Heine and Kuteva 2002: 69–70). It is much more probable that contact is involved in the development of another auxiliary, which among the Lezgic languages is attested only in Archi. In Archi, the construction with the auxiliary χos ‘find (accidentally), be found’ in the future tense is used to express conjecture or assumption: ‘it is probable that’. The auxiliary itself does not specify time reference, which is derived from the combination with the lexical verb, see the perfective converb in (22a), the imperfective converb in (22b) and the future converb in (22c), respectively. Although Kibrik (1977a: 92–93) describes the construction as a ‘possibilitive mood’, it rather stands at the intersection of epistemic modality and inferential evidentiality – in particular, what is called presumptive inference by Plungian (2001) or assumption by Aikhenvald (2004). (22) Archi: the presumptive inference construction (Kibrik 1977a: 216–217) a. to‹w›mu ručka-tːu šːetːe-li χo-qi that. pen-. buy.- .find.-: ‘He has probably bought pens.’ b. to‹w›mu ručka-tːu šːur-ši χo-qi that. pen-. buy.- .find.-: ‘He is probably buying pens (now).’ c. to‹w›mu ručka-tːu šːetːe-qi-ši χo-qi that. pen-. buy.-:- .find.-: ‘He will probably buy pens.’ There is another construction with ‘find’ (predominantly in one of the past tenses) which encodes a meaning from the evidentiality domain, namely the presence of direct evidence for the situation on the side of the speaker or another perceiving subject. Thus, the second part of (23) can be rendered literally as ‘I found (that) Mirza had gone away’. The degree of grammaticalization is probably not very strong in this case: the semantics of direct evidence are very close to the original lexical meaning of the auxiliary, and the subject noun phrase (encoding the one who has evidence for or ‘found’, the situation) can be overtly expressed. Still, the distribution of the verb ‘find’ in this construction is almost the same as that of other Archi auxiliaries (in particular, the existential verb), see Kibrik (1977a: 239–240) for discussion.

332

Timur Maisak

(23) Archi: the direct evidence construction (Kibrik 1977a: 242) zon notɬʼ-a-k qˁʷa-tːa, mirza uqˁa-li I. house-- .come.- Mirza. .go.away.- χu .find.() ‘When I came home, Mirza had already left (as I found out).’ What is important, is that while similar modal or evidential constructions with the auxiliary ‘find’ are lacking in other Lezgic languages, they are very common in the languages spoken further to the west of Daghestan, namely in Avar and Andic, as well as in the Tsezic languages.18 Given that Archi belongs to the Avar-dominated area, with a long history of bilingualism in Avar, it is very plausible that both ‘find’constructions in Archi were copied from this language (Daniel and Maisak 2018).

. Hearsay markers from speech verbs In some Lezgic languages, markers going back to speech verbs are used to express reportative evidentiality, or hearsay (‘they say’, ‘as I was told’). Thus, Haspelmath (1993: 148) describes the hearsay evidential marker -lda in Lezgian as a recently grammaticalized suffix resulting from the contraction of the present habitual form luhuda ‘(one) says’; this suffix can follow various indicative verbal forms, see (24) with the aorist. (24) Lezgian: hearsay evidential (Haspelmath 1993: 148) baku.d-a irid itim gülle.di-z aqːud-na-lda. Baku- seven man bullet- take.out-- ‘They say that in Baku seven men were shot.’ In Agul, the enclitic -ʁaj has the same function and origin and most likely represents the reduced present (< ʁaja) or habitual (< ʁaje) form of the verb ‘say’, which is now hosted by various finite verb forms. In (25), the hearsay marker occurs on the present habitual, which is used as a ‘historical present’ within the narrative. (25) Agul: hearsay evidential (corpus example) χul.a-s qu-ʕʷa-guna, uč.i temeʜ aq’u-na sa house- -go.- self() temptation. do.- one ʜub rukːa-j-e=ʁaj, me ʜupː-ar.i-k-as. sheep. slaughter.--= this sheep--/- ‘{He grazed the sheep,} and when he was going to return, he could not resist the temptation and slaughtered a sheep, they say, one of those sheep.’  See Kibrik et al. (2001: 307–318) for a detailed discussion of various constructions with ‘find’ in Bagwalal (Andic).

Grammaticalization in Lezgic (East Caucasian)

333

The development of hearsay markers from the verb ‘say’ should be kept apart from the grammaticalization of ‘say’ as a quotative and subordinating marker (see section 4.1). First, the hearsay and the quotative/subordinating functions of grammaticalized ‘say’ do not necessarily co-occur in one and the same language (hearsay markers are much less common). Second, the source constructions for the two tend to be different: as examples from Agul and Lezgian show, the present tense of the verb is a probable source for the hearsay marker, while it is mostly perfective and imperfective converbs that are used in the subordinating function.

. Verificative The morphological ‘verificative’ category is only attested in two Lezgic languages (Agul and Archi), and seems to be a crosslinguistic rarissimum. In both languages, there is a set of verificative verbal forms which express the meaning ‘to find out the truth value, or the value of an unknown variable’. In Archi, the verificative marker involves the segment -kːʷ-, which takes various verbal inflections, most commonly the infinitive in the purposive construction – ‘in order to find out whether …’ (26a). Kibrik (1977a: 290–292) even described the form in -kːus as a special kind of purposive converb; however, subsequent research made it clear that the verificative can occur in finite forms as well, as in (26b). As already hypothesized by Kibrik (1977a: 290), the verificative marker goes back to the verb akː- ‘see’, which is fused with the interrogative form of the lexical verb. The embedded lexical verb can also take various inflections, marking the time reference and the aspectual characteristics of the situation whose existence is being checked (e.g., ‘find out, whether s/he did, s/he does, s/he will do’, etc.). (26) Archi: verificative a. zon, halmaχdu w-i-r-kːu-s, uqˁa-li e‹w›di. I. [friend. -be-]-- .go.away.- be. ‘I went in order to find out, whether (my) friend is there.’ (Kibrik 1977a: 291) b. to‹w›mu baˁk’ bu-tɬ’u-r-kːu-qi zari. [that. ram. -slaughter.-]--: I. ‘I will check whether he slaughtered a ram.’ (Daniel and Maisak 2014: 394) In Agul, the verificative marker has several dialectal variants, cf. -čug- / -čuk’- / -čuq’- / -čuw- etc., or -magʷ- in Qushan Agul. It seems that the marker results from the morphological fusion of the verb agʷ- ‘see’ (a clear cognate of the Archi lexeme) with the preceding complement headed by the conditional form in -či. Thus, the -čugvariant is the contraction of the conditional suffix and the verb root (*-či agʷ- > -čug-),

334

Timur Maisak

while other similar variants probably involve an irregular sound change. The Qushan Agul verificative in -magʷ- remains mysterious, though, as the initial /m/ does not bear resemblance to the conditional affix, which is -t’en in this dialect. Like its Archi counterpart, the Agul verificative takes various verbal inflections, both finite and non-finite; it is especially frequent in the infinitive and the imperative. (27) Agul: verificative a. gi ʕut’u-naj-čuk’ [that() eat.-]-() ‘Check whether he has eaten.’ (Daniel and Maisak 2014: 382) b. zun dad.a mus χupːar ucaj-čuk’.a-s-e I() [father() when field. mow.:]-.-- ‘I will check when Father is going to mow the field.’ (Maisak 2016b: 826) Historically, verificative forms in both languages result from the morphological fusion of a matrix verb ‘see’ with its complement, which is originally the indirect polar or constituent question. The form of the complement differs, as in Archi, indirect questions encode the embedded clause by the interrogative form, whereas in Agul, the conditional mood is used in dependent questions. While the morphological fusion is complete (in both languages, the verificative derivations are morphologically bound), it was not preceded by a full syntactic fusion, or clause union. The verificatives manifest clear biclausal properties, as it is possible to express the agentive argument (the ‘verifier’ in the ergative case), as in (26b) and (27b), and unlike in causatives, there is no change in the argument structure of the embedded clause.19 The verb ‘see’, which becomes morphologized as a verificative marker, shifts its meaning from passive perception to ‘check, find out’, with the concomitant change in subject encoding from the dative, which encodes experiencers, to the ergative, as with typical agents. Among other languages of the group, the same shift is attested in the Lezgian verb akː- ‘see’ which can also take an indirect question complement. Unlike in Agul and Archi, in Lezgian the verificative construction remains periphrastic, i.e., the matrix verb and the complement do not coalesce. At the same time, examples like (28) are also possible in spontaneous speech (in written language, it would rather be rugunwat’a akːʷaz, consisting of two prosodically autonomous words).

 Other criteria of biclausality of the Agul verificative are examined in Maisak (2016b). See also Daniel and Maisak (2014) for comparison of the Agul and Archi data. Panova (2018) reports a typologically similar case of morphologically bound complementation with the verb ‘think, seem’ in Abaza, a West Caucasian language.

Grammaticalization in Lezgic (East Caucasian)

335

(28) Lezgian: verificative (Maisak 2016b: 853) am jak rugu-nwa-t’a-kːʷa-z fe-na. s/he [meat boil--]-see/- go- ‘He went to check whether the meat was cooked.’

. Locative preverbs An important feature of verbal morphology in the Lezgic languages, which, among the languages of the East Caucasian family is only shared by the Dargwa branch, is the presence of elaborate locative prefixation. Locative prefixes, also known as ‘preverbs’, seem to have the same origin as locative case suffixes in the nominal paradigm (see section 2.3).20 This is especially apparent in those cases where a locative prefix matches a localization marker on a dependent noun, cf. the  ‘under’ localization marker kː(V)- both on the verb and on the dependent noun in (29). (29) Tabasaran: locative prefixation and locative cases (Babaliyeva 2013: 37, 43) a. ča-n χil-ar.i-kː kːa-ʔ-u self- hand-- -put- ‘{She has collected them and} put it under her arms.’ b. kːa-da-‹b›ʁ-nu gardan.di-kː-an --take-. neck-- ‘Having taken it from under his neck …’ The same preverb-case ‘congruence’ can be seen even when the prefixal verb develops an idiomatic meaning, so the locative component is no longer evident. The corresponding argument retains the locative case form, although the encoding can later change into a more ‘canonical’ one for a given semantic role. For example, in Agul and Tabasaran, the verb ‘believe, trust’ encodes the ‘object of trust’ (the person whom one trusts) by the postessive case (30), which points to the fact that the initial consonant q- of the verb root is a lexicalized  prefix. In some Agul dialects, the verb ‘look’ has the stem qutːurfan- or qadurf- with the postessive encoding of the ‘object of looking’ (hence, the historical  prefix should be suspected in this verb as well). However, in the Huppuq’ dialect both the phonological shape of the verb has changed (into χutːurfas, with the /q/>/χ/ shift), as well as the argument encoding: a thing or a person at which one looks is encoded by the dative case in this dialect.

 Alekseev (1985: 117–123) suggests Proto-Lezgic reconstructions for a number of locative preverbs, including *al- ‘on, above’, *ʔ- ‘inside’, *k- ‘in contact with’, *qː- ‘out’, among others.

336

Timur Maisak

(30) Tabasaran: the post-essive argument marking with ‘believe’ (Babaliyeva 2013: 203) saban uzu didi-q quʁ-un-dar-za. at.first I. that..- believe---1: ‘First of all, I don’t trust it.’ There can be up to three locative preverbs in a verb stem (though sequences of three locative preverbs are rare). As a rule, when a verb has a single locative preverb, this preverb marks localization, e.g., kːa-ʔu ‘put under’ in (29a). If two preverbs are used, the first one specifies localization, and the second one marks orientation: thus, in kːa-da-bʁnu ‘take from under’ in (29b) the orientation preverb is -da- with the ‘reversive’ meaning, ‘in the opposite direction’. Note that it is only localization markers that are cognate to corresponding nominal morphemes, while orientation preverbs usually do not match the orientation suffixes in nouns. In some languages, the second slot is described as being filled by ‘expressive’ prefixes, whose function can be formulated as “the attribution of a pejorative connotation to the verb” (see Alekseyev [2016: 3543] on Rutul) or pointing to “an energetic type of action” (see Authier and Haciyev [2016: 3561] on Budugh). Locative prefixation is not a fully productive mechanism: there are verb roots with many locative derivatives (including bound roots, which do not appear without preverbs at all), but there are also verb roots from which no locative verbs can be derived. For example, in Huppuq’ Agul, about half of the 120 dynamic verbal roots have prefixed derivatives, only about 30 of which have more than one prefixed derivative; in total, about 350 dynamic prefixed verbs have been found (Maisak and Ganenkov 2016: 3585). Though in most cases the identification of locative prefixes in a verb stem is straightforward, there are verbs for which it is not immediately clear whether they contain a prefix or not. Not only individual verbs in a single language, but languages themselves differ with respect to the degree of transparency of locative prefixation. Tatevosov (2000) argues that among the North Caucasian languages, languages with locative prefixes can be placed along a continuum ranging from having purely compositional verb forms (i.e., Kubachi Dargwa) to having purely idiomatic ones (which Tatevosov illustrates with Tsakhur, for which he suggests to treat prefixes as highly lexicalized parts of verb roots); the West Caucasian language Adyghe occupies an intermediate position. Within the Lezgic group, Agul and Tabasaran on the one hand and Udi on the other can be considered to be on opposite ends of the spectrum. The transparent and logically organized systems of verbal prefixation in the former languages, which comprise hundreds of derivatives, contrast with the highly lexicalized and idiomatic system of Udi, where only through deep derivational (or comparative-historical) analysis about fifty prefixal verbs can be discovered (see Harris [2003a, 2003b] and Maisak [2008] for details and discussion). In its turn, Archi is the only Lezgic language where no locative prefixal verbs have been discovered.

Grammaticalization in Lezgic (East Caucasian)

337

The use of verbal prefixes for the expression of aspectual distinctions (which is typical of Slavic languages, and among the languages of the Caucasus is prominent in Ossetic and Kartvelian), is generally not attested in Daghestan. Tabasaran is the only language in which locative prefixes are used as a kind of ‘perfectivizing’ devices. Verbs that do not a contain a locative prefix, include a prefix ʁ- (historically, the orientational prefix ‘up’) in one set of the perfective forms, and a prefix d- (presumably, going back to ‘down’) in another set, cf. the aorist ʁ-ap’-nu ‘did’ and the perfect d-ap’-na ‘has done’ from the root ap’- ‘do, make’, see also Babaliyeva (2013: 11, 162).

. Repetitive preverbs In Lezgian and Agul, a repetitive with the meaning ‘again’ stands out among the derivational prefixal morphemes, in that it occupies its own slot and is almost unrestrictedly productive. Haspelmath (1993: 174) goes so far as to claim that the repetitive in Lezgian is “so regular that it could even be considered an inflectional category of the verb”. The Lezgian repetitive prefix is q(i)- or χ(u)-; -χ- also functions as an infix. The prefixes and infix derive repetitive verbs from a few verbs each, but from all verbal lexemes a periphrastic repetitive can be derived by means of the auxiliary q-uwun ‘do again’ (< awun ‘do’). In Agul, the prefix is q- or qa- / qu- / qi-: these variants are conditioned partly phonetically and partly lexically.21 The prefixal repetitive can be derived from all verbs, except for a few statives, and the periphrastic repetitive strategy is not common. Examples of derived repetitives in Lezgian and Agul are given in (31) and (32). (31) Lezgian: repetitive derivatives (Haspelmath 1993: 174–175) q-lahun ‘say again’ < luhun ‘say’ q-fin ‘go away, go back’ < fin ‘go’ χ-gun ‘give again’ < gun ‘give’ χu-taχun ‘take back’ < tuχun ‘take, carry’ a‹χ›kːun ‘see again’ < akːun ‘see’ ksun quwun ‘sleep again’ < ksun ‘sleep’ + quwun ‘do again’

 In the Huppuq’ dialect, this is the most external derivational prefix, which precedes locative prefixes, cf. qa-kːetːarxas ‘be destroyed again’ (< kːetːarxas ‘be destroyed, fall apart’ with the locative prefixal combination kː-etː- [--]). In the Keren dialect, the repetitive marker normally follows the locative prefixes, cf. kːetːa-q-arxas ‘be destroyed again’ from the same verb. Most probably, the position of the repetitive before the locative prefixes in Huppuq’ should be regarded as a comparatively recent “externalization” of this prefix.

338

Timur Maisak

(32) Agul: repetitive derivatives (Maisak and Ganenkov 2016: 3589) q-aq’as ‘do again’, ‘repair’ < aq’as ‘do’ q-aʁas ‘say again’, ‘tell more’ < aʁas ‘say’ qa-uχas ‘drink again’ < uχas ‘drink’ qa-jc’as ‘give again’, ‘give back’ < ic’as ‘give’ qu-hatas ‘send again’ < hatas ‘send’ qi-šaw ‘come back!’ < šaw ‘come!’ Apart from the general repetitive meaning ‘do again’, derivatives from motion verbs and ‘give’ can express the meaning ‘backwards’. A few repetitive verbs are idiomatic: ‘do again’ also means ‘heal, cure, repair’, and ‘become again’ also means ‘get better, recover’. In discourse use, other meanings can be associated with the repetitive prefix, e.g., ‘an event is repeated with different participants’, ‘an event is added to a series of events’ or ‘an action is produced in response to some previous action’, see Maisak and Ganenkov (2016: 3589–3590) for examples from Agul. From the form of the repetitive prefix it is clear that it must be diachronically related to the preverb of the  localization, but synchronically the two morphemes are clearly distinct. Thus, in Agul, the  prefix attaches only to roots, including bound roots, while the repetitive prefix attaches to stems that may already contain a locative prefix (including the  prefix itself, e.g., qa-q-arxas ‘fall behind again’ < q-arxas ‘fall behind’). The semantic extension of ‘behind’ and ‘back’ to ‘again’ is common crosslinguistically (see e.g.,  >  in Heine and Kuteva [2002: 259–260]), so the path from  to repetitive is not surprising. In other Lezgic languages, in particular Rutul and Tsakhur, verbs with the  prefix can also have the meaning ‘again’ or ‘back’, although morphologically there seems to be no repetitive prefix distinct from this locative prefix. Remarkably, among the Agul dialects the repetitive prefixation is attested only in two southern varieties, namely the Huppuq’ dialect and the Keren dialect, that have been in tight contact with neighboring Lezgian-speaking villages for ages. Given that there is no repetitive prefix in the closely related Tabasaran language, it is most plausible to assume that the Agul repetitive marker q- was borrowed from Lezgian in the southern Agul dialects (see Maisak [2019b] for details). This makes a rare example of a purely ‘intragenetic’ affix copying from a closely related Lezgic language.

. Infinitive and converbs Although the origin of non-finite forms, especially the numerous converbs, is not always transparent, in many instances the affixes bear clear resemblance to case markers. Thus, the infinitive marker -s / -z reflects the Proto-Lezgic dative case suffix *-sː (Alekseev 1985: 100), cf. akːus ‘to see’ in Archi, agʷas ‘to see’ in Agul, aguz ‘to look

Grammaticalization in Lezgic (East Caucasian)

339

for’ in Tabasaran, etc. As a rule, infinitives do not inflect for case in modern Lezgic languages, with the exception of Udi. Already in Caucasian Albanian, infinitives inflected for case, with the ergative in -s-en functioning as a converb ‘by/with doing X’, and the dative/locative in -s-a expressing purposive meaning (Gippert et al. 2008, II: 46–47).22 The genitive case of the infinitive in -s-un has developed into a general action nominal (“masdar”), and can itself inflect for case and number. Converbs, especially those expressing temporal relations, often include locative case markers. Those forms that convey a meaning of precedence in time, usually employ elative cases, sometimes with a postposition/adverb like ‘back, after’. Thus, in Agul and Tsakhur, the verb takes the superelative case in a dependent temporal clause; in Agul, the case marker follows the perfective converb suffix -na, which normally does not take any case endings (33a). Interestingly, in the Fite dialect of Agul the superelative verb form and the postposed adverb ‘after’ became fused, so that the temporal marker -laχa is already a synchronically unanalyzable converb suffix (33b). (33) Huppuq’ Agul vs. Fite Agul: temporal converb a. upu-na-l-as χab say.--- after b. upu-laχa say.- ‘when (s/he) said …’, ‘after (s/he) said …’ A regular series of locative converbs exists in Archi, where the converbial marker -ma can attach various locative inflections (cf. the unmarked essive in -ma, the lative in -mak, the elative in -maš, the translative in -maχutː, etc.; see Kibrik [1977a: 105]). The locative converbs can have temporal meanings as well, e.g., the elative converb in combination with the word χarāši ‘back, after’ means ‘after having done’ or ‘starting with the moment when’, cf. uqˁa-maš χarāši ‘after (you) went away’ (Kibrik 1977a: 297). In some languages, converbs with the temporal meaning ‘as soon as’, ‘immediately after’ include the comparative marker ‘as, like’ attached to a participle, e.g., agu-suman ‘as soon as (s/he) saw’ in Agul (Sulejmanov 1993: 157), ʁapi-si ‘as soon as (s/he) said’ in Tabasaran (Zagirov et al. 2014: 241), eʁala kːinä ‘as soon as (s/he) came’ in Udi. A similar temporal construction with the comparative particle kimi exists in Azeri. The now obsolete Persian loan word gah ‘time’ has become a general temporal suffix in Agul and Tabasaran. Thus, in Agul the marker -guna / -gana / -guni can attach to participles, demonstratives and also some adjectives, cf. aq’u-guna ‘when (someone) did’ from the perfective participle aq’u ‘having done’, te-guna ‘then, at that time’ from the distal demonstrative te ‘that’, bic’i-guna ‘when (someone) was a  The dative of the infinitive has also evolved into a present tense, see section 3.2.

340

Timur Maisak

child’ from the adjective bic’i ‘small, little’.23 The word gah itself is not used as a free noun in the modern languages, being replaced by another loan waχtː (waχt) ‘time’. Notably, the latter can be used in a functionally equivalent temporal construction in a semi-grammaticalized fashion, while retaining its syntactic autonomy, see (34) from Tabasaran. Such development illustrates a ‘new round’ of a crosslinguistically common path  >  (Heine and Kuteva 2002: 298–299). (34) Tabasaran: temporal constructions (Khanmagomedov 1970: 137) a. b-ik’ur-aji-gan -write.-.- ‘when (s/he) was writing’ b. b-ik’ur-aji waχt.na -write.-. time() ‘when (s/he) was writing’

. Person agreement Person agreement is not among the most prominent characteristics of the Lezgic languages (as opposed to the Dargwa languages, for example). It has developed only in two languages, Tabasaran and Udi, as an innovation. What agreement in both languages has in common is the fact that this is subject agreement (though for Tabasaran, this is only part of the story), and that the personal pronouns for the 1st and 2nd person were the diachronic sources for personal agreement clitics. In other respects, the two systems of agreement are very different, and they must have developed independently of one another. In Tabasaran, agreement is marked on the verb only.24 Subject agreement in the 1st and 2nd person is obligatory, but various kinds of other arguments and even adjuncts can optionally be indexed on the verb as well. Thus, it is possible to have more than one person enclitic on the verb simultaneously; when this happens, the subject marker will be the closest one to the verb (35). The 3rd person is always left unmarked. (35) Tabasaran: person agreement (Babaliyeva 2013: 210) uzu dumu uvu-z tuv-na-za-vuz. I. that. you- give--1:-2: ‘I will sell it to you.’  The form of the Agul affix probably reflects the inessive case (or, alternatively, the ergative case in the temporal function), with the oblique stem in -una / -ana / -uni which is a common model for monosyllabic nouns (thus, guna comes from *gah.una with the elision of the final laryngeal and vowel reduction). See also Žirkov (1948: 124) on the origin of the Tabasaran temporal suffix -gan.  On person agreement in Standard Tabasaran, see Babaliyeva (2013: 198–214); on various dialectal systems, see also Magometov (1965: 196–216), Kibrik and Seleznev (1982), and Bogomolova (2012).

Grammaticalization in Lezgic (East Caucasian)

341

Notably, the Tabasaran person markers retain the ergative-absolutive case distinction which is lost in the pronominal system: while uzu ‘I’ is syncretic between the ergative and the absolutive, the corresponding person marker distinguishes between the agentive -za, illustrated in (35), and the patientive -zu. In Udi, all the six person-number values are marked, although the origin of the 3rd person enclitics is not very straightforward; see Harris (2002: 178–183), Gippert et al. (2008, II: 52–54), and also Schulze (2011) for the most detailed discussion of diachronic issues. The position of person markers reflects the topic-focus structure of the sentence: they do not always stick to the finite verb, but occur on the right edge of a focused constituent, including question words (36) and the general negation marker. What is unusual and makes Udi close to typologically unique is the ability of person clitics to occur inside verb stems of both complex and simplex verbs: this ability has earned the Udi person markers the status of ‘endoclitics’, a rare type of clitics occuring inside words (see section 5.3). (36) Udi: placement of person markers (corpus example) p-i=zu, aχɨri maja=z taj-sa, mačˁu=z taj-sa? say-=1 at.last where=1 go- where.to=1 go- ‘I said: where do I go, at last?’ A phenomenon that comes close to person agreement and might represent early stages in the rise of agreement systems is the postverbal pronoun copying in Kryz and Agul. In Kryz, extensive use of personal pronouns immediately after finite verbs is very common. According to Authier (2010: 31), who suggests the influence of Azeri verb morphology as a possible source for this innovation, “the repetition of the pronoun in postverbal position is quite systematic when its first instance is placed in (preverbal) focus, after a topicalized object”. Only the 1st and the 2nd person pronouns are attested as enclitics, and elements other than pronominal subjects (including non-subject arguments and even adjuncts) can also be doubled by a clitic (Authier 2010: 31–32), see also Authier (2009: 44). (37) Kryz: postverbal pronoun copying (Authier 2009: 44) (zin) barkan.ǯi-zina ja-rč’ar-e-zin I. horse- -pass--I. ‘It’s by horse that I travel.’ Finally, in the Huppuq’ dialect of Agul, a pronoun doubling pattern has been discovered that includes a subject pronoun in the canonical preverbal position, paired with an identical instance of the same pronoun immediately following the verb. The doubling construction is found most often with the speech verb in matrix clauses introducing a quote, e.g., zun pune zun ‘I said I …’, zun aʁaa zun ‘I say I …’, mi aʁaa mi ‘s/he says s/he …’ and so on. In Maisak (2016c), I describe pronoun doubling in

342

Timur Maisak

the spoken corpus of the Huppuq’ dialect, while Bogomolova (2018) investigates the phenomenon in a broader perspective of postverbal subject positioning and its putative relation to the rise of Tabasaran-style person agreement.

 Grammaticalization of complex constructions . Emerging conjunctions and complementizers The predominant subordinating strategy in Lezgic is non-finite: participles, converbs, infinitives, and case-forms of action nominals are used to head complements and various types of adverbial clauses. The existing complementizers and subordinators are either borrowed (like the multi-purpose clausal dependency enclitic ki, ultimately an Iranian loan), or have an unclear origin. Among the incipient conjunctions, certain lexicalized verbal forms or phrases can be identified, e.g., ‘because’ is often rendered as ‘why if say’ or ‘why having said’ (38), and the contrastive disjunction (‘or else’, ‘otherwise’) is usually expressed by means of a negative conditional form of an existential verb, ‘if it is not’ (39). Disjunction in which free choice is emphasized can be expressed by affirmative conditional forms of the copula, the existential verb or the verb ‘want’, cf. structures like k’ant’a X, k’ant’a Y ‘X or Y’, literally ‘if (you) want, X, if (you) want, Y’ in Lezgian (Haspelmath 1993: 335). (38) Rutul: conjunctive expression ‘because’ (Alekseev 1994a: 249) läc’.ur-dɨ xäd bala jiʔi-j, his xu-jnɨ huʁʷal luʁur river- water. much .- why say.- rain. rain. a-j. .be- ‘In the river there was a lot of water because it had rained.’ (39) Agul: conjunctive expression ‘otherwise’ (corpus example) fajša ča-s sa tika jakː=ra, sa guni=ra, bring() we:- one piece. meat.= one bread.= čin latqa-s-e kun-ar, da-xu-či we: put.off.-- clothes-. -become.- latqa-f-tːawa. put.off.--: ‘Bring us a piece of meat and bread, then we take off our clothes; otherwise, we won’t take them off.’ Besides evolving into ordinal numeral markers (section 2.6) and hearsay markers (section 3.7), speech verbs tend to become another frequent type of complementiz-

Grammaticalization in Lezgic (East Caucasian)

343

ers/subordinators, a grammaticalization path which is widely attested crosslinguistically, see Heine and Kuteva (2002: 261–267). First of all, perfective or imperfective converbs of the verb ‘say’ are used with various complement-taking verbs, including speech verbs, verbs of thinking, emotional complement-taking predicates and some others. Such is the function of lanaha and luhuz in Lezgian, (du)pnu and k’uri in Tabasaran, puna and aʁaj in Agul, etc. (40) Lezgian: complementizer from ‘say’ (Haspelmath 1993: 368) wiri ha ik’ qsan-diz kütäh xa-na luhu-z am šad [all that. so good- end become- say.-] s/he. glad tːir. : ‘She was glad that everything ended so well.’ The same markers are also common in purpose (41a) and reason (41b) clauses; in the former, they can optionally ‘reinforce’ verb forms that function as purposives themselves (e.g., infinitives or optatives/jussives). (41) Tabasaran: purpose/reason subordinator from ‘say’ (Babaliyeva 2013: 291, 285) a. bilet ʁada‹b›ʁ-uz k’u-ri žibdiʔ χil u‹b›čːʷu. [ticket. take- say.-] pocket- hand. put.in- ‘He put his hand in his pocket to take off the ticket.’ b. uvu balʲzak urχura p-nu žʷuv gazaf savaldu [you() Balzac. read. say.-] self. very educated vu-di hisab map’an. - consider do. ‘Don’t think you are very educated just because you are reading Balzac.’

. Relative clauses and the make-up of participles Relative clauses are headed by participles, and normally precede their heads. Morphologically, participles remain unmarked in the attributive position in some languages (e.g., in Agul and Tabasaran, cf. [42]), and take an attributive marker in other languages (e.g., in Rutul and Tsakhur, cf. [43]). In its simplest form, the participle just coincides with the verb stem: thus, ʕʷa ‘going’ in Agul is a bare imperfective stem of the verb. (42) Agul: unmarked participle (corpus example) ja ʕʷa kas, ja qu-ʕʷa kas… or [go.()] person. or [-go.()] person. ‘{One couldn’t see} not a person going away, not a person coming back.’

344

Timur Maisak

(43) Tsakhur: participle with an attributive marker (Kibrik and Testelets 1999: 467) χāqa aˁlʲhā-na gade [home go.-] boy ‘a boy going home’ In some languages, new participles were built from the same periphrastic models as employed by finite tense and aspect forms (see section 3.2). For example, in Agul, alongside the simple perfective and imperfective participles identical to the two aspectual verb stems, two more perfective participles were derived from the perfective converb, and two more imperfective participles were derived from the imperfective converb, both in combination with auxiliary verbs in the participial form. Thus, the forms listed in (44) are morphologized periphrastic participles which differ from the morphologized indicative forms only in the form of the auxiliary. (44) Agul: new series of participles With the copula as an auxiliary a. Aorist participle: aq’u-nde (< *aq’u-na i-de) do.-: do.- - b. Habitual participle: aq’a-jde (< *aq’a-j i-de) do.-: do.- - With the existential verb as an auxiliary c. Perfect participle: aq’u-naje (< *aq’u-na a-je) do.-: do.- .be- d. Present participle: aq’a-je (< *aq’a-j a-je) do.-: do.- .be- Participial relative clauses in East Caucasian languages are known to have little, if any, syntactic restrictions on relativization targets: all syntactic positions are perfectly relativizable. At the same time, the role of the relativized argument cannot be deduced from the form of the participles themselves, which are not syntactically ‘oriented’ in any way.25 When the relativized element cannot be easily recovered,  As Comrie and Polinsky (1999: 83–84) put it in their paper on relativization in Tsez, another East Caucasian language, “[t]he precise nature of the relationship between the null and the head NP is determined by semantic linking rules which are probably language-specific. (…) The hearer has to assign a plausible interpretation to the association between the head NP and an unexpressed constituent in the attributive clause. (…) If a plausible interpretation can be assigned (…) then the resulting relative clause construction is judged acceptable”.

Grammaticalization in Lezgic (East Caucasian)

345

reflexive pronouns can be used, which function as resumptive pronouns. For example, in (45) from Lezgian, the reflexive wiči-kaj, taking the subelative case stands in the argument position (the topic of speech) within the relative clause. (45) Lezgian: reflexive used as resumptive (Haspelmath 1993: 342) čun wiči-k-aj raχa-zwa-j kas [we. self-- talk--] man. ‘the man we’re talking about’ Udi is the only exception among the languages of the group in that it makes use of an ‘Indo-European’ type of relative clauses with a relative pronoun mani ‘which’ in combination with the borrowed Persian/Azeri subordinator ki (46). Such clauses follow the head, and include a finite verb; undoubtedly, their spread in Udi is due to its contacts with the non-East Caucasian languages of the area.26 As for the origin of the relative pronoun itself, it seems to have an interrogative base (ma-) and is probably cognate to maja ‘where?’ and majin ‘from where?’. (46) Udi: finite relative clause (corpus example) me čuʁ-o, manu ki muʁecːcːe usen=e šejtan-en this woman- [which.  18 year.=3 devil- ʁačˁː-pː-e=ne… tie--=3] ‘… this woman, whom Satan has kept bound for eighteen long years.’ (Luk'an exlətbi Mŭq Xavar, 13:16)

. Focus clefts For the purposes of information structuring, many East Caucasian languages use focus constructions, in which the (right edge of the) focused constituent is marked by the position of a predicative element, usually a copula, a person agreement clitic, a question particle or some other kind of marker (see Harris [2002: ch. 10], and also Kalinina and Sumbatova [2007] for a discussion of various strategies). In some Lezgic languages, e.g., in Tsakhur and Udi, the focalization of nonverbal constituents does not influence the form of the predicate; what changes is only the position of the focalizing element. To the contrary, in such languages as Agul, Tabasaran, Lezgian and Rutul, the focus construction has a cleft or pseudocleft structure. In such sentences, the focused part is followed by a copula, and the

 The use of this relative clause type is very restricted in modern Nizh Udi, but seems to be much more common in Vartashen Udi, especially in older written texts (in Vartashen Udi, a borrowed Armenian subordinator te is used instead of ki), see Gippert (2011) for details.

346

Timur Maisak

rest of the sentence comprises the background part, headed by a participle (substantivized, absolutive singular). In example (47) from Rutul, the focused construction is used in a wh-question, although it is quite common in assertions as well. (47) Rutul: focus cleft (corpus example) va šiv=i haʔa-d? you() what.= [do.-] ‘What are you doing?’ In focus constructions, the focus can precede the background, which corresponds to ordinary cleft sentences in English or French. Alternatively, the background can precede the focus, as in (48) and (49); the latter variant corresponds to what is known as a wh-cleft, or pseudo-cleft. In Lezgian and Tabasaran, as mentioned by Haspelmath (1993: 352) and Babaliyeva (2013: 240), respectively, the pseudo-cleft structure is more common, although in other languages, e.g., Agul, corpus data show that the ordinary cleft is preferred. (48) Lezgian: focus pseudo-cleft (Haspelmath 1993: 352) tezetdin.a-n k’ʷal-e awa-j-di q’ʷe küsri tːir. [Tezetdin- house- .be--] two chair. : ‘What was in Tezetdin’s house were two chairs.’ (49) Tabasaran: focus pseudo-cleft (Babaliyeva 2013: 240) dušʔin al-i-b sa-b χumurzag vu-ji. [there .be--] one- water.melon. - ‘What was there was just a water-melon.’

 Verbal compounds: from two words to one and back . The structure of verbal compounds As already mentioned, in the absence of regular affixal verbal derivation in Lezgic languages, and given the comparatively small number of non-derived, simplex verb stems, verbal compounding is the chief way of creating new verb lexemes. Verbal compounds, also known as complex verbs, consist of a lexical part, or coverb, and a light verb. Light verbs host all verbal inflectional marking (tense, aspect, mood, polarity, agreement, etc.), and usually include high-frequency verbal lexemes with a generalized meaning, like ‘be, become’, ‘do, make’, ‘give’, ‘say’, ‘hit, beat’, ‘go, come’ and some others. Coverbs can be represented by nouns, adjectives, adverbs,

Grammaticalization in Lezgic (East Caucasian)

347

ideophones, and also nominal or verbal (or ‘acategorical’) bound stems which cannot function as autonomous words. The latter group is not homogeneous: thus, in some complex verbs, regular verb stems are used which otherwise serve as input for inflection (cf. the derivation of decausatives in Udi, section 3.1). Some bound coverbs, however, represent obsolete (and often phonologically truncated) verb roots, which are not used outside complex verbs, but have clear cognates among the simplex verbs in related languages. A selection of Budugh lexemes illustrating various types of complex verbs is given in (50). Note that it is very common to have intransitive/transitive pairs of complex verbs with the same coverb, but using the light verb ‘be, become’ in the former case and ‘do, make’ in the latter.27 (50) Budugh: complex verbs (Authier and Haciyev 2016: 3550–3554) q’us jɨxar ‘be, grow old’ < ‘old’, adjective + ‘be, become’ dɨχ jɨxar ‘hurry’ < ‘quickly’, adverb + ‘be, become’ didekir jɨxar ‘be born’ < ‘from mother’, noun in the ablative case + ‘be, become’ qele jɨxar ‘be angry’ < ‘in anger’, noun in the locative case + ‘be, become’ k’ev jɨxar / siʔi ‘close/be closed’ < ‘strong’, adjective + ‘do’/‘be, become’ ǯidir jɨxar / siʔi ‘hide (tr./intr.)’ < ‘hide’, bound stem + ‘do’/‘be, become’ ispor suʔu ‘quarrel’ < ‘quarrel’, Russian noun + ‘do’ fikir suʔu ‘think’ < ‘thought’, Azeri noun + ‘do’ neʕ juc’u ‘smell (intr.)’ < ‘smell’, noun + ‘give’ jɨʁ ʕaqu ‘fast’ < ‘day’, noun + ‘keep’ ʕul ʕosu ‘wait’ < ‘eye’, noun + ‘put’ Verbal vocabulary is mostly expanded by creating new complex verbs with verbs borrowed from Azeri or Russian serving as coverbs (Russian and Azeri nouns can also occur in complex verbs, as examples from [50] show). Russian verbs are normally borrowed in the form of the infinitive, while the Azeri loans have the form of the perfect participle in -miš, cf. the Agul examples arganizawatː aq’as ‘organize’ (< Russian organizovať ‘organize’ + ‘do’) or jašamiš xas ‘live’ (< Azeri jašamiš ‘live’ + ‘become’). There seems to be no strict border between free syntactic combinations (or idioms, in case of non-compositional meaning) on one hand, and complex verbs on the other. On the cline from syntax to morphology, different groups of complex

 For a summary of complex verb formation in a number of Lezgic languages, see in particular Maisak and Ganenkov (2016: 3580–3583) on Agul, Chumakina (2016: 3599–3600) on Archi, Authier and Haciyev (2016: 3550–3554) on Budugh, Alekseyev (2016: 3538) on Rutul, and Harris (2002: ch. 4; 2008) and Schulze (2016: 3569–3571) on Udi.

348

Timur Maisak

verbs occupy different positions. Still, most of them can be shown to be moving towards the morphological pole, becoming tighter and approaching simplex verbs in their behavior (see section 5.2). A development in the opposite direction, however, has occurred in Udi, where simplex verbs were reanalyzed as bipartite, probably by analogy with the historically bipartite complex verbs, which by far outnumber non-derived, monomorphemic verb stems. This reanalysis, briefly sketched in section 5.3, has led to the rise of crosslinguistically rare type of clitics, known as endoclitics, i.e., clitics that can occur inside words.

. From complex to simplex verbs Describing complex verbs in Archi, Chumakina (2016: 3600) states that “[s]yntactically, all types of complex verbs demonstrate the characteristics of a single word: the order of the parts is fixed (the lexical part is followed by the light verb) and the insertion of other lexical material between these parts is not, as a rule, allowed”. The same characterization is largely applicable to complex verbs in other Lezgic languages: on the whole, these lexical units follow the path of univerbation both on the syntactic and the morphological side. At the same time, as Maisak and Ganenkov (2016: 3582) claim for Agul, complex verbs do not represent a uniform class: “some of them are close to free syntactic combinations of verbs and object noun phrases, while others are lexicalized to a considerable degree and approach simplex verb stems”. Thus, the possibility of non-contact placement of the lexical part and the light verb is very restricted indeed, but still possible in specific contexts. For example, in Agul, the two parts can be separated when the lexical meaning of the verb is topicalized, and hence the coverb occurs on the left periphery, while the light verb occupies its usual position. See (51) with the verb ʜar-aq’as ‘teach, make known’ whose coverb ʜar is derived from a root ʜa- ‘know’. (51) Agul: complex verb “know do” (Maisak and Ganenkov 2016: 3582) ʜar zun gi-s aq’a-s-e… know I() that- do.-- ‘As for teaching, I will teach him …’ As far as the valence patterns of complex verbs are concerned, in those compounds that include a transitive light verb (e.g., ‘do’ in combinations like “help do” meaning ‘to help’), the coverb mostly occupies the position of a patientive noun phrase in the absolutive case. In languages with gender agreement, the nominal coverb retains its gender, controlling agreement on the light verb. However, there are complex verbs of the same structure, in which the coverb is rather an “incorporated” component which is different from the patientive noun phrase. Thus, in (52) from

Grammaticalization in Lezgic (East Caucasian)

349

Agul, the former noun q’at’ ‘piece’ is no longer the patient of the verb ‘do’ in the compound verb q’at’-aq’as ‘cut, detach’ (< “piece do”), because another absolutive noun phrase jerχe č’arar ‘long hair’ stands in the patient position. (52) Agul: complex verb “piece do” (Maisak and Ganenkov 2016: 3582) ruš.a uč-in jerχe č’ar-ar q’at’ q’u-ne. girl() self- long hair-. piece do.- ‘The girl cut off her long hair.’ Parts of compound verbs can become so tight morphologically that not just the verbal part, but the whole “coverb + light verb” complex can serve as an input for derivational processes. Thus, although the repetitive marker in Agul (see section 3.7) is usually prefixed to the verb stem, including light verbs, some compounds prefer to have it before the coverb. Such derivatives as qa-gunt’-aq’as ‘gather again’ (from gunt’-aq’as ‘gather’) or qa-un-aq’as ‘call again’ (from un-aq’as ‘call’) show that what started as combinations with the light verb aq’as ‘do’ are now close to non-segmentable verb stems. The coverbs ʜar in ʜar-aq’as ‘teach, make known’ and gunt’ in gunt’-aq’as ‘gather’ are also examples of bound elements which do not occur outside complex verbs. Similarly, some light verbs occurring in compounds are not used anymore as autonomous verbal predicates. For example, in Udi, apart from the verbs ‘do’, ‘become’ and ‘say’, a handful of other light verbs can be identified which are not found as free verbs in the modern language, and for which only tentative etymologies can be suggested (e.g., ‘go, come’, ‘give’, ‘cut’, among others, see Schulze [2016: 3569]). Finally, it is possible that the light verb disappears as a result of phonological reduction, so that former compounds cannot be formally identified as bipartite at all. This is what happened in Lezgian with complex verbs including the light verb awun ‘do’. Such verbs can occur in two forms: while in their full forms the verb is present, in the reduced form the root ‘do’ is not visible. Thus, for the verb k’ʷalaχ awun ‘work’ (< “work do”), the full form of the aorist will be k’ʷalaχ awu-na, and the reduced form is k’ʷalaχ-na, with the aorist ending in -na added directly to the coverb (Haspelmath 1993: 178).

. From simplex to bipartite verb stems in Udi The number of simplex verbs in Udi is small, and the set of verbs that historically include locative prefixes (see section 3.6) is not very large either: Maisak (2008) estimates the number of simplex verbs at about 50–60, and lists about 50 prefixal verbs. According to Schulze (2016: 3570), complex verbs account for more than 75 % of all verbal lexemes in modern Udi. Complex verbs in Udi have the same general structure as complex verbs in related languages, but on the whole, they are tighter than their counterparts in other

350

Timur Maisak

Lezgic languages.28 One of the reasons for this is that, as pointed out above, some of the light verbs are already obsolete as autonomous verbal lexemes, and their original meanings can be discovered only via etymological analysis. What is important, is that the overwhelming majority of light verbs has a stem that consists of a single consonant, e.g., p- ‘say’, b- ‘do’, the causative light verb d- and so on. Only the light verb bak- ‘be, become’ has a CVC root structure. As already mentioned in section 3.9, the position of personal agreement markers in Udi is not restricted to the verb, and when they occur in a complex verb, they can be placed either as enclitics on the light verb, or between the two components of a compound (53); the latter position can be interpreted as enclitic to the lexical part. There are also other types of markers (in particular, the three negative morphemes and the additive clitic), whose position depends on their scope: like personal clitics, in complex verbs they can occur between the lexical part and the light verb. (53) Udi: endoclitics inside complex verbs a. äš=e=b-sa b. cam=ez=p-i work=3=- write=1=- ‘s/he works’ ‘I wrote’ What has attracted much attention in literature is the ability of these clitics to occur inside simplex verb stems. In such cases, the verb stem is divided into two parts, the second of which comprises only the last consonant.29 The clitic is placed before this last consonant, breaking up the otherwise indivisible stem, like beˁʁ- ‘look, watch’ or akː- ‘see’ (54). Note that whereas the verbs in (53) consist of two meaningful parts (äš-b- ‘work’ is “work + do” and cam-p- ‘write’ is “writing + say”), the two parts of simplex verbs (e.g., beˁ- and -ʁ, or a- and -kː) do not have meanings of their own. (54) Udi: endoclitics inside simplex verbs a. beˁ=ne=ʁ-sa b. a=z=kː-i look1=3=look2- see1=1=see2- ‘s/he looks’ ‘I saw’

 A number of arguments showing that complex verbs in Udi are single words can be found in Harris (2002: 76–87).  This means that for certain types of verb stems the division into two parts is not allowed. In particular, this is impossible for monoconsonantal stems like b- ‘do’ or p- ‘say’, or for stems with a CV structure like bu ‘exist, be located’ or bi- ‘die’. Stems with a final -r also disallow ‘endoclisis’ (for details on restrictions on the placement of personal markers, see Harris [2002: ch. 6] and Maisak [2015]).

Grammaticalization in Lezgic (East Caucasian)

351

Harris (2000, 2002) claimed that the existence of clitics that can occur inside words (‘endoclitics’) posits a problem to those linguistic theories that adhere to the principle of Lexical Integrity. According to this principle, the morphological composition of words is not accessible to the rules of syntax: however, the person agreement markers in Udi are clitics, and simplex verb stems are morphological objects. Later, a number of accounts appeared that sought to explain away the problem of endoclitics in various ways. For example, according to Luís and Spencer (2005), whose approach follows the theory of Generalized Paradigm Function Morphology, person markers should be regarded as a special type of affixes (‘phrasal affixes’), which are not syntactic but morphological objects, although they are placed with respect to syntactic categories. However, what is most noteworthy for the present discussion is the very fact that at a certain stage in the history of Udi, it became possible to divide verb stems into two parts.30 As a possible explanation, Harris (2002: 213–215) puts forward the ‘Slot hypothesis’, Udi endoclitics “occur in ancient ‘slots’ occupied by the markers of gender-class (CMs) in Proto-Lezgian”. However, it is not the case that personal agreement markers just ‘inherit’ the infixal slots of gender markers in all verbs: as Harris (2002: 219–221) acknowledges herself, for many verbs the hypothesis makes wrong predictions, so that one has to assume the change of gender marker position in individual verbs through reanalysis or extension. Moreover, gender agreement in Old Udi was most likely lost well before the spread of endoclisis: for example, no Lezgic-type gender agreement is attested in the Caucasian Albanian palimpsests. All of these facts together suggest that the ‘bipartition’ of simplex verb stems should rather be regarded as a process independent of the use of infixes for gender agreement. It is probably not a coincidence that both in simplex and in complex verbs endoclitics occupy the position before the last consonant: in case of complex verbs, this is due to the fact that most light verb stems are monoconsonantal. In fact, many complex verbs in modern Udi are hardly distinguishable from simplex verbs: a deep derivational or even etymologic analysis is needed to understand whether the last consonant of a verb stem is a historical light verb or part of the root. One can thus hypothesize that the real reason underlying the divisibility of simplex verbs is the extension of the pattern “V-, VC- or CVC- lexical part + C- light verb” from complex verbs to all other verbal lexemes. That is, by analogy with complex verbs, which by far outnumber simplex ones, the latter began to be interpreted as including a detachable monoconsonantal (and meaningless) “pseudo-light verb”. It is this ‘convergence’ of complex and simplex verbs in Udi that has presumably led to the possibility of treating simplex verb stems as bipartite and, in the grammaticalization

 This development is comparatively recent: e.g., Gippert et al. (2008, II: 52) note that in Caucasian Albanian palimpsests (written between the 7th and the 10th century A.D.) endocliticization of person markers “is still extremely rare, if present at all”.

352

Timur Maisak

perspective, resulted in a reversed type of development with respect to a much more common univerbation process.

 Discussion and summary Like in many other languages, among the grammatical developments in the Lezgic branch of the East Caucasian family there are those that are crosslinguistically very common, and those that are comparatively rare. Among them, there are instances of grammaticalization that are shared by related branches of the family, as well as those that are more typical of the Lezgic languages or even restricted to them.31 Those grammaticalization paths that are attested in many languages outside the Caucasus and are also typical of other East Caucasian languages include: the use of the numeral ‘one’ as an emerging indefinite article, the development of matrix verbs ‘do’ and ‘give’ into causative auxiliaries, the auxiliation of copulas and locative verbs in tense and aspect constructions, the use of the motion verb ‘come’ in an adhortative construction, the grammaticalization of ‘say’ as a quotative, complementizer or hearsay marker, the evolution of conditional/concessive verb forms into indefinite pronoun series markers, and the rise of subject agreement clitics from personal pronouns, among other things. To the contrary, such developments as the grammaticalization of ‘say’ into an ordinal numeral marker, or the use of the verb ‘find’ as a modal/evidential auxiliary seem to be infrequent in the world’s languages, though they are quite common in the East Caucasian family. As rarissima in the world’s languages, including both the East Caucasian family and the Lezgic group in particular, I would qualify the rise of the morphological ‘verificative’ in Agul and Archi, as well as the reanalysis of simplex verbs as bipartite which led to the phenomenon of ‘endoclisis’ in Udi. For some of the grammaticalization patterns, an external or ‘intragenetic’ contact-induced origin is very plausible: the use of the auxiliary ‘find’ in modal/evidential constructions in Archi must be due to the influence of Avar, the spread of the repetitive prefix in southern Agul dialects is most probably a copy from Lezgian, and the rise of the present tense from the dative/locative form of the infinitive in Udi seems to be influenced by the genetically unrelated languages of the area. There is a number of material borrowings (‘global copies’, in Lars Johanson’s [2002] terms) from the Turkic language Azeri, with which most Lezgic languages have been in a long-standing contact, cf. the borrowed derivational affixes -suz, -lu, -či, conjunctions amma, ägär, ja… ja… and so on (some of these items are ultimately of Arabic

 Unfortunately, there are no detailed overviews of grammaticalization phenomena in East Caucasian languages on the whole or in individual groups (apart from the present sketch), so my estimations on this point are only tentative.

Grammaticalization in Lezgic (East Caucasian)

353

or Persian origin, but have probably appeared in Lezgic languages via Azeri). Plenty of parallels between Lezgic languages and Azeri can be observed in grammatical structure as well, e.g., the use of ‘come’ as an auxiliary in adhortative constructions, the development of a conditional copula into an indefinite pronoun marker, or the existence of a subject agreement system. When such parallels are not only functional, but also material (cf. the use of the Azeri conditional enclitic -sa in the Udi indefinite pronouns), or when they are restricted to just one or two languages (cf. the ‘locative’ present or the Differential Object Marking in Udi), contact influence can be plausibly suspected. However, in other cases there is no reason to prefer contact influence to independent development, or to the existence of much more global areal patterns whose exact origin can be hardly established (see especially Stilo [2015] on the area which subsumes at least the southern Lezgic idioms). To summarize, in the present sketch I have described or at least mentioned the following instances of grammaticalization, which, of course, do not represent the exhaustive “lexicon of grammaticalization” of the Lezgic languages. (Below, I follow the “source to target” order, given that in the main body of the paper the phenomena were considered by their target domains): Nominal and pronominal sources ‘Time’ > temporal converb ‘when’ ‘This’, ‘that’ (demonstratives) > definite determiner ‘One’ (numeral) > indefinite determiner ‘One’ (doubled) > reciprocal pronoun Reflexive (doubled) > local reflexive Reflexive > resumptive Person pronouns > personal agreement clitics Adelative (locative case) > subject of the verb ‘can’ Adelative (locative case) > causee of transitive verbs Adelative (locative case) > involuntary agent Dative > infinitive/purposive marker Verbal sources ‘Being at smb.’s possession’ (converb) > comitative case ‘Be, become’ (auxiliary) > possibilitive modal (‘can, be able’, ‘may, be allowed’) ‘Come’ (auxiliary) > adhortative (‘let’s…’) ‘Do, make’ > causative auxiliary/light verb ‘Do, make’ (auxiliary) > iterative ‘Give’ > causative auxiliary/light verb ‘Go’ > detransitive light verb ‘Find, be found’ (auxiliary) > direct evidence ‘Find, be found’ (auxiliary, in the future tense) > presumptive inference, assumption

354

Timur Maisak

‘Say’ (present/habitual) > hearsay (reportative) ‘Say’ (converb) > quotative marker ‘Say’ (converb) > complementizer ‘Say’ (converb) > subordinator (purpose, reason) ‘Say’ (participle) > ordinal numeral marker ‘Say’ > light verb (in complex verbs) ‘See’ (with indirect question complement) > ‘check, find out’ > verificative marker ‘Stay, remain’ (auxiliary) > continuative Auxiliary in the past > “retrospective shift” marker Copula/‘be’/‘want’ in the conditional/concessive > indefinite pronoun series marker Converb/participle in the elative case > temporal converb ‘when’, ‘after’ Infinitive in the genitive case > action nominal Present > future/habitual/‘historical present’ Present > subjunctive Adverbial sources Locative adverbs > locative prefixes on verbs Locative adverbs > locative suffixes on nouns Other sources Additive (‘also, even’) > negative pronoun series marker Additive (‘also, even’) > collective noun phrase marker Comparative marker ‘as, like’ > temporal converb ‘as soon as’ Locative prefix (‘up’, ‘down’) > perfective marker Locative prefix ‘behind’ > repetitive marker Reduplication (partial/full) > distributivity (in numerals) Constructions Converb (perfective) + copula > aorist (perfective past) Converb (perfective) + existential verb > resultative / perfect / unwitnessed past Participle (perfective) + copula > experiential/existential perfect Converb (imperfective) + copula > habitual Converb (imperfective) + existential verb > progressive / general present Participle (imperfective) + copula > generic present / future Infinitive + copula > prospective / future Infinitive + past copula > past prospective / irrealis Infinitive in the dative/locative case > general present Coverb + light verb > complex verb Biabsolutive construction Focus cleft construction Personal pronoun doubling construction

Grammaticalization in Lezgic (East Caucasian)

355

Lexicalizations Lexicalization (fossilization) of gender markers in verbs and nouns Lexicalization (fossilization) of plural markers in nouns Lexicalization (fossilization) of locative prefixes in verbs Lexicalization of complex verbs ‘Be, become’ in the future > ‘maybe, probably’ (epistemic parenthetical) ‘Find’ in the future > ‘maybe, probably’ (epistemic parenthetical) ‘Reversed’ univerbation Simplex verb > bipartite “stem1 + stem2” verb

Acknowledgements I wish to thank both editors and an anonymous reviewer for their comments, and Samira Verhees for her kind help with my English. Any remaining errors are my own.

Abbreviations 1 = 1st person singular, 1: = 1st person singular, agentive, 2 = 2nd person singular, 2 = 2nd person plural, 3 = 3rd person singular, , , ,  = nominal classes (genders),  = absolutive,  = additive,  = adverbial,  = aorist,  = ‘near’ localization,  = attributive,  = class (gender),  = comitative,  = complementizer,  = conditional,  = converb,  = copula,  = dative,  = detransitive,  = elative,  = ergative,  = “éventuel” (verb form),  = evidential,  = exclusive,  = feminine,  = future, : = potential future,  = genitive,  = habitual,  = hortative,  = imperative,  = ‘inside’ localization,  = infinitive,  = instrumental,  = imperfective,  = lative,  = light verb,  = manner (converb),  = neutral,  = negation,  = oblique,  = perfective,  = plural,  = ‘behind’ localization,  = prohibitive,  = perfect,  = present,  = preverb,  = past,  = particle,  = participle,  = question,  = repetitive,  = reportative,  = substantivizer,  = part of verbal stem (separated by clitics),  = ‘under’ localization, / = ‘under’/‘in contact’ localization,  = ‘on’ localization,  = temporal (converb),  = temporal (case),  = verificative. For unification reasons, the transcription and glosses in examples cited from others’ works have been changed or adapted; the glosses were added in case the original did not have them.

References Aikhenvald, Alexandra. 2004. Evidentiality. Oxford: Oxford University Press. Alekseev, Mikhail E. 1985. Voprosy sravniteľno-istoričeskoj grammatiki lezginskix jazykov. Morfologija. Sintaksis [The problems of comparative-historical reconstruction of the grammar of Lezgic languages: Morphology. Syntax]. Moscow: Nauka.

356

Timur Maisak

Alekseev, Mikhail E. 1994a. Rutul. In Rieks Smeets (ed.), The Indigenous languages of the Caucasus. Vol. 3. North East Caucasian languages. Part 2, 213–258. Delmar N.Y.: Caravan. Alekseev, Mikhail E. 1994b. Budukh. In Rieks Smeets (ed.), The Indigenous languages of the Caucasus. Vol. 3. North East Caucasian languages. Part 2, 259–296. Delmar N.Y.: Caravan. Alekseev, Mikhail E. 2003. Sravniteľno-istoričeskaja morfologija naxsko-dagestanskix jazykov. Kategorii imeni [Comparative-historical morphology of Nakh-Daghestanian languages. Nominal categories]. Moscow: Academia. Alekseev, Mikhail E. & Ja. G. Testelets. 1996. “Severokavkazskij ètimologičeskij slovar’” i perspektivy kavkazskoj komparativistiki [“A North Caucasian etymological dictionary” and the perspective of Caucasian comparative-historical studies]. Izvestija Akademii Nauk. Serija literatury i jazyka 55(5). 3–18. Alekseyev [Alekseev], Mikhail. 2016. Rutul. In Peter O. Müller, Ingeborg Ohnheiser, Susan Olsen & Franz Rainer (eds.), Word-Formation: An international handbook of the languages of Europe (HSK Handbücher zur Sprach- und Kommunikationswissenschaft 40/5), 3536–3545. Berlin: De Gruyter. Arkadiev, Peter & Timur Maisak. 2018. Grammaticalization in the North Caucasian languages. In Heiko Narrog & Bernd Heine (eds.), Grammaticalization from a typological perspective, 116– 145. Oxford: Oxford University Press. Authier, Gilles. 2009. Grammaire Kryz (Langue caucasique d’Azerbaïdjan, dialecte d’Alik). Leuven–Paris: Peeters. Authier, Gilles. 2010. Azeri morphology in Kryz (East Caucasian). Turkic Languages 14(1). 14–42. Authier, Gilles. 2012. The detransitive voice in Kryz. In Gilles Authier & Katharina Haude (eds.), Voice, valency, and ergativity, 133–163. Berlin: Mouton de Gruyter. Authier, Gilles & Adigözel Haciyev. 2016. Budugh. In Peter O. Müller, Ingeborg Ohnheiser, Susan Olsen & Franz Rainer (eds.), Word-Formation: An international handbook of the languages of Europe (HSK Handbücher zur Sprach- und Kommunikationswissenschaft 40/5), 3546–3563. Berlin: De Gruyter. Babaliyeva, Ayten. 2013. Études sur la morphosyntaxe du tabasaran littéraire. Paris: Thèse de doctorat, École pratique des hautes études. Becker, Laura. 2018. Articles in the world’s languages. Ph. D. diss., Universität Leipzig. Bogomolova, Natalia K. 2012. Ličnoe soglasovanie v tabasaranskom jazyke: konceptualizator i ego adresat v strukture situacii [Person agreement in Tabasaran: the conceptualizer and the addressee in the structure of the situation]. Voprosy jazykoznanija 4. 101–124. Bogomolova, Natalia. 2018. The rise of person agreement in East Lezgic: Assessing the role of frequency. Linguistics 56(4). 819–844. Bybee, Joan L., Revere Perkins & William Pagliuca. 1994. The evolution of grammar: Tense, aspect and modality in the languages of the world. Chicago: University of Chicago Press. Chirikba, Viacheslav. 2008. The problem of the Caucasian Sprachbund. In Pieter Muysken (ed.), From linguistic areas to areal linguistics, 25–94. Amsterdam: John Benjamins. Chumakina, Marina. 2016. Archi. In Peter O. Müller, Ingeborg Ohnheiser, Susan Olsen & Franz Rainer (eds.), Word-Formation: An international handbook of the languages of Europe (HSK Handbücher zur Sprach- und Kommunikationswissenschaft 40/5), 3595–3604. Berlin: De Gruyter. Comrie, Bernard. 1999. Spatial cases in Daghestanian languages. Sprachtypologie und Universalienforschung 52(2). 108–117. Comrie, Bernard & Maria Polinsky. 1998. The great Daghestanian case hoax. In Anna Siewierska & Jae Jung Song (eds.), Case, typology and grammar, 95–114. Amsterdam & Philadelphia: John Benjamins. Comrie, Bernard & Maria Polinsky. 1999. Form and function in syntax: relative clauses in Tsez. In Michael Darnell, Edith A. Moravcsik, Frederick J. Newmeyer, Michael Noonan & Kathleen M.

Grammaticalization in Lezgic (East Caucasian)

357

Wheatley (eds.), Functionalism and formalism in linguistics, Volume II: Case studies (Studies in Language Companion Series 42), 77–92. Amsterdam & Philadelphia: John Benjamins. Daniel, Michael & Dmitry Ganenkov. 2008. case marking in Daghestanian: limits of elaboration. In Andrej Malchukov & Andrew Spencer (eds.), The Oxford handbook of case, 668–685. Oxford: Oxford University Press. Daniel, Michael A. & Timur A. Maisak. 2014. Grammatikalizacija verifikativa: ob odnoj aguľskoarčinskoj paralleli [The grammaticalization of the verificative: on an Agul-Archi parallel]. In Mikhail A. Daniel, Ekaterina A. Ljutikova, Vladimir A. Plungjan, Sergej G. Tatevosov & Oľga V. Fedorova (eds.), Jazyk. Konstanty. Peremennye: Pamjati Aleksandra Evgen’eviča Kibrika [Language. Constants. Variables: In memory of Alexandr Evgen’evič Kibrik], 377–406. St. Petersburg: Aletejja. Daniel, Michael, Timur Maisak & Solmaz Merdanova. 2012. Causatives in Agul. In Pirkko Suihkonen, Bernard Comrie & Valery Solovyev (eds.), Argument structure and grammatical relations: A crosslinguistic typology, 55–114. Amsterdam: John Benjamins. Forker, Diana. 2012. The bi-absolutive construction in Nakh-Daghestanian. Folia Linguistica 46 (1). 75–108. Forker, Diana. 2017. Ergativity in Nakh-Daghestanian. In Jessica Coon, Diane Massam & Lisa Travis (eds.), The Oxford handbook of ergativity, 851–872. Oxford: Oxford University Press. Ganenkov, Dmitry, Timur Maisak & Solmaz Merdanova. 2008. Non-canonical Agent marking in Agul. In Helen de Hoop & Peter de Swart (eds.), Differential subject marking (Studies in Natural Language and Linguistic Theory 72), 173–198. Dordrecht: Springer. Ganenkov, Dmitry, Timur Maisak & Solmaz Merdanova. 2009. Diskursivnaja anafora v aguľskom jazyke [Discourse anaphora in Agul]. In NeFestšrift: Staťi v podarok (k jubileju A. E. Kibrika) [Not-a-festschrift: Papers presented on occasion of A. E. Kibrik’s birthday]. Online edition, March 2009. http://otipl.philol.msu.ru/~kibrik/content/pdf/Ganenkov_etal.pdf (accessed 01 April, 2020) Gippert, Jost. 2011. Relative Clauses in Vartashen Udi: Preliminary Remarks. Iran and the Caucasus 15. 207–230. Gippert, Jost, Wolfgang Schulze, Zaza Aleksidze & Jean-Pierre Mahé. 2008. The Caucasian Albanian Palimpsests of Mount Sinai. 2 vols. Turnhout: Brépols. Harris, Alice C. 2000. Where in the word is the Udi clitic? Language 76. 593–616. Harris, Alice C. 2002. Endoclitics and the origins of Udi morphosyntax. Oxford: Oxford University Press. Harris, Alice C. 2003a. The prehistory of Udi locative cases and locative preverbs. In Dee Ann Holisky & Kevin Tuite (eds.), Current trends in Caucasian, East European, and Inner Asian linguistics: Papers in honor of Howard I. Aronson, 177–192. Amsterdam & Philadelphia: John Benjamins. Harris, Alice C. 2003b. Preverbs and their origins in Georgian and Udi. In Geert E. Booij & Jaap van Marle (eds.), Yearbook of morphology 2003, 61–78. Dordrecht: Springer. Harris, Alice C. 2008. Light verbs as classifiers in Udi. Diachronica 25 (2). 213–241. Haspelmath, Martin. 1993. A grammar of Lezgian (Mouton Grammar Library 9). Berlin: Mouton de Gruyter. Haspelmath, Martin. 1997. Indefinite pronouns. Oxford: Clarendon Press. Haspelmath, Martin. 1998. The semantic development of old presents: New futures and subjunctives without grammaticalization. Diachronica 15 (1). 29–62. Heine, Bernd & Tania Kuteva. 2002. World lexicon of grammaticalization. Cambridge: Cambridge University Press. Johanson, Lars. 2002. Contact-induced linguistic change in a code-copying framework. In Mari C. Jones & Edith Esch (eds.), Language change: The interplay of internal, external and extralinguistic factors, 285–313. Berlin: Mouton de Gruyter.

358

Timur Maisak

Kalinina, Elena & Nina Sumbatova. 2007. Clause structure and verbal forms in Nakh-Daghestanian languages. In Irina Nikolaeva (ed.), Finiteness: theoretical and empirical foundations, 184– 249. Oxford: Oxford University Press. Kassian, Alexei. 2015. Towards a formal genealogical classification of the Lezgian languages (North Caucasus): testing various phylogenetic methods on lexical data. PLoS ONE 10 (2). http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0116950 (accessed 01 April, 2020) Khanmagomedov, Bejdullakh G.-K. 1970. Očerki po sintaksisu tabasaranskogo jazyka [Studies in Tabasaran syntax]. Maxačkala: Dagučpedgiz. Kibrik, Aleksandr E. 1975. Nominativnaja / èrgativnaja konstrukcija i logičeskoe udarenie v arčinskom jazyke [Nominative/ergative construction and logical accent in Archi]. In Vladimir A. Zvegincev (ed.), Issledovanija po strukturnoj i prikladnoj lingvistike. Vyp. 7, 54–62. Moscow: Moscow State University. Kibrik, Aleksandr E. 1977a. Opyt strukturnogo opisanija arčinskogo jazyka. T. 2. Taksonomičeskaja grammatika [A structural description of the Archi language. V. 2. Taxonomic grammar]. Moscow: Moscow State University. Kibrik, Aleksandr E. 1977b. Opyt strukturnogo opisanija arčinskogo jazyka. T. 3. Dinamičeskaja grammatika [A structural description of the Archi language. V. 3. Dynamic grammar]. Moscow: Moscow State University. Kibrik, Aleksandr E. 1997. Beyond subject and object: Toward a comprehensive relational typology. Linguistic Typology 1. 279–346. Kibrik, Aleksandr E. 2003. Nominal inflection galore: Daghestanian, with side glances at Europe and the world. In Frans Plank (ed.), Noun phrase structure in the languages of Europe, 37– 112. Berlin: Mouton de Gruyter. Kibrik, Aleksandr E. & Mikhail G. Seleznev. 1982. Sintaksis i morfologija glagoľnogo soglasovanija v tabasaranskom jazyke [Syntax and morphology of Tabasaran verbal agreement], 17–33. In Tabasaranskie ètjudy [Tabasaran essays]. Moscow: Moscow State University. Kibrik, Aleksandr E. & Jakov G. Testelec (eds.). 1999. Elementy caxurskogo jazyka v tipologičeskom osveščenii [Aspects of Tsakhur from a typological perspective]. Moscow: Nasledie. Kibrik, Aleksandr E., Konstantin I. Kazenin, Ekaterina A. Ljutikova & Sergej G. Tatevosov (eds.). 2001. Bagvalinskij jazyk. Grammatika. Teksty. Slovari [Bagwalal: Grammar. Texts. Dictionaries.] Moscow: Nasledie. Kittilä, Seppo. 2005. Remarks on involuntary agent constructions. Word 56(3). 381–419. Korjakov, Yurij B. 2006. Atlas kavkazskix jazykov [The Atlas of Caucasian languages]. Moscow: Institute of Linguistics of the Russian Academy of Sciences. Krishnamurti, Bhadriraju. 2003. The Dravidian Languages. Cambridge: Cambridge University Press. Luís, Ana & Andrew Spencer. 2005. Udi clitics: A Generalized Paradigm Function Morphology approach. Essex Research Reports in Linguistics 48. 35–47. Luk’an exlətbi Mŭq Xavar [The Good News according to Luke]. Bakı, 2011. Magometov, Aleksandr A. 1965. Tabasaranskij jazyk: Issledovanie i teksty [The Tabasaran language: analysis and texts]. Tbilisi: Mecniereba. Magometov, Aleksandr A. 1970. Aguľskij jazyk: Issledovanie i teksty [The Agul language: analysis and texts]. Tbilisi: Mecniereba. Maisak, Timur A. 2008. Glagoľnaja paradigma udinskogo jazyka (nidžskij dialekt) [Verbal paradigm in Udi (Nizh dialect)]. In Mikhail E. Alekseev, Timur A. Majsak, Dmitrij S. Ganenkov & Jurij A. Lander (eds.), Udinskij sbornik: Grammatika, leksika, istorija jazyka [Studies in Udi: Grammar, lexicon, history of the language], 96–161. Moscow: Academia.

Grammaticalization in Lezgic (East Caucasian)

359

Maisak, Timur. 2011. The Present and the Future within the Lezgic tense and aspect systems. In Gilles Authier & Timur Maisak (eds.), Tense, aspect, modality and finiteness in East Caucasian languages (Diversitas Linguarum 30), 25–66. Bochum: Brockmeyer. Maisak, Timur A. 2015. Pozicija ličnyx klitik v udinskom jazyke po korpusnym dannym [Position of person clitics in Udi: a corpus-based perspective]. In Ekaterina A. Ljutikova, Anton V. Cimmerling & Maria B. Konošenko (eds.), Tipologija morfosintaksičeskix parametrov. Vyp. 2. Materialy meždunarodnoj konferencii “Tipologija morfosintaksičeskix parametrov 2015” [Typology of morphosyntactic parameters, 2: Proceedings of the international conference], 243–265. Moscow: MPGU. Maisak, Timur A. 2016a. Tipologičeskoe, vnutrigenetičeskoe i areaľnoe v grammatikalizacii: dannye lezginskix jazykov [Typological, intragenetic and areal in grammaticalization: the Lezgic data]. Acta Linguistica Petropolitana 12(1). 588–618. Maisak, Timur. 2016b. Morphological fusion without syntactic fusion: the case of the “verificative” in Agul. Linguistics 54(4). 815–870. Maisak, Timur. 2016c. Subject pronoun doubling in Agul: spoken corpus data on a rare discourse pattern. Studies in Language 40(4). 955–987. Maisak, Timur. 2019a. Borrowing from an unrelated language in support of intragenetic tendencies: the case of the conditional clitic -sa in Udi. Diachronica 36(3). 337–383. Maisak, Timur. 2019b. Repetitive prefix in Agul: morphological copy from a closely related language. International Journal of Bilingualism 23(2). 486–508. Maisak, Тimur. Forthcoming. Structural and functional variations of the perfect in Lezgic languages. In Kristin Melum Eide & Marc Fryd (eds.), The Perfect Volume. Amsterdam: John Benjamins. Maisak, Timur & Dmitry Ganenkov. 2016. Aghul. In Peter O. Müller, Ingeborg Ohnheiser, Susan Olsen & Franz Rainer (eds.), Word-Formation: An International Handbook of the Languages of Europe (HSK Handbücher zur Sprach- und Kommunikationswissenschaft 40/5), 3579–3594. Berlin: De Gruyter. Makhmudova, Svetlana M. 2001. Morfologija rutuľskogo jazyka [Morphology of the Rutul language]. Moscow: Sovetskij pisateľ. Merdanova, Solmaz R. 2004. Morfologija i grammatičeskaja semantika aguľskogo jazyka: na materiale xpjukskogo govora [Morphology and grammatical semantics of Agul: on the data from the Huppuq’ dialect]. Moscow: Sovetskij pisateľ. Mithun, Marianne, 2008. The extension of dependency beyond the sentence. Language 84. 69– 119. Nasledskova, Polina L. 2019. Artikli v rutuľskom jazyke [Articles in Rutul]. Typology of Morphosyntactic Parameters 2(2). 79–94. http://tmp.sc/application/files/6415/7944/5488/ Nasledskova-2019-2-2.pdf (accessed 01 April, 2020) Nichols, Johanna. 2003. The Nakh-Daghestanian consonant correspondences. In Dee Ann Holisky & Kevin Tuite (eds.), Current Trends in Caucasian, East European, and Inner Asian Linguistics: Papers in Honor of Howard I. Aronson, 207–251. Amsterdam & Philadelphia: John Benjamins. Nikolayev, Sergei L. & Sergei A. Starostin. 1994. A North Caucasian etymological dictionary. Moscow: Asterisk. Panova, Anastasija B. 2018. Složnye predikaty s èlementom -dzyšča- v abazinskom jazyke: meždu morfologiej i sintaksisom [Complex predicates with the element -dzyšča- in AbazaL between morphology and syntax]. In Ksenija P. Semёnova (ed.), Malye jazyki v bol’šoj lingvistike. Sbornik trudov konferencii 2017 [Small languages in big linguistics. Proceedings of the 2017 conference], 167–173. Moscow: Buki Vedi. Plungian, Vladimir A. 2001. The place of evidentiality within the universal grammatical space. Journal of Pragmatics 33(3). 349–357.

360

Timur Maisak

Plungian, Vladimir A. & Johan van der Auwera. 2006. Towards a typology of discontinuous past marking. Sprachtypologie und Universalienforschung 59. 317–349. Robbeets, Martine. 2017. The development of finiteness in the Transeurasian languages. Linguistics 55(3). 489–523. Schulze, Wolfgang. 1992. How can class markers petrify? Towards a functional diachrony of morphological subsystems. In Howard I. Aronson (ed.), Linguistic studies in the non Slavic languages of the USSR, 189–233. Chicago: Chicago Linguistic Society. Schulze, Wolfgang. 2001. The Udi Gospels: Annotated text, etymological index, lemmatized concordance. München: Lincom Europa. Schulze, Wolfgang. 2011. The origins of personal agreement clitics in Caucasian Albanian and Udi. In Manana Tandaschwili & Zakaria Pourtskhvanidze (eds.), Folia Caucasica, FS für Jost Gippert zum 55. Geburtstag, 119–168. Frankfurt a. M. / Tbilisi: Universität Frankfurt / Staatliche Ivane-Javakhishvili-Universität Tbilisi. Schulze, Wolfgang. 2014. The emergence of diathesis markers from MOTION concepts. In Javier E. Díaz-Vera (ed.), Metaphor and metonymy across time and cultures: Perspectives on the sociohistorical linguistics of figurative language (Cognitive Linguistics Research 52), 171– 223. Berlin: de Gruyter. Schulze, Wolfgang. 2016. Udi. In Peter O. Müller, Ingeborg Ohnheiser, Susan Olsen & Franz Rainer (eds.), Word-Formation: An international handbook of the languages of Europe (HSK Handbücher zur Sprach- und Kommunikationswissenschaft 40/5), 3564–3578. Berlin: De Gruyter. Schulze-Fürhoff, Wolfgang. 1994. Udi. In Rieks Smeets (ed.), The indigenous languages of the Caucasus. Vol. 3. North East Caucasian languages. Part 2, 447–514. Delmar N.Y.: Caravan. Shushurin, Philip. 2017. The oblique causer construction in Lezgian. Acta Linguistica Petropolitana (Trudy Instituta lingvističeskix issledovanij RAN) XIII (1). 830–861. Sulejmanov, Nadir D. 1993. Sravniteľno-istoričeskoe issledovanie dialektov aguľskogo jazyka [A comparative-historical study of Agul dialects]. Makhachkala: DNC RAN. Stilo, Donald L. 2015. An introduction to the Atlas of the Araxes-Iran Linguistic Area. In Bernard Comrie & Lucía Golluscio (eds.), Language Contact and Documentation. Contacto lingüístico y documentación, 343–355. Berlin etc.: Walter de Gruyter. Tatevosov, Sergei G. 2000. Metafizika dviženija v grammatike estestvennogo jazyka: glagoľnaja prefiksacija v severokavkazskix jazykax [The metaphysics of motion in natural language grammar: verbal prefixation in North Caucasian]. Vestnik Moskovskogo universiteta. Serija 9: Filologija 6: 14–29. Tatevosov, Sergei. 2001. From resultatives to evidentials: Multiple uses of the perfect in NakhDaghestanian languages. Journal of Pragmatics 33. 443–464. Tatevosov, Sergei. 2005. From habituals to futures: Discerning the path of diachronic development. In Henk J. Verkuyl, Henriette De Swart & Angeliek Van Hout (eds.), Perspectives on Aspect, 181–197. Dordrecht: Springer. Tuite, Kevin. 1999. The myth of the Caucasian Sprachbund: the case of ergativity. Lingua 108. 1–29. Van den Berg, Helma. 2005. The East Caucasian language family. Lingua 115. 147–190. Zagirov, Zagir M., Velibek M. Zagirov, Kazi K. Kurbanov, Bejdullax G.-K. Khanmagomedov & Kim T. Šalbuzov. 2014. Sovremennyj tabasaranskij jazyk [Modern Tabasaran]. Maxačkala: IJaLI DNC RAN. Žirkov, Lev I. 1948. Tabasaranskij jazyk [Tabasaran]. Moscow & Leningrad: AN SSSR.

Juha Janhunen

7 Grammaticalization in Uralic as viewed from a general Eurasian perspective  Background: The Uralic family Chronologically and geographically, Uralic is one of the old and widespread language families of Eurasia, though by the number of languages it is rather a mediumsized family with about 40–50 distinct languages (depending on the criteria of counting). In historical times, the Uralic languages have formed an almost continuous horizontal belt extending from central Scandinavia (southern Lapland) in the west to the Baikal region (southern central Siberia) in the east. In the vertical direction, the territory covered by the Uralic languages comprises also the Carpathian basin, or Pannonia (Hungary), in the southwest and the Taimyr peninsula (northern central Siberia) in the northeast. There are reasons to assume that the language family spread primarily from east to west, and secondarily from south to north, with occasional tertiary offshoots in other directions. It is possible that the general eastto-west expansion had started in Pre-Proto-Uralic times, meaning that there may have been subsequently extinct Para-Uralic languages once spoken to the east of the easternmost historically documented Uralic languages (Janhunen 2013). The internal taxonomy of Uralic, based on both lexical and grammatical evidence, also suggests a gradual differentiation in the east-to-west direction, with each subsequent branch becoming increasingly shallower towards the west (see map 1). Although still occasionally contested, this taxonomy would imply that Uralic was initially divided into two major branches, Samoyedic in the east and FinnoUgric in the west. The geographical boundary between these branches goes along the watershed between the Ob and Yenisei basins, suggesting that the Uralic homeland may have been located in the southern part of this watershed, that is, in southern Siberia and/or northern Kazakhstan. The later divisions of Finno-Ugric produced, in an order from east to west, the Ugric, Permic, Mariic, Mordvinic, Finnic and Saamic branches. Among these, the status and internal coherence of the Ugric branch is still open to reanalysis. Although arguments have been brought in favour of a Ugric “unity” (Honti 1979, 1998), it also seems that there was an early internal division between Khanty (Khantic) and Mansi-Hungarian (Hungaric), while the features common to Khanty and Mansi, often lumped together as “Ob-Ugric”, are likely to be secondary. Judging by historical, dialectological and lexicological information, the internal division of most branches of Uralic took place somewhere between 2000 and 1000 years ago, that is, in the period extending, in terms of archaeology and cultural history, from the Iron Age to the Middle Ages. At the far end of this time scale there are the Saamic, Finnic, Mordvinic and Samoyedic branches, with an unamhttps://doi.org/10.1515/9783110563146-007

362

Juha Janhunen

Map 1: An areal view of the Uralic family tree (from Janhunen 2001: 39). From the original eastern location of Proto-Uralic (PU), the family expanded westwards, producing a series of increasingly shallow intermediate protolanguages, including Proto-Finno-Ugric (PFU), Proto-Finno-Permic (PFP), Proto-Finno-Volgaic (PFV), Proto-Finno-Mordvinic (PFM) and Proto-Finno-Saamic (PFS), as well as the corresponding local branches, including Ugric, Permic, Mariic, Mordvinic, Finnic and Saamic. The mutual positions of the four western branches (Saamic, Finnic, Mordvinic and Mariic) are still disputed, as is the question concerning the internal unity of the Ugric branch.

biguous Iron Age origin, while at the near end there are the Mariic and Permic branches, with a mediaeval or even later origin. The division of the alleged Ugric branch, if taxonomically real, must, however, have been a more complex phenomenon. The internal differentiation of Mansi (Mansic) and Khanty (Khantic) would seem to presuppose an early mediaeval time, while the separation of Hungarian from Mansi must have taken place much earlier, that is, in the Bronze Age or early Iron Age. The separation of Khanty from Mansi-Hungarian would have to have been an even earlier phenomenon. A conclusive dating will require a careful evaluation of the inherited vs. areally transmitted parallels. The time frame of Proto-Uralic is an even more controversial issue. Traditionally, Proto-Uralic is placed at a time scale of some 6000–8000 years. This dating is suggested especially by the culturally diagnostic lexicon that can be reconstructed for Proto-Uralic, which clearly represents a Neolithic or even Mesolithic level of cultural evolution (Janhunen 2008: 228–230, 2009). Also, judging by the distribution of early Indo-European loanwords in Uralic, Proto-Indo-European (without Anatolian), which itself is often dated to about 5000 years ago and located somewhere in the western steppes of Eurasia, must have been younger than Proto-Uralic, since even the oldest Indo-European elements transmitted to Uralic are present only in FinnoUgric and absent in Samoyedic. It has to be mentioned, however, that there have

Grammaticalization in Uralic as viewed from a general Eurasian perspective

363

been recent attempts at revising this framework, either by assuming even earlier contacts between Indo-European and Uralic, extending also to Samoyedic, or by assuming that Samoyedic has undergone a massive phase of relexification, which would imply that the traditional taxonomy of Uralic should also be revised (Salminen 1989 and others). So far, these attempts have not had any substance to them, which is why they will not be taken into account in the present chapter. Even so, the dating of Proto-Uralic remains a major issue for future research. Since, in any case, by the criteria of comparative linguistics, Proto-Uralic was spoken a very long time ago, we cannot expect much substance to have survived from the protolanguage into the modern Uralic languages. It is not surprising, therefore, that the number of Proto-Uralic lexical items, when defined as lexemes shared by both Samoyedic and Finno-Ugric, is only about 200 (cf. most recently, [Aikio 2002, 2006]), a figure which may be compared with the considerably larger number of lexical items standardly reconstructed for Proto-Indo-European (even with Anatolian). On the other hand, the grammatical system and typological orientation of Proto-Uralic can be reconstructed in much more detail (Janhunen 1982). This is connected with the fact that traces of Proto-Uralic grammatical markers are preserved relatively well in the two geographical extremes of the family: in the west (SaamicFinnic, but also Mordvinic and Mariic) and in the east (Samoyedic), suggesting that the languages in the centre (Permic and “Ugric”) are more innovative and, hence, less conservative. There are, indeed, multiple reasons to assume that the central branches of Uralic have undergone several waves of large-scale restructuring, apparently under the impact of neighbouring non-Uralic languages. This is perhaps nowhere as obvious as in the case of Hungarian, which, by all criteria, is the most “aberrant” Uralic language.

 The areal and typological context From the grammatical features that can be reconstructed for Proto-Uralic we know that it was, like the majority of the modern Uralic languages, of the Ural-Altaic type. In this connection, the Ural-Altaic type should be understood as corresponding to a macroscopic transcontinental areal zone, or “belt”, or “complex”, of several language families, including, apart from Uralic, also Turkic, Mongolic, Tungusic, Koreanic and Japonic. The areal similarities shared by these language families, which should not be mistaken for implying an original genetic unity, comprise several features of grammatical and lexical structure, including, for instance, a general head-final word order of both the basic clause (SOV) and of the nominal phrase (GAN), a predominantly nominative-accusative argument structure, a system of suffixally marked morphology, as well as a relatively simple syllable structure with no initial or final clusters (#CC, CC#) and with no tonal distinctions, but with the words bound together by a palato-velar vowel harmony.

364

Juha Janhunen

Deviations from the prototypical Ural-Altaic features are found mainly on the margins of the area but occasionally also elsewhere. For instance, the basic word order has been “Europeanised” in some Uralic languages spoken in western and northern Europe (Finnic, Saamic, and, to a lesser extent, Hungarian). Some languages spoken in the Ural-Ob region (especially “Ob-Ugric”), but including also Hungarian (which originated in this very region), have developed features of prefixal morphology (e.g., for verbal aspectual differences). The synchronic status of vowel harmony varies greatly within the Ural-Altaic area, and towards the east, the palato-velar harmony (PVH) is gradually replaced by a so-called tongue-root harmony (TRH), involving the feature of “retracted tongue root” (RTR) for some vowels, as in Koreanic, Tungusic and Mongolic (Janhunen 1981), unless it is totally absent, as in Japonic. On the other hand, some languages in the east have developed tonal distinctions (Japonic, Middle Korean). Languages that have entered the Tibetan areal realm (Turkic, Mongolic) have locally adapted to the Tibetan type of syllable structure with initial clusters (#CC). Also, while many Ural-Altaic languages have predominantly bisyllabic lexical roots ending in a vowel (CVCV) with the possibility of extension by derivational suffixes (-C and -CV), there are also examples of what seem to have been primary monosyllabic roots (CV). However, due to their fundamental typological similarity, the languages of the Ural-Altaic area have tended to develop along similar lines even when they have not been in direct contact with each other. For instance, in the realm of morphology, there are some striking material parallels between the individual Ural-Altaic families, but a closer look at the data reveals that they involve independent developments, which have coincidentally, though perhaps conditioned by the impact of the underlying language type, produced similar results (cf. Janhunen 2012, 2014). At the level of syllable structure, the phenomenon of vowel loss in non-initial syllables has created secondary monosyllabic roots ending in a consonant (*CVCV > CVC) almost everywhere in the Ural-Altaic area (Turkic, Mongolic, Koreanic, and most branches of Uralic), though the development has taken place at different times in the different families. The loss of intervocalic consonants, on the other hand, has created distinctive long or double vowels (*VCV > VV) in many languages (Turkic, Mongolic, and some branches of Uralic). More locally, ablaut alternations have arisen in several interconnected languages of the Ob region (Khantic, Samoyedic, cf. Katz 1975). On a more general theoretical level it has been pointed out that the comparative method inherently distorts our understanding of the typology of any protolanguage, since it provides only a fragmentary picture of the underlying language and tends to eliminate or ignore any original irregularities. Thus, whichever non-agglutinative features there may have been in a prehistorical language like Proto-Uralic, they cannot be reached by the comparative method (Korhonen 1974). This is, in fact, true of any irregularities that may have been present in the protolanguage, but which have been levelled in the post-protolanguage period. As it is, almost all of the irregularities and morphophonological alternations observed synchronically in the Ural-

Grammaticalization in Uralic as viewed from a general Eurasian perspective

365

ic languages are results of secondary innovations in the post-Proto-Uralic period. Even so, there are a few actual cases of morphophonological alternation (discussed further below) that seem to date back to Proto-Uralic. These cases typically involve individual lexemes and grammatical elements, and the generalizations that can be based on them are inevitably rather diffuse. In spite of the tendency of parallel evolution conditioned by the basic typological similarity of all Uralic languages, the synchronic diversity within the Uralic family is considerable, especially when compared with the other families of the UralAltaic area, which all derive from relatively new protolanguages of no more than 1000 to 2500 years old. As a result, the routes of evolution taken by each given Uralic branch, language and dialect have often been very different and involve a diversity of processes of innovation, restructuring, and grammaticalization. It is, therefore, very difficult to provide a general picture of the typological trends observed across the Uralic family (cf. the attempt made by Tauli [1966]). A detailed analysis of historical typology has only been carried out for a few actual Uralic branches and languages. A case in point is Saamic, which is known to have undergone a major restructuring of word structure, resulting in completely new patterns of information structure, with a complex system of quantitative and qualitative alternations within word stems and inflected forms (Korhonen 1969). Such a restructuring may well have been due to the influence of substratal languages of a nonUral-Altaic type.

 Cyclicity in structural evolution It is, consequently, virtually impossible to generalize the processes of grammaticalization that have taken place in the individual Uralic branches and languages, as each branch and language has taken a path of its own, often under the influence of other languages, reflecting the wide geographical distribution of the Uralic family. Therefore, the present chapter will mainly be concerned with the emergence of grammatical distinctions and categories in Proto-Uralic, rather than with the more trivial development of lexical elements into grammatical markers at more recent time levels. To begin with, it may be taken for certain that Proto-Uralic was a fullydeveloped language, once spoken by an actual speech community in a restricted territory, the Uralic linguistic homeland (Urheimat). There are clear indications that Proto-Uralic itself was a language of the Ural-Altaic type, implying that it shared most of the characteristics today regarded as “Ural-Altaic” in the geographical and typological sense. This gives the Ural-Altaic language type a general age corresponding to the specific age of Proto-Uralic, that is, perhaps up to 6000–8000 years. In this connection it has to be stressed that, contrary to claims occasionally made by typologists working mechanically with statistical evidence, the Ural-Altaic language type is not a result of universal tendencies in language structure, that is,

366

Juha Janhunen

it is not a language type that would have an inherent likelihood to arise with no circumstantial factors being involved. The very fact that the Ural-Altaic languages, from Uralic to Japonic, form a geographical continuum suggests that the common features that we may term “Ural-Altaic” are mainly due to language contact and language spread. Since the Uralic family is the oldest entity of this type, it is possible that some other families with a similar synchronic typology have joined the Ural-Altaic areal context later. This is, for instance, likely in the case of Japonic, which may have originated as a language of a different (Sinitic) typology (Janhunen 1997) but which later underwent a process of “Altaicization” or “Ural-Altaicization” under the areal impact of other Ural-Altaic languages. The extralinguistic circumstances of this typological transition can be relatively reliably reconstructed (Janhunen 1999). It is, therefore, important to recognize that typologies can change, and even Uralic may have undergone periods of typological reorientation and reformation in the Pre-Proto-Uralic period. Moreover, some typological features may have been acquired, lost and re-acquired several times in the past. Structural features often develop cyclically, which is why a feature present synchronically in a language need not be “original” even when it would seem to correspond to the typological profile of the areal and genetic context of the language concerned. This is often the case with morphology: a language may have had primary morphology that was lost (like Pre-Proto-Sinitic to Proto-Sinitic, or Old English to Modern English), but it may also develop a new secondary morphology (like modern Mandarin Chinese). An example from Uralic is Hungarian, which has, among other features, the richest case system of all modern Uralic languages, with some 20 separate suffixally marked cases. However, only one of the Hungarian case markers, the “superessive” in -(V)n, can unambiguously be traced back to a Proto-Uralic source, the locative in *-na, while most of the other case markers represent recently suffixalized spatial nominals (cf. e.g., Klemm 1928–1942: 180–219). This means that Hungarian at some stage lost most of its primary nominal morphology and later re-developed a secondary morphological system of the same type, but from new elements. Another typological feature that often evolves in cycles is vowel harmony. Proto-Uralic is conventionally assumed to have possessed a progressive vowel harmony of the palato-velar type. This harmony is assumed to have affected both word roots (“radical harmony”) and suffixes (“suffixal harmony”). However, only two contrasting vowel qualities can be reconstructed for the non-initial syllable in Proto-Uralic: a low vowel and a high vowel, and actual vowel harmony can be verified only for the low vowel, which seems to have had two allophones, a back *a [ɑ] and a front *ä [æ], depending on the vocalism of the initial syllable, which could be either one of the back vowels *a *o *u *ï or one of the front vowels *ä *e *ü *i. Clear traces suggesting that the low vowels alternated harmonically in Proto-Uralic are preserved only in few branches of Uralic, notably Northern Finnic, Southern Estonian, Erzya Mordvin, Northern Samoyedic (Nenets, Enets, Nganasan), and, as it would

Grammaticalization in Uralic as viewed from a general Eurasian perspective

367

seem, Hungarian. Elsewhere, vowel harmony has been lost either by neutralizing the contrast between back and front vowels in non-initial syllables (as in Saamic and Northern Estonian) or by merging all vowel qualities towards a generic nondistinctive neutral vowel, which may or may not be subsequently lost (as in most other modern Uralic languages). The evidence for vowel harmony in Proto-Uralic is, however, weakened by the fact that some Uralic languages seem to have developed a secondary palato-velar harmony. This has typically happened under the influence of the neighbouring Turkic languages, as is the case in Hill Mari, Southern Mansi, Eastern Khanty and Kamas. However, if this is so, we can question the age of vowel harmony also in some other Uralic languages, such as, for instance, Hungarian. We may also note that even those Uralic languages that would seem to have an “original” vowel harmony have undergone periods of disturbed harmony. Proto-Finnic, for instance, which had increased the paradigm of distinctive vowel qualities in non-initial syllables, had only two harmonically alternating pairs of vowels, the low vowels *a vs. *ä and the high rounded vowels *u vs. *ü, while there were, at the same time, three harmonically indifferent vowels, the front vowels *e *i and the back vowel *o. The harmonically indifferent back vowel *o could also have a retrogressive influence on preceding front vowels, as in *elä- ‘to live’ : *elä-nto > modern Finnish ela-nto ‘livelihood’.

 Morphophonology in Proto-Uralic It may be concluded that the morphophonological role of vowel harmony in ProtoUralic was minimal, though it may have had significance as a boundary marker in that it gave phonological words, especially inflected words, a harmonically coherent shape. It is possible that vowel harmony was also phonetically active for the high vowel of the system, though this cannot be verified on the basis of synchronic comparative evidence. The Proto-Uralic high vowel has been reconstructed variously as either *e (mid-high) or *i (high) on the basis of the Finnic stem type in which a final [i] alternates with non-final [e], as in tuli ‘fire’ :  tule-n. If we assume that this vowel participated in vowel harmony, we have to postulate a harmonic distinction between a front *i [i] and a back *ï [ɨ]. Since, however, no other Uralic branch preserves these qualities, it is also possible to reconstruct this vowel as a neutral vowel of the schwa type *ə [ə], a solution supported by the fact that in several Uralic branches and languages, as in Mariic, this vowel is lost in final position and reduced to a schwa medially, as in Meadow Mari tol ‘fire’ :  tolə-n, synchronically perhaps - tol-ə-n, while the low vowels *a *ä are often (though not always) preserved segmentally also in final position, as in *kota ‘dwelling’ > Meadow Mari *kudə [kuδo] :  kudə-n [kuδən].

368

Juha Janhunen

Since vowel harmony was an automatic phenomenon in Proto-Uralic we may conveniently ignore it in the phonological reconstructions and write the two vowels as *a (low) vs. *i (high or reduced). The distribution of these vowels in non-initial syllables was mainly lexically determined, with some stems and suffixes containing *a and others *i. There are, however, indications that the two vowels may also have alternated morphophonologically in some lexemes, especially in verbal stems, opposed with regard to the feature of transitivity, as in  *kaja- ‘to leave’ vs.  *kaji- ‘to remain’, a distinction preserved synchronically in, for instance, Mari  .1  kod-e-m vs.  kod-a-m (Erkki Itkonen 1962: 100–111). In such cases, one of the forms is likely to represent a petrified derivative of the other, but we do not know which of the two should be regarded as the basic form. In any case, an alternation of this type suggests that within the seemingly indivisible reconstructed stems there may have been obscured morphological boundaries, or that the different vowel qualities may have been conditioned by the presence of additional morphological elements that were lost before the Proto-Uralic stage. Another, much better documented, morphophonological alternation is connected with the phenomenon of the so-called consonant stems, that is, the alternation of the final high or neutral vowel *i with zero before suffixal syllables. Unambiguous synchronic traces of consonant stems are preserved only in the western end of the Uralic family, in Finnic and Saamic, as in Finnish tuli ‘fire’ :  (< ) *tul-ta. There are, however, clear indications that the phenomenon was present already in Proto-Uralic, as in *kani- > *kan- ‘to go’ (preserved only in Samoyedic) :  *kanta- ‘to carry’ (preserved also in several other branches, Finnish kanta-). The exact conditions under which consonant stems could be formed are unknown, but they were obviously connected with the rules of consonant phonotactics in the protolanguage. Suffix-initial consonants before which consonant stems are attested include *p *t *k *m *n *y. Modern languages occasionally show doublets in which an older, synchronically non-productive consonant stem has been replaced by a younger, synchronically productive vowel stem, as in Finnish kuole- ‘to die’ : (vowel stem, productive) kuole-ma ‘death’ < *kaali-ma < *kali-ma vs. (consonant stem, non-productive) kal-ma id. < *kal-ma (concerning the vowel developments, [cf. Aikio 2012]). In view of the phenomenon of consonant stems we might also speculate that the stems conventionally reconstructed with a final high vowel actually ended in a consonant with no following vowel. This would, however, alter the picture we have of Proto-Uralic root structure, in that there would have been monosyllabic roots ending in a consonant (*(C)VC) and even in a consonant cluster (*(C)VCC), which in the context of Ural-Altaic typology appears unlikely. The presence of a final vowel in these roots in Finnic and Saamic is certainly easier to explain as a retention than as an innovation (epenthesis of a secondary final vowel), and even Proto-Samoyedic had a final (reduced) vowel in those stems that contained a consonant cluster, as in Finnic *yänte- ‘senew, bow string’ = Samoyedic *yentə < Proto-Uralic *yänti. A related issue concerns a group of stems which in many Uralic languages appear as

Grammaticalization in Uralic as viewed from a general Eurasian perspective

369

monosyllabic vowel stems of the type CV or (in Finnic) CVV. It has been proposed that this stem structure might even go back to Proto-Uralic (Eugene Helimski, p.c.). However, it appears more likely that these stems originally contained a medial consonant of the “laryngeal” type (*x), which, moreover, is preserved in Saamic as the velar stop *k, as in *suxi- ‘to row’ > Pre-Proto-Saamic *suuki- (> North Saami suhka-). Finally, there is a structural phenomenon that shows a striking similarity in the two margins of the Uralic family: Finnic and Saamic in the west, and Samoyedic in the east. This is the phenomenon of consonant gradation, which involves the alternation of “strong” (tenues or geminate) consonants in open syllables (and other “strong” positions) with “weak” (lenes or single) consonants in closed syllables (and other “weak” positions). The parallelism in the synchronic patterns led early scholars to postulate consonant gradation also for Proto-Uralic, leading to the famous Gradation Theory, which was once the dominant doctrine in Uralic Studies (Setälä 1912). Although the idea of a Proto-Uralic origin of consonant gradation has received support even much later (Helimski 1995), it is incorrect. A closer look at the data shows that the phenomenon has arisen independently in several Uralic languages under the impact of a well-preserved bisyllabic stem structure and an accompanying non-distinctive trochaic stress pattern. In fact, consonant gradation is absent even in some Finnic languages (Veps) and present only in a few Samoyedic languages (Nganasan and Ket Selkup), which means that it was not yet present as a phonemic phenomenon in Proto-Finnic and Proto-Samoyedic. The Finnic and Saamic patterns of consonant gradation are also structurally different in the details, suggesting that they are independent developments.

 The origin of the parts of speech What has been said above of root and stem structure in Proto-Uralic applies only to “meaningful” or “lexical”, that is, non-deictic words, which formed the bulk of the lexical resources of the protolanguage. There was, however, a smaller group of “non-meaningful”, or grammatical, words, which may generally be characterized as “deictics”. These had a systematically different structure in that they were true monosyllabic vowel stems (*(C)V). The deictics formed a limited class, which comprised the personal pronouns 1 *mV : 2 *tV : 3 *sV (with different vowels in the singular and plural), the demonstratives *(Ø)V : *tV : *cV (with a front vowel for proximal and a back vowel for distal reference), the interrogatives *kV : *mV (with several vowel options), the copula-existential verb *o-, the negation verb *e-, and possibly a few others. The structural and functional opposition between deictics and non-deictics was apparently the most ancient division that differentiated parts of speech from each other in Pre-Proto-Uralic. Both deictics and non-deictics in Proto-Uralic could be used as free forms, that is, as independently uttered words with syntactic functions. They could, however,

370

Juha Janhunen

also be combined with suffixal elements, which typically consisted either of a single asyllabic consonant (*-C) or a syllabic sequence of a consonant and a vowel (*-CV). When combined with each other, suffixes could also form complexes which could contain consonant clusters (*-C-CV > *-CCV). While all basic word roots, both deictics and non-deictics, seem to have ended in a vowel, suffixally modified words could also end in a consonant, which potentially gave a three-way contrast between a final consonant (*-C), a consonant followed by a high or reduced vowel (*-Ci), and a consonant followed by a low vowel (*-Ca). The distinction between a high vowel and no vowel is, however, often difficult to reconstruct, and in trisyllabic or longer words the consonant stem may have been used as a free form already in the protolanguage, as in *kuñi- ‘to close eyes’ : free form (consonant stem) *kuñi-l ‘tear’ : bound form (vowel stem) *kuñi-li- > modern Finnish basic form kyynel : consonant stem in  kyynel-tä : vowel stem in  kyynele-n. A related issue concerns the alternation of vowels and zero in a number of suffixal elements. In some cases, the alternation may be due to a chronological difference. This could be true of the markers of personal predication and possession for the first and second person singular, which represent the grammaticalized and suffixalized traces of the corresponding personal pronouns *1 mi : 2 *ti. The markers of predication are “shorter”, containing only a consonant: .1 *-m : .2 *-t, while the markers of possession are “longer”, containing a full syllable: .1 *-mi : .2 *-ti, a situation still synchronically observable in, for instance, Finnish, cf. e.g., mene- ‘to go’ : .2 mene-t ‘you go’ vs. talo ‘house’ : .2 talo-si (< *-ti) ‘your house’. This formal difference could indeed correspond to a chronological difference, implying that the markers of predication were possibly grammaticalized earlier. However, a somewhat similar alternation between a low vowel and no vowel (or a high or neutral vowel) can be observed in several other suffixal elements, such as, for instance, the nominalizing suffixes *-ya and *-ma, which also appear in the shapes *-y (or *-yi) and *-m (or *-mi) with no clear functional difference. The “meaningful” words in Proto-Uralic represented two classes, or parts of speech, which may be termed nouns and verbs, or nominals and verbals, a situation synchronically characteristic of all languages of the Ural-Altaic type. In Proto-Uralic this dichotomy was based on morphosyntactic criteria alone, for formally there was no difference between verbal and nominal roots. Thus, both classes comprised typically bisyllabic roots, which could end in either one of the two vowels *a or *i that were permitted in non-initial syllables. Both nominal and verbal roots were also affected by the phenomenon of consonant stem formation. In this respect, ProtoUralic differed from some other languages of the Ural-Altaic type. In Proto-Japonic, for instance, all nominal roots ended in a vowel ((C)V), while all verbal roots ended in a consonant ((C)VC-) (or a non-syllabic vowel, that is, a glide), making the two parts of speech inherently different. Similarly, in Mongolic and Tungusic some stem types, notably nasal stems ending in (*)n, are restricted only to nominals. No similar restrictions can be reconstructed for Proto-Uralic, a situation that suggests, at least, a profound formal affinity between nominals and verbals.

Grammaticalization in Uralic as viewed from a general Eurasian perspective

371

In the absence of a formal difference, the principal manifestation of the dichotomy between nominals and verbals was that they were combined with different sets of suffixes, in that most suffixes could only be attached to either one or the other class of word. This was connected with the functions of the suffixes: nominal suffixes expressed typically parameters like case and number, while verbal suffixes expressed distinctions connected with tense, aspect, modality and voice. A difference may, however, have been involved in the fact that only nominals could occur as free forms without suffixes, that is, with a zero suffix (-Ø). Plain nominal roots without suffixes were used in the function of subject/agent and, possibly, indefinite or non-specific object/patient. Verbal roots, on the other hand, seem to have been always accompanied by suffixes. This is still the situation in the peripheral Uralic languages both in the west (Finnic-Saamic) and east (Samoyedic). In some of the languages located closer to the centre of the family it is, however, possible to use the plain verbal root as a free form either in the function of a second-person imperative (as in Mariic and Permic) or in that of a third-person present-tense indicative (as in Hungarian). Comparative evidence suggests that this is a secondary situation, at least partially conditioned by the influence of neighbouring languages.

 Interaction between nominals and verbals The dichotomy between nominals and verbals is, consequently, connected with their different grammatical functions, which require different sets of grammatical markers, that is, markers of inflectional morphology. This same dichotomy is also reflected in derivational morphology. To make a verbal root function as a nominal, or a nominal root as a verbal, Proto-Uralic used derivational suffixes for deverbal nominals and denominal verbals, as is still the case synchronically in all Uralic languages. These suffixes are inherently word-class-specific, in that the elements turning verbals to nominals (nominalization) are generally not identical with those turning nominals to verbals (verbalization, on which cf. Laakso [2008]). However, some of the suffixes forming deverbal nominals are also attested in denominal nominals, and even more typically, some of the suffixes forming denominal verbals can also be used to form deverbal verbals. For instance, denominal factitives (‘to make or use something for something’) and deverbal causatives (‘to make somebody do something’) are in many Uralic languages expressed by identical suffixes, ultimately based on the Proto-Uralic primary causative-factitive marker *-ta- (extended shape *-t.ta-) as in Finnish lippu ’flag’ :  lipu-t.ta- ‘to flag’ (‘to use the flag’), nukku- ‘to sleep’ :  nuku-t.ta- ‘to make somebody sleep’. This suggests that the morphological distinction between nominals and verbals is, after all, less sharp than would at first glance seem to be the case. There are indeed also other phenomena that suggest an original affinity between nominals and verbals in Uralic. For one thing, nominals can be used as predi-

372

Juha Janhunen

cates (nominal predication), in which case they can take predicative personal suffixes (enclitic pronouns) similar to those taken by verbs, a feature also known as “nominal conjugation”. Verbals, on the other hand, can in the function of finite predicates take number suffixes, which are identical with those used in the nominal declension, a property that is normally seen as an agreement phenomenon. There are even some suffixal elements that can be used both as adverbal case markers of the nominal inflection and as converb markers of the verbal conjugation. The most obvious example is offered by the element *-n, which in the protolanguage seems to have marked the connective (adnominal genitive and adverbal instrumental) case of nominals and the modal converb of verbals, as synchronically still in Saamic and Mariic, cf. e.g., Meadow Mari kol ‘fish’ :  >  kolə-n ‘of (a/the) fish’, nal- ‘to take’ : . nalə-n ‘(by) taking’. The corresponding negation form is based on an analogous ambivalent use of the privative (caritive) marker *(-k)-ta, which functions both as a case suffix (the privative or caritive case) and a converb marker (the privative or negative converb), as in Meadow Mari  kol-de ‘without (a/the) fish’, . nal-de ‘without taking’. In the above examples, the suffixes marking cases and converbs are conventionally considered to belong to the realm of nominal case declension, which is why their use on verbals is known as “verbal declension”. This claim can be contested, for formally there is no a priori reason to assume that case suffixes are more primary than converb markers. It is also debatable whether number marking is more basic in nominals than on verbals. However, an argument in favour of the primary nature of the case function is that converbs in most Uralic languages, as also in other languages of the Ural-Altaic type, are based on case forms of nominalized verbals. In modern Finnish, for instance, the connective and privative (abessive) markers can be attached to verbal stems only by inserting a nominalizing element, as in Finnish teh-de-n ‘(by) doing’ (with the connective case marker -n preceded by the nominalizing suffix -de-), teke-mä-ttä ‘without doing’ (with the abessive case marker -ttä preceded by the nominalizing suffix -mä-). It would be possible to analyse the case-marked verbals with no overt nominalizing suffix as examples of a “zero nominalizer”, e.g., Meadow Mari . nalə-n = nalə-Ø-n (take--) : . nal-de = nal-Ø-de (take--). The synchronic validity of this analysis remains to be confirmed, as the presence of the “zero” in the sequence is impossible to verify. Most importantly, however, the Uralic languages show examples of the phenomenon of nomenverba, that is, lexical roots that function both as nominals and verbals and which, hence, can take both nominal and verbal suffixes. Nomenverba are also attested in some of the neighbouring language families of the Ural-Altaic type, notably Turkic (Doerfer 1982) and Mongolic (Kara 1993), to a lesser extent in Tungusic, but not (or very marginally) in Koreanic or Japonic. The status of nomenverba varies in the modern Uralic languages, but they are particularly conspicuous in Finnic, where they can be divided into two types: primary and secondary. By

Grammaticalization in Uralic as viewed from a general Eurasian perspective

373

secondary nomenverba we understand stems that are historically secondary nominal and verbal derivatives from primary simple roots. The final segment of these stems is typically one of the high or mid-high vowels u ü i o (orthographically u y i o), which in this position are secondary and derive from sequences of an original stem-final vowel plus a glide (*iw *iy *aw *ay), as in Finnish haukku- ‘to bark’ : haukku ‘barking’, synty- ‘to be born’ : synty ‘birth’, leikki- ‘to play’ : leikki ‘play, game’ (a recent borrowing from Swedish), usko- ‘to believe’ : usko ‘belief ’. Synchronically, it is in most cases not possible to tell whether we are here dealing with deverbal nominals or denominal verbals, but diachronically most of these examples seem to involve primary verbal roots. In some cases this is still visible from correlative derivatives, as in oppi- ‘to learn’ : oppi ‘learning’ and loppu- ‘to end’ () : loppu ‘end’, which are based on the primary stems *oppe-, *loppe-, still preserved in the causative forms ope-tta- ‘to teach’, lope-tta- ‘to end’ (). By primary nomenverba we understand stems that show no derivational elements. Such items are relatively rare or even absent in many Uralic languages, though occasional examples are found in, for instance, Hungarian, as in fagy ‘to freeze’ : fagy ‘frost’. The languages that preserve the original root-final vowels, like Finnic, offer more examples, surprisingly many of which can be traced back to Proto-Uralic or other early historical stages. These items typically denote natural features or phenomena, e.g., *kocki(-) ‘dry’ > low water > rapids : ‘to dry’, *lomi(-) ‘snow’ : ‘to snow’, *tuxli(- ‘wind/feather’ : ‘to blow (of wind)’, *sula(-) ‘to melt’ : ‘melted mass’ (not in Samoyedic), or also physiological functions like *kunci(-) ‘urine’ : ‘to urinate’. While the verbal representations of these examples are intransitive, there are also nomenverba which can function as transitive verbs like *pala(-) ‘bite’ : ‘to bite’ > ‘to swallow’ (> ‘to burn’), *sala(-) ‘secret’ : ‘to conceal’ > ‘to steal’. A potentially important example is *kan-ta- ‘to carry’ : *kan(-)ta ‘heel, base, lower part’, in which the verbal representation is a causative based on the primary root *kani- : *kan- ‘to go’ (preserved only in Samoyedic), suggesting that even derived forms could function as nomenverba. Nomenverba pose a challenge both to synchronic and diachronic analysis. Synchronically, we could operate with the concept of “zero derivation”, but the decision as to which function, nominal or verbal, is primary and which is derived, would inevitably be arbitrary. Diachronically, we could also speculate that the one or the other function involves a derivational element which had been lost by the ProtoUralic stage and which therefore cannot be identified physically. However, it is also possible that the Uralic nomenverba are traces of a time when nominals and verbals were not yet grammaticalized as separate parts of speech. This would not mean that Pre-Proto-Uralic was in any way less “developed” than the later stages of Uralic, but simply that Pre-Proto-Uralic was perhaps a language in which nominals and verbals were not distinguished as strictly as they are today in the modern Uralic languages. This assumption has implications to the status of morphology in Uralic: it is possible that Pre-Proto-Uralic had undergone a grammatical cycle from less

374

Juha Janhunen

morphology to more morphology, a cycle that has continued in various directions in the modern Uralic languages. A case in point is Hungarian, which has clearly undergone a drastic reduction of inherited morphology, only to restore a complex morphological system from new elements later.

 Adjectival parts of speech In this connection, the status of adjectives, or “property words”, in Uralic requires attention. In this respect, the Ural-Altaic typological area is divided: in some families, especially in the west, adjectival meanings are expressed only, or mainly, by nominals (as in Turkic), while other families, especially in the east, use verbal adjectives, or adjectival verbals, that is, stative property verbs (as in Koreanic and Japonic). Uralic languages (like also Mongolic and Tungusic) generally have nominal adjectives, or adjectival nominals, meaning that adjectives can be regarded as a subclass of nominals with few features of their own. The main criterion for distinguishing adjectives from other nominals in the synchronic grammar is syntactic, reflecting the fact that adjectives often are used in the position of adnominal attributes, or also as nominal predicates (after which a headnoun may be assumed to be elliptically deleted). Adjectives can also be lexicalized as regular nominals (nouns) and be used as independent headnouns themselves, as in Finnish suomalainen ‘Finnish’ > ‘Finn’, vihreä ‘green’ > ‘green colour’. As far as morphology is concerned, adjectival nominals do not differ substantially from other nominals. They do, however, more often than other nominals, involve derived forms based on simple lexical roots. Many Uralic languages have specific suffixes for deriving adjectival nominals from other nominals, as -inen in Finnish, e.g., puna ‘red colour’ : puna-inen ‘red’, and -Ës in Hungarian, e.g., vér ‘blood’ : vër-ës ‘bloody’ > ‘red’. Even so, there are no derivational elements that would automatically allow a nominal to be recognized as an adjective, and many adjectival nominals are plain roots with no derivational elements. Another morphological feature of adjectival nominals is that they may have special forms of comparison (comparative and superlative) as in Finnish suuri ‘big’ :  suure-mpi :  suure-mpa- (: suure-mma-) ‘bigger’ vs.  suur-in :  suur-impa- (: suurimma-) ‘biggest’. Forms of comparison are, however, not universal in Uralic, for comparison can also be expressed lexically (by lexicalized items for ‘more’ and ‘most’) or syntactically (by the so-called comparative construction, normally involving the use of the ablative case form to express the point of comparison). Moreover, in specific contexts, even regular nominals can have forms of comparison, as in Finnish ranta ‘shore’ :  ranne-mpi ‘closer to the shore’ :  rann-in ‘closest to the shore’. This leaves only occasional modal forms as a relatively unambiguous diagnostic property of adjectival nominals, as in Finnish  suure-sti ‘in a big way, greatly’.

Grammaticalization in Uralic as viewed from a general Eurasian perspective

375

A potentially important morphosyntactic property of adjectival nominals is that, in some Uralic languages, especially in Finnic and Samoyedic, when used adnominally, they agree with the headnoun in number, as in Tundra Nenets ngarka-q myak°-q-na (big- tent--- ‘in (the) big tents’), or also in case, as in Finnish suur-i-ssa talo-i-ssa (big-- house--) ‘in big houses’. The details of agreement vary, depending, among other things, on the diachronic and synchronic status of the case suffixes. In Hungarian, for instance, only attributively used demonstrative pronouns, but not adjectives, agree in case and number, as in ez-ekben a nagy ház-ak-ban (--  big house--) ‘in these big houses’. Historically the agreement of adnominal modifiers is a recent innovation, and for Proto-Uralic no agreement can be reconstructed. It seems that, originally, nominals were used as adnominal modifiers with no morphological marking, as is still possible in compounds, as in Finnish puu+talo (wood+house) ‘wooden house’, which may be compared with the adjectival construction pu-inen talo (wood- house) ‘wooden house; house made of wood’. The replacement of derivationally specified adjectives by plain nominals is, however, not possible in the predicative (elliptic) position, in which other means (such as case marking) have to be used. It has to be added, however, that in one branch of Uralic – Saamic – the class of adjectives is more deeply grammaticalized, in that many adjectives have distinct attributive and predicative forms, as in North Saami  čielga :  čielggas ‘clear’ = Finnic *selke-tä > Finnish selkeä (used in both attributive and predicative functions). It may be concluded that adjectives have gradually been grammaticalized into a specific subclass of nominals, but the distinction with regard to regular nominals still remains minimal in the modern Uralic languages. More interestingly, however, adjectives also show occasional features that may be seen as traces of verbality, suggesting that in Pre-Proto-Uralic, adjectival meanings may have been expressed by verbals even more widely. Synchronic property verbs are among the Uralic languages attested in Samoyedic, where, for instance, some colour terms are verbals, as in Tundra Nenets nyar°ya- ‘to be red’, pəryidye- ‘to be black’. In adnominal position, nominalized forms (participles) of these verbals have to be used, e.g., . nyar°ya-na ‘red’, . pəryidye-nya ‘black’. Certain derivational elements conveying an adjectival meaning, such as privatives (caritives), are also verbals in Samoyedic, as in Tundra Nenets ngaewa ‘head’ :  ngaewa-syə- ‘to be without a head’ : . ngaewa-syə-da ‘headless’ :  ngaewa-syiq ‘without a head’ (cf. Salminen 1997: 55–56). In other Uralic languages, derived adjectival verbals are rare, but not non-existent: one relevant type is exemplified by Finnish sini ‘blue colour’ :  sin-inen ‘blue’ : sine-rtä- ‘to be blue/bluish, to shine with blue colour’, similarly pune-rta- ‘to be red’, kelle-rtä- ‘to be yellow’, vihe-rtä- ‘to be green’. There are, indeed, reasons to assume that Pre-Proto-Uralic had many more adjectival verbals than the modern Uralic languages. One possible trace of this earlier state is formed by the once obviously very productive type of adjectives in *-ta, attested in the western branches of Uralic, especially in Finnic, as in Finnish make-

376

Juha Janhunen

a ‘sweet’, ripe-ä ‘quick’ (a/ä < *-ta), and many others (Laakso 1990). Some items in this group contain the derivational element *-ki- (> -ke-) and are ultimately based on verbals, e.g., Finnish sur/e- ‘to feel sorrow’ : sur-ke-a ‘deplorable’, cf. sur-u, surk-u ‘sorrow’ (note that for this item, rather complex interaction with the Germanic etymon sorrow < *sorg- cannot be ruled out; Ante Aikio, p.c.). Also, for many items there is a parallel translative-inchoative verbal in -ne- < *-mi-, as in val-ke-a ‘light, flashy, white’ (> ‘fire’) :  val-ke-ne- ‘to grow white, light, clear’. The element *-ta may be compared with the nominalizer *-ta, as attested in, for instance, the Finnic “infinitives” in *-ta-, e.g., kul-ke-a’ ‘to go’ < Proto-Finnic *kul-ke-ta-k. The translative-inchoative verbals may also be the source of the Uralic comparative in *-m-pa (preserved in Finnic, Saamic and Hungarian), which, in that case, would contain the nominalizer *-pa, as attested, for instance, in Finnish as the marker of the “present participle”. Parallels for the development of comparative forms from nominalized verbals can be found in the Turkic languages (Ramstedt 1917). Assuming, however, that the distinction between nominals and verbals was once absent, or transitional, as is suggested by the nomenverba in Uralic, the original verbal nature of adjectives loses some of its potential relevance. Among the roots that can be reconstructed for Proto-Uralic, there are at least two that seem to have triple uses as verbals, adjectives and regular nominals (nouns). These roots are *lämpi(-) ‘(to be) warm’, reflected as Finnish lämp-ö ‘warmth’ :  lämpene- ‘to become warm’ (Aikio 2002: 13) and probably also lempi ‘love’ : lempe-ä ‘kind, mild’, as well as pilmi(–) ‘(to be) dark’, reflected as Finnish pilvi : pilve- ‘cloud’ : pime-ä ‘dark’ : pime-ne- ‘to grow dark’. Unfortunately, both of these items involve irregular phonological developments. The latter item has a verbal cognate in Samoyedic, attested in Tundra Nenets as paewə- ‘to be dark’ :  paew°-dya ‘dark’, but the phonological correspondence is not exact, as the Samoyedic forms originally contained a subsequently lost derivational element (*-y-), which conditions the allomorph of the nominalizing suffix (-dya < *-ntä, Tapani Salminen, p.c.).

 The origin of verbal predication As mentioned above, both nominals and verbals could function as plain predicates in Proto-Uralic, as is still the case in several Uralic languages. The parallelism between nominal and verbal predicates is most clearly visible in the phenomenon of “nominal conjugation”, in which nominals are used as predicates with no accompanying copula. This feature is synchronically best attested in the Northern Samoyedic languages (Nenets, Enets, Nganasan), in which the unmarked nominal stem can function as a predicate with reference to the third person singular, while for plural and dual reference the corresponding number markers are added, as in Tundra Nenets nyísya ‘father’ – ‘(he) is (a) father’ :  nyísa-xəh ‘the two fathers’ – (they two) are fathers’:  nyísya-q ‘fathers’ – ‘(they) are fathers’. When referring to the first

Grammaticalization in Uralic as viewed from a general Eurasian perspective

377

and second persons, the corresponding predicative suffixes are attached, again with no intervening copular element involved. Stem-final and stem-internal morphophonology (of both vowels and consonants) shows unambiguously that nominal predicates are marked for person by adding the personal endings directly to the nominal stem, as in Tundra Nenets .1 nyísya-d°m ‘I am (a) father’ : .2 nyísya-n° ‘you are (a) father’ : .1 nyísya-waq ‘we are fathers’ : .2 nyísya-daq ‘you are fathers’ (cf. Salminen 1997: 130). In such forms there is no formal reason to assume the presence of a copula, although, of course, it would be possible to speculate that a “zero copula” is present in the sequence. The use of actual copulas varies among the modern Uralic languages. In the westernmost branches (Finnic and Saamic) copulas are regularly used in all contexts, probably due to the influence of neighbouring (Indo-European) languages. In the eastern branches, third-person predicates are normally expressed without an overt copula, but copulas are, or can be, used in other persons, as well as in modally and temporally marked forms. Negation and existential functions normally also require the use of a copula. At least two copula-existentials can be reconstructed for Proto-Uralic. The primary unmarked copula seems to have been *o- (> Proto-Samoyedic *a-), which belongs to the limited class of monosyllabic deictic roots and was probably originally not conjugated as a verb, though verbal forms are attested in the modern branches of Uralic, notably in the Finnic nominalized forms  *o-ma ‘(he/she/it) is’ :  *o-ma-t ‘(they) are’ (> Finnish on : ovat). The other copula may be reconstructed as *lexi- (> Proto-Samoyedic *yi- > *i-) and may originally have had modal connotations, as suggested by its use for potential mood or future tense in several Uralic languages, as in Finnish lie- ‘may be’, Hungarian lë-sz- ‘will be’. The most common copular stem in almost all modern Uralic languages (but probably not in Samoyedic) was *o-li- (in some languages > *woli-), which may be seen as an extension of the primary root *o-, or possibly also as a combination of the two copular stems *o- and *le(xi)-. It is possible that verbal predicates originally behaved similarly to nominal ones. If so, the plain verbal stem was used in the function of a modally and temporally unmarked third-person singular predicate, as is the case in Hungarian, e.g., vár- ‘to wait’ : 3 vár (s/he) waits’. The corresponding third-person plural was probably already in Proto-Uralic marked by the plural suffix *-t, also used to mark nominal plurals, and an analogous situation must have been valid for the dual, marked by the suffix *-kV-. In the other persons the predicative personal endings were used. This plain verbal predicative paradigm is, however, only rudimentarily present in the modern Uralic languages, notably in the Finnic-Saamic singular first and second-person present-tense forms, as in Finnish mene- ‘to go’ : .1 mene-n (< *meni-m) ‘I go’ : .2 mene-t (< *meni-t) ‘you () go’. Synchronically, a similar situation is also observed in Hungarian, where all temporally unmarked (presenttense) forms are simple combinations of the verbal root and a personal ending, e.g., .1 vár-ok ‘I wait’ : .2 vár-sz ‘you () wait’ : .1 vár-unk ‘we wait’ :

378

Juha Janhunen

.2 vár-tok ‘you () wait’ : .3 vár-nak ‘they wait’, though the material shapes of the personal endings in Hungarian involve a large number of secondary innovations (Rédei 1989). In most other Uralic languages the simple combinations of verbal stem + personal ending have been replaced by more complex structures, which typically incorporate grammaticalized elements of conjugational class marking (as in the Mariic and the Permic languages), tense-aspect marking (as in ObUgric), or nominalization (to a varying degree in all Uralic branches and languages). Since the personal predicative endings are originally personal pronouns attached to the predicate, they are inherently neither verbal nor nominal. Apparently, in Pre-Proto-Uralic, pronouns marking the subject were cliticized and ultimately suffixalized due to the general preference to suffixal morphology in languages of the Ural-Altaic type, e.g., Pre-Proto-Uralic *mi+meni > *meni+mi > *meni=mi > *meni-mi > Proto-Uralic *meni-m ‘I go’. This development can also be seen in the markers of personal possession, which can be attached to verbal predicates to express a connection with the object, resulting in the so-called “objective conjugation”. Synchronically, the objective conjugation is attested in Samoyedic, Ugric and Mordvinic, though the Mordvinic system seems to involve an independent secondary innovation (see Keresztes 1999). However, on the basis of Samoyedic and Ugric, this feature is normally reconstructed for Proto-Uralic. The actual functions of the objective conjugation vary, and the actual markers have also undergone changes. Prototypically, verbal predicates marked with possessive suffixes imply reference to a definite or specific object, but synchronically the reference can also be connected with the absence of focus (as in Nenets) or simply the transitivity of the verbal (as in Selkup). The system originally referred only to a third-person object with no distinction with regard to the number of the object, but in Northern Samoyedic the plurality and duality of the object can synchronically also be expressed. Hungarian, on the other hand, has developed a specific form of reference to a numerically ambiguous second-person object with a first-person singular subject, e.g., lát- ‘to see’ : 1 lát-ok ‘I see’ : 1.3 lát-om ‘I see (him/her/it/them)’ : 1.2 lát-lak ‘I see you ( and )’. An unsolved problem is posed by the Samoyedic finite conjugation, which involves an element termed “finite morpheme”, inserted between the verbal stem and the personal ending. The form of this element varies depending on language and stem type, but in Nenets, where it is synchronically segmentable, it typically involves, in vowel stems, a schwa (*ə), which in the forms referring to a plural object follows the plural marker, as in Tundra Nenets xada- ‘to kill’ : -.2 xada-ə-n° ‘2 killed’ : -.2 xada-ə-r° ‘2 killed 3’ : --.2. xada-y-ə-d° ‘2 killed 3’ (cf. Salminen 1997: 99–101). For consonant stems, the finite morpheme has the shape *-nga-, as in Tundra Nenets nger- ‘to drink’ : -.2 nger-nga-r° ‘2 drank 3’. The original form of the finite marker is still open to dispute, and it is not clear whether the elements attested after vowel stems and consonant stems are etymologically identical. However, since the finite morpheme has no inherent

Grammaticalization in Uralic as viewed from a general Eurasian perspective

379

temporal-aspectual function, and it is at least synchronically not connected with nominalization, it might also derive from a separate auxiliary. This would mean that the Samoyedic finite conjugation would be based on a periphrastic construction, in which personal marking is added to a cliticized and suffixalized auxiliary, which, in turn, might involve an original copula-existential root (in that case perhaps *ə- < *a-).

 Nominalization of verbals It is, consequently, obvious that Uralic verbals and nominals originally shared several sets of morphological markers, a situation that to a varying extent still characterizes the modern Uralic languages. Many of these morphological parallels may be seen as agreement phenomena, in which the verbal or nominal predicate follows the person and number of the nominal subject. Thus, the markers of verbal personal predication can be combined with nominals to express nominal predication (personal agreement with subject, as still observed in Samoyedic), the markers of nominal possession can be combined with verbals to express relation to an object (personal agreement with object, as in Samoyedic, Ugric and Mordvinic), and the nominal dual and plural markers can be used on verbal predicates to express the duality or plurality of actors (number agreement with the subject, as in most Uralic languages, though the dual as a category is synchronically present only in Samoyedic, Ob-Ugric and Saamic). Even some nominal case suffixes are attested on verbals in modal and/or privative (caritive) functions similar to those observed with nominals (as discussed above). In the realm of derivational morphology, nominals and verbals also have parallels, as many derivational suffixes can be attached to both classes of words. There are, however, morphological categories that are strictly specific to verbals, and these are probably the strongest basis for viewing verbals as a distinct part of speech already in Proto-Uralic. Most importantly, only verbals can be nominalized. By nominalization we understand the “transcategorial operation” (Malchukov 2004) by which a verbal is transformed into a nominal by adding a nominalizing marker to the verbal base. The nominal character of the nominalized verbal is evident both morphologically and syntactically, in that it takes the markers of nominal morphology (for case, number, and personal reference) and can act in the syntactic positions of a nominal (subject, object, attributive, nominal predicate). At the same time, however, a nominalized verbal retains some of its verbal characteristics syntactically, in that it can take the arguments characteristic of a verbal (subject, object, adverbial). We might say that a nominalized verbal functions in the sentence actively as a nominal and passively as a verbal. It is important to stress that nominalized verbals, or “verbal nouns”, when so defined, are not simple deverbal nomi-

380

Juha Janhunen

nals, for the latter retain no verbal characteristics and function both actively and passively as regular nominals. The terminology concerning nominalization in general linguistics is a source of considerable confusion. For many purposes it is useful to make a distinction between adjectival and substantival uses, or “representations” (Kuznecova, Helimski, and Grushkina 1980: 249–256), of nominalized verbals. This distinction is also covered by the terms “infinitives” vs. “participles”. In this dichotomy, “participles” are prototypically used as modifiers, that is, either in combination with a headnoun or as nominal predicates (implying the ellipsis of a headnoun), while “infinitives” occur as independent headnouns, that is, as the heads of noun phrases functioning as a subject or an object. In typological literature, however, the term “nominalization” is occasionally restricted to referring only to “infinitives”, while “participles” are defined as a separate type of verbal forms (cf. e.g., Shagal 2017). In a different descriptive framework, “participles” may also be defined as “actor nominals”, while “infinitives” are “action nominals”. In a further division, “actor nominals” can be divided into “direct” or “conjunct” and “indirect” or “disjunct” “participles”, the latter also being known as agentive or oblique “participles”. However, all these divisions are irrelevant for many languages of the Ural-Altaic type, in which, typically, as in Modern Mongolian, any nominalized verbal can be used as an adnominal modifier or as an independent headword, as well as in the function of a nominal predicate. This seems to have been the situation also in Proto-Uralic. The definition of nominalization is connected with the stance we take with regard to the form vs. function dichotomy in grammatical description. From the formal point of view, nominalized verbals are non-finite forms, but from the functional point of view they can appear in both non-finite and finite functions. Verbal forms unmarked for nominalization can, by contrast, only be used in finite functions, that is, as independent predicates. Nominalized verbals are, therefore, inherently multifunctional, which explains their widespread use in the Uralic languages, in which they typically play a central role in the verbal paradigm, gradually marginalizing the actual finite forms and taking up their functions. In many Uralic languages all, or almost all, finitely used verbal forms are historical nominalizations. For instance, in the Finnic present tense finite paradigm only the singular first and second person forms (as discussed above) are primary unmarked plain verbals combined with the corresponding personal endings, while the forms used in the other persons involve nominalizations, as in Finnish 3 mene-e ‘he/she/it goes’ : 3 mene-vä-t ‘they go’ (< Proto-Finnic  *mene-pä :  *mene-pä-t), 1 mene-mme ‘we go’ : 2 mene-tte ‘you go’ (< Proto-Finnic - *mene-k-mVC : *mene-k-tVC). In this respect, the Uralic languages share the trend shown by other languages of the UralAltaic type. An important role taken by nominalized verbals is the marking of chained clauses. Unlike some other languages of the Ural-Altaic type, unmarked verbals in Uralic cannot function as adverbial modifiers to other verbals, that is, as “zero-

Grammaticalization in Uralic as viewed from a general Eurasian perspective

381

marked” converbs. There are also no primary converbs in Uralic with the exception of the relatively sparsely attested phenomenon of “verbal declension”, involving the connective and privative case forms of verbals (as discussed above). Even so, converbial forms are synchronically very common in all Uralic languages, and they are typically based on case forms of nominalized verbs. Some of these forms are transparent and could also be called “quasiconverbs”, as in Finnish teke- ‘to do’ :  teke-mä ‘doing, done’ (synchronically used as a deverbal nominal and as an “agentive participle”) :  teke-mä-ssä ‘(while) doing’ :  teke-mä-stä ‘from doing’ :  teke-mä-än ‘in order to do’ :  teke-mä-llä ‘by doing’ :  tekemä-ttä ‘without doing’, etc. Other forms with similar functions are less transparent formally, and also less obvious functionally, meaning that they may synchronically be recognized as fully grammaticalized converbs, as in Finnish teh-de-n ‘by doing’ (< do-- *tek-tA-n) : teh-dä-kse-en ‘in order to do’ (< do---3), teh-ty-ä-än ‘having done’ (< do--..--3), similarly Hungarian të- ‘to do; to put’ : té-ve ‘by doing, putting’ (< do--), të-vé-n ‘having done, put’ (< do---).

 Marking of tense, aspect and mood All Uralic languages have synchronic systems of tense, which to a varying extent interact with aspectuality. Typically, a formal distinction is made between present and past forms, which often correspond to uncompleted (imperfective) vs. completed (perfective) action. Some Uralic languages also have a fully grammaticalized future tense. Most tense forms are expressed morphologically, and, in the string of morphemes, the tense markers normally occupy a position between derivational suffixes and person markers. Other aspect-related features, including distinctions of the Aktionsart type, are expressed by derivational suffixes. This also means that the marking of the basic temporal distinctions is normally obligatory, while the marking of more elaborate aspectual distinctions of the Aktionsart type tends to be optional. The future tense marker in Nenets is an exception because it is a derivational feature (Salminen 1997: 54–55), however, this situation is connected with the generally atypical tense-marking system in the language (on which see more below). It is somewhat unclear to what extent the situation suggested synchronically by the modern Uralic languages was also characteristic of Proto-Uralic. The Samoyedic, and, in particular, the Northern Samoyedic languages, suggest that the temporal reference of a finitely used verbal may originally have been determined by the inherent aspectual content of the lexeme. In such a system, as still observed in, for instance Nenets, unmarked “aorist” forms of verbs with an inherent perfective content are used with reference to an immediate past tense, while verbs with an inherent imperfective content refer to the present tense, as in Tundra Nenets xa- ‘to die’ :

382

Juha Janhunen

  3 xa-° ‘(he/she/it) (just) died’ vs. yilye- ‘to live’ :   3 yilye-° ‘(he/ she/it) lives’. It is tempting to assume that a similar system was present already in Proto-Uralic, but since it is not otherwise typical of the languages of the Ural-Altaic complex it could also be a case of a secondary innovation in the Samoyedic branch, or even just in the Northern Samoyedic languages. It should also be noted that the synchronic Samoyedic forms are not simple verbal stems as they also contain the “finite morpheme” (as discussed above), whose diachronic status is unclear. The overwhelming strategy of building up tense paradigms in the Uralic languages has been the use of nominalized verbal forms in finite function. For ProtoUralic, a considerable number of nominalizers can be reconstructed, including *-pa, *-ta, *-ca, *-ka, *-ma, *-ya, some of which also seem to have had variants with a prothetic nasal: *-mpa, *-nta, while others have variants with no final vowel: *-k, *-m, *-y. The reasons underlying these, as it would seem, morphophonological variations, remain unknown. In any case, while all these markers are the sources of nominalized verbal forms (“participles” and “infinitives”) as well as converbs in the modern Uralic languages, they have also served as the basis for most of the synchronic tense-aspect markers. It is, however, difficult to reconstruct a specific temporal-aspectual reference for any one of the Proto-Uralic nominalizers, as they occur in different roles in the individual branches and languages. Typically, *-pa, *-ta, *-k(a) seem to have served as markers of the present-tense range, while *-ca, *-m(a), *-y(a) tend to refer to the past-tense range. The Finnish present-tense paradigm, for instance, is based on the nominalizers *-pa (3 and 3) and *-k (1 and 2), while the entire past-tense paradigm (“imperfect tense”) is based on *-y, as is the case in Saamic, too. In some other branches, as in Samoyedic, the past tense paradigm is based on the nominalizer *-ca. On the other hand, the nominalizers *-y and *-ca are also attested as suffixes deriving fully deverbalized and even lexicalized nominals with no temporal reference, as in Tundra Nenets yəbtə- ‘to molt’ : yəbto ‘goose’ (< molt- * yəbtə-y ‘one who molts’), ngəm- ‘to eat’ : ngəmca ‘meat’ (< eat- *əm-sa ‘thing to eat’). Some of the nominalizers have also ended up being used as modal markers. The only modally marked category that can be reconstructed for Proto-Uralic is the imperative, which in its basic form with reference to the second person singular is marked by *-k, as in Finnish elä-’ = Tundra Nenets yilye-q ‘live!’ < *elä-k. In the other persons, the variant *-ka- is used, although the details of the imperative paradigm vary even between closely related languages. In the central branches of Uralic (Mariic and Permic), the singular second-person imperative is unmarked (as mentioned above). The element *-k also marks the connegative form, used in the invariable main verb in combination with the negation verb *e-, as in Finnish 1 e-n elä-’ = Tundra Nenets nyí-d°m yilye-q ‘I do not live’ (note that the negation verb in Samoyedic does not take the finite suffix, but has irregular stem alternations). Other modal forms involve secondary innovations, though there is scattered evidence of the possible early presence of a potential mood formed by the element *-n(V)-, which

Grammaticalization in Uralic as viewed from a general Eurasian perspective

383

may or may not have been an original nominalizer. Conditional mood forms in several Uralic languages typically incorporate a marker of the past-tense range, as in Finnish  3 elä-isi = elä-is-i ‘he would live’ (with the final -i identical with the past tense marker -i(-). Apart from the simple morphologically formed tense forms several Uralic languages have secondary periphrastic tense forms. In the westernmost branches and languages, that is, Finnic and Saamic, as well as in Hungarian, these involve recently grammaticalized counterparts of the Western European periphrastic tenses. Finnish, for instance, has periphrastic forms for the perfect and pluperfect, e.g., ole-n tul-lut (be-.1 + come-.) ‘I have come’ (literally: ‘I am come’), ol-i-n tullut (be--1 + come-.) ‘I had come’ (literally: ‘I was come’). These can also be modally marked, as in Finnish ol-isi-n tul-lut (be--.1 + come.) ‘I would have come’. Hungarian, by contrast, has a grammaticalized periphrastic future tense based on the auxiliary use of the verb fog ‘to grasp, to take’, e.g., fog-ok jön-ni (take-.1 + come-) ‘I will come’ (literally: ‘I take to come’). There are also progressive constructions, which typically involve converbially used case forms of nominalized verbals, as in Finnish ole-n teke-mä-ssä (be-.1 + do--) ‘I am doing’. A cross-linguistically rare phenomenon is observed in Nenets and Enets, where a past-tense marker can be attached to predicatively used nominals, e.g., xasawa ‘man’, (in predicative use:) ‘he is a man’ : xasawa-sy° ‘he was a man’, with -sy° marking the past tense (“preterite”). Taken at face value, such forms would seem to be “nominal tense forms” (Salminen 1997: 130), but it appears more plausible to analyse the tense marker in them as a clitic, with an intervening “zero morpheme” functioning as a copula (Janhunen 2010: 170–172), i.e., xasawa=Ø-sy° (man-). This analysis is confirmed by the fact that the same clitic can be taken by finite verbals, in which case they also contain the “finite morpheme”, e.g., yilye-əsy° (live--) ‘he lived’. Diachronically it is a question of a periphrastic construction in which a predicatively used nominal or verbal was followed by the separate word *+i-sä ‘was’, which itself was composed of the copular stem *i- (< Uralic *lexi-) and the Uralic nominalizer *-ca in the function of a past-tense marker. It may be noted that this same past-tense marker, attested in its basic function elsewhere in Samoyedic, has in Nenets and Enets been grammaticalized further to the marker of an interrogative mood with a past tense-reference, as in Tundra Nenets to-sa-n° (come-.-.2) ‘did you come?’.

 Basic argument structure The Uralic languages as a family are normally classified as rather typical examples of the standard nominative-accusative strategy, with the subject in an unmarked nominative case, the object in a suffixally marked accusative case with the ending

384

Juha Janhunen

*-m (Wickman 1956), and with the elements organized in a verb-final sequence (SOV), as in most other languages of the Ural-Altaic type. The actual picture is, however, considerably more complicated. For one thing, one particular branch of Uralic, (Eastern) Khantic, has evolved in the direction of an ergative language, in which the original accusative has been lost, while the original locative in *-na has received the function of an ergative marker, though still with some peculiarities (Kulonen 1989: 297–302). This feature, which distinguishes Khantic from all other Uralic languages, including the other “Ugric” languages (Mansic and Hungarian), is certainly innovatory and was probably caused by the impact of contacting languages. Unfortunately, it is not clear which languages exactly were responsible for this exceptional impact on Khantic, and by what type of mechanism (substrate, adstrate or superstrate). The accusative case has also been lost in several other Uralic languages, including Permic, Hungarian, and most forms of Mansic. At the same time, the connective case, originally marked by the ending *-n and functioning as an adnominal genitive and an adverbal instrumental, has also been lost, which has led to a predominantly head-marked possessive construction, as in Hungarian a fiú könyv-e ( boy book.3) ‘the boy’s book’, although a pleonastic strategy with the dative case is also available: a fiú-nak a könyv-e ( boy-  book-.3) ‘the boy’s book’. In another pattern of development, observed most clearly in Finnic and Saamic, the accusative has formally merged with the connective, as in Finnish poja-n kirja (boy + book) ‘the boy’s book’ vs. nä’-i-n poja-n (see--.1 + boy-) ‘I saw a/the boy’ (with the word order changed to SVO under the impact of European languages). A similar merger of the genitive and accusative is observed in several other languages of the Ural-Altaic type, suggesting that it is an inherent tendency in this language type (Janhunen 2005). At the same time, other case forms, notably the partitive in Finnic, have taken over the role of object marking, as well as, in some cases, subject marking, leading to constructions that have also been seen as implying features of the ergative type (Itkonen 1974–1975), though this may remain a matter of interpretation. Although the merger of the accusative and genitive is a secondary phenomenon in the Uralic languages concerned, it had an antecedent in Proto-Uralic, in that the corresponding plural forms were originally marked by a single uniform plural connective ending, reconstructable as *-y. This was both formally and functionally different from the singular connective and accusative case forms in *-n vs. *-m, and also from the nominative plural marked by the ending *-t, which is, incidentally, used as the form of a definite plural object in Finnic. Traces of the plural connective ending *-y survive in modern Uralic languages variously as plural genitives (as in Saamic) or plural accusatives (as in Samoyedic), or also in the composition of the entire plural paradigm of nominal declension (in several branches, including, most importantly, Finnic and Saamic). Тhe connective form seems to have been the only marked plural oblique case in Proto-Uralic. The modern plural paradigms of nomi-

Grammaticalization in Uralic as viewed from a general Eurasian perspective

385

nal declension are, consequently, innovations in all branches of Uralic, and their grammaticalization has taken several different paths, including the introduction of secondary plural markers. The same is true of the dual: in Samoyedic the case endings in the dual paradigm are attached to the postpositionally used spatial *nä, e.g., Tundra Nenets mya-k°h nya-na (tent- + -) ‘in the two tents’, while in Saamic the dual is present only in the pronominal system and the corresponding personal markers and verbal forms. An important feature suggesting that Uralic may once have involved more ergativity than is synchronically present in the modern languages is that the genitive (connective) case is used as subject marker for nominalized verbals, including converbial forms, as in Finnish: the agentive construction minu-n teke-mä-ni (1- + do--.1) ‘the one done by me’; or converb of simultaneous subordinate action minu-n teh-de-ssä-ni (1- + do---.1) ’when I do/did (it)’. Even more characteristically, the paradigm of the “objective” conjugation, present in several Uralic languages, is typically based on the use of the possessive suffixes as person markers, as in Hungarian “subjective conjugation”: olvas-ok (read-.1) ‘I read’ vs. “objective” conjugation: olvas-om (read-.1) ‘I read () it’ (literally: ‘my read’), cf. possessive declension: könyv-em ‘my book’ (book-.1). Instead of “objective conjugation” (or “definite” conjugation, as it is also sometimes referred to) we might therefore speak of “possessive conjugation”, or, more generally, of the “possessiveness of predication”. Even so, it has to be remembered that the patients of verbals in the “objective” conjugation are nevertheless in the accusative case in those languages that have a marked accusative, though the accusative marker may also be secondary, as in Hungarian a könyv-et olvas-om ‘I read the book’ ( book + read-.1, containing the accusative marker -(V)t of unknown origin). Formally, also, the possessive paradigm of nominals and the “objective” paradigm of verbals have diverged in some details, meaning that the sets of markers used in the two paradigms are no longer necessarily synchronically fully identical. It may be noted that several other Ural-Altaic languages can also attach both predicative and possessive markers to a finitely used verbal. The functions of the “possessive conjugation” may, however, vary, and Uralic stands alone in the UralAltaic area in using the possessively marked forms to indicate a relation with the object and transitivity. In Turkic and Tungusic, the distribution of the predicative and possessive markers depends mainly on the morphological category of the verb, as well as, ultimately, on the chronology of grammaticalization of the forms concerned. More rarely, a single verbal form can take both sets of markers with a difference in function: this is the case with Buryat, where the general nominalizer with a future-tense connotation (the “futuritive participle”) implies simple future with the predicative person markers, while with the possessive markers it conveys the notion of a deontic future (Yamakoshi 2017). A related issue is connected with the marking of the object in imperative clauses. An important parallel between the languages of the extreme west (Finnic)

386

Juha Janhunen

and east (Samoyedic) of the Uralic area is that the object of a second-person imperative verbal is in the unmarked nominative case, as in Finnish lue-’ kirja (read.2 + book) ‘read a/the book!’, and Tundra Nenets pad°r pad°-q (letter write.2) ‘write a letter!’. The imperative form itself in these languages is expressed by an original nominalizer (Proto-Uralic *-k) with no actual person marking, though person marking is present in the corresponding plural forms. The absence of object marking may, of course, be explained by the fact that these clauses normally lack an overt subject, leaving the unmarked nominal unambiguously to represent the object. However, the unmarked “object” may also be analysed as the “subject” of the clause, meaning that the original sentence structure was possibly of the type ‘the letter (is to be) written’. Since there is no overt passive marking, the construction is strongly reminiscent of an ergative system (Janhunen 2000/2001; cf. also Comrie 1975). Passive in general is a marginal feature of the Uralic languages, typically based on derivational causative forms, as in Finnic, Saamic and Hungarian, and often not reaching the prototypical state of a “personal passive” (cf. e.g., Schlachter 1985). Some Uralic languages have also a category of medial or reflexive verbs, which follow a specific conjugational pattern. In Hungarian these are verbs following the so-called ik-conjugation (cf. e.g., Havas 2004: 134 et passim), while Northern Samoyedic has both a class of “alteration” verbs, which incorporate a “semi-reflexive” marker in the stem, and a specific set of personal endings expressing reflexivity (Salminen 1997: 81–83, 95–96, 103–105). The diachronic background of all these forms remains unclear and disputed.

 The grammaticalization of person marking Another argument in favour of the assumption that Pre-Proto-Uralic may have been an ergative language is found in the personal pronouns, which occur in two sets: one with the basic structure *CV, and the other with an additional suffixal dental nasal *-n, that is, *CV-n. This dichotomy is observed, in particular, in the singular pronouns, which may be reconstructed as 1 *mi : 2 *ti : 3 *sV and 1 *mi-n : 2 *ti-n : 3 *sV-n, respectively, while the dual and plural pronouns were formed either by changing the root vowel or by adding suffixal number markers. Traces of the two sets are preserved somewhat randomly in the modern languages, but as a rule each language has generalized one or the other form for each person. The forms with a final *-n seem to be preserved as a complete set in Saamic and Mordvinic, though with a velarized vocalism, as in North Saami 1 mon : 2 ton : 3 son, and Moksha 1 mon : 2 ton : 3 son. Finnic preserves the third-person pronoun in the form of *sän ~ *sen (> Finnish hän ‘he/she’, Estonian en- ‘oneself ’) and has additionally the personal interrogative *ke-n ‘who’ and the relative *ku-n ‘who, which’, while Hungarian has ön < *sVn, used as a second-person honorific pronoun. For Proto-Samoyedic, the forms 1 *mən : 2 *tən (with *ə < *i) can be reconstruct-

Grammaticalization in Uralic as viewed from a general Eurasian perspective

387

ed, though in Nenets and Enets these have been replaced by forms expanded by the element *-ti, as in 1 Tundra Nenets məny° = Tundra Enets moji (< *mən-ti). The forms without the final *-n are preserved in the bisyllabic compounds 1 *mi+nä : 2 *ti+nä, which contain the additional deictic element *+nä (incidentally, identical with the locative case suffix *-na), and which are the source of the synchronic system in Finnic, e.g., Finnish 1 minä : 2 sinä (with si- < *ti-), and probably also in Permic and Mariic (cf. Bartens 2000: 149–153). In some Uralic languages, the system of personal pronouns has undergone fundamental restructurings. Hungarian, for instance, preserves the plain second-person pronoun 2 të and the interrogative ki ‘who’, but has, at the same time, the innovative first-person pronoun 1 én, which has a cognate in Mansic 1 *äm. Mansic and Khantic, on the other hand, have introduced a secondary stem *nV- for the second-person pronouns, as in Northern Mansi and Northern Khanty 2 nang (Kulonen 2001). The most radical development has been the introduction of dummy nominals, which in combination with the possessive suffixes have replaced some of the personal pronouns in some languages, as in Tundra Nenets -.2 pidə-r° ~ pudə-r° : -.3 pi-da ~ pu-da (< *pid°-da ~ *pud°-da < *pix°dəda), based on the noun puxəd° < *pixəd° ‘body’. An early dummy nominal seems to have been *keti ‘skin’ > ‘person’ (Aikio 2006: 17–19), which is the base of the pronominal accusatives in Samoyedic, as in Tundra Nenets syiq- :  .1 syiq-m° : 2 syi-t° : 3 syi-ta (Salminen 1997: 131), but also in Hungarian  .1. en-g-em : 2 té-g-ed (Helimski 1982: 88–97). The rest of the pronominal case paradigm in both Samoyedic and Hungarian is based on spatials, some of which have a transparent connection with regular nominals, as in Hungarian (*)bel- ‘inside, interior’ = bél : bel- ‘intestines’. The spatials themselves form sets marked by primary case endings, e.g., Hungarian  ben-n ‘inside’ :  bel-ől :  bel-e, to which possessive suffixes can be added, yielding the “case forms” of the personal pronouns, as in 1  benn-em ‘in(side) me’ :  belől-em ‘from inside me’ :  belé-m ‘into me’. The same spatials have then also been grammaticalized as secondary case endings of regular nominals, e.g., kéz ‘hand’ :  kéz-ben :  kéz-ből :  kéz-be. In spite of such examples of secondary restructuring, the fact remains that the Uralic singular personal pronouns originally had two forms, with and without a final *-n. Formally, this final *-n was identical with the marker of the connective case, which had both adnominal genitival and adverbal instrumental functions. Although it can no longer be verified, it is quite possible that the marked forms of the pronouns were originally used to express not only the possessor of a nominal headnoun and the actor of a nominalized verbal, but also the actor of any transitive predicate, while the plain forms of the pronouns would have been used to express the subject of not only equative and existential sentences, but also of any intransitive predicate. If this was so, then the clauses involving pronouns with a final *-n would have been ergative constructions. This would also explain why the personal

388

Juha Janhunen

pronouns in *-n in many Uralic languages have a defective case paradigm. Although all of this is to some extent speculation, it is difficult to find another explanation of the situation in which the personal pronouns are attested in two sets of forms. It is well known that the ergative function is universally often expressed by forms identical with the genitive and/or instrumental cases. Naturally, when the system was restructured with the loss of ergativity as a grammatical principle, both sets of pronouns could have survived in random combinations. This raises the question concerning the origin of the suffixes of person marking in Uralic. Although the fact that the possessive suffixes are “longer” (1 *-mi : 2 *-ti) than the predicative personal markers (1 *-m : 2 *-t) might simply indicate a chronological difference of the two sets (as suggested earlier), it is also possible that the possessive suffixes derive from the longer forms with the final nasal, while the predicative forms derive from the forms without a nasal. This would correspond to the pattern observed in other languages of the Ural-Altaic type, notably Mongolic (cf. Ramstedt 1933), in which the possessive markers are longer than predicative markers, because the former are based on the genitives of the personal pronouns, while the latter are based on the corresponding basic forms (nominatives). Altogether, the evolution of the person markers has taken very similar paths in all those Ural-Altaic languages that have morphological person marking (Uralic, Turkic, Mongolic, Tungusic), although the process of suffixalization and grammaticalization has taken place at very different times in the different families. In Uralic, both sets of personal markers can be traced back to the protolanguage. An additional confirmation of the Proto-Uralic origin of the possessive markers is the morphophonological alternation observed in this set, in that the oblique forms of the possessive suffixes are preceded by a so-called “pronominal n”, which, moreover, deletes the *-m (< *-n-m-) in the first-person forms, as in Tundra Nenets .1 myaq-m° (< *mät-mə) ‘my tent’ :  myaq-n° (< *mät-nə) ‘of my tent’. This is a pattern shared by both Samoyedic and Finnic, demonstrating that it is a Proto-Uralic feature. It is also important to note that the possessive suffixes () were originally ordered after number () and case markers (), that is: ---. This order is preserved in most Uralic languages, but in some of the central branches the case markers have been secondarily moved to follow the possessive suffixes, that is: --, cf. e.g., Finnish talo-ssa-ni (house--.1) vs. Hungarian ház-amban (house-.1-). Although this reorganization of the morpheme string is partly due to the external impact of neighbouring languages, in particular, Turkic, it is also connected with the fact that the case suffixes themselves, as in Hungarian, are of a secondary spatial origin, and these spatials were originally attached to the possessively marked basic forms. However, a similar reorganization is also observed in sequences containing the “original” case marker of the locative, which in Hungarian functions as the so-called “superessive” (‘on’) and in Finnic as an essive (‘as’), cf. e.g., Finnish talo-na-ni (house--.1) ‘as my house’ vs. Hungarian ház-amo.n (house-.1-) ‘on my house’. In some Uralic languages, the feature

Grammaticalization in Uralic as viewed from a general Eurasian perspective

389

of possessive person marking has been reduced or lost by replacing the synthetic constructions involving possessive suffixes with analytic ones involving adnominal genitives, as in Estonian minu maja (1. + house) ‘my house’ = Spoken Finnish mu-n talo (1- + house) ‘my house’ vs. Standard Finnish minu-n talo-ni (1 + house-.1).

 Verbalization and insubordination in Uralic It is perhaps useful to place the phenomena discussed above in the context of the terminology used in general typology and the various theories of grammaticalization. Two terms that appear particularly relevant in this connection are “verbalization” and “insubordination”. Unfortunately, both terms are somewhat awkward and misleading and not necessarily applicable to Uralic language data, at least not in the sense intended by those who introduced them. However this may be, verbalization has been defined as the “reanalysis of nominal predicates as verbal predicates”, while insubordination is supposed to refer to the “reanalysis of (subject) complement clauses [as main clauses]” (Malchukov 2013: 201). Taken at face value, it is possible to find phenomena corresponding to these definitions in the Uralic languages, but this might lead to a terminological confusion. In Uralic, the term “verbalization” is best defined as what it is – the transformation of nominals (and other non-verbal parts of speech) into verbs. This is a standard procedure in Uralic derivational morphology and produces denominal verbals (as already mentioned earlier). Often, the derivational affixes used in this connection convey various additional connotations, but there are also semantically neutral general verbalizers, as exemplified by the Finnic suffix *-tA- (> -t- : -A-), originally a causative marker, as in Finnish halu ‘desire’ :  halu-t- : halu-a- ‘to desire’. This suffix is synchronically very productive and is also used to verbalize newly borrowed foreign items, which may be nouns, as in yooga ‘yoga’ :  yooga-t- : yooga-a- ‘to do yoga’, but also verbs, as in duba-t- : dubba-a- ‘to dub’ (with the marginal phoneme /b/ participating in the consonant gradation pattern originally characteristic only of the native geminates /kk pp tt/). Another candidate for verbalization in Uralic could be offered by the phenomenon of nominal predication. However, nominals used in predicative position are, by definition, not verbals, nor are they “verbalized” in any way, even if they take the same predicative personal endings as are taken by finitely used verbs. The correct term for this phenomenon is “predicativization”. In examples like Tundra Nenets .1 nyísya-d°m ‘I am (a) father’ (as discussed earlier) we have, at least synchronically, an equative clause containing only a predicatively used noun and a personal marker, but no verbalizing element. This is different from languages like Yukaghir, where in otherwise similar constructions there is a material copular element between the nominal stem and the predicative ending (Janhunen 2010: 166–

390

Juha Janhunen

168). If we look for an example of a true verbalization of a nominal in Uralic, we are left only with the nomenverba, in which a nominal is turned to a verbal with “zero derivation”, as in Finnish tuuli : tuule- ‘wind’ :  tuule- ‘to blow (of wind)’. However, in these cases there is no inherent evidence confirming that the direction of the derivational operation is from nominal to verbal, for it could also be from verbal to nominal. It is also questionable whether the use of nominalized verbals in predicative function, as in Finnish 3 (he) mene-vät ‘they go’ = .- mene-vä-t ‘the going ones’ (< *mene-pä-t), can be regarded as examples of “verbalization”, or of “insubordination” for that matter. There are, however, real examples of the “reanalysis of complement clauses as main predicates”, though these examples are rather marginal. Estonian, for instance, has an “oblique mood” that is formed, in the present tense range, by the finitely used partitive case form of the present participle, as in tege- ‘to do’ :  (ta) tege-vat ‘(s/he) is said to do (so)’= . tege-va-t (Ikola [1953], discussed also by Malchukov [2013: 178–179]). In the past tense range, the corresponding form can be simply identical with the perfective participle, e.g., . (ta) tei-nud ‘(s/he) is said to have done (so)’, which, however, is an abbreviation from (ta) olevat teinud, with .- ole-va-t as the actual “oblique mood” indicator. Another “subordinate” form used in a finite function is the Nenets auditive, marked by suffixal complex *-manon- in combination with the possessive suffixes, as in Tundra Nenets ye- ‘to ache’ :  .3 yewanon-ta ~ ye-won-ta ‘it feels like it aches’ (Salminen 1997: 115). Diachronically, the auditive suffix seems to contain both a nominalizer (*-ma) and a suffixalized trace of the regular noun muh : mun- ‘sound’. Altogether, it is rather ill chosen to speak of verbalization of verbals. This term is also not suitable to characterize the forms of the “objective” conjugation of verbs in those Uralic languages that have this category. Prototypically, these forms are simply possessively marked verbal stems in finite use, as in Hungarian tud- ‘to know’ :  .1 tud-om ‘I know it’ : .2 tud-od ‘you know it’, etc. In these forms, at least synchronically, there is no nominalizing element, which is why they remain full verbs. The same is true of the corresponding Samoyedic forms, in which only the finite morpheme is inserted between the verbal stem and the possessive suffix, as in Tundra Nenets xada- ‘to kill’ :  -.1 xada-ǝ-w° ‘I killed it’ : -.2 xada-ǝ-r° ‘you killed it’, etc. The possessive suffixes functioning as personal endings can, however, also be added to nominalized verbs functioning, for instance, as tense forms of the objective conjugation, as in Hungarian -.1 tud-t-am ‘I knew it’ : -.1 tud-t-ad ‘you knew it’, etc. Obviously, nominalization as such is no prerequisite for the use of the objective conjugation. The question is how the forms of the objective conjugation were originally to be understood. Most probably they were simply possessive forms of the verb, as has often been suggested, meaning that a form like Tundra Nenets  -.1 xada-ǝ-w° ‘I killed it’ would originally have meant ‘my kill’ (but with no overt nomi-

Grammaticalization in Uralic as viewed from a general Eurasian perspective

391

nalizer). However, since the typical sentence with a verbal form in the objective conjugation also contains a nominal object in the accusative, the entire construction would be equative, in which case a sentence like Tundra Nenets ti-m xada-ǝ-w° ‘I killed the reindeer’ (with ti ‘reindeer’ :  ti-m) would have originally meant something like ‘the reindeer was my kill’. If this is so, the accusative suffix would originally have functioned as a topic marker, i.e., ‘as for the reindeer, it was my kill’. This would explain why in some Uralic languages, as in Tundra Nenets, the objective conjugation is still synchronically used with a topicalized object, while a non-topicalized object in the focus position would require the verb to be in a “subjective” form, e.g., ti-m xada-ǝ-d°m (with -.1 xada-ǝ-d°m) ‘I killed the reindeer (and not something else)’. Altogether, this would corroborate the assumption that PreProto-Uralic was once an ergative language, in which the accusative case developed secondarily from marked topicalization.

 Conclusion: Layers of grammaticalization in Uralic A look into the comparative and reconstructable properties of the Uralic languages, as summarized in the previous sections, reveals that Pre-Proto-Uralic may in some respects have differed from Proto-Uralic, while Proto-Uralic may also have involved features that are no longer systematically preserved in the modern Uralic languages. Some of the most important developments that seem to have happened between Pre-Proto-Uralic and Post-Uralic include the following: (1) Gradual crystallization of the grammatical distinction between nominals and verbals: There are indications that the distinction between nominals and verbals may have been less strict in Pre-Proto-Uralic than in the modern Uralic languages. This is suggested, in particular, by the relatively large number of nomenverba in the reconstructable Proto-Uralic lexical corpus. (2) Nominalization as a source of the verbal paradigm: The principal primary phenomenon that distinguished verbals from nominals was nominalization. Nominalized verbal forms subsequently acted as the main source for the entire verbal paradigm, including not only actual nominalized forms (participles and infinitives) and their adverbial derivatives (converbs), but also the forms used as finite predicates. Ultimately, nominalized forms have almost completely marginalized the apparently more primary finite forms that contained no nominalizers. (3) The decline of verbal adjectives: As far as verbals were distinguished from nominals in Pre-Proto-Uralic, they seem to have covered also adjectival meanings (stative descriptive verbs) to a larger extent than is the case in the modern Uralic languages,

392

Juha Janhunen

in which adjectives are prototypically nominals, only syntactically distinguishable from nouns. (4) The transition from ergativity to the nominative-accusative strategy: Several circumstances suggest that Pre-Proto-Uralic may have had clauses built upon the ergative principle, while in the modern Uralic languages the nominative-accusative type of argument structure prevails. The accusative (marked by *-m) was already a ProtoUralic feature, but it may have been a late innovation in Pre-Proto-Uralic. In the plural paradigm, Proto-Uralic had only a single oblique form, marking a general connective case. In imperative clauses the role of the subject was taken by the patient, which, therefore, was unmarked, as is still the case in certain Uralic languages. (5) The grammaticalization of suffixal person marking: Quite certainly, person marking was an innovation of the Pre-Proto-Uralic period, since the reconstructable markers of both the possessive and predicative sets clearly derive from the corresponding personal pronouns. In agentive position the (singular) pronouns were combined with the nasal element *-n, which was identical with the connective (genitive-instrumental) case marker of the nominal (singular) paradigm and which, hence, may also have functioned as an ergative case marker. The above points all concern selected structural and functional aspects of grammaticalization, with the focus on the interaction between the parts of speech and the evolution of their syntactic roles. This type of grammaticalization could perhaps also be termed “transgrammaticalization”, since it is connected with the functional transitions between grammatical categories, and not with the simple evolution of lexical elements into grammatical markers. In fact, at the Proto-Uralic level, most grammatical markers, typically realized as suffixes, were already fully grammaticalized and had no transparent connection with ordinary lexical elements occurring as free morphemes. Apart from the personal endings, which were indeed directly derived from the corresponding personal pronouns, the only example of a possibly grammaticalized free morpheme is offered by the numeral *kektä ~ *käktä ‘two’, which may have been the source of the dual marker *-kV-, as attested in a number of Uralic languages (Samoyedic, Ugric, Saamic). The dual seems to have been restricted to references to individualized pairs of actors, which is why, for instance, paired body parts were used only in the plural, as is still the case in the languages having the category of dual, cf. e.g., Tundra Nenets nguda ‘hand’ :  nguda-q ‘(the two) hands’. In the individual Uralic branches and languages of the Post-Proto-Uralic period many more cases of lexical grammaticalization can be found. It is particularly typical that new case suffixes have been formed out of spatials, postpositions and regular nouns. Occasionally, the development is so recent that the case marker remains fully identical with the corresponding free morpheme: for instance, the Hungarian

Grammaticalization in Uralic as viewed from a general Eurasian perspective

393

marker of the “temporal” case -kor is identical with the noun kor ‘time, epoch’, and as a case marker it does not even follow the vowel harmony, as in öt-kor (five-) ‘at five o’clock’, which is why it is probably best analysed as a clitic, i.e., öt=kor. Other examples of recently cliticized case markers are the Estonian comitative marker =ga, as in isa=ga (father=) ‘with father’, which is based on the postposition kaasa ‘together with’ (Finnish kansa’ ~ kanssa), as well as the Veps ablative post affix =päi ‘from’, which is based on the postposition päi ‘towards’ (Finnish päin), itself the lexicalized plural instrumental case form of the noun (*)pää ‘head’, synchronically used in connection with the inner and outer local case markers, as in hebo ‘horse’ :  hebo-s ‘in the horse’ :  hebo-s=päi ‘from (inside) the horse’ :  hebo-l ‘on the horse’ :  hebo-l=päi ‘from (above) the horse’ (Grünthal 2015: 62). Recent research has suggested that the Finnic dichotomy of inner and outer local cases (with the coaffixes *-s- vs. *-l-), with somewhat less systematic counterparts in Saamic, Mordvinic and Mariic, is probably also based on suffixalized and grammaticalized spatials (Aikio and Ylikoski 2007; Ylikoski 2016). A general conclusion from these considerations is that the Uralic branches and languages have followed different paths with regard to the grammaticalization of different morphological categories. Even developments that have led to the grammaticalization of similar functional features in the different languages are often due to parallel tendencies conditioned by the underlying typological similarity of all languages in the Ural-Altaic belt. An example of such a feature is definiteness: in the Uralic languages it can be marked by methods as diverse as separate articles (definite vs. indefinite) before the nominal headword (in Hungarian), by a cliticized (definite) post-article (in Mordvinic), by differential object marking (nominative vs. accusative/connective vs. partitive, in Finnic), or by the use of different types of conjugation (subjective vs. objective, as in Samoyedic, Ugric and Mordvinic). In all of these patterns, definiteness is intertwined with other features, most notably, specificity and it affects both the grammar and the lexicon. Interestingly, it also appears that the deeper backwards in time we go the less features the early forms of Uralic seem to have had with the prototypical Ural-Altaic language type. Pre-Proto-Uralic was obviously in some respects typologically rather different from its modern descendants, although, of course, it was in no way a less developed language. The idea, once expressed even by serious linguists (e.g., Ravila 1943), that the early forms of Uralic could be a source for glottogonic speculations, has been shown to be mistaken. It is also difficult to tell whether, and in what respects, the early stages of Uralic were less or more “complex” than the modern Uralic languages. There are indications that there has been an increase in morphological complexity in, for instance, the realm of person marking. Some languages, like Hungarian, have undergone periods of destruction and reconstruction of morphological complexity in, for instance, the nominal declension. The loss of ergativity may have involved a simplification or, at least, an adaptation to an increasingly dominant areal typology of the nominative-accusative type. Ultimately, we have to

394

Juha Janhunen

recognize that all stages in the evolution of the Uralic languages belong to a typological cycle, which successively erases and rebuilds patterns and distinctions (Janhunen 2000).

A note on the notation The Proto-Uralic phonemes are in this paper rendered by the symbols: (nasals) *m *n *ñ *ng, (stop obstruents) *p *t *c *k, (sibilant fricative) *s, (dental or retroflex affricate) *z, (spirants) *d *j *x, (glides) *w *y, (liquids) *l *r, (high vowels) *u *ü *ï *i, (mid-high vowels) *o *e, (low vowels) *a *ä. For modern languages with a nonRoman orthographical basis, or with no written use, a phonemic transcription on similar principles is used. Languages with a Roman-based orthography are rendered according to the normative spelling.

A note on the linguistic data Much of the linguistic data quoted in this paper from the Uralic languages is common knowledge which does not require references. References are, however, given for data that are potentially controversial or that are open to different alternative analyses. Detailed information concerning the individual Uralic languages and their history may be found in the two handbooks edited by Sinor (1988) and Abondolo (1998), as well as in the literature quoted there. Due to the generalizing nature of the present paper, linguistic material is quoted selectively and with a focus on a few well-known representative languages, notably Finnish, Hungarian and Tundra Nenets, on which standard grammatical descriptions are readily available.

Acknowledgements I would like to thank Ekaterina Gruzdeva and Tapani Salminen for kindly reading this paper and commenting on several both theoretical and substantial issues. I would also like to thank Andrej Malchukov and Walter Bisang for their insightful editorial remarks as well as for organizing the conference on grammaticalization where the first draft of this paper was presented.

Abbreviations 1 = first person, 2 = second person, 3 = third person,  = abessive,  = ablative,  = accusative,  = adessive,  = adjectival,  = adverbial,  = allative,  = aorist,

Grammaticalization in Uralic as viewed from a general Eurasian perspective

395

 = attributive,  = auditive,  = causative,  = coaffix,  = comitative,  = comparative,  = conditional,  = connective,  = copula,  = converb,  = case ending,  = dative,  = definite,  = dual,  = epenthetic,  = elative,  = essive,  = factitive,  = finite,  = genitive,  = illative,  = imperative,  = imperfective,  = indirect,  = inessive,  = infinitive,  = interrogative,  = intransitive,  = lative,  = locative,  = modal,  = nominal,  = negative,  = nominalizer,  = number marker,  = objective,  = oblique,  = partitive,  = passive,  = plural,  = participle,  = predicative,  = perfective,  = privative,  = proximal,  = present,  = preterite,  = possessive suffix,  = singular,  = spatial,  = superlative,  = temporal,  = transitive,  = translative,  = verbal,  = predicative personal ending

References Abondolo, Daniel (ed.). 1998. The Uralic languages (Routledge Language Family Descriptions). London: Routledge. Aikio, Ante. 2002. New and old Samoyed etymologies 1. Finnisch-Ugrische Forschungen 57. 9–57, 59. 9–34. Aikio, Ante. 2006. New and old Samoyed etymologies 2. Finnisch-Ugrische Forschungen 59. 9–34, 59. 9–34 Aikio, Ante. 2012. On Finnic long vowels, Samoyed vowel sequences, and Proto-Uralic *x. Mémoires de la Sociéte Finno-Ougrienne 264. 227–250. Aikio, Ante & Jussi Ylikoski. 2007. Suopmelaš gielaid l-kásusiid álgovuođđu sáme- ja eará fuolkegielaid čuovggas [The origin of the l-cases in Finnic languages in the context of Saamic and other related languages]. Mémoires de la Société Finno-Ougrienne 253. 11–71. Bartens, Raija. 2000. Permiläisten kielten rakenne ja kehitys [The Structure and Development of the Permic Languages]. Mémoires de la Société Finno-Ougrienne 238. Comrie, Bernard. 1975. The antiergative: Finland’s answer to Basque. In R. E. Grossman, L. J. San & T. J. Vance (eds.), Papers from the 11th Regional Meeting of the Chicago Linguistic Society, 112–121. Chicago: University of Chicago. Doerfer, Gerhard. 1982. Nomenverba im Türkischen. Studia Turcologica Memoriae Alexii Bombaci Dicata. (Napoli). 101–114. Grünthal, Riho. 2015. Vepsän kielioppi [A grammar of Veps]. (Apuneuvoja suomalais-ugrilaisten kielten opintoja varten 17). Helsinki: Suomalais-Ugrilainen Seura. Havas, Ferenc. 2004. Objective conjugation and medialisation. Acta Linguistica Hungarica 51. 95–141. Helimski, Eugene. 1982. Drevneishie vengersko-samodiiskie yazykovye paralleli (Lingvisticheskaya i ètnogeneticheskaya interpretaciya) [The oldest Hungarian-Samoyedic linguistic parallels (A linguistic and ethnogenetic interpretation)]. Moskva: Nauka. Helimski, Eugene. 1995. Proto-Uralic gradation: Continuation and traces. Congressus Octavus Internationalis Fenno-Ugristarum (Pars I: Orationes plenariae et conspectus quinquennales. Jyväskylä). 17–51. Honti, László. 1979. Features of Ugric languages (Observations on the question of Ugric unity). Acta Linguistica Academiae Scientiarum Hungaricae 29. 1–25. Honti, László. 1998. Ugrilainen kantakieli – erheellinen vai reaalinen hypoteesi? [The Ugric protolanguage – a mistaken or a real hypothesis?] Mémoires de la Société Finno-Ougrienne 228. 176–187. Ikola, Osmo. 1953. Viron ja liivin modus obliquuksen historiaa [On the history of the modus obliquus in Estonian and Livonian]. (Suomi 106). Helsinki: Suomalaisen Kirjallisuuden Seura.

396

Juha Janhunen

Itkonen, Erkki. 1962. Beobachtungen über die Entwicklung des tscheremissischen Konjugationssystems. Mémoires de la Sociéte Finno-Ougrienne 125. 85–125. Itkonen, Terho. 1974–1975. Ergatiivisuutta suomessa [Ergativity in Finnish] 1–2. Virittäjä 78. 378– 398, 79. 31–65. Reprint: Opuscula Instituti linguae fennicae 46. Helsinki: Helsingin yliopisto. Suomen kielen laitos. Janhunen, Juha. 1981. Korean vowel system in North Asian perspective. Han-geul 172. 129–146. Janhunen, Juha. 1982. On the structure of Proto-Uralic. Finnisch-Ugrische Forschungen 44. 23–42. Janhunen, Juha. 1997. Problems of primary root structure in Pre-Proto-Japanic. International Journal of Central Asian Studies 2. 14–30. Janhunen, Juha. 1999. A contextual approach to the convergence and divergence of Korean and Japanese. International Journal of Central Asian Studies 4. 1–23. Janhunen, Juha. 2000. Reconstructing Pre-Proto-Uralic typology: Spanning the millennia of linguistic evolution. Congressus Nonus Internationalis Fenno-Ugristarum, Pars I (Orationes plenariae & Orationes publicae). 59–76. Tartu Janhunen, Juha. 2000/2001. The Nenets imperative sentence and its background. FinnischUgrische Mitteilungen 24/25. 71–85. Janhunen, Juha. 2001. On the paradigms of Uralic comparative studies. Finnisch-Ugrische Forschungen 56. 29–41. Janhunen, Juha. 2005. On the convergence of the Genitive and Accusative cases in languages of the Ural-Altaic type. In M. M. Jocelyne Fernandez-Vest (ed.), Les langues ouraliennes aujourd’hui: Approche linguistique et cognitive [The Uralic languages today: A linguistic and cognitive approach]. Bibliothèque de l’École des Hautes Études, Sciences historiques et philologiques, Tome 340: 133–144. Paris: Librairie Honoré Champion. Janhunen, Juha. 2008. Some Old World experience of linguistic dating. In John D. Bengtson (ed.), In hot pursuit of language in prehistory: Essays in the four fields of anthropology. In honor of Harold Crane Fleming, 223–239. Amsterdam: John Benjamins. Janhunen, Juha. 2009. Proto-Uralic – what, where, and when? Mémoires de la Société FinnoOugrienne 258. 57–78. Janhunen, Juha. 2010. Enclitic zero verbs in some Eurasian languages. In Lars Johanson & Martine Robbeets (eds.), Transeurasian verbal morphology in a comparative perspective: genealogy, contact, chance (Turcologica 78), 165–180. Wiesbaden: Harrassowitz Verlag, Janhunen, Juha. 2012. Non-borrowed non-cognate parallels in bound morphology: Aspects of the phenomenon of shared drift with Eurasian examples. In Lars Johanson & Martine Robbeets (eds.), Copies versus cognates in bound morphology. (Brill’s Studies in Language, Cognition and Culture 2), 23–46. Leiden: Brill. Janhunen, Juha. 2013. A legkeletibb uráliak [The easternmost Uralians]. (Székfoglalóelőadások a Magyar Tudományos Akadémián. A 2013 május 6-án megválasztott akadémikusok székfoglalói). Budapest: Magyar Tudományos Akadémia. (Also published in: Nyelvtudományi Közlemények, vol. 110. 7–30. Budapest [2014]). Janhunen, Juha. 2014. Ural-Altaic: The polygenetic origins of nominal morphology in the Transeurasian zone. In Martine Robbeets & Walter Bisang (eds.), Paradigm change: In the Transeurasian languages and beyond. (Studies in Language Companion Series 161), 311–335. Amsterdam: John Benjamins. Kara, György. 1993. Nomina-verba mongolica. International Symposium on Mongolian Culture. (Taipei: Meng Tsang Wei yüan hui). 151–154. Katz, Hartmut. 1975. Generative Phonologie und phonologische Sprachbünde des Ostjakischen und Samojedischen. Münchener Universitätsschriften, Finnisch-Ugrische Bibliothek 1. München. Klemm, Antal. 1928–1942. Magyar történeti mondattan [Hungarian historical syntax], vol. 1–3. Budapest: Magyar Tudományos Akadémia.

Grammaticalization in Uralic as viewed from a general Eurasian perspective

397

Keresztes, László. 1999. Development of Mordvin Definite Conjugation. Mémoires de la Sociéte Finno-Ougrienne 233. Korhonen, Mikko. 1969. Die Entwicklung der morphologischen Methode im Lappischen. FinnischUgrische Forschungen 37. 203–362. Korhonen, Mikko. 1974. Oliko suomalais-ugrilainen kantakieli agglutinoiva? [Was the Finno-Ugrian protolanguage agglutinative?] Virittäjä 78. 243–256. Kulonen, Ulla-Maija. 1989. The passive in Ob-Ugrian. Mémoires de la Sociéte Finno-Ougrienne 203. Kulonen, Ulla-Maija. 2001. Zum n-Element der zweiten Personen besonders im Obugrischen. Finnisch-Ugrische Forschungen 56. 151–174. Kuznecova, A. I., E. A. Helimski & E. V. Grushkina. 1980. Ocherki po sel’kupskomu iazyk. Tazovskii dialekt [An outline of the Selkup language. The Taz dialect], Tom I. (Publikaciï otdeleniia strukturnoi i prikladnoi lingvistiki, vypusk 8). Moskva: Izdatel’stvo Moskovskogo universiteta. Laakso, Johanna. 1990. Translatiivinen verbinjohdin NE itämerensuomalaisissa kielissä [The translative derivational suffix NE of verbs in Baltic Finnic]. Mémoires de la Sociéte FinnoOugrienne 204. Laakso, Johanna. 2008. On verbalizing nouns in Uralic. Finnisch-Ugrische Forschungen 47. 267–304. Malchukov, Andrej. 2004. Nominalization, verbalization: Constraining a typology of transcategorical operations. (Lincom Studies in Language Typology 8). Munich: Lincom Europa. Malchukov, Andrej. 2013. Verbalization and insubordination in Siberian languages. In Martine Robbeets & Hubert Cuyckens (eds.), Shared grammaticalization: With a special focus on the Transeurasian languages, 177–208. Amsterdam: John Benjamins. Ramstedt, G. J. 1917. Suomalais-ugrilaisen komparatiivin syntyperä [The origin of the Finno-Ugrian comparative form]. Virittäjä 21. 37–39. Ramstedt, G. J. 1933. Persoonapäätteeellisen verbitaivutuksen synnystä [On the origin of verbal personal conjugation]. Suomalaisen Tiedeakatemian esitelmät ja pöytäkirjat 1935. 125–128. Ravila, Paavo. 1943. Uralilaisen lauseen alkuperäisestä rakenteesta [On the original structure of the sentence in Uralic]. Virittäjä 1943. 247–263. Rédei, Károly. 1989. Über die finnougrische Konjugation unter besonderer Berücksichtigung der ungarischen Personalssuffixe. Journal de la Société Finno-Ougrienne 82. 82–88. Salminen, Tapani. 1989. Classification of the Uralic languages. Proceedings of the Fifth International Finno-Ugrist Students’ Conference (IFUSCO 1998.). (Castrenianumin toimitteita 35). 15–24. Salminen, Tapani. 1997. Tundra Nenets inflection. Mémoires de la Sociéte Finno-Ougrienne 227. Schlachter, Wolfgang. 1985. Hat das Finnische ein Passiv? Finnisch-Ugrische Forschungen 47. 1–144. Setälä, E. N. 1912. Über art, umfang und alter des stufenwechsels im finnisch-ugrischen und samojedischen. Finnisch-Ugrische Forschungen 12. 1–128. Shagal, Ksenia. 2017. Towards a typology of participles. Helsinki: University of Helsinki (Department of Modern Languages) dissertation. Sinor, Denis. (ed.). 1988. The Uralic languages: Description, history and foreign influences. Handbuch der Orientalistik VIII, 1. Leiden: Brill. Tauli, Valter. 1966. Structural tendencies in Uralic languages. (Indiana University Uralic and Altaic Series 17). Bloomington: Mouton de Gruyter. Wickman, Bo. 1956. The form of the object in the Uralic languages. Uppsala: Almqvist & Wiksells Boktryckeri Aktiebolag. Yamakoshi, Yasuhiro. 2017. Shinehen Buriyaato go no 2 shurui no mirai hyougen – Bunshi no teidoushika ni kan suru 3 ruikei [Two future expressions in Shinekhen Buryat: Three typological models of the verbalization of participles]. Hoppou Jinbun Kenkyuu 10. 89–96. Ylikoski, Jussi. 2016. The origins of the western Uralic s-cases revisited: Historiographical, functional-typological and Samoyedic perspectives. Finnisch-Ugrische Forschungen 63. 6–78.

Andrej L. Malchukov

8 Grammaticalization in Ewen (NorthTungusic) in a comparative perspective  Introduction Tungusic languages are traditionally considered as a branch of Altaic, although this affiliation is not uncontroversial (Cincius 1949; Benzing 1955b; Dörfer 1978; Robbeets 2005, 2015; Janhunen 2014). Tungusic languages can be divided into Northern (Ewen, Ewenki, Negidal, Solon), Eastern (Nanai, Ulcha, Orok, Oroch, Udihe), and Southern (Manchu, Jurchen) branches, even though this classification has also been contested in literature (see Whaley, Grenoble, and Li [1999]; Georg [2004]; Janhunen [2012b] for discussion). Some scholars (e.g., Avrorin 1959) group Eastern branch with Northern, while others (e.g., Sunik 1962) collapse it with the Southern branch. For general information on Tungusic languages, a reader is referred to such publications as Benzing (1955b), Menges (1968), Alpatov et al. (1997), Vovin & Alonso de la Fuente (Forthcoming). Janhunen (2012b) presents a brief introduction to the history of Tungusic languages and peoples. The present discussion is focused on grammaticalization in North-Tungusic Ewen, also known as Even or Lamut (Cincius 1947; Benzing 1955a; Novikova 1960; Malchukov 1995, 2008) and takes a comparative outlook, seeking external evidence for certain grammaticalization paths from related languages whenever possible. Typologically, North-Tungusic languages are agglutinating languages with SOV word order, and in a number of other respects they are representative of Altaic languages. For example, in the domain of phonology, they feature vowel harmony (cf. the form of the locative case in -lA with stems with back vs. front vowel series; cf. Ewen moo-la ‘in wood’ vs. möö-le ‘in water’). In the domain of morphology, all Tungusic languages (except for Manchu) feature possessive suffixes on the possessed noun (cf. d’uu-s [house-yours] ‘your house’), and, in syntax, all of them employ non-finite forms (converbs, participles) to render complex clauses. There is also a considerable typological variation among the branches of Tungusic; in particular, Northern Tungusic show a richer agglutinating structure as compared to East Tungusic and, in particular, South Tungusic languages. In the case of SouthTungusic Manchu, this should be in part due to language contact with Sinitic.1 The discussion of grammaticalization scenarios in Tungusic is complicated by the fact that specialists differ as to genealogical position of Tungusic. As mentioned above, some argue for relatedness of Tungusic with other Altaic (or Transeurasian)

 Tsumagari (2012) shows that, more recently, also some East-Tungusic languages like Udihe underwent simplification due to the influence of Chinese. https://doi.org/10.1515/9783110563146-008

400

Andrej L. Malchukov

languages (Turkic, Mongolic, possibly also Korean and Japanese), while others assume that similarities are due to areal factors (see e.g., Robbeets [2015], and Janhunen [2014] for different views). Correspondingly, many paths of grammaticalization may be disputed on the accounts that assume cognacy of the markers in question beyond Tungusic proper (unless one assumes of course that Tungusic is the most archaic and preserves lexical sources for grammaticalized items). In many cases I follow Benzing’s classic comparative grammar of Tungusic (Benzing 1955b), which has an advantage of being comprehensive, but also noncommittal with regard to the Altaic controversy. Although Benzing was generally reluctant to posit lexical sources for Tungusic grammatical markers, there are clearly cases where such origin can be recovered (see the spectacular case of negation below). Recently, Alonso de la Fuente (2011) demonstrated that at least in some cases it is possible to find the source construction for grammatical markers in Tungusic; in particular, he relied on Manchu, which he claims, is not as innovative in its grammatical structure as commonly believed. In certain aspects this approach leads to reviving the proposals from early Manchu studies (such as Zakharov [[1879] 2012]).

 Grammaticalization of nominal categories The nominal word classes in Ewen (and other Tungusic languages) include nouns, adjectives, pronouns and numerals. The morphological structure of nouns has the following template: -number-case-possession: d’u-l-taki-tan [house---3] ‘to their houses’.

. Grammaticalization of number In Ewen (and other North-Tungusic languages) the singular is unmarked, while plural is marked by the suffix -l ~ -r (the latter allomorph appears with n-final stems); cf. d’uu ‘house’, d’uu-l ‘houses’; hirkan ‘knife’, hirka-r ‘knives’. The origin of the plural marker is not totally clear but most authors agree that it originates from a collective marker. In particular, Cincius (1947: 118) suggested that -l is originally a collective marker and further linked it historically to the Written Mongol il ‘people, clan’. While the latter suggestion remains to be proved, the origin of plural markers from collective markers is generally accepted (Cincius 1947; Benzing 1955b; Sunik 1982: 41). Indeed, South-Tungusic languages like Manchu have not grammaticalized plural markers to the same extent and employ collective markers (-sa, -ta, -ri) to render nominal plurality. Some of these markers are also found in North-Tungusic, where they are used in collective function with a restricted set of lexical items, often in combination with the regular plural marker: cf. Ewe-sel ‘Ewens’. It is interesting to note that some East-Tungusic languages like Nanai use this complex marker -sAl

401

Grammaticalization in Ewen (North-Tungusic) in a comparative perspective

(a combination of collective marker -sA- with the plural -l) as a regular plural inflection; cf. Nanai hoton-sal ‘towns’ (Benzing 1955b: 1025). Renewal of plural morphology by way of strengthening of the plural marker is thus more advanced East-Tungusic Nanai and Ulcha (as well as in areally adjacent North-Tungusic outliers Negidal and Solon; see Grenoble & Whaley [2003] for further discussion of plural marking in Tungusic).

. Grammaticalization of case markers The inventory of case markers differs greatly across Tungusic languages. As in other domains North-Tungusic languages reveal a richer morphological structure, while South-Tungusic are relatively impoverished, and East-Tungusic languages take an intermediate position. Thus while North-Tungusic languages feature 12–14 cases (14 in Ewen, 12 in Ewenki), in East-Tungusic the system is reduced (to 8–9 cases in Nanai dialects), and it is still more reduced in Manchu which features only 5 cases (Nominative, Accusative, Genitive, Dative-locative, Ablative), of which Nominative is unmarked. Genitive case is absent in most Tungusic languages, except for Manchu, since the possessive relations are head-marked. Table 1 (based on Sunik 1982: 160–161, and Benzing 1955b: 1027) summarizes case inventories in four major Tungusic languages taking into consideration only cases directly pertaining to our study:2

Tab. 1: Case forms in Tungusic.

 (nominative)  (accusative)  2 (designative, indefinite accusative)  (dative)  (locative)  (allative)  (ablative)  (instrumental)  (prolative)  (elative)

Ewen

Ewenki

Nanai

Manchu

Ø -w/-u/-m/-bu -gA

Ø -wA -jA

Ø -wA/-bA -gA

Ø -be −

-du -lA -tki -duk -d’i -li -gič

-du -lA -tki -duk -d’i -li -git

-du -lA -či − -d’i (-li) -diAdi

-de − − -či /-ci − − −

 Gruntov (2002: 175) reconstructs for Proto Tungusic system of 12 cases, but some of them (e.g, postulated old accusative in -i) are restricted to pronominal paradigms, and some other forms (genitive -i, locative -de, as distinct from the dative-locative -du) are securely attested only in Manchu.

402

Andrej L. Malchukov

The discrepancy in the number of cases between the branches of Tungusic cannot be reduced to one single causal factor. Some cases have been lost in Manchu or syncretized due to phonetic changes (as in case of Instrumental that merged with the Genitive, see Benzing [1955b: 1026]), or survive as lexicalizations, while others constitute innovations which developed in North-Tungusic. At least some of these case suffixes can be shown to derive from spatial postpositions. Spatial postpositions are relational nouns specifying spatial/topological relations such as doo- ‘inside’, ančin- ‘aside’, oj- ‘on’, herde- ‘under’ (cf. mugdeken herde-du-n [stump bottom-3] ‘under the stump’). In some cases, these relational nouns are demonstrably derived from a body part. Novikova (1960: 226) and Sunik (1982: 213) suggested that the dative -du derives from doo ‘inside’, originally ‘stomach, guts’; this etymology has been recently supported by Janhunen (2014: 326).3 The meaning ‘inside X’ can still be detected in the spatial use of Ewen “dative” -du in contrast to (general) locative -lA; the former is predominantly used for being or moving inside (Malchukov 2008). Gruntov (2002: 138) appropriately calls this use of -du, which he reconstructs for North-Tungusic, “inessive”. The general pattern of reanalysis can be then represented in this way: (1)

Ewen a. D’uu doo-n house inside-3 ‘inside of the house’



b. d’uu-du-n house--3 ‘his house’

On this account, the possessive markers in the original structure were used to index the lexical noun (formal possessor) within the postpositional structure (that is, the noun d’uu ‘house’ in [1a]). After encliticization of the spatial noun (postposition) the possessive suffixes were coopted to index a possessor of a noun phrase; as in bej d’u-du-n [man house--3.] ‘in the man’s house’). It is interesting to note that Mongolian languages, where possessive suffixes are generally less grammaticalized, show that syntactic reanalysis of possessive morphology may even predate univerbation. Janhunen (2012a: 195) cites Mongolian examples like shiree-n dee-rcen [table- above--2.] ‘on your table’, where the possessive enclitic indexes a possessor of the lexical noun rather than signals relations between the lexical noun and the spatial noun, in a manner reminiscent of what we find in Ewen. This scenario (  > Relational  > Adposition > ), which is frequent cross-linguistically (see Lehmann 1995; Heine and Kuteva 2002), provides

 There is a seemingly similar locative-dative suffix -du in Mongolic languages, which is often taken as cognate of the Tungusic dative, and reconstructed for Proto-Altaic (EDAL, Gruntov 2002), but Janhunen (2014: 326) considers the form -du as secondary, derived either reduction from the reflexive dative form in -dur, or from combination of the more basic -d and -r markers. The correspondence between doo ‘inside’ and Mo. doo ‘middle’ seems hardly accidental though.

Grammaticalization in Ewen (North-Tungusic) in a comparative perspective

403

an explanation for certain peculiarities in the use of case markers in Tungusic. First, it accounts for the appearance of “double case” patterns in Tungusic, which viewed synchronically, should be treated as either allomorphy or historically composite case exponents. For example, one of the allomorphs of the locative case -lA appears in certain conjugations in the form -dulA, which can be naturally derived from encliticization of the spatial postposition do- in the locative case (cf. turki-le ‘on the sledge’ but oron-dula ‘on the reindeer’). Second, it accounts for why possessive suffixes follow case suffixes, in violation of the iconic order (generally case has the broadest scope – over the whole NP). Also in Uralic, where there is variation as to the order N-pos-case or N-case-pos, the latter order has been explained (Comrie 1978) through its origin in a construction with encliticized locative postposition which inflected for possession. Some other case markers arguably originate from spatial nouns, too. One case discussed in literature is *gi- ‘side’ (Novikova 1960: 227; Sunik 1982: 235–237), which does not survive as a separate lexeme in modern Tungusic languages but contributed to several locative formatives. In particular, it can be detected in the form of complex adverbial markers (bar-gi.da ‘other side’), as well as to locative cases (Elative -gi.č, also comitative -gi.li ‘by the side’; [Benzing 1955b: 1036]). Remnants of *gi ~ ETg *d’ea ‘side’ can be found in compounds such as Nanai oči-d’ia ‘North; Northern land/side’, perxi-d’ie ‘west; western land/side’’ (Sunik 1982: 247). Novikova (1960: 226–227) further suggested that the locative case -lA comes from *la ‘place’, surviving as a derivational affix in Tungusic languages (cf. Ewen bi-le.k ‘village’). In both cases the general path leads from spatial postpositions in compound structures to derivational suffixes, and finally to inflectional suffixes (case-markers), in accordance with development documented in literature on grammaticalization. Among the case markers represented in Table 1, the designative case in -gA(whose connection to the partitive -jA- is unclear; see Kazama [2012] for discussion) is of special interest. Designative case is not only unique morphologically, as it always occurs in combination with possessive affixes, but it is also semantically peculiar. As noted in Malchukov (1995), the designative is functionally unusual as it seemingly fulfills a double case function. On the one hand, it marks its host as a direct object (like a garden variety of an accusative case), but in addition it interprets a possessor as a recipient/beneficiary (thus fulfilling a function of a dative marker). Compare the construction with the designative case in (2a), with the regular construction with the accusative (2b), where the benefactive interpretation is lost: (2)

Ewen a. Etiken kuŋa turki-ga-n bö-n. old_man child sledge--3. give-.3 ‘The old man gave a sledge to the child.’ b. Etiken kuŋa turki-wa-n bö-n. old_man child sledge--3. give-.3 ‘The old man gave the child’s sledge.’

404

Andrej L. Malchukov

Malchukov (2008; cf. Malchukov and Nedjalkov 2010) suggests to relate the designative case in -gA- to the homophonous verb ga- ‘take’. His analysis is typologically inspired: he draws a parallel to verb-serializing languages, where ‘take’-verbs are a frequent origin of both instrumental and object markers (Heine and Kuteva 2002). On his analysis, the construction with a designative case originated from an embedded purposive clause with the purposive converb in -gA- and headed by the verb ga‘take’; in the course of reanalysis, the verb form undergoes simplification through haplology: (3)

Ewen a. Hin turki-ga-s emu-re-m. your sledge--2 bring--1 ‘I brought the sledge for you.’ b. *[Hin turki(-w) ga-ga-s] emu-re-m. your sledge(-) take-.-2 bring--1 ‘I brought the sledge for you.’

Although some details of this scenario need to be worked out,4 it explains the aforementioned peculiarities on the part of the designative marker, which makes it unusual with respect to morphology and semantics. The development of the verb ‘take’ to an object marker also finds parallels in neighboring languages, and can be considered an areal feature (cf. use of the coverb bǎ < ‘take, hold’ to mark specific direct objects in Mandarin Chinese; Sun and Bisang, this volume). Some of the case-markers represented in Table 1 above have later undergone semantic evolution, sometimes along the well-known grammaticalization paths. Thus, dative meaning of -du should be a later development of the original locative meaning (Locative > Dative), given than its origin in the postposition ‘inside’. Novikova (1960) suggests that the instrumental case marker in -č (mo-č ‘with a stick’) is cognate with the proprietive marker (cf. oro-č ‘having reindeer’; she also names of reindeer-breeding Ewen and Ewenki groups), as well as – combined with the plural marker – with the non-productive sociative marker (cf. ŋin-či-l ‘with the dog’. If so, this development instantiates a familiar path Proprietive > Comitative > Instrumental. An interesting case of lexicalization of case markers is the restriction of the original ablative -duk to comparative construction in Arman (a divergent dialect of Ewen), with a parallel development in Nanai (observed in Gruntov [2002: 136]).

. Grammaticalization of possessive agreement As in other Tungusic languages (except for Manchu), possessive relations in Ewen are head-marked through the use of possessive agreement suffixes on the possessed  In a recent article, Alonso de la Fuente (2015: 35) suggests that the origin of the designative marker may have involved not only * ga-, but also * gaju- ‘to bring or take something back’.

Grammaticalization in Ewen (North-Tungusic) in a comparative perspective

405

Tab. 2: Possessive suffixes in Ewen in comparison to pronouns. Possessive suffixes

Pronouns in Ewen

Tungusic pronouns (reconstructed after Benzing b: )

Personal Singular  Singular  Singular  Plural  excl. Plural  incl. Plural  Plural 

-W~-mU~-bU -s~-sI -n ~-nI -WUn~-mUn~-bUn -t ~ -tI -sAn -tAn

bii hii noŋan buu mut huu noŋartan

*bI *sI *nI *büä *münti *süä *ti

Reflexive singular plural

-I~-mI~-bI -WUr ~-mUr ~-bUr

meen meer

? ?

noun. More specifically, for the 1st and 2nd person possessors the construction is double-marked, as they feature in addition special forms of possessive pronoun distinct form regular person pronouns; cf. min d’uu-w ‘my house’, hin d’uu-s ‘your house’, but noŋan d’uu-n ‘his/her house’. The possessive suffixes are represented in Table 2 above – contrasted with pronouns from which they presumably derive. It is generally assumed that possessive suffixes come from encliticization of personal pronouns to the head noun (Benzing 1955b). This can be seen more readily through comparison of the possessive markers of the 1st and 2nd person with corresponding pronouns (see the second and third column in Table 2). The form of 3rd person pronouns and its relation to 3rd person possessive suffixes needs a special explanation (see Section 3 below for discussion). Possessive endings occur on nouns, postpositions of nominal origin (see Section 2.2 above), and also on nonfinite forms (participles) in non-finite complement, relative and adverbial clauses (see Section 5 below). They also occur on some finite forms, for which they are likely to develop through insubordination (see Section 5.4).

 Grammaticalization in other nominal classes One of the more obvious cases of grammaticalization in the domain of nominal morphology is the formation of numerals designating teens; in Ewen as in other Tungusic languages they derive through compounding digits with the noun for ‘ten’: ilan + ‘three’ + mian ‘ten’ → ilanmiar ‘thirty’. Also interesting in this respect are special classes of numerals used for certain objects, whose use is reminiscent of numeral classifiers; cf. -ndu ~ -nru for counting houses, -ŋrA for counting (domestic)

406

Andrej L. Malchukov

animals (reindeer), -rdA for scores in a game, etc. (Cincius 1947; Benzing 1955b: 1054). Numeral classifiers are familiar from South-East Asian languages (e.g., Sinitic);5 a closer parallel is a celebrated case of Nivkh featuring dozens of (bound) classifier forms forming different classes of numerals (Gruzdeva 2004). At least in some cases, these markers in Ewen (Tungusic) can also be shown to have developed from compounds. Compare the special numeral series in -ndu used for counting houses, which presumably originated through compounding: ila-ndu ‘three (houses)’ < ilan ‘three’ + d’uu(g) ‘house’ (Benzing 1955b: 1054). Certain forms of pronouns are also interesting in that respect. In literature, the 1st and 2nd person pronouns in Tungusic have been extensively discussed because they show intriguing parallels to other Altaic languages, but the nature of these similarities is a matter of controversy (see Janhunen [2014] for a recent discussion). For discussion of grammaticalization, however, 3rd person pronouns are of particular interest, as they show a composite structure in all Tungusic languages, except for Manchu (see Table 2 above): Noŋa.n ‘he/she’; Noŋa.r.tan ‘they’. The last segments derive from possessive markers (-n 3.; -tAn 3.) but are fossilized and should be regarded synchronically as part of the stem. This becomes clear through comparison of the case paradigm of pronouns with the inflection of the possessed nouns; cf. the dative form of the pronouns and of the possessed nominal: noŋan-du-n ‘to him’; noŋar-du-tan ‘to them’ vs. oran-du-n ‘to his reindeer’; ora-r-dutan ‘to their reindeer’. Now, in the case of possessed nominals, we are dealing with regular combination of case and possessive agreement suffixes. In the case of pronouns, however, the last segments are devoid of any function and should be considered as part of the stem. As a result, cases are formally marked by infixation, which is otherwise unheard of in Tungusic languages (and in Altaic languages, for that matter). This peculiarity can only be explained by a renewal of pronouns by generic nouns in the possessed form. The generic noun in question is obscure, but Benzing (1955b: 1057) tentatively reconstructs it as *ŋuga and compares with Ma. guwa ‘other; man’ (see, though, Alonso de la Fuente [2013: 37 ff.] on the entangled history of Ma. guwa, and especially [2013: 37] for a critical assessment of Benzing’s proposal). The renewal of (3rd person) pronouns in Tungusic bears remarkable similarity to the replacement of personal pronouns through the stems of “dummy nouns” in Samoyedic (see Janhunen, this volume), which might suggest an areal convergence. Interestingly, some Tungusic languages tried to “repair” the anomalous infixing pattern through restructuring. Thus, in Negidal, the 3rd person possessive marker -tin/-tan has been reanalyzed as a plural marker, leading to reordering of the marker with respect to case suffixes; cf. Neg. noŋa-l-tin > noŋa-ti-l (Xasanova and Pevnov 2003: 255). Similarly, in Udihe, the reanalysis of possessive marker as a plural mark-

 Manchu, under the influence of Chinese has also grammaticalized a number of numeral classifiers (Gorelova 2002: 206–209 after Zakharov 2012).

Grammaticalization in Ewen (North-Tungusic) in a comparative perspective

407

er has led to externalization of case morphology in pronominal forms: cf. nua-ti-du ‘them’ vs. amin-di-ti ‘to their father’ (Cincius 1949: 99). Therefore, in both languages the petrified 3rd person possessive marker has been coopted as a plural marker, which affected the ordering of the suffix with respect to the case marker in 3rd person plural pronouns. The externalization of cases in 3rd person plural pronouns thus follows the familiar pattern of externalization of inflectional morphology recorded in literature (e.g., Bybee 1985; Haspelmath 1998 and others).

 Grammaticalization of verbal categories The finite verb in Ewen inflects for tense/mood and person/number. Additionally, verbs, both finite and non-finite, can take aspect and voice suffixes. The usual ordering of the markers of the verbal categories is: voice-aspect-tense/mood-person/ number; e.g., maa-mač-čoot-tı-tan [kill----3] ‘(they) used to kill each other’. Categories of aspect and voice are not paradigmatic, that is, several markers of these categories can co-occur in a word-form. The order of these markers is partially conventional (as described by Robbek [1982] for aspectual markers), but partially determined by differences in semantic scope of the respective suffixes (Nedjalkov 1991, 1992; Malchukov 1995). This is especially true of “lexical suffixes” which have concrete semantic content such as venitive -nA-, causative -wkAn-, desiderative -m- and some others. Compare the variable order of suffixes in the following examples, where the order reflects variation in semantic scope: it-ne-wken- [see-] ‘make (smb.) go and see’; ič-uke-ne- [see--] ‘go (in order) to make (smb.) see’. As will be clear from the discussion below, many of these lexical suffixes, can indeed be traced to verbs in compound structures.

. Grammaticalization of markers of valency and voice Voice/valency changing categories in Ewen include the following: mediopassive in -B- (cf. aaŋa-b- ‘be open’); adversative passive in -W- (cf. maa-w- ‘be killed’, udalaw- ‘be caught by rain’ from udal- ‘rain’); causative in -WkAn- (cf. hör-uken- ‘make/ let go’, ič-uken- ‘let see, show’); reciprocal in -mAČ- (cf. čor-mat- ‘hit each other’); sociative in -ldA- (cf. höre-lde- ‘go together’, less frequently in reciprocal sense: baka-lda- ‘meet’ from bak- ‘find’). Among these markers, the (adversative)-passive in -W- is of particular interest diachronically. According to a common view (Nedjalkov 1991; Li and Whaley 2012), which goes back to the early work by von der Gabelentz and Zakharov (Zakharov 2012: 159), the passive -W- derives from the verb bu- ‘give’. In more recent work (Nedjalkov 1991; Malchukov 1995; Li and Whaley 2010), the shift from causative to passive meaning is conceived to proceed through an intermediate permissive-causa-

408

Andrej L. Malchukov

tive stage (meaning ‘let’). Manchu is more conservative in that respect insofar as -bu- may have either a causative meaning (mostly with intransitives) or a passive meaning (with transitives). (4)

Manchu (Nedjalkov 1991: 5) a. Bata i-mbe wa-ha enemy he- kill- ‘The enemy killed him.’ b. Bata-be wa-bu-ha enemy- kill-- ‘(He) made (somebody) kill the enemy.’ c. Bata-de wa-bu-ha enemy- kill-- ‘(He) was killed by the enemy.’

In Ewen, the -W- is used exclusively in the passive sense; the causative use is marginal and restricted to a few lexical items (on the productive causative is -wkAn- see below). Still Ewen also provides evidence for the causative origin of the passive markers, even though in a more indirect way. Indeed, the passive in Ewen is not a canonical passive, but rather an adversative passive marker, as familiar from Japanese (Malchukov 1995). The adversative passive is peculiar semantically, as it carries an implication that the action is adverse for the subject, but it is also peculiar syntactically, as it may be both valency-reducing (as regular passives) and valencyincreasing (like causatives). Compare the canonical (valency decreasing) passive construction in (5b) and the adversative (valency-increasing) construction based on an impersonal verb in (5c). (5)

Ewen (Nedjalkov 1995) a. Nugde etike-m ma-n bear old_man- kill-.3 ‘The old man was killed by the bear.’ b. Etiken nugde-du ma-w-ra-n old_man bear- kill---3 ‘The old man was killed by the bear.’ c. Etiken uda-la-w-ra-n old_man rain----3 ‘The old man is caught by the rain.’

Thus, adversative passives may be seen as a category intermediate between passives and causatives (Malchukov 1995). Viewed diachronically, adversative passive may

Grammaticalization in Ewen (North-Tungusic) in a comparative perspective

409

be considered an intermediate stage in the causative to passive reanalysis, as is also the case in Tungusic. The development from causatives to passives has been also documented for other Altaic languages as well (Kormušin 1976; Robbeets 2007). Interestingly, the same grammaticalization path  >  >  is also attested in Sinitic and some other East-Asian languages (Yap and Iwasaki 1998; Chappel and Peyraube 2011: 793), suggesting an areal connection. The regular causative in Ewen with the marker -wkAN- derives from a combination of the passive-causative -W-/-bu- with the part -kan-, which is interpreted in the literature either a (second) causative marker, or an intensifier (Kormushin 1978). Sunik (1962: 128) further proposed to derive -kan- from the verb ŋun~gun ‘say’. In any case the rise of a new marker should be attributed to the need to disambiguate the polyfunctional -W- marker, by way of strengthening. Less is known about the origin of other markers; the sociative -ldA- is apparently a loan from Mongolian (Poppe 1965; Benzing 1955b: 1069). The origin of the reciprocal -mAt- remains unclear, but Alonso de la Fuente (2011) suggested that this can be a contraction of the converb-auxiliary construction (V-me + oo-(t)- ‘become’). According to Alonso de la Fuente (2011), the original function was durative and later it developed into reciprocal through intermediate stages of pluractional uses (-mAtstill has marginally multiplicative uses in Ewen). This analysis is interesting, as the rise of new verbal markers through contraction of converbs and auxiliaries is otherwise well-attested, yet this suggestion, as the author acknowledges himself, is not without problems as the source construction is not attested for Manchu.

. Grammaticalization of aspect and Aktionsarten In traditional Tungusic studies aspectual forms are not consistently differentiated from other cases of verbal derivational morphology (with modal or valency-related functions). Among aspectual proper (Aktionart) suffixes, the following are fairly productive in Ewen: progressive in -D’- ~ -D’Id- (cf. höre-d- ‘be going’); durative in -d’AAn- (cf. höre-d’een- ‘go for a long time’); inchoative in -l- (cf. höre-l- ‘begin to go’); resultative in -Č(I)- (cf. ıl- ‘stand up’ and ıla-t- ‘stand’); momentative in -sAn- ~ -s- (cf. hukle-sn-e-n ‘slept for a while/went to sleep’); distributive in -kAČ- (cf. kökeket- ‘die one after another’); iterative in -WAAČ- (cf. hör-rööt-te-n- ‘usually go’); and habitual in -G(A)rA- (cf. hör-ger- ‘used to go’). These forms show considerable variation in frequency and productivity in Northern Tungusic (see Robbek [1982] for frequency estimates for Ewen; cf. Gorelova [1979] on Ewenki); in both Ewen and Ewenki languages, progressive, inchoative and iterative markers belong to the most frequent ones. Gorelova (1980) further notes that the progressive (imperfect) marker -d’- is the most frequent in Ewenki; in fact, she considers it (on the frequency grounds) to be the only inflectional aspectual marker; all other suffixes she treats as derivational expressing Aktionsart distinctions.

410

Andrej L. Malchukov

The majority of these markers have cognates in other Tungusic languages, still there are no convincing etymologies for most of them. Ramstedt (1957: 162) suggested that the inchoative -l- derived from the verb il- ‘stand up’. This etymology is typologically probable, but needs further corroboration, as the use of il- ‘stand’ for this aspectual meaning is not attested in Tungusic to the best of my knowledge. More convincing suggestion by Ramstedt (1957: 162, 163), is to relate the directional form -ji- (as in Ma. ga-ji- ‘bring’) from ji- ‘go’ found in East Tungusic. In addition, more recently Alonso de la Fuente (2011) proposed to derive also the progressive marker -d’- (cognate with the future marker) from d’i- ‘go’ (see Section 4.4 below for discussion). Nevertheless, for the majority of aspectual suffixes the origin is not clear, as the process of univerbation is not readily documented. One of the rare examples in North-Tungusic where grammaticalization is underway and a source construction is clear is the formation of the “attenuative” aspect -soo- in the podkamenny-tunguska idiom of Ewenki from converb-auxiliary combination: (6)

Ewenki (Vasilevic 1948: 153) a. Nasa-soo-ra-n b. < nasa-s oo-ra-n wave- become--3 ‘(he) just waved’

In Manchu, however, the processes of univerbation are more ubiquitous, which makes Manchu especially revealing for the study of grammaticalization processes. This has already been appreciated in the pioneering studies of Zakharov (2012), but this line of research has been followed up more recently by Avrorin (2000), Gorelova (2002), and Alonso de la Fuente (2011). For example, the (past) continuous form in -mbi-xe derives from a combination of the converb and the copula (Avrorin 2000; Zakharov, 2012: 174; Gorelova 2002: 292): (7)

Manchu (Avrorin 2000) a. D’e-me bi-xe eat- be- ‘was (at) eating’ b. D’e-mbi-xe eat-- ‘was eating’

Also, the directional aspect in -ŋgi- arises from a periphrasis involving the light verb aŋgi- ‘send’, as originally proposed by Zakharov (Zakharov 2012: 163; Sunik 1962: 126; Gorelova 2002: 250; Alonso de la Fuente 2011: 54); cf. Ma. ala-ŋgi [report-go]

Grammaticalization in Ewen (North-Tungusic) in a comparative perspective

411

‘go to report’ < ala-me uŋgi [report- send] ‘send to report’. It is likely that also some other aspectual markers listed above developed through contraction of the converb-plus-auxiliary construction, but their origin remains less certain.6

. Grammaticalization of modal markers Modal categories, which in traditional Tungusic studies are not consistently differentiated from aspect, include the following: desiderative -m- (cf. höre-m- ‘want to go’); conative -sčI- (cf. höre-sči- ‘try to go’); and purposive-directional -nA- (cf. it-ne‘go to see’). Among these markers, the modal marker -m- is generally assumed to derive from the verb *mu- ‘want’, which has been lost in Tungusic (Benzing 1955b). More convincingly, as the lexical source is preserved, the marker -nA- has been shown to derive from ŋene- ‘go’ (Zakharov 2012: 163; Sunik 1962: 127; Alonso de la Fuente 2011). (8)

Ewen a. it-ne-n see--.3 ‘he went to see’


>

nd stage PERF/Indir.-evid. Imperfect/Dir.-evid. Udihe

> >

rd stage Preterite > Validational >

th stage General Past Ø

Nanai

Orok

The 2nd stage is found in Udihe, the 3rd in Nanai, and the last one with the old past tense lost in Orok (Malchukov 2000). So markedness reversal in the tense system resulted in modalization of original past indicative forms to forms with evidential and affirmative value. The modalized forms in Nanai illustrate the markedness reversal pattern most clearly. The most commonly used past tense form in (10a) is

 Janhunen (2012a: 158) cites the following forms from (Chakhar dialect) of Mongolian where the “terminative” (past) marker -b- in the “long” (emphatic) form -b=aa has developed modal (“precautionary”) meaning: ald-b=aa ‘you dropped it > ‘be careful not to drop it!’ (citing the work by Sechenbaatar).

416

Andrej L. Malchukov

of participial origin, while the form in (10b) is the original past form, which is now rarely used and restricted to emphatic contexts: (10) Nanai a. Mi un-kim-bi. I. say-.-1 ‘I said.’ b. Mi un-ke-i. I. say--1 ‘I did say.’ In North-Tungusic languages this development ([old]  >  > ) is less conspicuous. In Ewen the temporal paradigm has been enriched by some forms of participial origin (the imperfect in -Ri mentioned above, some forms with future reference), but this has not led to modalization of the original finite forms (aorist/nonfuture and future). A situation more reminiscent of East-Tungusic is observed in Negidal, which is spoken in the Amur region is influenced by the neighboring East-Tungusic languages. As noted by Xasanova and Pevnov (2003: 276), in Negidal the future indicative form in -d’A- is restricted to interrogative and exclamative contexts: (11) Negidal (Xasanova and Pevnov 2003: 276) Ioowa iče-d’e-m? What see--1 ‘What shall I see (there)?’ The future indicative form can be used for questions, including rhetorical questions, but not for future assertions. Given that this form features the genuine verbal agreement and is in competition with the future form of participial origin (the same future marker but combined with the possessive style agreement), its peculiar illocutionary function should be also due to the modalization of finite forms ousted from indicative uses by more recent forms of participial origin. As noted above, the rise of evidential forms (both direct and indirect evidentials) in Tungusic is a by-product of the shifts in the structure of the verbal system, and is related to the process of verbalization and insubordination discussed below. The development of the reportative evidential markers in some languages follows a more common path of grammaticalization recruiting a verb of speech as a reportative form. The early stages of this development are also attested in Ewen. In Ewen, like in other Tungusic languages, direct speech is normally introduced by the verb göön-/un- ‘say’, used in the finite form when preceding, and in the converbial form when following the quoted material (cf., e.g., Ewen converbial forms gööniken/ göömi/göönid'i ‘saying/having said’). Some Ewen varieties exhibit an overuse of

Grammaticalization in Ewen (North-Tungusic) in a comparative perspective

417

göön in the invariable form, which indicates a further stage grammaticalization process: (12) Ewen (Novikova 1980: 136) Gaaran'd'a göön-ni: "Oŋalgan, göön, hii čaamaj nuŋu bi-se-nri. oon, -bird say-.3 -bird  you most fool be--2 how göön, d'eb-d'i-n, göön, ia-č čiki-d'i-n, göön."  eat--3  what- cut--3  ‘The Gaaran'd'a -bird said: “Oŋalgan-bird, what a fool you are! How will he (the fox) eat you, how will he cut the tree …”.’ A further step in the grammaticalization process is represented in Nanai, which has developed the quotative enclitic -(j)Am. As suggested by Avrorin (1961: 275–276), this enclitic has developed from the verb un- ‘say’ in the form of the simultaneous converb in -mi/-mAri. It is used as a citation particle with both verbs of speech and mental predicates. (13) Nanai (Avrorin 1961: 275) Mi haj-wa un-ke-i?! Gamasom-ba baogo-o-ri=am I. what- say--1 Son-in-law- find--.= pondad'o-i edi olbinda=m. sister-. not(.) fetch-..2= ‘Haven’t I said?! We have to find our son-in-law, don’t bring along your sister.’

. Grammaticalization of (subject) agreement Depending on tense/mood categories, a verb form in Ewen can take two distinct series of person-and-number suffixes to show agreement with the subject. As is clear from Table 3 below, the second set features two distinct subparadigms, but the distinction is minimal and reduces to whether the mood form includes a separate plural marker in plural or not. Note that the second set of suffixes in Table 3 is similar to possessive suffixes on nouns (cf. Table 2 above), suggesting that they developed through reanalysis of nominal forms. The first set of agreement suffixes, even though ultimately of pronominal origin, seems to have a more complex history. Benzing (1955b: 1079–1080) analyses the paradigm as heteroclitic arising through the blending of series based on -n-nominalizations (haa-nri ‘you know’ < haa-n-si), with the aorist series based on the marker -RA- combined with pronouns (haa-ra-m ‘I know’ < haa-ra-n-bi). On the other hand, Sunik (1962) and Kormušin (1984) maintain that all indicative forms are of the latter origin (i.e., haa-nri ‘you know’ < haa-Ra-N-si). Kormušin (1984) fur-

418

Andrej L. Malchukov

Tab. 3: Subject agreement endings in Ewen. st set

nd set

future, nonfuture indicative

hypothetical mood, imperfect

preventive, subjunctive

SG   

-m -nrI -n(I)

-W -s -n

-W -s -N

PL  EXC  INC  

-RU -p -s -r

-WUn -t -sAn -tAn

-l-bUn -l-tI -l-sAn -l

ther suggested that even certain forms in the predicative paradigm (3 in -n, 2 in -s, 1 excl -u) may have been borrowed from the possessive paradigm or at least influenced by the latter forms. While the details of the development are obscure, what is clear is that the finite agreement is older than possessive and cannot be fully derived from the pronominal enclitics, but is of more complex origin.

. Grammaticalization of (verbal) negation One of the most spectacular cases of grammaticalization is the evolution of negation in Tungusic (see also Robbeets [2015] for comparative discussion). In North Tungusic languages (like in Uralic), negation is periphrastic and is rendered by the inflected negative verb e- (see Table 3) and a lexical verb in the special non-finite (connegative) form with -R(A), which is cognate with the aorist marker. (14) Ewen e-he-m gaa-d. not.do--1 know-. ‘(I) don’t know.’ In some East-Tungusic languages like Nanai, the negative verb was postposed to the lexical verb and underwent suffixation: (15) Nanai Gad-a-se-mbi take---1 ‘I didn’t take’


Subjunctive > Participle -ri > Participle -ča > Participle -mat > Deverbal noun In other Tungusic languages some of the aforementioned deverbal forms may be more verbalized. For example, in a closely related Ewenki, the form with -nA is a regular perfective participle, as can be seen from its capacity to combine with the verbal negative auxiliary (Bulatova and Grenoble 1999). The perfect participle with -čA is also more verbalized in Ewenki, as compared to Ewen; in particular in Ewenki this form can take person agreement directly, while in Ewen it needs to be combined with a copula. As I suggested in earlier work (Malchukov 2013), the noun-verbal continuum (or noun-verb squish in terms of Ross [1973]), in diachronic terms should be attributed to verbalization of deverbal nouns, which develop into verbal forms through a participle stage.

. Grammaticalization of converbs Adverbial clauses can be formed by converbs or by case-marked participles; as shown later the latter are also a frequent source of the former. The converbs can be divided into several types based on morphological criteria: (i) some are non-inflected (e.g., the simultaneous converb with -mi, the terminal converb with -kAn), (ii) the others inflect for number (e.g., the manner converb in -nikAn []/-nikAr []), and still others (iii) take possessive agreement to agree with the subject. This distinction has a syntactic correlate: the first two groups are same subject converbs (i.e., the converbial subject is identical to the matrix subject and thus left unexpressed). The third group of person-inflected converbs (may) have a different subject which would then be expressed in the same form as the subject of complement and relative clauses (that is, in the possessive form of pronominal, and cross-referenced through possessive series of agreement). The following examples illustrate a distinction between same-subject converb (in -mi) and different subject converb (in -RAk-); semantically, these forms are similar insofar they can be used in a range of meanings expressing different temporal (‘when’) and contingency (‘if ’, ‘because’) relations between the two events; the choice is determined exclusively syntactically: (21) Ewen a. Bej muču-mi anŋat-ti-n man return- stay_overnight--3 ‘When the man returned, he stayed overnight.’ b. Bej muču-raka-n anŋat-ti-w man return--3() stay_overnight--1 ‘When the man returned, I stayed overnight.’

Grammaticalization in Ewen (North-Tungusic) in a comparative perspective

423

There are various complications, pertaining to this distinction. Therefore, certain different subject converbs should be called more appropriately variable-subject converbs, as they may take a different or a same subject; in the latter case they take reflexive-possessive endings to render coreferentiality of the subjects. Under certain conditions, same subject converbs might also be employed in different subject constructions, and vice versa (see Malchukov [2008] for more discussion of Ewen, and Nedjalkov [1995] for comparative Tungusic data), but in general the distinction between the two types of converbs is robust. Another type of adverbial clause is built on case-marked participles. The following example shows an adverbial clause expressing anteriority built on the perfect participle in combination with the locative case marker (-čA-lA- < -čA + -lA-): (22) Ewen Bej muču-ča-la-n anŋat-ti-w man return---3 stay_overnight--1 ‘After the man returned, I stayed overnight.’ The structure of the adverbial clause is basically no different from the complement clause, except for the fact that the clause is in the adverbial position and correspondingly the participle takes an oblique case. In the case of subject identity, the participle would take the reflexive-possessive ending instead of the personal possessive. (23) Ewen muču-ča-la-j anŋat-ti-w return---. stay_overnight--1 ‘After the man returned, I stayed overnight.’ Such constructions tend to be reanalyzed into converbs (similar structures in Mongolic are appropriately called ‘quasi-converbs’ by Janhunen [2012a]); in fact it can be the case that the form with -čAlA- has undergone this reanalysis in some Ewen dialects; at least, in Western Ewen dialects it is currently one of the most frequent forms expressing anteriority. For some other forms this is still more obvious; thus, the same subject anterior form with -Rid’i ()/-Rid’ur () is transparently derived from a combination of the imperfective participle -Ri with the instrumental case -D’iand the reflexive possessive marker (-i [], -(w)ur []). Still, synchronically it needs to be analyzed as a converb, which is evident not just based on its (noncompositional) semantics, but also based on the fact that the corresponding different-subject forms with possessive endings are not in use: (24) Ewen muču-rid’i anŋat-ti-w return-. stay_overnight--1 ‘After returning, I stayed overnight.’

424

Andrej L. Malchukov

At a later stage of grammaticalization the source construction is restructured and no longer transparent. Benzing (1955b: 1085) suggests deriving the different-subject conditional form with -RAk- from combination of the -RA (aorist) with the locative case -ki, but this proposal remains speculative because the suffix -ki does not survive as a separate case marker. More compelling is the etymology of the same subject conditional form with -mi, which can be shown to derive from the deverbal noun with -n in combination with the reflexive-possessive suffix (-mi < -n + (w)i). Here the Eastern Tungusic languages provide decisive evidence (Petrova 1936) as they preserve the plural forms of these marker incorporating the plural form of reflexive endings; cf. Nanai: -mer < -n + wur (Benzing 1955b: 143). Finally, it should be noted that some of the converbs, which are either noninflected or inflected for number (e.g., the manner converb in -nikAn/nikAr) most likely take their origin in appositive participles, as already suggested by Sunik (1962: 178). Both paths (Participle + Case > Converb; Appositive Participle > Converb) have been also extensively documented cross-linguistically (Haspelmath 1995). To conclude, the major source of converbs in Tungusic languages are casemarked participles. This does not close the cycle of development of non-finite clauses though, as both participles and converbs might develop finite clause uses through a process commonly referred to as insubordination.

. Reanalysis of complex constructions: Insubordination and verbalization The term insubordination, coined by Evans (2007), refers to reanalysis of subordinate clauses as main clauses. Insubordination has recently attracted a lot of attention in typological literature (Evans 2007; Mithun 2008; Evans and Watanabe 2016), also because it challenges the received assumptions in grammaticalization literature about directionality in this domain. Indeed, in grammaticalization framework it is rather expected that main clauses will be downgraded to a subordinate clause (clause union), rather than a subordinate clause being upgraded to a main clause. As in many burgeoning fields of ongoing research, the terminology in this domain is not fully established; thus, the terms ‘insubordination’ (Evans 2007), ‘desubordination’ (Mithun 2008), ‘verbalization’ (Malchukov 2013), ‘finitization’ (Givon 2016), ‘clause disengagement’ (Cristofaro 2016) refer to overlapping but not identical processes. For Tungusic and broader Altaic languages insubordination is of particular importance, as many finite forms can be shown to be of non-finite origin; in fact, those language show a tendency to recycle their non-finite forms (Robbeets 2009; Malchukov 2013). This process has been shown to involve multiple historical cycles (Robbeets 2009, 2017) and extends beyond Altaic languages to other languages of North-Asia (Malchukov 2013).

Grammaticalization in Ewen (North-Tungusic) in a comparative perspective

425

There are different types of insubordination to be considered; in the case of Tungusic, all of them involve the finite use of originally non-finite forms. One type is adjunct-insubordination, which can be illustrated by the use of the purposive converb with -dA- as a future (distant) imperative. It stands to reason to assume that imperative uses as in (25b) developed from purposive converbs embedded under the verb of speech: (25) Ewen (Malchukov 2001) a. [Bej em-de-n] gön-em. man come--3.. say-.1 ‘I said that he should come.’ b. Em-de-n! come--3.. ‘Let him come (later)!’ As noted by Malchukov (2001), the direction of development is from the converb to the imperative use in accordance with insubordination scenario, not the other way around. Evidence for this is found in the fact that a converb takes a possessive style agreement. Moreover, it can also take reflexive-possessive agreement, which would be completely out of place in a main clause, but can be explained if it originated in a construction headed by the imperative form with the subordinate subject coreferential to the main verb subject. (26) Ewen a. [D’eb-de-j] em-ni. eat--.. come-.2 ‘Come (in order) to eat.’ b. D’eb-de-j! eat--.. ‘(you.sg) Eat (later)!’ Reanalysis of purposive converbs into future imperatives is attested in other Tungusic languages as well. While adjunct-insubordination is fairly frequent cross-linguistically (recall, for example, the optative usage of standalone conditional clauses such as If I were there found in many European languages), for Tungusic (and other Altaic languages), complement insubordination is of particular importance. In my earlier work (Malchukov 2013) I claimed that insubordination of subject complement clauses paved way to the renewal of finite forms by participles. The general path of complementinsubordination involving reanalysis of the subject complement is schematically represented below, and is illustrated by the use of the imperfective participle with -Ri,

426

Andrej L. Malchukov

which has established itself as imperfect tense in Ewen dialects: [Sb Part-agr] [] → [Sb Part-agr] ø → [Sb] []. (27) Ewen a. [Bej-il hör-ri-ten] bi-d’i-n. man- go-.-3. be--3 ‘The men probably left/are leaving.’ b. Bej-il hör-ri-ten man- go--3() ‘The men left.’ On this account, the participial form initially headed a subject complement and combined with a copula (in present tense the copula could be zero). When discussing similar structures in Nanai, Avrorin (1981) suggested that reanalysis of the erstwhile main (existential) predicate as a modal particle could have paved the way for this reanalysis, and this is a possible scenario for Ewen as well (cf. the future auxiliary in [27a] also used as a hesitation particle bid’in ‘maybe’ in Ewen). Another possibility is that reanalysis occurred in present tense contexts where the copula (bisni ‘is’) was generally optional (with 3rd person subjects). Most importantly, the participial form retains its possessive style agreement in the resultant finite structure as shown in (27b). Indeed, possessive-style agreement can be regarded as a hallmark of insubordination revealing that initially it was a nominalized structure (subject complement). On this account, complement insubordination is more prominent in Tungusic than adjunct insubordination, as it provides a major mechanism of renewal of finite clause in Tungusic and other Siberian languages (Malchukov 2013; cf. Robbeets 2010, 2017). Malchukov (2013) suggested distinguishing insubordination, which involves upgrading of the erstwhile clausal complement, from verbalization, which involves the reanalysis of nominal predicates into verbal ones. He illustrates verbalization processes through the rise of periphrastic perfect construction, with the schematic representation provided below: [Sb] [N/Part] [] → [Sb] [V2 Aux] (→ [Sb] []). (28) Ewen a. Bej [hör-če] [bi-si-n] man go-. be--3 ‘The man had left.’ b. Bej [hör-če bi-si-n] man go-. be--3 ‘The man had left.’

Grammaticalization in Ewen (North-Tungusic) in a comparative perspective

427

As is clear from the comparison of (27) and (28), verbalization processes are similar to insubordination in that both may lead to finitization of participles (indeed, verbalization is treated as a variety of ‘direct insubordination’ by Robbeets [2017]). Yet, there are also important differences between the two scenarios with respect to input structures and their outputs. As noted above, verbalization processes, unlike insubordination, do not take their origin in an embedded clause. With regard to the output structures, insubordination gives rise to finite predicates which still feature possessive-style agreement, while verbalization does not. Thus, while acknowledging similarities between the two scenarios (cf. a recent discussion in Robbeets [2017]), for now, I regard both insubordination and verbalization as two distinct mechanisms behind replacement of finite forms by forms of non-finite origin in Tungusic.

 Conclusions Above I presented an overview of processes of grammaticalization and reanalysis in North-Tungusic Ewen, drawing on external evidence wherever possible, but also making use of internal reconstruction. As is clear from the presentation above, many pathways described for Tungusic are common cross-linguistically (those recorded in Heine and Kuteva [2002] are in italics): – Collective > Plural (Section 2.1) –   > Spatial Adposition >  (Section 2.2) – Pronouns > possessive agreement (on nouns) (Section 2.3) – Pronouns > person agreement (on verbs) (Section 4.6) –  >  (Section 2.2) –  >  >  (Section 4.1) –  >  (Section 4.4) –  >  >  (Section 4.3) – Past desiderative > Subjunctive (Section 4.5) –  >  (Section 4.4) –  >  (Section 4.3) –  >  (Section 4.5) –  > reportative clitic (Section 4.5) – Negative verb > negative affix (Section 4.7) – Participle + (oblique) case > Converb (Section 5.3) – Verbal noun > participle (Section 5.2) – Participle > finite verb (Section 5.4) Moreover, certain paths are areally conditioned. For example, the reanalysis of spatials (discussed in 2.2) is prominent in different branches of Altaic, but also beyond (for example, in Uralic; see Janhunen, this volume). The same can be said of the use of case-marked participles as converbs (discussed in 5.3), which is very common

428

Andrej L. Malchukov

in Eurasia (see the work of Čeremisina and her school; e.g., Čeremisina [1986]). As mentioned in 4.1 above, the development of causative to passive is attested in different branches of Altaic, while development of ‘give’ to causative is typical of Sinitic and other East-Asian languages. Also, insubordination (in particular complement insubordination [discussed in 5.4] involving repeated renewal of finite forms) scenarios are common in Altaic Transeurasian (Robbeets 2010), and more generally in North-Asian languages (Malchukov 2013). Some developments are less well attested cross-linguistically; of particular interest here is the development of verbs with non-factual semantics (or non-factual inferences) as negative modal auxiliaries (sick > cannot; afraid > cannot, won’t; lazy > won’t; see 4.3 above). Although developments of dynamic modals from verbs of concrete semantics (know > can) are well-attested in the literature, the development of negative modals has not been attested so far (it is not recorded in Heine and Kuteva [2002]). Tungusic data also provides extra evidence for polygrammaticalization phenomena where a single lexical item yields different categories (depending on the input construction). A spectacular case of polygrammaticalization, recently discussed by Alonso de la Fuente (2011), involves polygrammaticalization of ŋene- ‘go’ into ventive aspect -ne- (it-ne-n ‘went to see’), necessitative participle/mood -nne(ma-nna-n ‘must kill’), imperative -ŋa- (ma-ŋa-nri ‘you must kill’), as well as into the modal connegative discussed above (ma-ŋa turkurem ‘I can’t kill’). Certain paths of grammaticalization show the role of structure dependency; certain characteristics may facilitate or inhibit grammaticalization (reanalysis). The most spectacular case is that of negation (see 4.7); the negative auxiliary was fused with the lexical verb only when postposed; preposed negative auxiliaries do not show any signs of univerbation. The reason for this is not totally clear, but one can probably invoke the role of analogy; analogically driven grammaticalization (Lehmann 1995) would support only fusion of postposed auxiliaries in a language relying exclusively on suffixation. More generally, the Tungusic data shows the role of construction-dependency in grammaticalization processes. Perhaps it is best illustrated by the processes of verb-auxiliary contraction (univerbation). As recently demonstrated by Alonso de la Fuente (2011), verbal categories in Tungusic frequently arise from reanalysis of the converb marker and auxiliary as a new aspectual or voice marker. Such processes are better attested for Manchu, possibly under the influence of Mongolic languages. In Mongolian, the immediate aspect -ski- has arisen from a combination of the immediate converb with -s plus the auxiliary xi- ‘do’. (Janhunen 2012a: 178, 166), the progressive marker -jaj- is a contraction of the imperfective converb marker -j plus the auxiliary bai- ‘be’ (Janhunen 2012a: 217), etc. Alonso de la Fuente (2011) cites a number of similar cases from the field of Altaic languages, and also others (Dravidian, Eskimo). Although such developments are well-known in grammaticalization research, yet the challenge they present for grammaticalization theory (apart from

Grammaticalization in Ewen (North-Tungusic) in a comparative perspective

429

descriptive challenges as they make the source less easily recoverable) has not been sufficiently appreciated, it seems. Indeed, in the case of converb-auxiliary contraction, it is not even clear what should be regarded as the source item for grammaticalization; while otherwise lexical sources would be given in preference, in this case the auxiliary is very general and contributes less to the meaning for the resultant construction than the other component. Even speaking of context dependence with respect to these cases seems to be an inappropriate understatement. Finally, the Tungusic data shows that certain paths of grammaticalization and reanalysis are system-driven. This is best illustrated by the process of modalization of the original finite forms, being displaced by forms of non-finite origin (see 4.5). As a result, the old finite forms develop various types of modal or evidential forms, eventually ending as emphatic affirmative forms. This path (Indicative > Direct Evidential > Affirmative) is reminiscent of the processes by which old indicative forms have been modalized in European languages (cf. ‘subjunctives from old presents’ described in the work by Bybee, Perkins, and Pagliuca [1994] and Haspelmath [1998]).

Acknowledgements I am grateful to Juha Janhunen and José Andrés Alonso de la Fuente for their useful comments on the draft version of this article. The usual disclaimers apply.

Abbreviations Abbreviations follow the Leipzig glossing rules. Additional abbreviations include  – anterior,  – connegative,  – designative case,  – nonfuture,  – perfect particle,  – particle,  – ventitive

References Alonso de la Fuente, J. A. 2011. Tense, voice and Aktionsart in Tungusic. another case of “Analysis to synthesis”? Wiesbaden: Harrassowitz Verlag. Alonso de la Fuente, J. A. 2013. Manchu guwafu ‘crutch; pole’ and gûwa ‘(an)other. International Journal of Diachronic Linguistics and Linguistic Reconstruction 10. 27–54. Alonso de la Fuente, J. A. 2015. Tungusic historical linguistics and the Buyla Inscription. Studia Etymologica Cracoviensia 20. 17–46. Alpatov, V. M., I. V. Kormušin, G. C. Pjurbeev & O. I. Romanova (eds.). 1997. Yazyki mira. Mongol'skie yazyki, tunguso-man'čžurskie yazyki, yaponskij yazyk, koreiskij yazyk [Languages of the world: Mongolic, Tungus-Manchu, Japanese, Korean]. Moskva: Indrik. Avrorin, V. A. 1959. Grammatika nanajskogo jazyka. Vol. 1. [A grammar of Nanai]. Leningrad: Nauka. Avrorin, V. A. 1961. Grammatika nanajskogo jazyka. Vol. 2. [A grammar of Nanai]. Leningrad: Nauka.

430

Andrej L. Malchukov

Avrorin, V. A. 1981. Sintaksičeskie issledovanija po nanajskomu jazyku [Studies in syntax of Nanai]. Leningrad: Nauka. Avrorin, V. A. 2000. Grammatika man'čžurskogo pis´mennogo jazyka [A Grammar of written Manchu]. Sankt Petersburg: Nauka. Avrorin, B. A. & Boldyrev B. V. 2001. Grammatika oročskogo jazyka [A grammar of Oroch]. Novosibirsk: Nauka. Benzing, Johannes. 1955a. Lamutische Grammatik. Wiesbaden: Steiner. Benzing, Johannes. 1955b. Die tungusischen Sprachen. Versuch einer vergleichenden Grammatik. Wiesbaden: Steiner. Bulatova, Nadezhda & Lenore Grenoble. 1999. Evenki. München: Lincom. Bybee, Joan. 1985. Morphology: a study of the relation between meaning and form. Amsterdam: Benjamins. Bybee, Joan, Revere Perkins & William Pagliuca. 1994. The evolution of grammar: Tense, aspect, and modality in the languages of the world. Chicago: University of Chicago Press. Čeremisina, M. I. (ed.). 1986. Strukturnye tipy sinteticheskich polipredikativnych konstrukcij v jazykach raznych sistem [Structural types of the synthetic polipredicative constructions in the languages of different language types]. Novosibirsk: Nauka. Cincius, V. I. 1947. Očerk grammatiki èvenskogo (lamutskogo) jazyka [An outline of Ewen (Lamut) grammar]. Leningrad: Učpedgiz. Cincius, V. I. 1949. Sravnitel’naja fonetika tunguso-man’čžurskix jazykov. [Comparative phonetics of Tungus-Manchu languages]. Leningrad: Nauka. Chappel, Hilary & Alain Peyraube. 2011. Grammaticalization in Sinitic languages. In Bernd Heine & Heiko Narrog (eds.), A handbook of grammaticalization, 786–796. Oxford: Oxford University Press. Cristofaro, Sonia. 2016. Routes to insubordination: a cross-linguistic perspective. In Evans, Nicholas & Honore Watanabe (eds.), Dynamics of insubordination (Typological Studies in Language), 393–422. Amsterdam: Benjamins. Dörfer, G. 1978. Classification problems of Tungus, Tungusica (vol. 1). Wiesbaden: Harrassowitz Verlag. Evans, Nicholas. 2007. Insubordination and its uses. In Irina Nikolaeva (ed.), Finiteness. Theoretical and empirical foundations, 366–431. Oxford: Oxford University Press. Evans, Nicholas & Honore Watanabe (eds.). 2016. Dynamics of insubordination (Typological Studies in Language). Amsterdam: Benjamins. Georg, Stefan. 2004. Unreclassifying Tungusic. In Carsten Naeher (ed.), Proceedings of the first international conference on Manchu-Tungus studies (Tunguso Sibirica 9), 45–57. Wiesbaden: Harrassowitz Verlag. Givón, Talmy. 2016. Nominalization and re-finitization. In Claudine Chamoreau & Zarina EstradaFernández (eds.), Finiteness and nominalization, 271–297. Amsterdam: Benjamins. Gorelova, Lilia M. 1979. Kategorija vida v evenkijskom jazyke [Category of aspect in Ewenki]. Novosibirsk: Nauka. Gorelova, Liliya. 2002. Manchu Grammar. Leiden: Brill. Grenoble, Lenore A. & Lindsay J. Whaley. 2003. The case for dialect continua in Tungusic: Plural morphology. In Dee Ann Holisky & Kevin Tuite (eds.), Current trends in Caucasian, East European, and Inner Asian linguistics papers: In honor of Howard Aronson, 97–122. Amsterdam: John Benjamins. Gruntov, Ilja. 2002. Rekonstrukcija padezhnoj sistemy proaltajskogo jazyka. [Reconstruction of case system in Proto-Altaic]. Moscow, Russia: Russian State University for the Humanities dissertation. Gruzdeva, Ekaterina. 2004. Numeral classifiers in Nivkh. Sprachtypologie und Universalienforschung 57(2/3). 300–329.

Grammaticalization in Ewen (North-Tungusic) in a comparative perspective

431

Haspelmath, Martin. 1995. The converb as a cross-linguistically valid category. In Martin Haspelmath & Ekkehard König (eds.), Converbs in cross-linguistic perspective, 1–55. Berlin: Mouton de Gruyter. Haspelmath, Martin. 1998. The semantic development of old presents: new futures and subjunctives without grammaticalization. Diachronica 15. 29–62. Heine, Bernd & Tania Kuteva. 2002. World lexicon of grammaticalization. Cambridge: Cambridge University Press. Janhunen, Juha. 2012a. Mongolian (London Oriental and African Language Library 19). Amsterdam & Philadelphia: John Benjamins. Janhunen J. 2012b. The expansion of Tungusic as an ethnic and linguistic process. In Malchukov A. & Whaley L. (eds.), Recent advances in Tungusic linguistics, 5–16. Wiesbaden: Harrassowitz Verlag. Janhunen, Juha. 2014. Ural-Altaic: The polygenetic origins of nominal morphology in the Transeurasian zone. In Martine Robbeets & Walter Bisang (eds.), Paradigm change: In the Transeurasian languages and beyond, 311–337. Amsterdam: Benjamins. Janhunen, Juha. This volume. Grammaticalization in Uralic languages. Kazama, S. 2012. Designative case in Tungusic. In L. Whaley & A. Malchukov (eds.), Advances in Tungusic studies, 123–155. Wiesbaden: Harrassowitz Verlag. Kormušin, Igor V. 1976. O passivnom značenii kauzativnyx glagolov [On passive meaning of causative verbs]. Turcologica 1976. 89–93. Kormušin, Igor V. 1984. Sistemy vremen glagola v altajskich jazykach [Verbal tense systems in the Altaic languages]. Moscow: Nauka. Lebedev, V. D. 1978. Jazyk evenov Yakutii [Language of Ewens in Yakut Republic]. Leningrad: Nauka. Lehmann, Christian. 1995 [1982]. Thoughts on grammaticalization. München: Lincom. Levin, V. I. 1935. Samoučitel’ evenskogo jazyka [Teach yourself Ewen]. Leningrad: Učpedgiz. Li, Fengxiang & Lindsay J. Whaley. 2012. The grammaticization cycle of causatives in Oroqen dialects. In L. Whaley & A. Malchukov (eds.), Advances in Tungusic studies, 167–183. Wiesbaden: Harrassowitz Verlag. Malchukov, Andrej. 1995. Even. (Languages of the World/Materials, vol. 12). München: Lincom. Malchukov, Andrej. 2000. Perfect, evidentiality and related categories in Tungusic languages. In Bo Utas & Lars Johanson (eds.), Evidentiality in Turkic, Iranian and neighboring languages, 441–471. Berlin: Mouton de Gruyter. Malchukov, Andrej. 2001. Imperative constructions in Even. In V. S. Xrakovskij (ed.), Typology of imperative constructions, 159–180. München: Lincom. Malchukov, Andrej. 2008. Sintaksis èvenskogo jazyka: strukturnye, semanticheskie, kommunikativnye aspekty [Syntax of Ewen: structural, semantic, discourse aspects]. Sankt Petersburg. Nauka. Malchukov, Andrej & Igor Nedjalkov. 2010. Ditransitive constructions in Tungusic languages. In Andrej Malchukov, Martin Haspelmath & Bernard Comrie (eds.), Studies in ditransitive constructions: A comparative handbook, 316–351. Berlin: Mouton de Gruyter. Malchukov, Andrej. 2013. Verbalization and insubordination in Siberian languages. In Martine Robbeets & Hubert Cuyckens (eds.), Shared grammaticalization with special focus on the Transeurasian languages, 177–208. Amsterdam & Philadelphia: John Benjamins. Menges, Karl Heinrich. 1943. The function and origin of the Tungus tense in -ra, and some related questions of Tungus grammar. Language 19. 237–251. Menges, Karl Heinrich. 1968. Die Tungusischen Sprachen (Handbuch der Orientalistik 1: Der Nahe und der Mittlere Osten 5: Altaistik 3: Tungusologie). Leiden: Brill. Mithun, Marianne. 2008. The extension of dependency beyond the sentence. Language 83. 69–119. Nedjalkov, Igor V. 1991. Recessive-Accessive polysemy of verbal suffixes. Languages of the World (1). 4–31.

432

Andrej L. Malchukov

Nedjalkov, Igor V. 1992. Zalog, vid, vremja v tungusoman'čžurskix jazykax [Voice, tense and aspect in Tungusic languages]. Unpublished dissertation. St.-Petersburg. Nedjalkov, Igor V. 1995. Converbs in Evenki. In Martin Haspelmath & Ekkehard König (eds.) Converbs in cross-linguistic perspective: Structure and meaning of adverbial verb forms adverbial participles, gerunds, 97–136. Berlin & New York: Mouton de Gruyter. Nedjalkov, Igor V. 1997. Evenki. London: Routledge. Novikova, K. A. 1960. Očerki dialektov èvenskogo jazyka, vol. I [Sketches of Ewen dialects]. Leningrad: Nauka. Novikova, K. A. 1980. Očerki dialektov èvenskogo jazyka, vol. II [Sketches of Ewen dialects]. Leningrad: Nauka. Petrova, T. I. 1936. Ul’čskij dialekt nanajskogo jazyka [Ulcha dialect of Nanai]. Moskva & Leningrad: Učpedgiz. Ramstedt, Gustaf John. 1957. Vvedenije v altaiskoje jazykoznanije. [Introduction to Altaic studies]. Moskva: Izdatel’stvo inostrannoj literatury. Robbeets, Martine. 2005. Is Japanese related to Korean, Tungusic, Mongolic and Turkic? (Turcologica 64). Wiesbaden: Harrassowitz Verlag. Robbeets, Martine. 2007. The causative-passive in the Trans-Eurasian languages. Turkic Languages 11. 235–278. Robbeets, Martine. 2009. Insubordination in Altaic. Voprosy Filologii: Serija Uralo-Altajskie Issledovanija 1. 61–80. Robbeets, Martine. 2015. Diachrony of verb morphology. Japanese and the Transeurasian languages (Trends in Linguistics 291). Berlin & Boston: De Gruyter Mouton. Robbeets, Martine. 2017. The development of finiteness in the Transeurasian languages. Linguistics 55(3). 2–35. Robbek, V. A. 1982. Vidy glagola v evenskom jazyke [Verbal aspect in Ewen]. Leningrad: Nauka. Robbek, V. A. 1992. Grammatičeskije kategorii evenskogo glagola [Verbal categeries in Ewen]. St.Petersburg: Nauka. Ross, John Robert. 1973. Nouniness. In Bas Aarts, David Denison, Evelin Keizer & Gergana Popova (eds.), Fuzzy grammar, 351–422. Oxford: Oxford University Press. Sun Linlin & Walter Bisang. This volume. Grammaticalization in Mandarin Chinese. Sunik, O. P. 1962. Glagol v tunguso-man'čžurskix jazykax [The verb in Tungus-Manchu languages]. Leningrad: Nauka. Sunik, O. P. 1982. Suščestvitel’noe v tunguso-man’čžurskix jazykax [The noun in Tungus-Manchu languages]. Leningrad: Nauka. Tsumagari, T. 2012. A Note on Udihe Phonology from an Areal-typological Perspective. In Malchukov A. & Whaley L. (eds.), Recent advances in Tungusic linguistics, 81–89. Wiesbaden: Harrassowitz Verlag. Xasanova, M. & A. Pevnov. 2003. Mify i skazky negidal’cev (ELPR Publication series) [Myths and legends of Negidals]. Kyoto: Suita. Yap, Foong Ha & Shoichi Iwasaki. 1998. The emergence of ‘give’ passives in East and Southeast Asian languages. In Mark Alves, Paul Sidwell & David Gil (eds.), SEALS VIII: Papers from the Eighth Annual Meeting of the Southeast Asian Linguistics Society, 193–208. Canberra: Pacific Linguistics. Zakharov, I. 2012 [1879]. Grammatika man’čžurskago jazyka [A Grammar of Manchu]. Folkestone: Brill. Vovin, A. & J.-A. Alonso de la Fuente (eds.). Forthcoming. Tungusic Languages. London: Routledge. Whaley, L. J, L. Grenoble & F. Li. 1999. Revisiting Tungusic classification from the bottom up: a comparison of Evenki and Oroqen. Language 75(2). 286–321.

Edward Vajda

9 Areal features in Yeniseian grammaticalization Abstract: Pastoral nomads speaking Ugric, Samoyedic, Turkic, Tungusic and Mongolic languages occupied most of North Asia when Russians arrived in the late 16th century. The forests of central Siberia were also home to Yeniseian-speaking hunters whose languages contained such areally unique features as a prefixing verb template, possessive prefixes, and phonemic tones. The surrounding families have vowel harmony and are exclusively suffixing. While the pastoralists acquired no grammatical traits from their hunter-gatherer neighbors, they significantly influenced Yeniseian morphosyntactic structures, as young brides from nearby reindeerbreeding groups entered the hunter-gatherer bands. Yeniseian developed a suffixing case system through grammaticalization of native postpositional constructions. Verbs shifted from prefixing to suffixing through reanalysis of inherited position classes with no borrowing of actual morphemes. This accommodation evolved farthest in Ket, the family’s sole surviving member today. The origin of other core Yeniseian features such as noun gender and plural suffixes on nouns remains unclear. External comparison with Na-Dene (Athabaskan-Eyak-Tlingit) – a North American family with which Yeniseian shares a genealogical link – can help clarify the original structure of the language ancestral to Yeniseian.

 Introduction Yeniseian contains the critically endangered Ket and several extinct relatives – Yugh, Kott, Assan, Arin, Pumpokol – all related at a time depth probably less than 2,500 years. Figure 1 shows their distribution as documented by tsarist fur tax records during the first century of Russian activity in the area. Asterisks locate Ket speakers today. The syllables (ses, čes, šet, set, tet) are hydronymic formants characteristic of different Yeniseian daughter languages. Substrate river names containing them are dispersed far beyond areas where speakers were historically documented (Figure 2). The outline contains an area of compact dialectal diversity and possibly the Proto-Yeniseian homeland. See Vajda (2019) for more discussion of Siberian hydronyms, including those in -ši located west of Lake Baikal, which may be Yeniseian. Substrate river names of incontrovertible Yeniseian origin suggest that pastoralists interacted with Yeniseian speakers long before such contact was historically documented. Ket (Werner 1997c; Vajda 2004; Georg 2007; Kotorova and Nefedov 2015; Nefedov 2015) and its close sister Yugh (Werner 1997b) are fairly well described. The https://doi.org/10.1515/9783110563146-009

434

Edward Vajda

Fig. 1: Yeniseian-speaking groups in 1600 CE. (Vajda 2019: 187).

extant documentation of Kott, which belongs to a different primary branch, consists of several hundred words, along with crucial grammatical paradigms showing the language’s morphosyntactic categories, though no phrase- or sentence-level examples appear in the published descriptions (Castrén 1858; Werner 1997a). Assan, of which only a few hundred words are known (Werner 2005: 122–141), is closely related to Kott. Arin and Pumpokol are even more sparsely recorded (Werner 2005: 142– 187). They do not seem particularly close to either Ket-Yugh or Kott-Assan, or to each other, and may constitute two additional primary branches, though a few innovative features of verb structure are shared between Arin and Ket-Yugh, suggesting an early period of common development (Vajda 2017b). The present investigation relies mainly on Ket-Yugh and Kott data. Like other Inner Asian families, Yeniseian is SOV and strongly head marking, with an elaborate system of case suffixes and postpositions. Unlike the other fami-

Areal features in Yeniseian grammaticalization

435

Fig. 2: River names of Yeniseian linguistic provenance (Vajda 2019: 189).

lies, its possessive markers are preposed to the possessum, rather than suffixed. Also distinct is the polysynthetic finite verb, which in modern Ket consists of ten morpheme positions, eight of which were historically prefixes. Subject and object NPs trigger verb-internal agreement but are themselves morphologically unmarked. Another unusual feature for Inner Asia is the Yeniseian noun class system – inanimate vs. animate, the latter subdivided among singular nouns into masculine and feminine classes. Adjectives and adverbs are weakly differentiated morphologically and essentially constitute a single superordinate ‘modifier’ class whose members are not normally inflected by phrasal or clausal morphosyntax; case endings and postpositions can only attach to adjective or adverb stems that have first been nominalized. All Yeniseian languages exhibit a similar typological profile, as far as the extant documentation can reveal.

436

Edward Vajda

Although the time depth of Proto-Yeniseian is modest and the extant documentation of most daughter languages is deficient, a comparison of Ket-Yugh with Kott grammatical forms supports important conclusions about historical processes of change in the family. Internal reconstruction adds further insights. Finally, the external link with Na-Dene helps illuminate the original structure of the ancestral language. A key theme in our study is that morphosyntactic innovations in Yeniseian are often linked with areal contact. Section 2 treats nominal morphology, followed by finite verb structure in Section 3 and complex constructions in Section 4. Section 5 summarizes the history of areal contacts and links specific layers of innovation with molecular genetic traces of population mixing. The conclusion in Section 6 reiterates that the most unusual structural features of modern Ket arose through long-term contact with little outright borrowing of lexical or morphological material.

 Nominal categories . Noun class A distinction between animate- and inanimate-class nouns is a core trait of all Yeniseian languages. Singular animate nouns are divided into masculine and feminine subclasses. Class is unmarked in the noun stem, but triggers formal distinctions in verb-internal subject and object agreement (section 3.3). It is also reflected in plural suffix allomorphs: -n generally for animate-class nouns (masculine and feminine), and -(a)ŋ for inanimate-class: (1)

Morphologically regular noun plurals (Southern Ket 1) a. animate-class: qīm ‘woman’ → qimn ‘women’ hīɣ ‘male’ → hiɣn ‘males’ b. inanimate-class: a’t ‘bone’ → areŋ ‘bones’ dɔ’n ‘knife’ → dɔnaŋ ‘knives

In some instances, the suffix has merged irregularly with the stem, though the original animacy distinction may still be evident:

 Most examples derive from my fieldwork on Southern Ket (2005–2009) and appear in transcription using the diacritics: /ā/ high-even tone, /a’/ abrupt laryngealized tone, and /à/ falling tone. Geminate vowels /aa/ have rising-falling tone (unmarked). Multisyllabic words usually have a pitch peak on the initial syllable resembling word stress (also unmarked). The less infrequent peak on the second syllable is marked with an acute accent /á/.

Areal features in Yeniseian grammaticalization

(2)

437

Noun plural formation with stem irregularities (Southern Ket) sēs ‘river’ → sàs ‘rivers’ i’ ‘day’ → ɛkŋ ‘days’ qɔ ̀j ‘bear’ → qōn ‘bears’

Some exceptions form a sub-pattern, as when all stems ending in the universal nominalizing suffix -s are pluralized as -s-in, regardless of noun class distinctions: ugdɛ ‘long’ → ugdɛ-s ‘one that is long’ (anim- or inanim-class) → ugdɛ-s-in ‘ones that are long’. Also, most kinship terms and a few other group plurals take -aŋ rather than the regular animate suffix -n: ām ‘mother’ → amáŋ ‘mothers’, hɯ’p ‘son’ → hɯvaŋ ‘sons’, ēs ‘god’ → esáŋ ‘gods’. The origin of these patterns, like the grammaticalization pathway of the two main plural allomorphs, remains unclear.

. Possession Possession is expressed by preposing a pronominal marker to the possessum noun: Tab. 1: Ket possessive clitics. singular possessor 

=b=



=k=

.

=da=

plural possessor

=na= . =d=  ( or )

=d=

In modern Ket, these markers have become ditropic clitics, preferring to encliticize to the preceding word and procliticizing to the following possessum noun only at the beginning of an intonation group. Examples (3a–b) show 3rd person masculine-animate-class =da= to illustrate the pronunciation of all Ket possessive markers. (3)

3rd person masculine singular possessive marker in different phonological contexts a. Proclitic at the beginning of an intonation-group tura ke’t da=qu’s hapta that man 3..=tent it.is.erected ‘That man’s tent is set up (= As for that man, his tent is set up).’

438

Edward Vajda

b. Enclitic elsewhere qà ses-ka=ra qu’s hapta big river-=3.. tent it.is.erected ‘On the big river his tent is set up.’ The ditropic behavior of Ket possessive markers represents an accommodation to the structure of the surrounding languages, from which marriage partners were often taken. The detachment of what were originally possessive prefixes represents a typological shift from prefixing to suffixing.

. Cases and postpositional constructions As with possessive markers, no postposed relational morphemes were actually borrowed from neighboring languages. Rather, it is the pronunciation of native morphology that gradually came to mimic that of the surrounding suffixing languages. Ket case suffixes and postpositions parallel the surrounding families in their array of functions, and in being postposed in an agglutinative arrangement (Vajda 2008a). However, the suffixing nature of Yeniseian bound relational morphemes developed from complex possessive constructions containing noun roots preceded by possessive prefixes. Vajda (2013b) argued that Yeniseian possessive prefixes were connected to their possessum by a generic possessive -ŋ-, cognate with the nasal element in n-class Athabaskan nouns. Because possessive constructions form the basis of nearly half of all case forms and most postpositional constructions, this nasal has left a number of vestigial traces. The three case forms that require possessive connectors to attach to their host noun are shown in (5): Tab. 2: Possessive-based case forms in Ket. dative

ablative

adessive



=bʌ-ŋ-a

=bʌ-ŋ-al

=bʌ-ŋ-ten



=ku-ŋ-a

=ku-ŋ-al

=ku-ŋ-ten

..

=da-ŋ-a

=da-ŋ-al

=da-ŋ-ten

=di-ŋ-a

=di-ŋ-al

=di-ŋ-ten

=na-ŋ-a

=na-ŋ-al

=na-ŋ-ten

..  ( or ) //.

These forms can stand as independent words in the absence of an actual pronoun or noun host. In the more frequent instances where a noun or pronoun directly precedes them, they cannot detach prosodically from their base and resemble the case suffixes typical of Inner Asia.

Areal features in Yeniseian grammaticalization

439

(4) Possessive-based case forms shown with the inanimate-class Ket noun qu’s ‘tent’ a. dative ses=di-ŋ-a river=3-- ‘to a ~ the river’ b. ablative ses=di-ŋ-al river=3-- ‘from a ~ the river’ c. adessive ses=di-ŋ-ten river=3-- ‘at a ~ the river’ The remaining four Ket case forms lack possessive connector elements and obligatorily attach to a preceding noun or pronoun: (5) Bare (not possessive-augmented) case forms in Ket a. locative ses-ka river- ‘at the river’ ~ ‘on the river’ b. prolative ses-bes river- ‘along the river’ ~ ‘past the river’ c. comitative-instrumental ses-as river- ‘together with the river’ ~ ‘by means of the river’ d. caritive ses-an river- ‘without the river’ The Ket bare case suffixes resemble those of neighboring languages more closely than is true of the three possessive-based case forms. It is possible the bare case suffixes once also had possessive augments, which contracted and eroded (see Table 3 below). Other spatial and temporal relations in Yeniseian are expressed by postpositional constructions, most of which are possessive-based, though without the connector

440

Edward Vajda

nasal. Some postpositions transparently derive from nouns (ba’l ‘gap’ → =bal= ‘between’, ʌqad ‘back, spine’ → =ʌʌd= ‘on the surface of ’). Postpositions generally contrast with case suffixes in being unable to occur word-finally; constructions containing them invariably require a word-final case suffix (dative, ablative, locative, or prolative). The examples in (6) show postpositional constructions with a possessivebased final case suffix (dative) as well as a bare case suffix (locative): (6) Ket postpositional constructions a. dative a’q=na=bal-di-ŋ-a trees=..=between-3-- ‘to (the place) between the trees’ b. locative a’q=na=bal-ka trees=..=between- ‘(located) between the trees’ Like possessed nouns, postpositional constructions like those shown in (6) require a possessive clitic to connect the postposition to the head noun (in this case animate-class plural =na=), as well as a following case ending. If the case formant requires a possessive augment (ablative, dative, adessive), then inanimate-class possessive =di= connects it to the preceding postposition; the possessive marker is absent, however, before case endings not requiring a possessive augment (locative, prolative, comitative-instrumental, caritive). Although phonological erosion and semantic bleaching has obscured the etymologies, and thus the grammaticalization pathways, of most Modern Ket case suffixes, Table 3 illustrates how one such element developed from a possessed body-part root (*pǝqad ~ *ǝqad ~*qobad ~ *qǝpad ‘back’, ‘spine’, ‘obverse side’):

Tab. 3: Comparison of case and postposition morphology in two Yeniseian languages. Ket

Kott

Proto-Yeniseian

-ʌʌt-(+ case suffix)

-(ŋ)-ha:t

< *-ŋw-hǝqad

‘on top of’ (postposition)

‘on’ (locative case suffix)

< ‘-back’

The nasal possessive connector was retained in Kott plural forms but has dropped out altogether in Ket, where the morpheme is a postposition preceded directly by a pronominal possessive marker. Table 4 shows the etymologies of several other postpositions in Modern Ket, along with the Proto-Yeniseian noun from which each grammaticalized. The hyphen

Areal features in Yeniseian grammaticalization

441

Tab. 4: Etymologies for several Ket postpositions. Ket postposition

Modern Ket noun

Proto-Yeniseian noun

=bal- ‘between’

ba’l ‘space’, ‘gap’

*ba’r ‘space’, ‘gap’

=kūp- ‘in front of’

kūp ‘beak’, ‘tip’

*kūb ‘beak’, ‘tip’

=hɯj- ‘inside of’

hɯ̄ j ‘stomach’, ‘room’

*pɯ̄ j ‘stomach’

=ɯ̄ n- ‘beneath’

ɯ’n ‘sled runner’

*ki’n ‘belly’, ‘underside’

=tāne ‘in the direction of’ (< *-tan-di-ŋ-a)

tanno ‘aiming’ (< ta’n ‘path’ + *ta’n ‘path’ qo ‘tracking’)

=qōne ‘(motion) right up to’ (< *-qon-di-ŋ-a)

qō ‘mouth’ (back-formed from *qo’n ‘lips’, ‘face’ qo’n ‘lips’)

after the postposition indicates that it must be followed by a case ending such as locative -ka, dative -di-ŋ-a, ablative -di-ŋ-al, specifying stationary location, destination, or point of origin. In a few cases, Modern Ket did not preserve the noun root as an independent lexical item with the original meaning from which it grammaticalized into a postposition; in such cases, the nearest etymologically related noun in Modern Ket is provided instead. The last two postpositional roots in Table 4 partly merged with the dative case suffix -a, which has lost its possessive augment. This contraction probably occurred because no other case suffix is used with either of these roots, whereas several case forms can alternate after other postpositions (=ɯ̄ n-di-ŋ-al ‘out from beneath’, =ɯ̄ ndi-ŋ-a ‘(motion) beneath’, =ɯ̄ n-di-ŋ-ta ‘located beneath’), a contrast that apparently helped maintain their formal elaboration. Like possessive-augmented case forms, postpositional constructions serve as independent words in the absence of a noun or pronoun host: na=bal-di-ŋ-al ‘out from between them (.)’, d=ʌʌt-ka ‘located on top of it’. Although Ket nouns and pronouns with bound relational morphemes resemble the suffixing case forms of the surrounding languages, Yeniseian nominal inflection derives from a completely different underlying structure. The absence of any nominative or accusative case forms also distinguishes Yeniseian morphosyntax from the case paradigms of Uralic, Turkic and Tungusic, where nominal suffixes can express argument structure.

. Adjectives As follows from the strongly head-marking profile of Yeniseian, nouns and verbs typically carry all phrasal and clausal inflection, while adjectives and other modifiers are normally uninflected. However, there are rare instances where adnominal

442

Edward Vajda

modifiers in Ket and Yugh agree with their head noun in number or class. About a dozen basic adjectives have what appear to be plural-inflected forms when modifying a plural noun. (7)

Ket adjective stems with number alternations qà qu’s ‘big tent’ → qàŋ qu’ŋ ‘big tents’ ugdɛ būl ‘long leg’ → ugdɛŋ buleŋ ‘long legs’

Most Ket adjectives are unvarying and lack number agreement. Vajda (2013a) argued that the element -ŋ represents an ancient adjectival derivational suffix, preserved through reanalysis as a plural marker in a handful of adjective stems. Most adjectives either retained the suffix everywhere, or seem to have lost it entirely through phonological attrition. A minority of adjectives apparently allowed two pronunciations in free variation during the attrition process. In these stems, the suffix -ŋ, which resembled the inanimate-class noun plural suffix -ŋ, was reanalyzed as plural agreement. The fact that exactly the same adjectives show plural agreement in Yugh (χɛ’ ~ χēŋ ‘big’, ugdi ~ ugdiŋ ‘long’) demonstrates that this occurred prior to Russian contact. Data from other Yeniseian languages are insufficient to help answer the question of when the reanalysis occurred, though the absence of recorded singular/plural adjective pairs in Kott suggests it was limited to the Ket-Yugh daughter branch.

. Action nominals Yeniseian action nominals (often called ‘infinitives’ in previous studies) function like participles (8a), gerunds (8b), or infinitives (8c) (8)

Action nominal usage in Ket a. suul-bὲr kɛ’t sled-make. person ‘a person who makes sleds’ (literally, ‘sled-making person’) b. ap suul-bὲr b-in-ut my sled-make. 3.--end ‘I finished making sleds.’ (literally, ‘My sled-making ended.’) c. suul-bὲr-ɛsaŋ ap q’ɔj sled-make.- my wish ‘I want to make a sled.’ (literally, ‘To sled-make is my wish.’)

Most action nominal forms in Ket and Yugh seem to lack any derivational morphology and resemble bare verb roots like bἑr ‘making’ (or verb roots compounded with

Areal features in Yeniseian grammaticalization

443

the noun incorporate from the corresponding finite verb stem, like suul-bἑr ‘sledmaking’). However, Vajda (2014: 518–519) showed that Yeniseian action nominals derive from a complex morphological formula shared precisely with Na-Dene. This formula originally involved a sibilant prefix and a nasal suffix. The sibilant prefix was retained in basic Kott verbs (ši-ten ‘to lie down’, ši-pi ‘to make’) but was almost entirely lost in Ket and Yugh, which instead retain many more vestiges of the suffix. The action nominal suffix -ŋ is identical phonologically with the adjective-deriving suffix discussed above in 2.4 and shows the same irregular morphophonology – sometimes disappearing altogether, sometimes eroding to falling tone, and sometimes being optionally retained. The same variation between eroded vs. retained suffix occurs in a small number of action nominals (Southern Ket bὲr ~ bɛrɛŋ ‘making’), in which case the suffixed form has usually been reanalyzed as expressing pluractional meaning: bὲr ‘to make once’ ~ bɛrɛŋ ‘to make many times’. Other action nominals invariably retain the suffix without marking any distinction in event plurality: Ket aniŋ ‘playing’, iliŋ ‘eating’, ejiŋ ‘going’, bagdeŋ ‘stretching’. The reanalysis of some instances of the surviving action nominal suffix to express multiple events parallels how the moribund adjective-deriving suffix (possibly the same morpheme) was reanalyzed as a marker of plural agreement in noun phrases.

 Finite verb structure . Preliminary remarks Vajda (2017b) argued that the most common types of change in Ket-Yugh and Kott finite verb structure were metathesis or reanalysis of morpheme positions inherited from the following position-class structure in Proto-Yeniseian:

Tab. 5: The Proto-Yeniseian finite verb template. P

incorporate

P

obj agr

P

thematic consonant(s)

P

 agr

P

P

conjugation aspect marker l-impv n-pfv

P a sbj agr

P

P-

P-

verb root

pfv + stative

anim pl sbj agr

b stative

The same tendency for metathesis and reanalysis to prevail over the complete loss or new addition of morpheme positions can be traced in the evolution of Na-Dene templatic verb morphology as well (Vajda 2017b), a tendency crucial for tracing the divergence of Yeniseian and Na-Dene from a common structure. The finite verb templates in both families originated from the earlier coalescence, in a language ancestral to both, of a light verb and heavy verb sequence into a single morphologi-

444

Edward Vajda

subject agreement prefix

light verb

aspect suffix

subject agreement prefix

heavy verb

aspect suffix













3 inan. *w3 anim. *x-

conjugation prefix *si*ɢa-

aspect prefix continuative *ɬcompletive *ŋʷstative *j-

1, 2 subject agr prefix, 3 anim. or plural *d-

verb root

aspect suffix continuative *-ɬ completive *-ŋʷ stative *-j

Fig. 3: Origin of Dene-Yeniseian templatic finite verb morphology.

cal word. Because each verb root had its own agreement and tense-aspect-mood affixes, their amalgamation produced a linear mix of lexical and inflectional morpheme positions, with TAM affixes appearing on both sides of the main verb root, sometimes redundantly. Very few of these morphemes disappeared completely in either family, though some survived only as relics. Yeniseian appears to have entirely lost verb-final continuative *-ɬ, but retained the same suffix productively in its position after what had originally been the light verb root so that this element became a prefix in the amalgamated template. The Ket-Yugh conjugation prefix qo- (< *ɢa-) was retained only in the past tense forms of a few basic verb stems meaning ‘kill’ (examples 15a–d) and as the base of inceptive stems (example 20b). Elements in the left periphery of the verb phrase, such as object markers and incorporated nouns, attached to the template later. Some of them may have joined the morphological verb already in Dene-Yeniseian times, but in any event were likely already present in the verb phrase in the same relative order. Otherwise, very few morpheme positions were acquired or lost during the subsequent, separate evolutions of the Yeniseian and Na-Dene verb templates.

. Tense, mood, aspect If the model of template genesis in Tables 3 and 4 is correct, then the TAM marking system in Yeniseian is extremely ancient. Grammaticalization of light verbs into tense occurred very long ago. The Yeniseian daughter languages combine the conjugation markers (ancient light verbs) with the aspect markers on their right to express tense, aspect and mood. Ket has six productive TAM classes (Vajda 2004: 45– 48), distinguished by which combination of conjugation marker (q- or s-) and aspect affix (l or n) is used to express the past and non-past indicative and imperative forms. Imperative formation also involved reanalysis of pre-root pronominal *d- as an imperative marker. This form survives only before vowel-initial bases, as in these Ket forms:

Areal features in Yeniseian grammaticalization

(9)

445

Ket imperative forms with survivals of pre-root *da. hə’ŋ a4-n2-d1-un0 fish.net 4/2/1-set0 ‘Set the fish net!’ b. a4-d1-in0 4/1-stand0 ‘Stay where you are!’

The superscript numbers in (9) mark the original template positions corresponding to Table 5, whereas in Modern Ket the prefix sequences a4-n2-d1-and a4-d1- function as a single unit to mark the imperative in the given stems. In Na-Dene, the same pronominal element d- was reanalyzed as a valence decrease prefix, while pre-root continuative ɬ- was reanalyzed as valence increase, its original aspectual function remaining productive verb-finally, a position where this suffix did not survive at all in Yeniseian. Vajda (2017b) shows how every component of the Na-Dene classifier complex has a cognate in Yeniseian, with the markers in question often having undergone quite different pathways of semantic change in each family. Affixes on both sides of the verb root continued to mark stative-resultatives (or perfective-statives) – i.e., verbs expressing a state resulting from a prior action without expressing the action itself. Such verbs are common in both Ket-Yugh and Kott: (10) Kott transitive (a) and stative-resultative (b) verb forms a. hič-iːn7-a6-th5-o3-l2-ok0-in-2 (-ok0 < ?Proto-Yeniseian *-wədʲ ‘make’) hurry-7-3..6-5-3/2-1-make0-..-2 ‘They hurried him.’ b. hič-iːn7-a6-th5-o3-l2-a:1-‘uk0-i-1 (< *jə1-wədʲ0-ej-1) hurry-7-3..6-5-3/2-1-make0--1 ‘He had been hurried.’ (‘He was in a state of having been made to hurry.’) (11) Ket transitive (a) and stative-resultative (b) verb forms a. da8=nan7-u6-k5-si4-bɛt 0 3..8=bread7-3.6-with5-4-make0 ‘She makes it into bread.’ ~ ‘She makes bread with it.’ b. nan7-u6-k5-si4-ja1-bajaj0 bread7-3.6-with5-4-1-be.made0 ‘It has been made into bread.’ ~ ‘Bread has been made with it.’ The Ket base -bajaj0 ‘be.made’ (dialectally -bɛdej ~ -bɛt ~ -bej ~ -baj) is a portmanteau of *-wedʲ ‘make’ and the stative-resultative suffix *-ej, which interacts with the

446

Edward Vajda

P1 detransitivizing prefix jʌ- ~ ja- to form a sort of circumfix around the verb root in stative-resultative stems. Both elements originated from the same stative affix *-j, the prefixal version of which was originally suffixed to the Dene-Yeniseian light verb (and which evolved into the I-component of the Na-Dene classifier). The fate of the ancient perfective marking nasal affix is also interesting from the point of view of grammaticalization. The stative prefix in Yeniseian merged with the pre-root subject markers into a single position class. This series of subject affixes originally marked only agreement with speech-act-participant subjects, as is still true of modern Na-Dene languages. In Yeniseian, stative *-jə- was reanalysis as the 3rd person agreement marker jə- ~ ə- ~ -a-, while perfective *-ŋʷ- was reanalyzed as the plural or pluractional marker -ŋ-. The paradigm fragment in (12) highlights these elements in bold type: () a. di-in-doq 18-2-fly0 ‘I flew.’

b. d-in-da-ŋ-doq-ŋ 18-2-1-1-fly-0 ‘We flew.’

c. k -in-doq 28-2-fly0 ‘You (sg.) flew.’

d. k -in-ka-ŋ-doq-ŋ 28-2-2-1-fly-0 ‘You (pl.) flew.’

e. d-in-doq 38-2-fly0 ‘He flew.’

f. d-in-a-ŋ-doq-ŋ 38-2-3-1-fly-0 ‘They flew.’

Subject plural marking spread by analogy to past tense forms, where the original perfective marker appears in P2 as a past tense marker (in- < *si-ŋʷ). In Yeniseian, the rightward metathesis of the perfective-stative marker into the position directly before the verb root apparently helped preserve the original pre-root subject markers from erosion in many intransitive stems. A few transitives sporadically retain the reanalyzed perfective suffix, as illustrated by the Yugh examples in (13a–d). This irregular verb, shown in its full paradigm in Werner (1997b: 194–195), also contains a rare survival of the original DeneYeniseian tense/mood alternation between conjugation markers *ɢa- and *si-, which still appear in their original position of P3 rather than P4: (13) a. k8-aŋ4-ɨs3-ej0 28-3..4-3-kill0 ‘You (sg.) killed them.’ b. k8-aŋ4-χ3-ej0-aŋ-1-ɨn-2 28-3..4-3-kill0--1--2 ‘You (pl.) kill them.’

Areal features in Yeniseian grammaticalization

447

c. d8-aŋ4-ɨs3-ej0 38=3..4-3-kill0 ‘He kills them.’ d. d8-aŋ4-χ3-ej0-aŋ-1-ɨn-2 38=3..4-3-kill0--1--2 ‘They killed them.’ More evidence that this affix originally expressed aspect rather than plural agreement comes from occasional Ket intransitive singular forms (14), where the original changeof-state suffix -iŋ remains as a base augment in a form with no plural meaning at all: (14) b3-il2-a1-tel-iŋ0 3.3-2-1-freeze- 0 ‘It froze.’ Because -iŋ resembles a plural suffix, such forms tended to be reanalyzed as iteratives, meaning ‘freeze many times’. Suffix-less forms such as b3-in2-a1-te:l0 came to be used to mean ‘it froze once’ though the lengthened vowel in -te:l0 ‘freeze’, caused by erosion of -iŋ, indicates that the change-of-state suffix was originally present in these single event forms, as well. The intransitive marker -a- in P1 also often elides in Ket, so that the meaning ‘it froze’ has been recorded as b3-il2-te:l0 and b3-in2-te:l0, as well as b3-in2-a1-te:l0, b3-il2-a1-te:l0, and b3-il2-a1-teliŋ0. The rightmost portion of the verb in modern Ket tended to erode as the template realigned itself toward a suffixing structure (section 3.4 below). Several TAM categories in Ket are expressed by particles directly before the finite verb which never join it phonologically. Following the general tendency of preposed grammatical elements in Ket to mimic suffixes, these particles tend to encliticize to the preceding word: ba ‘habitual past’, asn ~ as ‘habitual future’, an ‘habitual present’, sin ‘single action in the indeterminate past’, qam ‘single action in the immediate future’, sim ‘irrealis/conditional (any tense)’, qān ‘optative (let it become)’, atn ~ at ‘negative imperative (don’t!)’, bǝ̄ n ‘negative particle (also negates other parts of speech)’, and bin ‘mirative (used to narrate an event unexpected to the speaker)’. Some probably include survivals of the light verb roots *si- or *ɢa-: qān < *qa-ŋw-j ‘optative’ (where -ŋw is the perfective suffix and *-j the resultative), sim < *si-ŋw ‘conditional’; qam < *qa-ŋw ‘proximal future action’. The origins of the habitual particles asn ~ as (future), ba (past), and an (present) are unclear. Indeterminate past sin is grammaticalized from an adverb meaning ‘once’ (cf. Kott alšin ‘one time’). Mirative bin is grammaticalized from the reflexive pronoun bin ‘self ’. Negative imperative atn ~ at may derive from interrogative atn ‘why?’ Ket-Yugh negative bǝ̄ n and Kott mon reflect Proto-Yeniseian *wǝnj, probably a negated copula but the morphological structure is unclear. The Ket copula usaŋ ~ usam ‘is, was, are, were’ and translative postposition -esaŋ ‘in order to become’ also probably contain ancient auxiliary verb roots followed by the perfective suffix *-ŋw.

448

Edward Vajda

. Subject/object agreement Agreement in Dene-Yeniseian followed accusative alignment, with a strong preference for animate subjects in bi-argument clauses. As shown in Figure 3 earlier, 1st and 2nd subject agreement stood before the heavy verb root, and 3rd person agreement before the conjugation prefix (the original light verb). Object agreement stood farther to the left and probably originated from the incorporation of postpositional constructions into the verb complex. Pre-Proto-Yeniseian innovated a verb-final animate-plural subject suffix -n, which resembles the animate-class noun plural suffix -n. The Athabaskan verb complex contains an animate-class nominalizing enclitic of a similar shape in the same position that could be homologous, which would indicate that the Yeniseian animate-class noun plural suffix -n was innovated by analogy to the verb suffix. By Proto-Yeniseian times, a major innovation in subject person marking was under way. The original pre-root subject markers had largely eroded except in a subset of intransitives (and a smaller number of transitives) where the presence of the stative affix appears to have protected them from elision. To compensate for their loss, each daughter branch of Yeniseian innovated a new subject marking position. In Ket, a subject person-marking prefix was added to the leftmost edge of the verb; these markers became ditropic clitics in modern Ket. Some Ket transitives retain traces of the original pre-root subject markers (base anlaut g- in forms [15b] and [15e]). Kott, by contrast, innovated a new verb-final suffix position by extending to finite verbs the predicate concord suffixes used on nouns, pronouns, adjectives and adverbials in copular constructions, probably under the influence of the suffixing Turkic languages spoken by neighboring tribes. This did not develop in Ket and Yugh, spoken much farther north along the Yenisei river in minimal contact with Turkic speakers. The new Kott agreement suffix marked subject person and number and attached to the right of the animate plural subject agreement marker inherited from Proto-Yeniseian. () Ket subject marking a. d=b-il-bɛd 18=3.3-2-make0 ‘I made it.’ b. k=b-il-gɛd 28=3.3-2-make0 ‘You. made it.’

() Kott subject marking a. b-a-la-paj-aŋ- 3.4-3/2-make1.-3 ‘I made it.’ b. b-a-la-pa-u- 3.4-3/2-make2.-3 ‘You. made it.’

Areal features in Yeniseian grammaticalization

449

c. d=b-il-bɛd 38=3.3-2-make0 ‘He made it.’

c. b-a-la-pex-Ø- 3.4-3/2-make-3-3 ‘He made it.’

d. d=b-il-bɛd-n- 18=3.3-2 -make0-1 ‘We made it.’

d. b-a-la-pe-n--toŋ- 3.4-3/2-make.-2-1.-3 ‘We made it.’

e. k=b-il-gɛd-n- 28=3.3-2-make0-1 ‘You. made it.’

e. b-a-la-pet-n--oŋ- 3.4-3/2-make.-2-2.-3 ‘You. made it.’

f. d=b-il-bɛd-n- 38=3.3-2-make0-1 ‘They made it.’

f. b-a-la-pet-n--Ø- 3.4-3/2-make.-2-3-3 ‘They made it.’

In Kott, the new agreement series resulted in multi-site marking of plural animateclass subjects by suffixes in positions -2 and -3. In Ket, the new subject position on the verb’s leftmost edge led to the rise of multi-site subject person marking in any stem that had preserved the original subject markers in P1. Vajda (2017b) describes additional processes of metatheses in Ket that interchanged the subject and object marking positions in certain stems, but with no semantic reanalysis of the original agreement functions. All of these innovations together ultimately created the three productive transitive agreement-marking configurations and five productive intransitive configurations found in the language today (Vajda 2004, 2009, 2014).

. Shift from prefixing to suffixing verb morphology One of the most striking typological features of Ket is its mix of prefixing and suffixing verb structures, all conforming to the same position class template shown in Table 6. Tab. 6: Position-class template of the Modern Ket finite verb. P

P

sbj incorperson porate agr

P

P

P

obj or sbj agr

thematic  anim agr con-so+ tense/ nant(s) mood portmanteau

P

P

P

P

P-

 inan agr

tense mood aspect

, sbj or obj agr or stativeresultative prefix

base (verb root often with fossilized aspect) suffix)

anim pl sbj agr

450

Edward Vajda

Basic verbs tend to be strongly prefixing, with the semantic head in base position (P0). The same is true of the Yugh and Kott verbal lexicons. (17) Prefixing verbs in Ket a. d8-i6-t5-o4-l2-oŋ0 18-3..6-5-4/2-see0 ‘I saw her’ b. ku8-l2-di1-vʌk0 28-2-1.1-pull0 ‘You. pulled me.’ c. d8-o4-n2-tɛt0 18-3..4-2-hit0 ‘I beat him.’ d. du8-b3-bɛt0 3..8-3.3-make0 ‘He makes it.’ e. k8-ik7-si4-vɛs0-n-1 28-into.open.space7-4-pass0-..-1 ‘You. arrive.’ The only productive suffix in such verbs is animate-class plural -n in position -1. An example appears in (17e), but this morpheme can also be added to the forms in (17a–d) to indicate an animate-class plural subject. Most finite verb forms in modern Ket, however, are strongly suffixing. Many productive verb stem patterns incorporate an action nominal into position 7, so that the semantic head occupies the verb’s initial syllable. The verb root in the original semantic head position (P0) has eroded both semantically and phonologically, merging with the surrounding aspectual affixes to form a sort of suffix marking transitivity, iterativity, or inceptivity. This transformed the Ket verb into a strongly suffixing structure resembling the morphologies of neighboring families. This typological shift occurred with no alteration of the inherited template and no borrowing of foreign morphological material. (18) Suffixing, action-nominal stem verbs in Ket a. d8-igbɛs6-ku6-ɣ5-o4-l2-bɛt0 18-arrive.7-2.6-5-4/2-.0 ‘I brought you. (many times)’ b. igbɛs6-a6-ɣ5-a4-qan0 arrive.7-2.6-5-4-0 ‘He begins arriving.’

Areal features in Yeniseian grammaticalization

451

The base -bɛt in (18a) originally meant ‘make’, and still carries this meaning in basic prefixing verbs (example 18d). Inceptive -qan (past tense -qon) derives from the tense-aspect morphemes *qa4-n2 reanalyzed as the verb base in position 0, presumably after the original heavy verb root eroded completely in these forms. This would explain the absence of P2 aspect suffix elsewhere in these verbs and the past-tense shift of /a/ to /o/ in the base qan, an alternation that normally affects only P4 morphemes. Four more examples of highly productive action-nominal stems – causatives based on reanalysis of a verb root originally meaning ‘put’ – can be seen in (23) and (24) in the next section. The other productive verb stem patterns require incorporated nouns or adjectives in P7. A full description of Ket incorporation can be found in Vajda (2017a). Three examples are given below: (19) Incorporating stems in Ket a. d8-itn7-il2-bɛt0 18-yukola7-2-make0 ‘I prepared yukola (dried fish).’ b. d8-ūs7-a6-h5-a4-tɛt0 18-bear.spear7-3..6-5-4-hit0 ‘I spear him (a bear).’ c. d8-ugdɛ7-t5-a4-b3-sin0 18-long7-5-4-3.3-cause.to.become0 ‘I lengthen it.’ Incorporating stems are the only productive patterns that retain a functioning heavy verb root in P0. These stems still conform to the suffixing model because they contain a lexical root in the verb’s initial syllable (the incorporated noun, adjective, or directional). The overwhelming prevalence of verb types with a lexical stem at the leftmost edge developed gradually through accommodation to the exclusively suffixing verb structures in the surrounding languages. A final piece of evidence supporting this evolution can be found in the pronunciation of P8 subject person agreement markers. Originally, these were prefixes and mostly remained so in Yugh, as well as in basic Ket verbs of the type shown in (17). In verbs with a root morpheme incorporated into P7, however, they became clitics to prevent them from forming a syllable to the left of the stem’s lexical portion. If P7 began in a vowel, then the consonant portion of the P8 agreement marker remained as a prefix (19a–c above). If P7 began in a consonant, then the P8 marker encliticized to the preceding word. If no such word is available, the marker elides entirely, except P8 feminine da=, which procliticizes to the verb.

452

Edward Vajda

(20) Pronunciation of Ket P8 subject clitics d8=don7-si4-bɛd0 ‘I make a knife’ k8=don7-si4-bɛd0 ‘you. make a knife’ d8=don7-si4-bɛd0 ‘he makes a knife’ da8=don7-si4-bɛd0 ‘she makes a knife’

(donsibɛd or =d # donsibɛd) (donsibɛd or =k # donsibɛd) (donsibɛd or =d # donsibɛd) (da=donsibed or =da # donsibɛd)

For a full account of P8 phonotactics, see Vajda (2004: 74) and Kotorova and Nefedov (2015: 39–40). The behavior of P8 subject markers parallels that of possessive markers in the nominal morphology. Both were originally word-initial prefixes but became ditropic clitics that prefer to encliticize to a preceding word. Their evolution caused modern Ket to be pronounced as a suffixing rather than prefixing language.

. Voice, valency and alignment Vajda (2015) describes valency in the Ket finite clause. Grammaticalization of thematic consonants created two new valency types. Ket intransitive motion verbs become transitives by adding P5 thematic k (originally meaning ‘with’) to create applicatives with comitative meaning: (21) Ket intransitive motion verbs a. qima=ra iksives [=da8 igda7-si4-bes0] 8 grandma=3.. into.open.space7-4-pass0 ‘Grandma comes.’ b. ōp (d)tajga [< du8=t5-aj4-ka0] father 3..8=5-4-walk0 ‘Father walks around.’ (22) Transitivized Ket motion verbs (comitative applicatives) a. ōp hiɣ-dɯl doktajga [du8=o6-k/t5-aj4-ka0] father man-child 3..8=3..6-with/5-4-walk0 ‘Father leads (= walks around with) the boy.’ b. qima iliŋis grandma food daiɣuksives [< da8=igda7-u6-k5-si4-bes0] 3..8=into.open.space7-3.6-with5-4-pass0 ‘Grandma brings (= comes with) food.’ The second instance involves P5 thematic q, which originally expressed ‘motion into’ or ‘location inside’. Ket evolved four productive stem types where thematic q

Areal features in Yeniseian grammaticalization

453

combined with eroded forms of verb root originally meaning ‘put’ came to express inceptive meaning. The examples in (23), which contain the action nominal toʁojiŋ ‘to dry’, illustrate the two types of inceptives and their etymological origins: (23) Inceptive causatives with thematic q a. da8=toʁojiŋ7-q5-i4-b3-it0 3..8=dry.7-inside5-4-3.3-put.once0 ‘She starts drying him off (once).’ (lit., ‘She puts it once into drying’) b. da8=toʁojiŋ7-q5-a4-b3-da0 3..8=dry.7-inside5-4-3.3-put.0 ‘She starts drying it off (repeatedly).’ ~ ‘She keeps drying it off.’ (24) Inceptive anticausatives with thematic q a. da8=toʁojiŋ7-q5-is4-a1-tn0 3..8=dry.7-inside5-4-3.1-be.put.once0 ‘She starts drying off (once).’ (lit., ‘She gets put once into drying.’) b. da8=toʁojiŋ7-q5-a4-ja2-dij0 3..8=dry.7-inside5-4-3.1-be.put.0 ‘She starts drying off (repeatedly).’ ~ ‘She keeps drying off.’ Despite its plethora of agreement marking configurations, modern Ket follows accusative alignment. However, because the language prefers subjects with the highest clausal agentivity or sentience, there are no true passive forms. Stative-resultative verbs do not convey the action itself, only the resultant state, and do not normally mention the causal agent in the same clause. Only under recent influence from Russian, have resultatives occasionally been recorded used as participial passives with instrumental-case marked agents. The examples in (25) are from Werner (1997c: 214): (25) Transitive action verb (a) and its quasi-passive (b) counterpart a. kɛ’t tɯ’s d8-il7-u6-k5-si4-vet0 man rock 3.8-small.piece7-3.6-from5-4-make0 ‘A person breaks the rock.’ b. tɯ’s kɛr-as il7-u6-k5-s4-aja1-vet0 rock man- small.piece7-3.6-from5-4-1-make0 ‘A rock is (in a state of having been) broken by the person.’ Sentences like (25b) are atypical for Yeniseian, where bi-argument clauses rarely allow inanimate-class subjects. Our native-speaker consultants were hesitant to accept them.

454

Edward Vajda

Accusative alignment with a strong preference for the most active argument in the clause to occupy the subject position is a trait inherited from Dene-Yeniseian. This is contrary to Vajda (2008b), where the Proto-Yeniseian animate- and inanimateclass marking distinction was mistaken for earlier semantic alignment. Na-Dene languages likewise generally have accusative alignment, with the same preference for subjects of the highest animacy. Only Tlingit innovated semantic alignment, apparently under areal influence (Mithun 2013: 676–679). Athabaskan seems to have reanalyzed the inherited Dene-Yeniseian distinction between 3rd person animate and inanimate agreement into a kind of inverse marking system (e.g., Navajo direct yivs. inverse bi-) – a typological feature found in other North American families.

. Predicate concord Another striking typological mixture in Yeniseian is the use of predicate concord suffixes in locational or existential clauses, in contrast to the (historical) use of mostly prefixes to mark subject agreement in finite verb forms. Predicate concord suffixes attach to qualitative adjectives and spatial adverbs (26) Ket predicate concord suffixes on (a) adjective, (b) spatial adverb a. bū sel-du he bad-3.. ‘He is bad.’ b. āt kiseŋ-di I here-1. ‘I am here.’ These suffixes were inherited from Proto-Yeniseian, as can be seen by comparing them in Ket and Kott: () Ket predicate adjective agreement a. bɯd-di strong-1. ‘I am strong.’

() Kott predicate adjective agreement a. bik-taŋ strong-1. ‘I am strong.’

b. bɯd-gu strong-2. ‘You. are strong.’

b. bik-u strong-2. ‘You. are strong.’

c. bɯd-du strong-3.. ‘He is strong.’

c. bik-tu strong-3.. ‘He is strong.’

Areal features in Yeniseian grammaticalization

d. bɯd-eŋ-dʌŋ strong-.-1. ‘We are strong.’

d. bik-toŋ strong-1. ‘We are strong.’

e. bɯd-eŋ-kʌŋ strong-.-2. ‘You. are strong.’

e. bik-oŋ strong-2. ‘You. are strong.’

f. bɯd-eŋ-aŋ strong-.-3.. ‘They. are strong.’

f. bik-i’-jaŋ strong-.-3.. ‘They. are strong.’

455

Predicate nouns do not take concord suffixes in modern Ket, though Castrén recorded such forms in the 19th century (Werner 1997b: 209). The ability of nouns, adjectives, and spatial adverbs to take concord suffixes parallels the fact that they incorporate into finite verb stems. Yeniseian predicate concord originated from a prefixing configuration consisting of an incorporated adjective, adverb or noun + subject agreement prefix + copular verb (probably ~ *ǝŋj ). In the Kott forms above, the 1 suffix -taŋ derives from 1 prefix t + copular aŋ. In plural forms, the velar nasal was reinterpreted as a plural marker, as also happened with subject pronominals in finite verb forms. Yugh preserves tonal signatures (laryngealization or pharyngealization) left by the eroded word-final copular verb: (29) Yugh predicate concord morphology a. bɯd-diʔ strong-1. ‘I am strong.’

d. bɯd-ɯŋ-dʌː ħŋ strong-.-1. ‘We are strong.’

b. bɯd-guʔ strong-2. ‘You. are strong.’

e. bɯd-ɯŋ-kʌː ħŋ strong-.-2. ‘You. are strong.’

c. bɯd-duʔ strong-3.. ‘He is strong.’

f. bɯd-ɯŋ-ɛːħŋ strong-.-3.. ‘They. are strong.’

A comparison of the Ket and Yugh inanimate-class predicate concord suffix shows that they derive from the same type of prefixing structure found in finite verbs, consisting of inanimate class agreement prefix *b- and copula ~*-ǝŋj: (30) Inanimate-class concord suffix forms in Ket (a) and Yugh (b) a. bɯd-am (< *b-ǝŋj ) b. bɯd-ɛʔ strong-3. strong-3. ‘It is strong.’ ‘It is strong.’

456

Edward Vajda

c. bɯd-eŋ-am strong-.-3. ‘They. are strong.’

d. bɯd-ɯŋ-ɛʔ strong-.-3. ‘They. are strong.’

The reanalysis of agreement prefix + copula into a suffix occurred before ProtoYeniseian and may have been influenced by contact with now-extinct Inner Asian languages or by early pastoral contacts. If this is correct, it represents yet another instance where native morphology accommodated to broader areal patterns, yielding the unusual typological blend of structures found in modern Ket.

 Complex constructions . Preliminary remarks Cross-linguistic comparisons of Yeniseian syntax owe much to overview of Ket syntax in Werner (1997c: 332–359) and the thorough treatment of Ket clause linkage in Nefedov (2015). The diachronic study of complex constructions in the family is limited by the fact that all sentence-level examples represent Ket or Yugh, the same primary branch, and most were recorded after Russian bilingualism has become pervasive. Complex constructions in Ket contain either a conjugated finite verb form or its corresponding action nominal. Ket differs from other Siberian languages, which require participial verb bases in subordinate clauses (Anderson 2003: 28– 34). At the same time, Ket differs from most polysynthetic languages in modifying dependent clause verb forms with postposed subordinators derived from nominal case suffixes or postpositions. Once again Ket reveals itself to be a typological hybrid.

. Adverbial relations Polysynthetic verb morphology merges with postposed relational modifiers most frequently to express interclausal adverbial relations. Nefedov (2015: 212–213) lists 37 case or postpositional markers used to subordinate one event to another in narration. All attach to fully conjugated verb forms, while 19 of them can alternatively attach to action nominals. (31) Temporal subordinate clauses in Ket a. Simultaneity (Nefedov 2015: 172) bū=d b-il-ɛl o-ɣon-bɛs he=3.. 3.--sing 3..-went- ‘He sang while walking.’

Areal features in Yeniseian grammaticalization

457

b. Posteriority (Werner 1997c: 350) āt qarɛ ɛŋŋuŋ bɔ-ɣɔtn-kupka, āt qasɛŋ ki’ iŋɢus=t hapto I that village 1.-go-before I there new house=1 build.it ‘Before moving to that village, I’ll build a new house there.’ c. Anteriority (Kotorova and Nefedov 2015: 285) kàl b-in-oʁut-qaɣa, ap hɯ’p uska d-imes fight. 3.--end-after my son back 3..-came ‘After the war ended, my son came back (home).’ Table 7 shows the etymologies of these and several other Ket postposed relations morphemes, which grammaticalized or regrammaticalized from nouns, adjectives, adverbs, case suffixes or postpositional constructions. The use of action nominals in place of conjugated verb forms for some types of subordination may be due to areal influence, since the Ket action nominal is functionally closest to the participial forms found in Uralic and Turkic subordinate clauses. Tab. 7: Origin of some Ket postposed subordinators. complementizer

grammaticalization source

-bes (a) ‘while’ (background action)

-bes ‘through’, ‘past’ (prolative case suffix)

-kupka (b) ‘before’ (prior action)

kūb ‘beak, tip’ + locative case suffix -ka ‘in front of’ (postpositional construction)

-qaɣa (c) ‘after’ (subsequent action)

demonstrative stem qa ‘that (far away)’ + locative -ka (adverb ‘after’, ‘afterward’)

-ka ‘when’, ‘if’

locative case suffix -ka

-diŋal ‘since’ (= ‘from that time’, or ‘because’)

di-ŋ-al (-- ‘from’, ‘out of’) (ablative case suffix)

-diŋa ‘until’

di-ŋ-a (-- ‘to’) (dative case suffix)

-dugdɛ ‘during’ (concurrent action)

d- possessive clitic + ugdɛ (adjective ‘long’)

458

Edward Vajda

. Complementation Nefedov (2015: 154–155) lists 47 complement-taking predicates in Ket and the formal means used to achieve complementation. Some reflect Russian influence, such as bila ‘how’ used as a complementizer (32a). The modal word nada ~ nara is a borrowing of Russian nado ‘need’; Ket complex predicates containing it are probably calques (32b). (32) Complementation strategies influenced by Russian a. sīn baam ɛndirunsɔŋ bila kʌ’j decrepit old.woman 3...forgot how walk. ‘The decrepit old woman forgot how to walk.’ (Nefedov 2015: 125) b. ki’ do’n kasɛs nada new knife take. need ‘It’s necessary to buy (= take) a new knife.’ One device that predates Russian influence is the translative complementizer -ɛsaŋ ‘in order to’, which attaches to conjugated verbs (33a), action nominals (33b, c), or nouns (33d) to form variety of complex predicates. (33) Complementation with translative suffix -ɛsaŋ ‘in order to (do, be, become)’ a. āt sa’q d-i-ɣej-ɛsaŋ dit-tos I squirrel 1-3..-kill- 1.-intend ‘I intend to kill a squirrel.’ b. lɔvɛt-ɛsaŋ da=qɔ’j work.- ..=wish ‘He wants to work.’ (literally, ‘Working is his wish.’) c. hɯ’p da=ōp suulbɛr-ɛsaŋ d-a-tivij son his=father sled-make.- 3.-3..-ask ‘The son asks his father to build sleds.’ Nefedov (2015: 138) d. tunil qotɛ tunbis qim-ɛsaŋ siɣʌtʌnoq then ahead such woman- 3...became ‘Afterwards she became that type of woman.’ Kotorova and Nefedov (2015: 627) Complementation with -ɛsaŋ exemplifies the characteristic Ket combination of polysynthetic verb morphology with postponed relational modifiers.

Areal features in Yeniseian grammaticalization

459

. Relative clauses Ket contains three types of relative clauses: headless (a), preposed (b) and postposted with complementizer (c). (34) Ket relativization strategies a. da=aŋ-ɢ-ej-s 3...=3..--kill- ‘the one. who killed them’ b. da=aŋ-ɢ-ej qīm 3...=3..--kill woman ‘the woman who killed them’ c. kisɛŋ qīm qorɛ=ra aŋ-ɢ-ej here woman =3.. 3..--kill ‘Here is the woman that killed them.’ Postposed relative clauses (34c) strongly resemble those of Russian, with the relativizer qorɛ ‘that, which, who’ grammaticalized from the demonstrative qorɛ ‘that (distal)’, probably under Russian influence. Headless relative clauses (34a) are a type of nominalization built with the same suffix -s used to nominalize all other parts of speech. Any finite verb form can be nominalized in this way. Example (34b) contains a preposed relative clause consisting of a simple verb clause with fully conjugated finite verb – a relativization strategy found nowhere else in Siberia and probably representing the original Yeniseian technique.

. Coordination Compound sentences in Ket tend to be asyndetic, with native coordinating conjunctions such as haj ‘and’, tam … tam ‘either … or’ generally found with parallel elements in the simple clause. For this reason, modern Ket has borrowed several basic Russian coordinating conjunctions: ili ‘or’, i ‘and’, no ‘but, however’, as well as adversative a ‘and, but’. Unlike subordination, which involves relational suffixes, coordination in Ket never alters the verb form. The recent adoption of Russian coordinating conjunctions represents only a minor modification to the traditional method of asyndetic linkage of polysynthetic verb clauses. As is typical for a polysynthetic language, the finite verb expresses the lion’s share of the clausal morphosyntax. Ket complex constructions are typologically unusual for their extensive adaptation of postposed relational modifiers from nominal morphology to express adverbial relations and predicate complementation, and for their recent adoption of Russian coordinating conjunctions, in contrast to the overall paucity of loanwords in the language (Vajda 2009).

460

Edward Vajda

 The sociolinguistic layering of morphosyntactic change in Yeniseian Specific types of contact-induced linguistic change arise from specific types of human interaction, not automatically through the mere fact of geographic propinquity. Certain aspects of Ket population genetics, as revealed by human DNA studies, together with anthropological data, can help reveal reasons for contact-induced language change. The Ket Y-chromosome, passed down from father to son, is overwhelmingly haplotype Q – the same basic haplotype distantly shared by most Native American males. By contrast, the diversity of Ket mitochondrial DNA haplogroups, passed down through the mother, indicates that wives were taken from adjacent tribes over many millennia. The exogamous, strongly patriarchal nature of traditional Ket society, suggests a contact scenario whereby Ket gradually underwent structural changes as female outsiders entering the group as brides acquired the language in early adulthood, while outright borrowing of lexemes and morphemes was culturally disfavored. Yeniseian-internal reconstruction and external comparisons with Na-Dene suggest which grammatical traits were inherited and which were innovated. Table 8 attempts a chronological layering of these traits,

Tab. 8: Sources of Yeniseian grammatical traits, with parallels in human genetics. a. Ancient (before  years ago) Dene-Yeniseian inheritance (Y-DNA haplotype Q, mt-DNA haplotype A): core templatic prefixing structure of finite verb and action nominals derived from that structure; possessive prefixes based on pronominal and nasal connector; basic lexicon, derivational affixes, and postpositions. b. ? Prehistoric contact ( to  years ago) with unknown North Asian languages (mt-DNA haplotypes U, F, shared by modern Kets with earlier Kitoi Culture burials dated to , – , BCE): remnants of the Kitoi people or other hunter-gatherers in south-central Siberia may be the source of certain traits not inherited from Dene-Yeniseian yet already present in Proto-Yeniseian before contact began with pastoral peoples: noun plurals, masculine and feminine gender agreement subclasses in singular animate nouns, predicate concord suffixes reanalyzed from inherited copular verb construction. c. Intermarriage with Uralic, Turkic, Mongolic, and Tungusic pastoralists over the past three millennia (mt-DNA haplotypes H, C, D): suffixal-agglutinating case system (possibly influenced by earlier contact, as well), realignment of finite verb toward suffixing. Idiosyncratic internal developments connected with the shift to suffixing include metathesis and reanalysis of finite verb position classes, as well as the addition of a new subject agreement position, leading to multi-site subject marking. d. Recent (largely post-) Russian bilingualism during a time of rapid Ket language loss and obsolescence: minor influences on grammatical structure, including shifts in noun class and case marking; see Minayeva () for a detailed treatment; borrowing of conjunctions and a few other function words.

Areal features in Yeniseian grammaticalization

461

along with the DNA haplogroups plausibly associated with each layer. See Vajda (2016) for more on the human genetic facets of Yeniseian prehistory.

 Summary Morphosyntactic change in Yeniseian appears frequently linked to language contact. The assimilation of young brides speaking unrelated languages led to the accommodation of an originally prefixing morphology to the exclusively suffixing morphologies of the surrounding tribes. Excepting the past several decades of intense Russian influence, these processes involved little or no outright borrowing of morphological material, producing in modern Ket a number of rare, if not wholly unique typological combinations. Morphological-word initial possessive prefixes on nouns and subject person agreement prefixes on verbs became ditropic clitics that prefer to attach to a preceding word so that most Ket nouns and verbs are pronounced with a lexical root in their first syllable. The Ket finite verb system restructured to a suffixing configuration, while fully retaining its inherited position-class template. Verbs with a final lexical root preceded by prefixes became unproductive and survive only in basic vocabulary. Alongside these contact-induced changes, inherited adjective and action nominal-deriving suffixes, and certain tense-aspect-mood suffixes were reanalyzed as plural or pluractional markers. Idiosyncratic metathesis of position classes in the Ket verb, together with processes of reduction of the original word-final verb root and adjacent TAM affixes led to the rise of multi-site subject marking and a bizarre array of agreement marking configurations, all of which are merely idiosyncratic lexical variations on the language’s inherited accusative alignment pattern. Ket syntax shows a hybrid blend of clause-level polysynthesis with suffixal agglutination of relational markers in subordinate clauses and other complex constructions. One particularly frequent type of reanalysis involves the reinterpretation of several different classes of morphemes as plural or pluractional markers. Surviving remnants of an unproductive adjective-deriving suffix as well as the action nominal suffix – both of the shape -ǝŋ and possibly originating as the same morpheme – were reanalyzed as markers of plurality or pluractionality, apparently based on nothing more than chance homonymy with the common noun plural suffix -ǝŋ. An analogous change occurred in the finite verb template, where the perfective-stative suffix -ŋ was reanalyzed as a plural marker, leading to irregular multi-site marking of subject plurality in certain verbs. These changes served to obscure the inherited position-class model of the verb in all of the Yeniseian daughter languages. In Ket and Yugh, the appearance of what seems like plural markers on a small number of adjectives created a striking exception to the otherwise strongly head-marking profile of Yeniseian morphology. It is possible that the expanded use of plural suffixes in the Yeniseian daughter languages was at least partly motivated by contact with

462

Edward Vajda

suffixal-agglutinating languages where morphological marking of plurality figured prominently. Finally, although the unusual typological shift from prefixing to suffixal agglutination was clearly induced through areal pressure, the specific types of grammaticalization evident in the Yeniseian daughter languages largely follow well-worn semantic pathways that are evident in many other languages worldwide. Body part and other concrete nouns grammaticalized to spatial adpositions, and spatial adpositions became clausal subordinators.

Abbreviations  = ablative case,  = adessive case,  = animate-class,  = action nominal,  = caritive case,  = dative case,  = feminine animate subclass,  = inanimate-class, = inceptive (beginning of action),  = comitative-instrumental case,  = intransitive,  = locative case,  = masculine animate subclass,  = nominalizer,  = object agreement marker,  = plural,  = possessive,  = prolative (prosecutive) case,  = perfective,  = past tense,  = relativizer,  = subject agreement marker,  = singular,  = thematic consonant,  = translative (= ‘in order to be’)

References Anderson, Gregory D. S. 2003. Yeniseic languages from a Siberian areal perspective. STUF 56(1/2). 12–39. Castrén, M. A. 1858. Versuch einer jenissej-ostjakischen und kottischen Sprachlehre. SanktPeterburg: Imperatorskaja Akademija Nauk. Georg, Stefan. 2007. A descriptive grammar of Ket. Part I: introduction, phonology and morphology. Kent, UK: Global Oriental. Kotorova, Elizaveta & Andrey Nefedov (eds.). 2015. Comprehensive dictionary of Ket with Russian, German, and English translations. Munich: Lincom Europa. Minayeva, Vera. 2003. Russian grammatical interference in Ket. STUF 56(1/2). 40–54 Mithun, Marianne. 2013. Contact and North American languages. In Raymond Hickey (ed.), The handbook of language contact, 695–713. Oxford: Wiley-Blackwell. Nefedov, Andrey. 2015. Clause linkage in Ket. Utrecht: LOT. Vajda, Edward. 2004. Ket (Languages of the world/materials 204). Munich: Lincom. Vajda, Edward. 2008a. Head-negating enclitics in Ket. In Edward Vajda (ed.), Subordination and coordination strategies in North Asian languages, 179–201. Amsterdam: John Benjamins. Vajda, Edward. 2008b. Losing semantic alignment: from Proto-Yeniseic to Modern Ket. In Mark Donohue & Soeren Wichman, The typology of semantic alignment, 140–161. Oxford: Oxford University Press. Vajda, Edward. 2009. Loanwords in Ket. In Martin Haspelmath & Uri Tadmor (eds.), Loanwords in the World’s languages: a comparative handbook, 471–494. Berlin: Mouton de Gruyter. Vajda, Edward. 2013a. Metathesis and reanalysis in Ket. Tomsk journal of linguistics and anthropology 1(1). 14–26.

Areal features in Yeniseian grammaticalization

463

Vajda, Edward. 2013b. Vestigial possessive morphology in Na-Dene and Yeniseian. In Sharon Hargus, Edward Vajda & Danny Hieber (eds.), Working Papers in Athabaskan (Dene) Languages 2012. (Alaska Native Language Center Working Papers No. 11). 71–91. Vajda, Edward. 2014. Yeniseian. In Pavel Stekauer & Rochelle Lieber (eds.), Handbook of derivational morphology, 509–519. Oxford: Oxford University Press. Vajda, Edward. 2015. Valency properties of the Ket verb clause. In Andrej Malchukov & Bernard Comrie (eds.), Valency classes in the world’s languages. 630–668. Berlin & New York: Mouton de Gruyter. Vajda, Edward. 2016. Dene-Yeniseian. In Oxford research encyclopedia of linguistics. Oxford Online. Vajda, Edward. 2017a. Ket incorporation. In Michael Fortescue, Marianne Mithun & Nicholas Evans (eds.), The Oxford handbook of polysynthesis. Oxford: Oxford University Press. 906–929. Vajda, Edward. 2017b. Patterns of innovation and retention in templatic polysynthesis. In Michael Fortescue, Marianne Mithun & Nicholas Evans (eds.), Handbook of polysynthesis, 363–391. Oxford: Oxford University Press. Vajda, Edward. 2019. Yeniseian and Dene hydronyms. In Gary Holton & Thomas Thornton (eds.), Language and toponymy in Alaska and beyond: Papers in honor of James Kari, 183– 201. Honolulu and Fairbanks: University of Hawai‘i Press and Alaska Native Language Center. Werner, Heinrich. 1997a. Abriss der kottischen Grammatik. Wiesbaden: Harrassowitz. Werner, Heinrich. 1997b. Das Jugische (Sym-Ketische). Wiesbaden: Harrassowitz. Werner, Heinrich. 1997c. Die ketische Sprache. Wiesbaden: Harrassowitz. Werner, Heinrich. 2005. Die Jenissej-Sprachen des 18. Jahrhunderts. Wiesbaden: Harrassowitz.

Agnes Korn

10 Grammaticalization and reanalysis in Iranian  Introduction The Iranian (Ir.) branch of Indo-European is a group of languages spoken over a large area and attested over nearly three millennia (see Table 1 and Figure 1). To introduce the Iranian languages mentioned in this chapter in a very rough and brief way I list them broadly from east to west so far as Old and Middle Iranian are concerned, which at the same time arranges them in chronological order of their attestations. While there are major differences between the Ir. languages, some general trends of development may be identified, and there seem to be clusters of categories where grammaticalization processes are particularly active. Overall, the following groups of functions seem to grammaticalize in parallel ways: (a) transitivity / actionality / control; (b) aspect / durativity / mood; (c) animacy and person marking.

Tab. 1: Iranian languages mentioned in this chapter. Eastern Iranian: –

Western Iranian:

Old Iranian (ca.  to ca.  BC): Avestan and Old Persian



Middle Iranian (ca.  BC to ca. AD ): Khotanese (texts from ancient Turkestan, present Xinjiang, China; chiefly Buddhist), Sogdian (along the Silk Road, Central Asia into China; texts from various religions), Bactrian (written in Greek script, chiefly manuscripts from the Sassanian era), Chorasmian (grouped with Middle Iranian for reasons of its grammatical structure although most texts are from the Islamic period)



Parthian, Middle Persian

contemporary Iranian languages (since the advent of Islam): Ossetic, Yaghnobi, Shughni, Munji, Wakhi, Pashto

https://doi.org/10.1515/9783110563146-010

Zazaki, Kurdish (Sorani, Kurmanji, Southern Kurdish), Semnani, Taleshi, Tati, Vafsi, Gilaki, Sangesari, Balochi, Bashkardi, Laki, Central dialects, Caucasian Tat, Luri, Bakhtiari, (New) Persian

466

Agnes Korn

Fig. 1: Map showing the approximate location of Iranian languages (selection).

To illustrate these clusters, the present chapter will also mention processes that are not grammaticalization in the strict sense, such as the reanalysis of suffixes and periphrastic constructions. Needless to say, this chapter is only a selection of what could be discussed.1 The starting point is set by the rich morphology of Proto-Indo-European (PIE), both in the nominal (three numbers, three genders, eight cases) and verbal systems (two voices, three verbal stems originally encoding aspect, with several moods each, and additional formations for aktionsart and causativity). Owing to phonological changes and related processes, many of these categories are lost during Middle Iranian times. Conversely, new categories arise, which are expressed by grammaticalization of free morphemes or by periphrastic constructions.

 Unless otherwise noted, references for the points mentioned in what follows are found in Korn (2016a) and Korn (2017a), on which this chapter is mainly based. Old Ir. forms with an asterisk are meant as general Old Iranian, without the phonetic specificities of Old Persian or Avestan. The examples taken from published sources have been somewhat unified in transcription, glossing (which is mostly mine) and translation for the present purposes. Unreferenced Balochi examples are from fieldwork with Maryam Nourzaei, Bashkardi is from the recordings made by Ilya Gershevitch in 1956 (cf. Korn 2017b).

Grammaticalization and reanalysis in Iranian

467

 Grammaticalization of nominal categories . Case Starting within the Old Ir. period, the inherited system of eight cases (still in use in Avestan) is subject to syncretism (cf. Table 2), yielding six cases in archaic Middle Iranian (Khotanese, “light stems” in Sogdian), then three (late Khotanese, Chorasmian). Subsequently, the two-way distinction of direct () vs. oblique () is displayed by the older stages of Middle Western Iranian and Bactrian, and by the “heavy stems” in Sogdian. Numerous New Ir. languages (among them Semnani, Kurmanji, Pashto, Yaghnobi) have preserved this two-case-system, and the far-reaching functions of the oblique case (including the marking of direct and indirect objects, of the possessor, and of the agent in ergative constructions). At this point, new case marking arises through the grammaticalization of adpositions. For instance, an innovation apparently common in Gilaki and Balochi has led to the functions of the inherited oblique marker being limited to a neo-genitive, while the marker -ā was introduced for the remaining oblique functions, perhaps as a “consequence of the functional overloading of the genitive when it became a marker of the direct object” (Thordarson [2009: 169] about the Ossetic dative). This -ā might have been copied from New Persian, which has grammaticalized =rā (dialectally =ā, and originally a postposition ‘with respect to’) to mark, inter alia, defi-

Tab. 2: Case syncretism in Iranian (simplified). ProtoIranian, Avestan

I: Old Persian

II: older Khotanese, Sogdian “light stems”















III: later Khotanese, Chorasmian

-



- ( *-ahya,  *-ānām) Pronouns:  *mana (etc.)









- 



IV: Sogdian “heavy stems”, Parthian, earlier Bactrian & Middle Persian; Kurmanji, Zazaki, Pashto, Yaghnobi, etc.

V: later Bactrian & Middle Persian; New Persian, Sorani, etc.

  -Ø;  *-ē; Pron.  *azam

 *-ahya > *-ē > -Ø,  *-ānām > -ān (but   *-ana,  *-abiš in some Pamir languages); Pron.  man

 -Ø,  -ān; Pron.  man

468

Agnes Korn

nite direct and indirect objects (cf. Section 2.2).2 An  marker -de employed, e.g., in Sangesari in a parallel way to Persian =rā might likewise derive from a locational adposition (cf. Stilo 2009: 707 f.). case markers can be combined to yield more cases. In Balochi, -rā (which would thus have been borrowed twice) may be added to the oblique case (-ā-rā) to add pragmatic emphasis, and the oblique marker -ā is suffixed to the genitive to yield a locative as in example (1) in the dialect of Turkmenistan and Afghanistan; in the latter, its use is restricted to humans (see Korn [2008a] for the Balochi locative). (1)

Balochi (Afghanistan) (Korn 2008a: 88) čōrika āt {watī pis u mās } -ay-ā boy come..3 own father and mother--4 ‘The boy came to his parents (lit. father and mother).’

Locational nouns are likewise grammaticalized: a suffix deriving from Old Ir. *arda‘side’ gives a neo-dative in Shughni and Wakhi; it may be used in Ossetic to reinforce the allative (Weber 1980: 133). Ossetic shows nine cases, and while it is not clear which ones are inherited, some are clearly secondary, and overall the system seems to have been adjusted to that of the neighbouring Caucasian languages by grammaticalizing combinations with various postpositions (see Thordarson [2009: 124–171], Weber [1980]; for a recent account of the Ossetic case system, see Belyaev [2010]). There are also grammaticalized adpositions which are used in case-like function. The most important element here is Old Ir. *hača ‘from’ (the Khotanese cognate jsa is already frequently suffixed to the ablative-instrumental). It forms a genitive of pronouns in Taleshi, Southern Tati and several Eastern Ir. languages (e.g., čaman, aš-ta ‘from me / you’ etc. in Taleshi, CLI 299); other prepositions can be prefixed to some Eastern Ir. personal and demonstrative pronouns as well (e.g., Munji dāmox ‘in/on us’; see Wendtland [2009: 182 f.] for these forms). In several other Ir. languages, free prepositions (usually meaning ‘to[wards]’) mark direct and indirect objects.4

 Middle Persian shows rāy predominantly in lexical function (‘on account of ’ etc.), but also has examples for its use with (similar numbers of) direct and indirect objects; in nearly all instances, rāy occurs on objects which are definite and animate (cf. the data in Jügel [2015: 193–216]).  Note also the “group inflection”, which is regular in many New Ir. languages.  The developments of the case system in Iranian are conveniently summarized in Stilo (2009) and Windfuhr (1992). (Needless to say, one might disagree with some point or another.)

Grammaticalization and reanalysis in Iranian

469

. Animacy and the marking of participants The (partial) loss of inherited categories motivates changes in the ways case and number are marked. The marking of the direct object illustrates this particularly clearly. In the Old Ir. and the more archaic Middle Ir. languages, it is only the syntactic context that triggers a certain case (adapting terminology used by Bashir [2008: 49–52], this might be termed “syntactic object marking”, e.g., Avestan uta druuā˚ aspəm viste ‘the wicked one obtains the / a horse’, [JamaspAsa 1982: § 82]), but many New Ir. languages show “Differential Object Marking” (DOM; see Bossong [1985]), i.e., the presence of case marking “depends on inherent semantic properties of the object (animacy, person) or its referential status (definite, indefinite, specific, non-specific)” (Bashir 2008: 52). For instance, identified direct objects ([±animate]) are marked with the particle =rā in New Persian (see Section 2.1) while the generic noun is used for an unidentified object, e.g., asb-Ø mībīnam ‘I see a horse / horses’ (unmarked for case and number) vs. asb=rā mībīnam ‘I see the (specific) horse’. In other Ir. languages, it is animacy (combined with definiteness) that triggers case marking, either by case endings (inherited or new) or by adpositions. DOM occurs with both inherited case endings and innovated markers and adpositions.

. Number The widespread loss of final syllables in Middle Iranian also had the effect of eliminating much of the inflectional plural marking although some of it survives in the . -ān (< Old Ir. .) in many Ir. languages (cf. Table 2). This suffix is reanalysed as a plural marker in Middle Persian, Parthian, Bactrian, and many New Ir. languages. Some Ir. languages attach the case markers used in the singular to this plural suffix. The inflection of secondary plurals is agglutinative as well. Novel plurals arise either from abstract or collective formations, such as Middle Persian -(ī)hā, and -t, which is found in a number of Eastern Ir. languages. Both come from suffixes deriving abstract nouns (owing to which the Sogdian plural is inflected as a feminine singular). In Middle Persian, there is a difference between the two plural formations in that -(ī)hā denotes “individual plurality” (Skjærvø 2009: 205) cf. kōf-īhā vs. kōfān ‘mountains’ (cf. German Berge vs. Gebirge ‘mountains’). Nouns grammaticalized as plural markers include -gal (‘group, herd’), which is widely employed, e.g., in the Luri, Bakhtiari and Kurdish groups and in some of the varieties called “Central dialects” and in those found in the province of Fārs.

470

Agnes Korn

. Possession, adpositions and noun phrase structure In Ir. languages that have preserved the two-way distinction of direct vs. oblique case, the latter is also used in genitive function. Ir. languages with a separate  case include Balochi (specializing the inherited oblique, cf. Section 2.1). Alternatively, a clitic, traditionally called “ezāfe”, is used systematically in Persian and occasionally in Bactrian to attach dependent elements to their head nouns (2a). This clitic goes back to the Old Ir. relative pronoun (cf. Section 4.2 below). Kurdish and Zazaki also use the ezāfe construction, where it has different forms dependent on gender and case. The pronominal clitics (enclitic pronouns) are used in all functions of the oblique case and are an alternative means to express possession (2b). (2)

New Persian a. ketāb =e man book = I b. ketāb =am book =1 ‘my book’

There is no inherited verb for ‘to have’, and the so-called mihi est construction (3a) is used instead. Depending on parameters such as alienable vs. inalienable possession, adpositions may be used instead of an oblique or other case (3b). (3)

a. Balochi (Jahani and Korn 2009: 666) tarā brās nēst you.. brother .exists ‘you don’t have brothers (lit. for you brother doesn’t exist)’ b. taī kirr-ā dān ast=ẽ you.. side- rice exists=.3 ‘Do you have rice (lit. is there rice at your side)?’

In Persian and some other Ir. languages, the verb dār- (originally ‘hold’) has undergone a semantic shift to ‘have’.5 New adpositions are formed by the grammaticalization of nouns. These conform to the noun phrase patterns of the language, thus yielding a head-initial pattern, e.g., in Persian (e.g., rū=ye ‘on [lit. face=]’) and a head-final pattern in Balochi as in (3b), lit. ‘on the side of ’.

 Noteworthily and in a way entirely parallel to that seen in Western European languages, the same verb is also used as auxiliary for the perfect, and for the progressive in some Ir. languages (see Sections 3.1, 3.4).

Grammaticalization and reanalysis in Iranian

471

. Determiners While there is no inherited definite nor indefinite article, Ir. languages show a number of inherited demonstrative pronouns, which are also used as pronouns of the 3rd person. Demonstratives yield definite articles in Sogdian (stems x-/w-, as in [41] below, and y- / m-) and in Bactrian (i , m- ) while they continue to be used as demonstratives at the same time (see Wendtland [2011a] for Sogdian and Gholami [2011] for Bactrian). The relative pronoun seems to be the origin of the definite article in Chorasmian (e.g., yā (a)sm-a ‘the sky’ [Durkin-Meisterernst 2009: 343]), Digor Ossetic (CLI: 468) and probably one of the sources of the Bactrian article ι (SimsWilliams 2000–2012/II: 214). A suffixed definite article -ak is found in Sorani and (in varying forms) in the Southern Kurdish dialects described by Fattah (2000). It appears to derive from *-aka- (Cabolov 1978: 12),6 which otherwise is a suffix for nominal derivation (e.g., Middle Persian haftag, New Persian hafté ‘week’ from haft ‘seven’); it is originally diminutive, but used so frequently in Iranian that it largely lost any meaning. Spoken Persian likewise shows the occasional use of a “referential” -é (pesar-é ‘that boy [there]’; Windfuhr and Perry [2009: 432]). Interestingly, the plural suffix -ān (see Section 2.3) follows the definite article (pyāw-ak-ān ‘the men’, [McCarus 2009: 598]). Sorani also has an indefinite article, likewise suffixed, viz. the clitic =êk,7 which surely derives from the numeral ‘one’ (Old Ir. *aiwa-ka-, Persian yak; cf. Cabolov [1978: 13]). Another variant of this numeral (*aiwa-) yields a clitic (Middle Persian =ē(w), New Persian =ī, similarly in many Ir. languages) that is often called “indefinite article”, but “specificity marker” (in the sense of Heine [1997: 72 f.]) might be a better term; it is quite different in use from the Kurdish article (see Section 4.2 for the use of this clitic in relative clauses). A new 2 pronoun arises in various Eastern Ir. languages which is based on the 2 pronoun, giving a form that looks like a combination with the 1 in Bactrian and some Pamir languages (thus apparently ‘you.-we’), and other derivations based on the 2 in Pashto, Ormuri etc. The motive for this innovation is likely to lie in a phonological process šm > m operating in these languages, which made the 2 pronoun (*šmāx) identical with the 1 pronoun (*māx), triggering its replacement by a new form (see Korn 2016b: 415–417).

 Grammaticalization of verbal categories As in the nominal system, some verbal categories were lost, beginning in later Old Iranian. The Middle and New Ir. verb (Table 3) is based on the dichotomy of present  Maybe -ak contains *-aka- with a second suffix.  In the plural, =êk is replaced by -ān. Kurmanji has an indefinite article -ek.

472

Agnes Korn

Tab. 3: Examples of verbal stem formation. present stem ()

past stem ()

‘do’ (Balochi)

kan-

inherited

kurt

verbal adj. in *-ta-

‘build’ (Middle Persian)

dēs-

denominative

dēs-īd

 + *-ita-

‘believe’ (Parthian)

wurraw-

loanword

wurraw-ād

 + *-āta-

stem (deriving from various present stem formations) vs. past stem (going back to the verbal adjective, see Section 3.1, or derived from the present stem by secondary suffixes).

. Transitivity Transitivity is a field of high grammaticalization activity in Iranian. Already in inherited verbs, it may be marked by verbal stem formation, where inherited intransitives and original causatives are distinguished by root shape and stem suffix, e.g., Balochi suč- ‘burn ’ (*suč-a(ya)-) vs. sōč- ‘burn ’ (*sauč-aya-); Khotanese hamäh- ‘change ’ (*fra-miθa-) vs. hamīh- ‘change ’ (*fra-maiθaya-). Additional means of (de)transitivization arise from suffixes. In Middle Iranian, the inchoative suffix -s- (< PIE *-sḱe-) is used to convert transitive verbs into intransitives, e.g., Sogdian ywc- ‘teach’ → yγwsty ‘is taught, learns’. This might seem surprising, but one could say that “the inchoative is typically seen as an action that happens all by itself” (Stilo 2004: 240). Conversely, newly arising suffixes are used to derive transitives from intransitives, and causatives from transitives, e.g., Ormuri -āw-, Parachi -ēw-, Parthian, New Persian and Yaghnobi -ān-, Middle Persian and Balochi -ēn- (the latter also has a double causative in -āēn-) while Sorani combines -ān- and -ēn- in its causative paradigm. Another strategy to express transitivity (as well as voice and aktionsart, as the case may be, see the sections below) is seen in the system of complex predicates, i.e., the combination of nominals (etc.) with semantically bleached verbs (“light verbs”). This phenomenon is extremely common in many New Ir. languages. The most common light verbs are ‘do’ (as, e.g., in Persian kardan, Ossetic kænyn, Wakhi tsar-, etc.) for complex predicates expressing [+control] or transitive, active and related meanings; and ‘become’ (Persian šodan, Ossetic uyn, Wakhi wots-, etc.) broadly for the meaning [+affected] or intransitive, mediopassive, etc. as in (4). In Zazaki, ‘do’ and ‘become’ are used with some preverbs in a similar way, e.g., /ā-kar-/ ‘open ()’ vs. /ā-bī-/ ‘open ()’. To this system further light verbs are added, chiefly, but by no means exclusively, in languages in contact with Persian (see Korn [2013: 50 f.] for discussion and references for the examples to follow).

Grammaticalization and reanalysis in Iranian

(4)

New Persian a.  penhān kardan hiding do. ‘to hide’

473

tarǰome kardan translation do. ‘to translate’

b. / penhān šodan tarǰome šodan hiding become. translation become. ‘to hide’ ‘to be translated’ In a somewhat similar way, verbs may be combined with “vector verbs”, whose choice is triggered in a parallel way to that of light verbs (5). (5)

Balochi (Eastern) (Bashir 2008: 74) a.  bākīγā āwār māl išt=ō dāθ-a rest. looted goods leave.= give.- ‘The rest of them abandoned the looted goods.’ b.  darmān udarθ=ō šuθ-a powder explode.= go.- ‘The powder blew up.’

The prominence of the category of transitivity is also enhanced by the fact that the past stem is based on the integration of a nominal form (viz. the verbal adjective / “perfect passive participle” in *-ta-) into the verbal paradigm; this form has a passive meaning for transitive verbs and an active one for intransitive ones (parallel to English eaten vs. gone). While the intransitive perfect / past tense is expressed by the perfect participle / past stem with the copula (as in [6a] and [8a]) in most of Middle Iranian and later on, the transitive paradigm shows different patterns. Khotanese employs an enlarged form of the perfect participle in combination with the copula. Other Ir. languages use a transitive auxiliary.8 As is common for auxiliaries in grammaticalization processes, phonological reduction occurs; so the verb ‘hold / have’ that forms the transitive perfect in Sogdian (in a pattern entirely parallel to Germanic and Romance have vs. be perfects) merges with the past stem, cf. (6b) vs. (44) below.

 The transitive pattern takes over in the course of Khotanese (enlarged past participle) and Chorasmian. In Sogdian, the ‘have’ pattern is generalized to intransitive unergative verbs in the later sources (Wendtland 2011b).

474

Agnes Korn

(6) Sogdian a.  ʾʾγt=ʾym come.=.1 ‘I came’ b.  xwrdʾr-y eat..hold-2 ‘you ate’ In Ossetic, the origin of the transitive formation (7) is not transparent anymore; the suffix might derive from Ir. *dā- (< PIE *dheh1 ‘put’), thus parallel to Latin formations of the type rube-facio ‘I make red’) and also parallel to the Germanic dental (“weak”) preterite, which “is usually assumed to be from the same root as the verb do” (Fortson 2004: 308, § 1.27). (7)

Ossetic a.

b.

 (/) xiz-ɨn graze.- ‘to graze’

. xɨstæn graze...1 ‘I grazed ()’

. xɨs-ton graze.-.1 ‘I grazed ()’

kal-ɨn kald-ɨstɨ kald-ton pour.- pour.-.3 pour.-.1 ‘to pour’ ‘they are poured’ ‘I poured’

Alternatively, an ergative pattern arises by the combination of an agent in the  case (which includes the pronominal clitics such as =t in [8b]) with the past stem, to which the copula or the verbal endings – agreeing with the patient (e.g., hēm in [8b]) – are suffixed. This applies, e.g., to Bactrian, Parthian, Pashto, Middle Persian, Kurmanji as well as an older layer of Sogdian. (8)

Parthian (Korn 2008b: 268) a.  (az) āγad hēm I. come. .1 ‘I have come’ b.  az hišt hēm sēwag u =t and 2 I. leave. .1 orphan ‘… and you have left me as an orphan’

Grammaticalization and reanalysis in Iranian

475

Consequently, ergativity in Ir. languages shows a split that agrees with the typological tendency observed by Trask (1979: 388): if there is a tense / aspect split in a given ergative system, it is the past tense / perfective aspect that shows ergativity while the present or imperfective domain patterns nominative-accusatively. As Ir. ergativity is of the morphological or “surface” type, Trask’s statement requires the following modification: it is the forms based on the past / perfect stem (“past domain”) that show ergativity, independent of their tense / aspect function (including modal forms). Conversely, the forms based on the present stem (“present domain”) may include past tenses (thus, e.g., in Sogdian and Yaghnobi) which pattern nominatively because of their morphology.

. Agreement (subject/object agreement) As mentioned in Section 2.1, animacy is an important category for case marking and agreement. In addition to case marking (suffixes or adpositions) being limited to certain types of objects in the present (non-ergative) domain (DOM, see Section 2.2), direct and indirect objects in the past domain are sometimes case marked in the way they would be in the present domain in some Ir. languages (see the summary in Table 4 below). As a result, case marking is rarely limited to “purely” ergative vs. nominative/ accusative types, and exhibits all theoretically possible types of argument marking listed by Comrie (1978: 332), including the “double oblique” type with subject and object both in the oblique, with the verb agreeing (see Section 3.1) variously with the subject (as in Vafsi) or the object (as in Balochi), or with neither of them (as in Taleshi). In Balochi (and perhaps in Yaghnobi), verbal agreement is limited to the marking of number for a third plural patient (as in [48] below, agreeing with the plural ‘fish’), while other languages show agreement in person, cf. e.g., Parthian (8b) above, and gender (e.g., in Pashto and Zazaki). In some Ir. languages, agreement with the indirect object is favoured over agreement with the direct object if the former is human, as is typically the case for verbs such as ‘give’. This pattern appears to be regular in Bactrian (Sims-Williams 2011), cf. (9), where the verbal ending -ēd refers to the indirect object: (9)

Bactrian (Sims-Williams 2011: 34) ud māx lād-ēd ei xwēciyau and we give.-2 this undertaking ‘and we gave you this undertaking’

The split ergative system is ousted in various Ir. languages. The transitive pattern is generalized in Pamir languages (all verbs patterning “ergatively”). Another pattern that one could term “ex-ergative” is the generalization of the (pronominal) agent clitics as agreement markers, as, e.g., in Semnani (10a): here,

476

Agnes Korn

Tab. 4: Summary of paths of development in the past domain (examples of some Iranian languages). ergative

agent  (including ) patient  verbal agreement with patient (or animate recipient)

Zazaki, Kurmanji, Pashto etc.

mix-ergative

DOM for patient: double  patterns various verbal agreement (or none)

Vafsi, some of Balochi

↓ ↓

loss of case distinction agreement with patient (or none)

later Middle Persian, Parthian, Bactrian

reanalysis of verbal ending as pronoun reanalysis of  as verbal ending

Sorani

ex-ergative

the past domain has two sets of verbal endings depending on transitivity, the transitive ones using the (former) pronominal clitics and the intransitive ones the inherited verbal endings. In the Jewish dialect of the city of Yazd (10b), the pronominal pro-clitics yield “conjugaisons préfixées” (Lazard 2005: 87). (10) a. Semnani (CLI 308) darviš-i bāt-eš dervish- say.-3 ‘the dervish said’ b. Judeo-Yazdi (Lazard 2005: 86) iv-â-râ eš-raxt water-- 3-pour. ‘s/he poured water’ Persian has come full circle from the Old Ir. / pattern through ergative to a novel / system (and so have some other New Ir. varieties). The use of the pronominal clitics for the subject in the past domain (goft=eš ‘s/he said’, raft=eš ‘s/he went’) is the only reflex of the former ergative construction. In Sorani, which has lost case distinctions, agent clitics are obligatorily used in the (ex-)ergative domain and are in this sense agreement markers of transitive verbs (as are the verb affixes in Semnani) even if they can freely occur in various places of a clause (cf. [11], with agent clitic =y inside the verb, and =yān in [12]). The same development, though less systematic, has occurred in other Ir. languages in the context of loss of case. By a converse development, the inherited verbal endings develop into oblique pronouns (Jügel 2009): in (11), -im is the direct object, and it functions as possessive pronoun in (12):

Grammaticalization and reanalysis in Iranian

477

(11) Sorani (Jügel 2009: 148) kart=ī duwam la nāx=awa a=y-xwārd-ım=awa part= second from inside= =3-eat.-1= ‘The second part was eating me up from the inside.’ (12) Sorani (Jügel 2009: 153) wargirt-im lawē taqrīr=yān there report=3 receive.-1 ‘There they took my report.’ Diachronically as well, it seems that some pronominals derive from verbal endings. This explanation appears to be viable for pronominal clitics of the 1 such as Balochi =ān, =un (vs. ending 1 -ān etc.), the 3 clitic =te in Laki and potentially some others (for discussion and references concerning the conversion of copula forms and pronouns, see Korn [2011]). Conversely, Iranian joins Hebrew, Arabic, Chinese, etc., in showing some copular forms arising from demonstratives. For the Sogdian demonstrative (ʾ)xw, the phenomenon might be due to language contact, since it is found in Buddhist texts (nearly all of which are translations from Chinese), and in texts from a Turkic milieu, as in (13), which even quotes a Turkish personal name. (13) Sogdian cnʾnklʾγy tmyr ʾwyzy nβʾnt ctβʾr krmyr 21 ʾspyty rγzy ʾʾsy xw . . at four red 21 white (cloth) take..  ‘In Čanglaγ (place name) four red and 21 white pieces of raγzi cloth are to be taken from Tämir-öz.’ However, other Ir. languages where an areal motivation is less evident likewise show copular forms that either synchronically or diachronically are pronominals. Diachronically, this applies to the Pashto copula 3  dǝy,  da,  dī deriving from the Old Ir. demonstrative *aita-, which is probably also the basis for the Ossetic copula 1 dæn, 2 dæ (with verbal endings attached to the “stem” d-) and 3 u, i, is (probably deriving from three different demonstratives). Synchronically, “the pronominal clitics sometimes perform the copular function” in Wakhi (Bashir 2009: 841), as in (14): (14) Wakhi (Bashir 2009: 841) tu=t kūi you.=2 who ‘who are you?’

478

Agnes Korn

. Voice/valency While Old Iranian expresses mood (see Section 3.5) and voice distinctions by suffixes, these formations are for the most part supplanted by analytical constructions in the subsequent stages. The inherited mediopassive (which includes not only the functions commonly seen for the middle voice crosslinguistically, but also the reflexive, reciprocal and passive) survives in the more archaic Middle Ir. languages (Khotanese, Sogdian), though even there, only a few verbs are used both in the active and the middle; it is lost everywhere else. Patterns coming close to the range of the Indo-European mediopassive include several of those mentioned for intransitives in Section 3.1: this applies to intransitives derived (diachronically or synchronically) from inchoatives as well as to the choice of light verbs in complex predicates (cf. [4]). The Old Ir. passive in -ya- is preserved to a slightly larger extent; it survives in Khotanese, Sogdian and Middle Persian as well as in some New Ir. varieties. Other Ir. languages have morphological passives as well, among these Sorani (suffix -rV-). Eastern Balochi has even acquired a new passive in -īǰ- (borrowed from Indo-Aryan). It is typologically noteworthy that such passives co-occur with ergative constructions in the same language (see Section 3.1). In analytical passives, the most commonly used auxiliary is *baw- ‘become’ (as for instance in Sogdian, Parthian, Middle Persian and Balochi). Otherwise verbs of movement are employed, e.g., *čyaw- ‘move forward’ in Ossetic, Pashto, some Pamir languages and New Persian (where šudan has shifted in meaning from ‘move forward’ to ‘become’),9 and ‘to come’ in Kurmanji and Caucasian Tat, see (40) below.

. Tense, aspect and aktionsart Originally, ancient Indo-European languages encode aspect by the choice of the verb stem, viz. present stem (imperfective), aorist stem (perfective), perfect stem (resultative). After the loss of the perfect and aorist, the newly arising opposition of present vs. past stem (see Table 3) indicates tense. A new aspect opposition is found in various New Ir. languages, often matching aspectual systems in neighbouring languages. It is chiefly adverbs and particles that are grammaticalized to mark aspect: in Persian the prefix mī- (deriving from Middle Persian hamē ‘always’) marks the imperfective aspect in the past tense while it has been generalized in the present tense (15), probably because the present tense is seen as an inherently imperfective category.10

 Examples of a passive with this verb from Khotanese and Christian Sogdian cited in the literature should better be interpreted in other ways (Sims-Williams 2014: 101 f.).  The copula and the verb dār- ‘hold, have’ do not use mī-.

Grammaticalization and reanalysis in Iranian

(15) New Persian kard mī-kard do. -do. ‘s/he did’ ‘s/he was doing’

479

mī-kon-ad -do.-3 ‘s/he is doing / s/he does’

In various other New Ir. languages, other prefixes (of uncertain origins) fill the same slot in the same or similar functions, such as di- in Kurmanji and a- in Sorani, Bashkardi (see [20] for additional formations) and some Balochi dialects. Some Eastern Ir. languages use prefixes for the perfective aspect, such as wəin Pashto (while verbs with preverb are made perfective by shift of accent). Several preverbs are employed to convert verbs into perfective ones in Ossetic. While this system is strikingly parallel to the one found in Slavic languages, influence from Russian does not date back long enough to be the decisive factor, and it is rather language contact with Georgian and other Caucasian languages that may have triggered the Ossetic aspect system. The prefix be-, which in New Persian has modal values (see Section 3.5), has been analysed as marking “completive” aktionsart, perfective aspect or the result of an action, or “emphasis” in Middle Persian (cf. the survey in Jügel [2013: 30–33], who assumes several etyma with directional and emphatic meanings that came together in this prefix). At the same time, bi- forms a future in Classical Persian (Jahani 2008) and already in Early Judeo-Persian (16a). It also forms a “close future” in the variety spoken in Abyane in Isfahan province (16b). (16) a. Early Judeo-Persian (Paul 2013: 125) kw bʿd_ʾz psh … by ʾy(m)  after Pessah …  come..1 ‘that I (will) come after Passover’ b. Abyanei bim na-xös-ā al’ön be-vāǰ-ān me -hit-. now 2-say-1 ‘Don’t hit me! I’ll say [it] at once.’11 The notions of imperfective or progressive, sometimes including nuances of prospective or future semantics, may also be expressed by analytic constructions in various stages of grammaticalization. A number of Ir. languages use locational expressions (‘be in [the state of] doing’). In Balochi, where the infinitive in the  is combined with the copula (17), the formation is synchronically transparent, while

 Lecoq (2002: 172): « ne me frappe pas ! Je vais le dire tout de suite ». Note that this future (bekar-ān ‘I will do’) is different both from the present (a-kar-ān) and the subjunctive (ba-kar-ān).

480

Agnes Korn

Jewish Tat uses the bare infinitive, and (as in Persian) the formation has become the general present tense (18), so that there is an aspect opposition only in the past tense. (17) Balochi (Southern / Western) (Jahani and Korn 2009: 675) man gušag-ā =un I say.- =.1 ‘I am saying’ (18) Caucasian Tat (Jewish) (Authier 2012: 192, 195) a. soxden =um do. =.1 ‘I do’ b. soxden bir-üm vs. soxd-um do. be.-1 do.-1 ‘I was doing’ ‘I did’ The Taleshi present tense is a combination of a verbal noun in a locative expression plus the copula (19). (19) Taleshi (Schulze 2000: 23 f.) a. kārde-da-m do.--1 ‘I do’ b. kārde-da bi-m vs. kārd-ǝm-e do.- be.-1 do.-1-.3 ‘I was doing’ ‘I did’ These patterns also occur with the prefixes mentioned above: in addition to the present tense a- + present stem (see above), Bashkardi also shows a progressive with the preverbs be- and a-. As the pattern employs a verbal noun, it patterns nominatively (the copula agreeing with the agent) although it is based on the past stem (cf. Section 3.1). Muslim Caucasian Tat likewise shows such a pattern (20). (20) a. Caucasian Tat (Muslim) (CLI: 298) ba-bāftan =üm -weave. =.1 ‘I am weaving’ b. Bashkardi (Southern) (Skjærvø 1989: 848) be-kert(-en) =īn -do.(-) =.1 ‘I am doing’

Grammaticalization and reanalysis in Iranian

481

c. Bashkardi (Northern) (Skjærvø 1989: 848) a-kerden =om -do. =.1 ‘I am doing’ Other languages of the Caspian region have a local copula ‘be in’ (probably etymologically identical to dar ‘in’), such as “the Mazandarani verb dayyen (present stem: dar-) ‘to be’ (indicating location rather than existence)” (Jahani 2017: 264) in patterns such as (21). (21) Mazenderani (Sari) (Šokrī 1995: 124 f.) dar-eme vače ǰem xos-embe be_in_place.-1 child with talk.-1 ‘I am talking to the child.’ Likewise bordering on locational constructions is the use of the verb ‘stand’. The starting point is seen in Old Ir. uses of ‘stand’ such as (22), where the semantics tends towards an iterative or durative auxiliary (see Benveniste [1966], from where I also took most examples for this topic): (22) Avestan yō me duš.saŋhō hišt-aite ... 1 injuring... stand.-.3 ‘who keeps injuring me (lit. who stands [as someone] injuring me)’ In Khotanese, the participle of ‘stand’ is added to a finite verb form to yield a meaning of imperfectivity (23). (23) Khotanese (Emmerick 2009: 404) u ttrāyi ṣṭāna vaṃña and save..2 stand.. now ‘[you rescued previously] and you are now rescuing (lit. you are standing rescuing me)’ In Buddhist Sogdian, a particle deriving from ‘stand’ forms an imperfective (24). (24) Sogdian (Buddhist) wyn-ʾm ʾštn vs. wyn-ʾm see-.1 stand. see.-1 ‘we are seeing’ ‘we see’ The end point is seen in Yaghnobi, where the particle has been reduced to a marker of the present tense that is suffixed to the verbal ending, as in (25) and (37a) below.

482

Agnes Korn

(25) Yaghnobi wēn-om=išt see-1= ‘I see’ The “continuous form” of Tajiki is likewise formed with ‘stand’; here, it is the synchronically existing and inflected verb in combination with the perfect participle (26). The predecessor of this pattern is seen in the Middle Persian “perfectum praesens” that uses ‘stand’ as an auxiliary (27).12 (26) Tajiki Persian (Rzehak 1999: 78) man kitob xond-a istod-a=am I book read.- stand.-=.1 ‘I am reading a book’ (27) Middle Persian (Manichean) (Andreas and Henning [1933: 299–300], fragment M 9 II r, 16–18) gyān (…) andar tan ā'ōn āmixt ud passaxt ud bast soul in body thus mix. and mingle. and bind. ēst-ēd (…) stand.-3 ‘the soul (…) is (lit. stands) so mixed, mingled and bound in the body …’ Furthermore, it is possible that the plural stem of the copula in Iron Ossetic, (y)st(e.g., st-æm ‘we are’ etc.) might derive from ‘stand’ (see Bielmeier 1977: 162 f. and CLI: 477).13 While Buddhist Sogdian uses ‘stand’ (see [15]), the other dialects employ ‘remain’ to express imperfectivity (28). (28) Sogdian (Manichean) (Gershevitch 1954: 100) tʾš-nd=skwn cut-.3= ‘they were cutting’ Just as ‘stand’ and ‘remain’ just mentioned, Sogdian and Chorasmian also show a future particle that is suffixed to the finite verb (29a). In Sogdian, it exists at the same time as a finite verb (29b).14  Cf. Durkin-Meisterernst (2014: 384 f.), who points out the resultative meaning of this pattern. It has also been suggested that the Persian forms of the structure dādast-īm contain a contracted form of ‘stand’ (Jeremiás 1993: 106 f.).  For an alternative explanation (3 from *asti, other plural forms based on the new stem st-), see Weber (1983). The plural copula in Digor might derive from the Old Ir. copula. For the singular forms, see 3.2.  See Korn (2017c) for future and prospective formations in Iranian.

Grammaticalization and reanalysis in Iranian

483

(29) Sogdian a. ʾβyzy Lʾ βrt wn-ʾy=kʾm bad  carry. do.-2= ‘you will not be able to bear the hardship’ (Sims-Williams 2007: 378) b. L’ k’m not want..1 ‘I do not want’ (Sims-Williams 1996: 182) Similarly, the verb ‘hold’ forms a progressive in New Persian (30) and several other Ir. languages. Sogdian also has duratives composed of an *-aka-participle plus (transitive) δʾr-, (intransitive, passive) *ah- / wmʾt- / ʾskw- (Gershevitch 1954: 126). In Middle Persian, the pattern  + ‘hold’ has the meaning of “preservation of a state obtained after an event” (Henning 1934: 247). (30) New Persian (Jahani 2008: 169) quč‘alī če be mouqe‘ āmad-ī.  what to moment come.-2 man dār-am mī-r-am. I have.-1 -go.-1 xwāhar=at=rā tanhā na-gożār sister=2= alone -leave..2 ‘Quchali, how well on time you came. I am leaving (i.e., I intend to leave any moment). Don’t leave your sister alone.’ Verbs meaning ‘want’ are also used as auxiliaries in Persian and Kurdish (31), and at the same time keep being used as full verbs. (31) a. Persian xwāh-am raft want.-1 go. ‘I will go’ b. Mukri Kurdish (Öpengin 2016: 83) de=y-hewē bi-bār-ē =3-want..3 -rain.-3 ‘it is going to rain’ kām- is an auxiliary (used in combination with the past stem) in Abyanei (32) and yields a prefix in Sistani (33).15 Otherwise, inherited subjunctives (see Section 3.5) may be used to express future.  It has been suggested that the prefix k-, which is used with certain present stems in Balochi, also derives from kām.

484

Agnes Korn

(32) Abyanei16 čūn tūp sedāy ziād a … kömö tārsā as canon voice much .3 .3 fear. ‘since the noise of the cannon is enormous, he’ll be afraid’ (33) Sistani (Lazard 1974: 80) kma- rasīdan-o - arrive.-1 ‘I will arrive’

. Mood and modality While some of the Old Ir. moods survive in Middle and New Iranian (the subjunctive is generally preserved in Middle Ir. languages and in Ossetic), modal categories are predominantly expressed by novel formations in other New Ir. languages. In most cases, this is achieved by the grammaticalization of particles such as Middle Persian hēb, see (43) below. Interestingly, several prefixes mentioned in Section 3.4 as markers of imperfectivity are also found in modal function (subjunctive and/or conditional). This particularly applies to bi-.17 For the clitic a in Balochi and Bashkardi, the modal function appears to be the older situation ([34], paralleling the Persian subjunctive marker in [35]). Perhaps its reanalysis as a marker of imperfectivity, and its generalization to mark present tense is due to Persian influence (this goes so far as the verb dār- ‘have, hold’ being an exception in not taking the prefix, cf. Buddruss [1988: 62]).18 (34) a. Balochi (Coastal, Iran) man raw-ān āb dast =a kan-ān I go.-1 water hand  do.-1 ‘I am going to wash / to the toilet (lit. I go; I do the ablution).’ b. Bashkardi (Northern) be-yår-ie ke gwar=e hamie kabåb-ōn a-xwar-om -bring.-2  side=  meat- -eat.-1 ‘Bring [the bread] so that I might eat it with the meat.’ In Persian, but also elsewhere, modality is chiefly expressed by analytic constructions with the main verb in the subjunctive and the modal element either an inflected verb (35a–c) or a fossilized verb form (35d).

 Lecoq (2002: 221): « (comme le bruit du canon est considérable, il aura peur ».  Some languages also show a subjunctive past: in Gilaki and Balochi, bi- is combined with -ēnsuffixed to the verb stem.  See note 10.

Grammaticalization and reanalysis in Iranian

485

(35) Persian a. mī-tavān-am be-xwān-am -be_able.-1 -read.-1 ‘I am able to read’ b. mī-xwāh-am be-xwān-am -want.-1 -read.-1 ‘I want to read’ c. del=am mī-xwāh-ad be-xwān-am heart=1 -want.-3 -read.-1 ‘I wish to read (lit. my heart wants to read)’ d. bāyad be-xwān-am it_is_necessary -read.-1 ‘I have to read’

Among the periphrastic modal constructions that can claim the longest traceable history and widest use in Iranian is the potential construction composed of the past stem or the perfect participle (see Section 3) plus a finite form of ‘to do’ as an expression for ‘to be able’. This pattern is attested already in Old Persian, and it is found in Chorasmian, Parthian, Khotanese and Sogdian, see also (29). It is still in use in several New Ir. languages today, among these the Eastern Ir. varieties Munji and Yaghnobi (36) and in Balochi. The distribution of the auxiliaries ‘do’ for the transitive pattern vs. ‘become’ for intransitives may have provided a starting point for the use of these verbs in complex predicate pairs (cf. Section 3.1; see Korn [2013: 35– 40] for more discussion and references on the potential construction).

(36) Yaghnobi a.  moγ na-žoyt kun-im=išt we -read. do-1= ‘we cannot read’ b.  be hamra na-ed višt without comrade -go. become.2 ‘one (lit. you) can’t go without a comrade’

It is only in Chorasmian that ‘do’ has been phonologically reduced as might be expected from grammaticalization processes, and yields a particle (=k-) that is suffixed to the past stem of the main verb and carries the verbal inflection (37).

486

Agnes Korn

(37) Chorasmian kfʾmʾny prdki ka=fa=ma ne= pard=k-i for==1 = restrain.=-2 ‘for you cannot restrain me’ Conversely, in Ir. languages where the particles and locational constructions have yielded a general present tense (see Section 4.4), the inherited present sometimes assumes a modal meaning, such as the subjunctive / imperative in Caucasian Tat (38). (38) Caucasian Tat (Jewish) (Authier 2012: 175, 173) a. xun-it kele kele read.-2 big big ‘Read out aloud!’ b. čü sox-um me imohoy what do.-1 I now ‘What should I do now?’ Synchronically, one might interpret these patterns as “old presents” in the sense of Haspelmath (1998). However, it is actually not so that an inherited present would be “pushed aside” into modal function by the rise of a new present formation: as shown by Middle Persian and Parthian, the present was already used in future (39) and modal functions long before these new presents came into being. (39) Middle Persian (Manichean, verse) nāz-ēnd awēšān kē griyīd hēnd rejoice.-3 .  cry. .3 ud griy-ēnd imīn kē nūn xann-ēnd and cry.-3 .  now laugh.-3 ‘those who cried are now rejoicing and these who are now laughing will cry’19

 Grammaticalization of complex constructions In ancient Indo-European languages, there are few, if any, subordinating conjunctions, and inherited means of subordination include participle and infinitive constructions for complement and adverbial clauses. Even if the formation of verbal nouns has

 Durkin-Meisterernst (2014: 374): ‘Sie frohlocken, die geweint haben. Und diese werden weinen, die jetzt lachen.’

Grammaticalization and reanalysis in Iranian

487

Tab. 5: Subordinators etc. in Persian diachronically.19 Old Iranian

Middle Persian

New Persian

*kū

*kat

*kahya

*yat

‘where’

/. of / pronoun – ‘when’

 of  /  pronoun

/. of  pronoun



ka



ī

– ‘where’ –  – 

‘if, when’

, 

–  – 



ki > ke

=e

‘where’

: , , , ‘if’ etc.



changed over time, they continue to be employed in rather complex patterns containing what in other languages would be subordinate clauses in the form of participial or infinitive constructions. These patterns may even be strengthened in contact with languages that routinely use such patterns, namely Turkic and Indo-Aryan. The inherited relative pronouns are generalized as subordinators in a number of Ir. languages. Subordination with conjunctions and finite verbs is largely parallel or even identical for relative, complement and adverbial clauses; i.e., many Iranian languages use the same subordinators for all of them (cf. English that as relative particle and complementizer). Table 5 shows the development seen in Persian. The New Persian clitic ke can be called a general subordinator () as it introduces complement clauses (), relative clauses () and quoted speech () as well as adverbial clauses of various kinds while Middle Persian uses the subordinator kū (as does Early Judeo-Persian, see [16a] above). The subordinators may be combined with nominals to yield subordinates that essentially are relative or complement clauses, e.g., New Persian barā-ye īn ke entirely parallel to, e.g., French par-ce que ‘lit. for that which, i.e., because’.

. Complement clauses Inherited means of expressing complement clauses include the use of verbal nouns, e.g., an abstract noun formed from the infinitive as in the second part of (40).

 See Öhl and Korn 2008 for more details on the development of this system.

488

Agnes Korn

(40) Caucasian Tat (Jewish) (Authier 2012: 230) {{ ä=qäd en=u küšd-e omor-e } odomi-ho ye done =inside =3 kill.- come.- person- one piece ˁosir ne=debire-i=re=š } mi=danüsd-i rich =be_in.-==also =know.-2 ‘Did you know that there was not even one rich person among those who were killed?’ In many Ir. languages, a subordinator has developed, either a fossilized form of a relative or interrogative pronoun (see Table 5), or alternatively a connective particle such as the Sogdian clitic (ǝ)t(i), enclitic to the first noun phrase of the last clause in (41). (41) Sogdian (Yoshida 2009: 315) menu rti=šī xā xatēn māθ ǝti baγa and=3  queen thus  lord. think..1 čan xwēr-baγī ǝti ǝβt čintāman ratni nīži from sun-god.  seven  jewel. go_out..3 ‘The queen said to him: “O Lord! I thought thus: from the sun god went out the seven cintāmaṇi jewels”.’ This element may at the same time be used to introduce quoted speech, as does the first instance of ǝti in (41). Conversely, the use of a quotative particle (probably a calque on Azeri Turkish) is one of the subordinating strategies in Caucasian Tat (42), while =ho seen in Section 4.2 may also be used in this way. (42) Caucasian Tat (Jewish) (Authier 2012: 241) danüsden=üm biror=me soq=i =gufdire know.=.1 brother=1 safe=.3 = ‘I know that my brother is alive.’

. Relative clauses Relative clauses are inherited from Proto-Indo-European, and Iranian shows two inherited relative pronouns: the stems *ka- (which at the same time is the interrogative pronoun) and *ya- (see also Table 5). Within Middle Iranian, the relative pronoun becomes fossilized in the form of a relative particle, such as Middle Persian (uninflected) kē for animates vs. čē for inanimates (although the distinction is not always observed), deriving from the Old Ir. genitive *kahya, čahya. In a parallel way (and starting within Old Indo-Iranian), the neuter /. *yat is employed for relative clauses independent of agreement. It also yields the

Grammaticalization and reanalysis in Iranian

489

relative particle (Middle Persian) ī (43) and the “ezāfe” in New Persian (cf. Section 2.4). (43) Middle Persian (Manichean) (Durkin-Meisterernst 2014: 270) ud abar dāmān=iz { ī=šān pidēnag-ān } abaxšāyišn hēb kun-ēnd and on being=also =3 meat_meal- mercy  do.-3 ‘And they should practice mercy on the beings which are their meat meals.’ The more common strategy, however, is the use of the subordinator. In Sogdian, the subordinator seen in (41) is added to the relative pronoun (44). (44) Sogdian (Yoshida 2009: 318) yunē čakraβart čintāmani dārani ke=ti ǝzu parβērāt-dār-ām this  spell = I. explain.-hold.-1 ‘this Chakravart Chintamani spell which I explained’ In addition to the subordinator ki and participial relatives such as the clause with küšde omore in (40) above, Caucasian Tat also has a relativizing element =ho that attaches to the finite verb (45). It is likely to be identical in origin to the  marker -ho (Persian -hā), and might preserve a trace of the latter’s origin as an abstract suffix (see Section 2.3). (45) Caucasian Tat (Jewish) (Authier 2012: 252) eri vosdore { e=čum=yu xuš bi=yo-v=ho } či=re for buy. =eye=3 well =come-3= thing= ‘(he had come) to buy something that pleased him’ In Persian (and, probably under Persian influence, to some extent also in Balochi), the clitic =ī (Balochi =ē), in all likelihood deriving from Middle Ir. ēw ‘one’ (cf. Section 2.5), is obligatorily used on the head noun of restrictive relative clauses (46) and thus marks the relative clause as restrictive. (46) Persian (Windfuhr and Perry 2009: 503) ān doxtar=ī { ke Alī dust dār-ad } īnǰā =st  girl=   friend have.-3 here =.3 ‘That girl who loves Ali is here.’

. Adverbial clauses Just like in the case of complement clauses (see Section 4.1), non-finite subordinates are used with verbal nouns of various types (including novel formations, such as the infinitives in (47) and (45) above.  Text published by Farrell (2008: 131), his translation.

490

Agnes Korn

(47) Balochi (Karachi) allāh rabb-ul-izzat-ā god (epithet)- ‘Allah the Lord of Glory, { insān-ē bīnāī-ē pač kanag-ē wāstā } man- view- open do.- for for the sake of opening the sight of humans, { insān-ē dimāg-ē pač kanag-ē wāstā } man- mind- open do.- for for the sake of opening the mind of humans, { insān-ārā samǰāinag-ē wāstā } man- make_understand.- for for the sake of explaining to humans, dunyā-ē tōkā nabī dēm kut-a world- in prophet forward do.- sent prophets into the world.’20 As mentioned above, overt finite subordination is achieved with a subordinator (such as New Persian ki), with or without the addition of nouns (e.g., vaqt-ī ki ‘when [lit. the time that]’), so that this pattern is essentially a relative clause. Another alternative, not studied until now, is the repetition of a clause, with the second one being liable to interpretation as a subordinate. In (48), literally ‘And our mother went to the wedding the second day. Our mother went to the wedding, and we ate the fish’, the second clause seems to function as a temporal subordinate. This use of repetition is a kind of “Tail-Head-Linkage” (apparently not observed yet for Indo-European). (48) Balochi (Karewan, Iran) o mē mā ēdga rōč-a šed-a sīr-a and our mother other day- go.- wedding- { šed-a sīr-a mē mād } o mā māhī wārt-ã go.- wedding- our mother and we fish eat.-3 ‘… and our mother went to the wedding the second day. { [When] she [had] gone, } we ate the fish.’ Instances such as this are marked by a rising intonation of the clause that is liable to interpretation as a subordinate. This intonation alone may also mark subordination, without any repetition, such as in (49).

Grammaticalization and reanalysis in Iranian

491

(49) Balochi (Konarak, Iran) goš-ī { šēr ē šo } naparok=ē kāt say.-3 lion  go. person= come. ‘They say: { [When] the lion had gone (lit. went), } some person came.’

There are also instances of tense switch (past vs. present) that might be interpreted as indicating subordination, as in (50), for which Jügel (2015: 86) suggests the possible interpretation ‘When he had put a crown of thorns on his head, they came to praise him (…)’.

(50) Parthian (Jügel 2015: 86) xārtāg pad sar aweštād ō namāǰ ās-ēnd (…) crown_of_thorns on head put. and praise come.-3 ‘He (?) put a crown of thorns on [his] head, and they come [for his] praise (…)’

. Clause chaining There are additional strategies in Iranian for chaining sentences, yielding patterns that oscillate between juxtaposition of main clauses and subordination. One such strategy is the use of connectives. In Bactrian (ud [written οδο], as in [9] above, or the clitic =d [=δο]) and Sogdian (rti as in [41], alternatively the enclitic subordinator =(ǝ)t(i)), they are generalized to occur after the first word of virtually every clause. Note also the clause-introducing o ‘and’ in (48) and (50). Another common strategy is the use of converb-like non-finite forms, frequently the past stem (which is identical to the 3) or the perfect participle. In (51) the oblique kiṭag-ā shows the appropriate case for the (transitive) second and third verbs, but not for the first (intransitive) one. The sentence may conveniently also be understood as containing a subordinate (‘After having gone to buy rice, the grasshopper brought it back.’) in the sense of the pattern mentioned in Section 4.3 above.

(51) Balochi (Karachi) (Farrell 2003: 203) kiṭag-ā šu dān git ārt grasshopper- go. grain seize. bring. ‘The grasshopper went, bought [some] rice [and] brought it [back].’

In Yaghnobi, it is also possible to use the bare  stem in this way, as in the case of the second and third verb in (52).

492

Agnes Korn

(52) Yaghnobi (Jügel 2015: 419) man sitiriyon a-šaw-im I day_before_yesterday -go.-1 xor=im xapar a-nos ki a-vow sister=1 news -take.  -come. ‘Two days ago, I went, paid a visit to my sister and came [back].’

 Summary Iranian shows a number of grammaticalization processes well known crosslinguistically, such as the grammaticalization of adpositions as case markers in the nominal system, or the grammaticalization of auxiliaries for mood and voice categories in the verb system. Table 6 shows grammaticalization phenomena as found in Iranian, chiefly those discussed in this chapter, although, of course, many additional items could be cited. At the same time, there are also grammaticalization processes not shown in Heine and Kuteva 2002 (such as ‘come’ yielding a passive auxiliary, or the agent in Ir. ergativity coming from the  / ). Notably, the same verbs that are used as auxiliaries (then showing typical grammaticalization phenomena such as phonological reduction etc.) are also the most important light verbs in complex predicates (Table 7); and their distribution is parallel to that of the auxiliaries (‘do’, ‘hit’, ‘hold/have’ etc. for [+control] and ‘be, become’ and verbs of movement for [+affected]). This even applies where the verbs are etymologically not related (there are various roots each for ‘do’ and ‘become’), suggesting that the rise of complex predicates is a process parallel to the grammaticalization of auxiliaries. The categories of transitivity, control and actionality thus form one cluster of particularly high grammaticalizational activity in Iranian. Another cluster is the field of aspect, durativity and mood. Progressives (often becoming present tense formations) may be grammaticalized from particles or locational constructions. Auxiliaries are also found, among which, again, is ‘hold/have’ and the verb ‘stand’, the latter recalling the use of verbs of movement in the patterns just mentioned. A third field of particular interest is animacy and person marking, which is seen in the nominal and verbal system. Differential marking of direct and indirect objects is common throughout Iranian (though not universal), and to some extent it is also found in the ergative domain, where differential marking of agents is also found. The verbs mentioned in Table 7 are not the only elements to be grammaticalized in multiple ways. In the nominal system, we find relative pronouns yielding the “ezāfe” (see Section 2.4) in Persian etc. and the direct article (Chorasmian, Bactrian). Demonstratives give not only articles, but are also found as copular forms, while pronominal clitics can be reanalysed as verbal endings and as agreement markers.

Grammaticalization and reanalysis in Iranian

Tab. 6: Grammaticalization phenomena in Iranian according to the classification of Heine and Kuteva (2002). Source

Target

head forehead face breast

front

back behind

after

belly

in

bottom foot

down

footprint

behind

there

demonstrative

demonstrative

copula definite

pers-pron, third pers-pron, third plural thing piece

copula impersonal indefinite pronoun classifier

one

singulative

do

causative pro-verb

give beat keep copula, locative copula exist go

causative pro-verb continuous h-possessive obligation continuous change-of-state

stand

continuous copula

leave want in (spatial) locative dative possessive

permissive future continuous subordinator b-, h-possessive perfect

w-question relative

complementizer

VP-and or continuous future complementizer

subordinator s-question present epistemic modality purpose

493

494

Agnes Korn

Tab. 7: Auxiliary uses of common light verbs. verb

function

languages



 past

(generally)

*dār ‘hold, have’

 past 

Sogdian, Chorasmian Persian

*dā ‘put; give’

 past

Ossetic

verbs of movement



(many Ir. languages)

*kar ‘do’ *baw ‘become’

 potential  potential

(many Ir. languages) Khotanese, Sogdian, Balochi, Pashto

A topic for future research would be converse processes that can be described as degrammaticalization: verbal endings yield pronouns in Sorani Kurdish, and so do pronominal clitics (cf. Section 3.2). A number of affixes gain autonomy such as case endings occurring in group inflection in many Ir. languages, and sometimes follow the articles, and the Kurdish articles are likely to come from a suffix deriving nominals. Similarly, some 3 modal endings are reinterpreted as mood affixes following the verbal ending in the Bactrian subjunctive (-ινδ-αδο -3-.3) and optative (-ινδ-ηιο -3-.3, also used for the 1), for which a parallel formation is found in Parthian. Iranian is thus a convenient study case for long-term perspectives. The widely different characteristics among these languages may (also) to some extent be due to language contact.

Acknowledgements I am grateful to Thomas Jügel for discussion of various points mentioned in this chapter, and to Murad Suleymanov and an anonymous reviewer for useful comments.

Abbreviations  = ablative case,  = accusative case,  = article, CLI = Schmitt (ed.) 1989,  = complementizer,  = copula,  = dative case,  = deictic element,  = demonstrative pronoun,  = direct case,  = ezāfe,  = feminine,  = focus marker,  = future,  = genitive case,  = imperative,  = infinitive,  = instrumental case,  = interrogative,  = imperfect,  = imperfective, Ir. = Iranian,  = intransitive,  = locative case,  = masculine,  =

Grammaticalization and reanalysis in Iranian

495

middle,  = neuter,  = negation,  = nominative case,  = object marker,  = oblique case,  = pronominal clitic (enclitic pronoun),  = Proto-Indo-European,  = plural,  = name,  = potential,  = perfect participle,  = present (stem),  = participle,  = past stem,  = particle,  = quotative,  = relative,  = subjunctive,  = ingular,  = specificity marker in the sense of Heine (1997: 72 f.),  = subordinator,  = transitive,  = vocative case

References Andreas, Friedrich C. & Walter B. Henning. 1933. Mitteliranische Manichaica aus ChinesischTurkestan II. Sitzungsberichte der preußischen Akademie der Wissenschaften. 292–363 (= Walter B. Henning 1977. Selected Papers I (Acta Iranica 14). 191–260). Authier, Gilles. 2012. Le judéo-tat (Langue iranienne des Juifs du Caucase de l’Est). Wiesbaden: Reichert. Bashir, Elena. 2008. Some Transitional Features of Eastern Balochi: An Areal and Diachronic Perspective. In Carina Jahani, Agnes Korn & Paul Titus (eds.), The Baloch and Others: Linguistic, historical and socio-political perspectives on pluralism in Balochistan, 45–82. Wiesbaden: Reichert. Bashir, Elena. 2009. Wakhi. In Gernot Windfuhr (ed.), The Iranian Languages, 825–862. London & New York: Routledge. Belyaev, Oleg. 2010. Evolution of Case in Ossetic. Iran and the Caucasus 14. 287–322. Benveniste, Émile. 1966. Le verbe stā- comme auxiliaire en iranien. Acta Orientalia 30. 45–49. Bielmeier, Roland. 1977. Historische Untersuchungen zum Erb- und Lehnwortschatzanteil im ossetischen Grundwortschatz. Frankfurt a. M.: Peter Lang. Bossong, Georg. 1985. Empirische Universalienforschung. Differentielle Objektmarkierung in den neuiranischen Sprachen. Tübingen: Gunter Narr. Buddruss, Georg. 1988. Aus dem Leben eines jungen Balutschen, von ihm selbst erzählt. Stuttgart: Steiner. Cabolov, Ruslan L. 1978. Očerk istoričeskoj morfologii kurdskogo jazyka [Sketch of the historical morphology of Kurdish]. Moscow: Nauka. CLI = Rüdiger Schmitt (ed.). 1989. Compendium Linguarum Iranicarum. Wiesbaden: Reichert. Comrie, Bernard. 1978. Ergativity. In Winfried Lehmann (ed.), Syntactic typology: Studies in the Phenomenology of Language, 329–394. Sussex: The Harvester Press. Durkin-Meisterernst, Desmond. 2009. Khwarezmian. In Gernot Windfuhr (ed.), The Iranian Languages, 336–376. London & New York: Routledge. Durkin-Meisterernst, Desmond. 2014. Grammatik des Westmitteliranischen (Parthisch und Mittelpersisch). Vienna: Österreichische Akademie der Wissenschaften. Emmerick, Ronald. 2009. Khotanese and Tumshuqese. In Gernot Windfuhr (ed.), The Iranian Languages, 377–425. London & New York: Routledge. Farrell, Tim. 2003. Linguistic influences on the Balochi Spoken in Karachi. In Carina Jahani & Agnes Korn (eds.), The Baloch and Their Neighbours: Ethnic and Linguistic Contact in Balochistan in Historical and Modern Times, 169–211. Wiesbaden: Reichert. Farrell, Tim. 2008. The Sweet Tongue: Metaphor in Balochi. In Carina Jahani, Agnes Korn & Paul Titus (eds.), The Baloch and others: Linguistic, historical and socio-political perspectives on pluralism in Balochistan, 101–138. Wiesbaden: Reichert. Fattah, Ismaïl Kamandâr. 2000. Les dialectes kurdes méridionaux. Étude linguistique et dialectologique (Acta Iranica 37). Leuven: Peeters. Fortson, Benjamin. 2004. Indo-European Language and Culture: An Introduction. Chichester etc.: Wiley-Blackwell.

496

Agnes Korn

Gershevitch, Ilya. 1954. A Grammar of Manichean Sogdian. Oxford: Oxford University Press. Gholami, Saloumeh. 2011. Definite Articles in Bactrian. In Agnes Korn, Geoffrey Haig, Simin Karimi & Pollet Samvelian (eds.), Topics in Iranian linguistics, 11–22. Wiesbaden: Reichert. Haspelmath, Martin. 1998. The Semantic Development of Old Presents: New Futures and Subjunctives without Grammaticalization. Diachronica 15. 29–62. Heine, Bernd. 1997. Cognitive Foundations of Grammar. New York, Oxford: Oxford University Press. Heine, Bernd & Tania Kuteva. 2002: World Lexicon of Grammaticalization. Cambridge: Cambridge University Press. Henning, Walter B. 1934. Das Verbum des Mittelpersischen der Turfanfragmente. Zeitschrift für Indologie und Iranistik 9. 158–253. Jahani, Carina. 2008. Expressions of future in Classical and Modern New Persian. In Simin Karimi, Vida Samiian & Donald Stilo (eds.), Aspects of Iranian Linguistics, 155–176. Newcastle: Cambridge Scholars Publishing. Jahani, Carina. 2017. Prospectivity in Persian and Balochi and the preterite for non-past events. In Agnes Korn & Irina Nevskaya (eds.), Prospective and Proximative in Turkic, Iranian and beyond. Wiesbaden: Reichert. Jahani, Carina & Agnes Korn. 2009. Balochi. In Gernot Windfuhr (ed.), The Iranian Languages, 634–692. London & New York: Routledge. JamaspAsa, Kaikhusroo. 1982. Aogəmadaēčā. A Zoroastrian Liturgy. Vienna: Österreichische Akademie der Wissenschaften. Jeremiás, Éva. 1993. On the Genesis of the Periphrastic Progressive in Iranian Languages. In Wojciech Skalmowski & Alois van Tongerloo (eds.), Medioiranica. Proceedings of the International Colloquium organized by the Katholieke Universiteit Leuven from the 21st to the 23rd of May 1990, 99–116. Leuven: Peeters. Jügel, Thomas. 2009. Ergative Remnants in Sorani Kurdish? Orientalia Suecana 58. 142–158. Jügel, Thomas. 2013. The Verbal Particle BE in Middle Persian. Münchener Studien zur Sprachwissenschaft 67. 29–56. Jügel, Thomas. 2015. Die Entwicklung der Ergativkonstruktion im Alt- und Mitteliranischen. Eine korpusbasierte Untersuchung zu Kasus, Kongruenz und Satzbau. Wiesbaden: Harrassowitz. Korn, Agnes. 2008a. A New Locative Case in Turkmenistan Balochi. Iran and the Caucasus 12. 83– 99. Korn, Agnes. 2008b. Marking of arguments in Balochi ergative and mixed constructions. In Simin Karimi, Vida Samiian & Donald Stilo (eds.). Aspects of Iranian Linguistics, 249–276. Newcastle: Cambridge Scholars Publishing. Korn, Agnes. 2011. Pronouns as Verbs, Verbs as Pronouns: Demonstratives and the Copula in Iranian. In Agnes Korn, Geoffrey Haig, Simin Karimi & Pollet Samvelian (eds.), Topics in Iranian Linguistics, 53–70. Wiesbaden: Reichert. Korn, Agnes. 2013. Looking for the Middle Way: Voice and Transitivity in Complex Predicates in Iranian. Lingua 135. 30–55. Korn, Agnes. 2016a. The languages, their histories and genetic classification: Iranian. In Hans Henrich Hock & Elena Bashir (eds.), The Languages and Linguistics of South Asia: A Comprehensive Guide (The World of Linguistics 7), 51–66. Berlin: Mouton de Gruyter. Korn, Agnes. 2016b. A partial tree of Central Iranian: A new look at Iranian subphyla. Indogermanische Forschungen 121. 401–434. doi:10.1515/if-2016–0021. Korn, Agnes. 2017a. Evolution of Iranian. In Jared Klein, Brian Joseph & Matthias Fritz (eds.), Comparative Indo-European Linguistics, 608–624 (Handbücher zur Sprach- und Kommunikationswissenschaft 41). Berlin: Mouton de Gruyter. Korn, Agnes. 2017b. Notes on the Nominal System of Bashkardi. Transactions of the Philological Society 115. 79–97.

Grammaticalization and reanalysis in Iranian

497

Korn, Agnes. 2017c. What to look out for: morphology of prospectives and futures in Iranian. In Agnes Korn & Irina Nevskaya (eds.), Prospective and Proximative in Turkic, Iranian and beyond, 35–48. Wiesbaden: Reichert. Lazard, Gilbert. 1974. Morphologie du verbe dans le persan du Sistan. Studia Iranica 3. 65–85. Lazard, Gilbert. 2005. Structures d’actances dans les langues irano-aryennes modernes. In Dieter Weber (ed.), Languages of Iran: Past and Present. Iranian Studies in memoriam David Neil MacKenzie, 81–93. Wiesbaden: Harrassowitz. Lecoq, Pierre. 2002. Recherches sur les dialectes kermaniens (Iran central). Grammaire, textes, traductions et glossaires (Acta Iranica 39). Leuven: Peeters. McCarus, Ernest. 2009. Kurdish. In Gernot Windfuhr (ed.), The Iranian Languages, 587–633. London & New York: Routledge. Öhl, Peter & Agnes Korn. 2008. Performanzbasierte und parametrische Wandel in der linken Satzperipherie des Persischen. Der Subordinationsmarker ke und die Interrogativpartikel āyā. Die Sprache 46. 137–202 [2006]. Öpengin, Ergin. 2016. The Mukri Variety of Central Kurdish: Grammar, Texts, Lexicon Wiesbaden: Reichert. Paul, Ludwig. 2013. A Grammar of Early Judeo-Persian. Wiesbaden: Reichert. Rzehak, Lutz. 1999. Tadschikische Studiengrammatik. Wiesbaden: Reichert. Schmitt, Rüdiger (ed.). 1989. Compendium Linguarum Iranicarum. Wiesbaden: Reichert. Schulze, Wolfgang. 2000. Northern Talysh. Munich: Lincom. Sims-Williams, Nicholas. 1996. On the Historic Present and Injunctive in Sogdian and Choresmian. Münchener Studien zur Sprachwissenschaft 56. 173–189. Sims-Williams, Nicholas. 2000–2012. Bactrian Documents from Northern Afghanistan (Corpus Inscriptionum Iranicarum II, III, 5). Oxford: Oxford University Press. Sims-Williams, Nicholas. 2007. The Sogdian potentialis. In Maria Macuch, Mauro Maggi & Werner Sundermann (eds.), Iranian Languages and Texts from Iran and Turan. Ronald E. Emmerick Memorial Volume, 181–193. Wiesbaden: Harrassowitz. Sims-Williams, Nicholas. 2011. Differential object marking in Bactrian. In Agnes Korn, Geoffrey Haig, Simin Karimi & Pollet Samvelian (eds.), Topics in Iranian Linguistics, 23–38. Wiesbaden: Reichert. Sims-Williams, Nicholas. 2014. Biblical and other Christian Sogdian Texts from the Turfan Collection (Berliner Turfantexte 32). Turnhout: Brepols. Skjærvø, Prods O. 1989. Bashkardi. Encyclopædia Iranica 3. 846–850. Skjærvø, Prods O. 2009. Middle West Iranian. In Gernot Windfuhr (ed.), The Iranian Languages, 196–278. London & New York: Routledge. Šokrī, Gītī. 1995. Gūyeš-e sārī (Māzanderānī) [The dialect of Sari (Mazenderani)]. Tehran: Pažūhešgāh-e ʿolūm-e ensānī va moṭālaʿāt-e farhangī 1374 h.š. Stilo, Donald. 2004. Vafsi Folk Tales. Twenty-four Folk Tales in the Gurchani Dialect of Vafsi as Narrated by Ghazanfar Mahmudi and Mashdi Mahdi and Collected by Lawrence P. EllwellSutton. Wiesbaden: Reichert. Stilo, Donald. 2009. Case in Iranian: From reduction and loss to innovation and renewal. In Andrej Malchukov & Andrew Spencer (eds.), The Oxford Handbook of Case, 700–715. Oxford: Oxford University Press. Thordarson, Fridrik. 2009. Ossetic Grammatical Studies. Vienna: Österreichische Akademie der Wissenschaften. Trask, Robert. 1979. On the origins of ergativity. In Frans Plank (ed.), Ergativity: Towards a theory of grammatical relations, 385–404. London: Academic Press. Weber, Dieter. 1980. Beiträge zur historischen Grammatik des Ossetischen. Indogermanische Forschungen 85. 126–137. Weber, Dieter. 1983. Beiträge zur historischen Grammatik des Ossetischen. Indogermanische Forschungen 88. 84–91.

498

Agnes Korn

Wendtland, Antje. 2009. The Position of the Pamir Languages within East Iranian. Orientalia Suecana 58. 172–188. Wendtland, Antje. 2011a. Die Entwicklung von Demonstrativpronomen zu Artikeln im Soghdischen. Wiesbaden: Harrassowitz. Wendtland, Antje 2011b. The Emergence and Development of the Sogdian Perfect. In Agnes Korn, Geoffrey Haig, Simin Karimi & Pollet Samvelian (eds.), Topics in Iranian Linguistics, 39–52. Wiesbaden: Reichert. Windfuhr, Gernot 1992. Case. Encyclopædia Iranica 5. 25–37. Windfuhr, Gernot & John Perry. 2009. Persian and Tajik. In Gernot Windfuhr (ed.), The Iranian Languages, 416–544. London & New York: Routledge. Yoshida, Yutaka. 2009. Sogdian. In Gernot Windfuhr (ed.), The Iranian Languages, 279–335. London & New York: Routledge.

Annie Montaut

11 Grammaticalization in standard Hindi/ Urdu and Hindi dialects  Introduction . The aim of the present paper The paper reviews the main grammaticalization processes in Hindi which can be grasped on the basis of textual evidence spanning over more than one millennium. A rich and practically uninterrupted literary tradition is indeed available for the language, between the Sanskrit corpus, its middle (Prakrit) developments, already regionally differentiated, and the modern regional languages which stemmed out of them. Since the second millennium, this tradition is associated with different language names as the term ‘Hindi’ is hardly used for the language before the 18th century, and the category ‘Old Hindi’ or ‘High Hindi’ or ‘medieval Hindi’ conflates regional languages and dialects now registered as such. Neighbouring IndoAryan languages will be occasionally mentioned.

. Hindi among Indo-Aryan languages In its extended meaning, ‘Hindi’ refers to the whole Indo-Aryan linguistic continuum covering the area West-East between Punjab and Bengal, and South North between Maharashtra, the Deccan and the Himalayan chain (see Figure 1).1 It claims nearly 450 million speakers (42 % of the population). In its restricted meaning, the term designates the modern standard variety spoken today (hereafter SH), with about 260 million speakers in India.2 This standardized variety is the official lan-

 All the ‘Hindi’ varieties, standard and dialectal have been classified as Indo-Aryan, a subgroup of Indo-European as early as their first linguistic survey by Grierson ([1903–1928] 1967–1973), who identified two circles, corresponding to two successive immigration waves: the inner circle corresponding to central languages supposedly closer to the ancestral languages, and the outer circle (on this “revisited” hypothesis see Southworth [2005]). Subgroupings are political and administrative as much as linguistic: since Maithili became a constitutional language (registered in the eighth schedule of the Constitution) in 1994, it stopped belonging to the Hindi sphere, but sister Bihari languages such as Bhojpuri still do. Chattisgarhi has recently obtained the status of State official language. For a history of the main classifications of Indo-Aryan, see Cardona and Jain (2003).  The Indian Census reflects a steady growth of the Hindi speakers’ proportion since 1971. The criteria used for defining a “mother tongue” and moreover the linguistic loyalties of the speakers questioned, however, make the statistics not entirely reliable. A large number are bilingual with Hindi in North India, because of school or professional requirements. https://doi.org/10.1515/9783110563146-011

500

Annie Montaut

Fig. 1: Geographical Distribution of Hindi languages.

guage of the country as well as of a dozen Central and Northern States. It is also the language of the widely diffused Bollywood film and has a recent (1880–) but rich literature. There are up to 331 so-called ‘dialects’ of Hindi (extended meaning), some of which are linguistically closer to Bengali, Nepali or Marathi than to standard Hindi. Those I used for this paper are Haryanvi or Bangaru (13 million speakers), spoken in the North West of Delhi, Marwari, the largest Rajasthani language (31 million), Garhwali and Kumaoni spoken by about 3 million each in the new Northern State of Uttarakhand, Bundeli spoken in the South Madhya Pradesh (20 million), Awadhi, spoken around Benares, Bhojpuri (39 million) spoken in Western Bihar, Magahi (14 million) close to Maithili spoken in Eastern Bihar, Chattisgarhi (18 million). In addition, there are at least two important lingua franca, the bazar Hindi called Dharavi Hindi, spoken in Mumbai/Bombay and the Sadri or Sadni/

Grammaticalization in standard Hindi/Urdu and Hindi dialects

501

Sadani used by the Adivasi or ‘Tribals’ from Bihar-Bengal-Orissa (1,8 million).3 The Southern varieties called Dakhini Hindi (or Urdu), spoken in the major cities of the Deccan by up to 45 % of their population in Hyderabad (less than 25 % in Mysore), are deeply marked by contact with Dravidian languages. Oversee varieties of Hindi (Mauritius, Fiji, Surinam) were mostly Eastern varieties (Bhojpuri or Calcutta Bazari Hindi, with the influence of Awadhi through the only book which the indentured labourers took with them, the Ramayana of Tusidas composed in Awadhi). Urdu, although written now exclusively in the modified Arabic script like Persian, shares with Hindi a common history and literary culture between the 14th and 18th century, and both are still mutually intelligible, even close to identical at the colloquial level. Lexical borrowings from Persian however ceased post 19th century in Hindi which developed an exclusively Sanskrit based neology, whereas Urdu on its side continued borrowing from Persian or Arab with the addition of English.4

. Typological characteristics Hindi shares with all other Indian languages about a dozen typological characteristics which are considered defining for a consistent linguistic area.5 All are head final languages (anteposition of adjectives and relative clauses, postpositions or suffixes instead of prepositions), all contrast retroflex and dental consonants, all have causative derivation, have more complex (N V or Adj V) than simplex predicates, use an invariable conjunctive participle for coordinating and subordinating predicates and clauses, use semi-auxiliarized or vector verbs for aspectual and “attitudinal” specifications (1). (1)

a. Hi/Ur vah nikal gayā 3. get.out go... ‘He left.’

 Hindi forms and examples quoted in the paper without further specifications are modern Standard Hindi/Urdu (Hi/Ur). The abbreviations used for the others are: Aw = Awadhi, Bgr = Bangaru/ Haryanvi, Bhj = Bhojpuri, Bun = Bundeli, Chatt = Chattisgarhi, Dakh = Dakhini, Garh = Garhwali, Ku = Kumaoni, Mrw = Marwari, Sdn = Sadani/Sadri.  On the history of this common linguistic and literary culture, although more and more obliterated since the Partition between India and Pakistan, see Rai (1984) and Montaut (2012: Chapter 1).  With more similarities uniting the genetically unrelated four major families (Indo-European, Dravidian, Austro-Asiatic, Sino-Tibetan) spoken on the Sub-continent than are uniting, for instance, Bengali and English, although both belong to the same family (Emeneau 1980).

502

Annie Montaut

b. Tamil avan poy vit ̣t ̣ān 3.. go leave..3.. ‘He left.’ All use reduplication (see Montaut 2008) of all speech parts in various grammatical and non-grammatical functions, all display differential object marking with transitive verbs and differential subject marking (see Montaut 2013a) with experiential and cognition predicates (2): (2)

a. Hi/Ur mujhe yah pasand hai; mujhe nahī ̃ patā 1.. this liking.is 1.  know ‘I like it.’ ‘I do not know.’ b. Tamil enakku idu piḍikkum; enakku teryadum 1.. this like 1. not.know ‘I like it.’ ‘I do not know.’

Other typological characteristics are specific to the Indo-Aryan family, such as aspirate voiced and voiceless stops, a pronominal system structured by the hierarchy of respect more than number, a locally bound reflexive, a clause linking system favouring correlation (3b) rather than subordination, on the Sanskrit model (3a) whereas Dravidian languages have relative participles (3c). (3)

a. Sk yo[ah̩] naro[ah̩] gaccati tam̩ paśyati ... man... go..3.  see..3. ‘You see the man who goes.’ b. Hi/Ur jo lar ̣kā kal āyā kyā (tum=ne) use dekhā?  boy yesterday came  (2=) 3../ see. c. Tamil nēṟṟu vanda payyann.e paat ̣t ̣-inga-lā? yesterday come.. boy. see.-2- ‘Did you see the boy who came yesterday?’

Finally, some typological features like grammatical gender and ergative construction, which are specific to Western Indo-Aryan languages only, also characterize the Western dialects of Hindi, among which are Standard Hindi and Urdu (4a): perfect transitive verbs agree with their patient and require a marked agent also in Bundeli,

Grammaticalization in standard Hindi/Urdu and Hindi dialects

503

Braj, Garhwali, Kumaoni, but no longer in Awadhi, Bhojpuri (4b) and Magahi. Marwari, despite being a Western language, displays double agreement in the perfect, with the patient on the participle and with subject on the auxiliary (4c), which may suggest an attrition of ergativity. (4)

Hi/Ur a. maĩ=ne tum=se yah bāt bār-bār kah-ī 1= 2= this thing time-time say.-. ‘I told you this (thing) many times.’ b. Bhj ham torā se i bāt barabbar kaha-l-i 1 2  this thing often say--1 ‘I told you this (thing).’ c. Mrw mhe sītā=ne dekh-ī hū̃ 1.. Sita= see.-. be..1 ‘I have seen Sita.’

.. Phonology As all Indian languages, Hindi and its dialects clearly contrast dental and retroflex consonants, and as all Indo-Aryan languages they contrast aspirate and non-aspirate stops, with a noticeable erosion of the latter contrast in Dakhini. In Standard Hindi, vowels (20) contrast by length (with ATR), aperture, nasalization, but Eastern dialects have diphthongs and a weaker relevance of nasalization and length.

.. Noun and noun phrase In the absence of the category article (occasional use of numeral ek, eka ‘one’ as indefinite determiner, occasional use of the deictic as a definite article), a bare noun can be definite or generic, or sometimes indefinite. Determiners such as indefinites, deictics or possessive precede the adjective before the noun, which is always final in the phrase: (5)

Hi/Ur ye sāmne bait ̣hī sigret ̣ pītī sundar laṛkiyā̃ . in.front seated cigarette drinking beautiful girl.. ‘These beautiful girls in front smoking a cigarette’

Agreement is absent in Eastern dialects, which do not mark grammatical gender. Elsewhere it is inflectional for adjectives and participles (with a general opposition

504

Annie Montaut

-a or -au, -o for masculine and -i for feminine), restricted to -a/au ending adjectives, and limited to three distinct forms, whereas the nominal paradigm is far more extended. Nouns vary in number and case; the oblique case is required for oblique and marked arguments (particularly before a postposition). Depending on their morphonological shape, feminine nouns combine with two sets of endings: SH -ī ending nouns like laṛkī ‘girl’ has a plural form laṛkiyā̃ ‘girls’ in the direct case, and the forms laṛkī and laṛkiyõ in the oblique case (Ø –yā,̃ Ø, -yõ) and those ending with another vowel or a consonant like mez ‘table’ combines with Ø, -ẽ, Ø, -õ. Masculine -ā ending nouns combine a richer set of endings (-a, -e, -e, -õ) but all other masculine only license the –õ oblique plural suffix (from Sk plural genitive case -anam). Western dialects display similar paradigms with mostly the ending -au/o for masculine singular, whereas in Eastern languages nouns do not vary in case. Deictic pronouns and determiners share a specific morphological paradigm, inherited from one of the various forms of the Sk deictics (sa-/ta- or es-, os-) re-distributed into a twofold distinction (proximate e/ī, remote u/o) with singular oblique form in -s in Western languages. This specific declension also characterizes relative pronouns (Sk ya > j-) and interrogatives or indefinites (k-). These pronominal bases provide adverbs of time (kab, jab ‘when’), location (yahā̃ ‘here’, vahā̃ ‘there’, kahā,̃ jahā̃ ‘where’), and direction (idhar/udhar ‘towards here/there’, jidhar, kidhar). Apart from the correlative, as in (3), another strategy for relativizing a noun is the use of participial (present or past) forms of the verb (5), a strategy sometimes attributed to the influence of Dravidian languages which have no relative pronouns.

.. Verb and verb phrase Although Hindi is still characterized as a flectional language because of its ancestry, the verb is even less inflectional than the noun. By the end of the first millennium, the rich Sanskrit verbal paradigm was practically reduced to two finite forms (synthetic present for the whole domain of non-past, sigmatic future maintained only in certain regions) and the past participle used in a predicate for the whole past domain. The TAM categories where later renewed with various auxiliaries, derived from various forms of the verb ‘be’ (ho in SH, bha(y) in Braj for present, both from the Sk bhava- ‘be, become’; cha or ca in many dialects, sa (Bgr), all derived from Sk as ‘be’, thā in SH and Western dialects for past, from the Sk stha ‘be, remain’. Generally, tense is conveyed by the auxiliary and primary aspects by the present (-t/d-) or past participle of main verb, whereas voice and secondary aspects such as progressive, continuative, frequentative are conveyed by distinct auxiliaries. The result is a highly analytical system with distinct words (or morphs, according to the languages) for each temporal or aspectual specification, and the inflected auxiliary in the final position (V voice A T [P]), with person endings restricted to the present

Grammaticalization in standard Hindi/Urdu and Hindi dialects

505

tense in standard Hindi, whereas Eastern languages, with no gender/number agreement, display more person endings.6 (6)

Hi/Ur a. boltī jā rahī ho speak..  .. .2 ‘[you] keep on speaking.’ b. le jāyā jā rahā thā take go.  .. .. ‘[you] was being brought.’

Agreement is either with the subject or the unmarked patient in ergative languages (cf. above), default agreement (masculine singular ending) occurring with marked object in ergative alignments in SH, Braj, Bun, but not in the Rajasthani languages where the verb agrees with marked patients. But verbs may agree with two arguments (agent and patient or beneficiary if human) in Magahi, like in Maithili, an atypical pattern in Indo-Aryan languages which is attributed to the contact with Austro-Asiatic languages spoken in this part of Bihar (Verma 1991).

.. Dependent clauses As already mentioned, the inherited system is correlation, which involves no real hypotactic dependency, but the universal complementizer ki, borrowed from Persian (order: Main Clause −  − Dep Clause), provided for a now well-developed system of subordination by means of compounding ki with various prefixes. Nonfinite clauses however remain the favoured device for expressing dependency.

 Grammaticalization of nominal categories . Number As already mentioned, Hindi has inherited flectional markers for number, diversely renewed and simplified according to the gender/number pattern in the various dialects. Eastern dialects, as well as Bazari Hindi, which lost inflectional endings to a greater extent, grammaticalized a few nouns into plural markers: log ‘people’ (Bhj)

 Because of this high complexity of the verb phrase, the gloss will be simplified hereafter for most complex verb forms and gender number indicated only once in the gloss, except when verb agreement is commented.

506

Annie Montaut

or lokain (Mag), both from Sk loka ‘world, people’ (baccā log [child people] ‘children’), jan ‘human being’ (Mag), or man ‘human being’ (cauva man [child human.being] ‘children’), from Sk manuṣya ‘man’, for animates (Sdn, Chat), and the general quantifier sab ‘all’ for inanimates or animates.

. Classifiers As a category, classifiers are unknown to Indo-Aryan but they exist in Bengali, probably due to the contact with Sino-Tibetan languages, although some of them are also analyzed as plural markers (gulo, for inanimates, from kula ‘family’, jOn, from jān ‘living being’ for humans) or definite article (t ̣o). The general classifiers t ̣ho or go is required in Bhj with numerals (ek t ̣ho biḍī [one  cigarette] ‘one cigarette’, tini go/tini t ̣ho laïka [three  boys] ‘three boys’). Go (Chat got ̣, Sdn got ̣k) has been related to the Sk word gr̥ta (> guta, gua) and to the Bengali classifier for round objects (got ̣a, < gut ̣ika ‘bunch’). The more frequent t ̣ho is even more obscure, like Bg t ̣o (? vr̥tta > vatta as suggested by Chatterji [[1926] 1970: 884]) and sometimes related to the verbal basis stha (Saxena [1937] 1971: 155). It is considered a borrowing from Austro-Asiatic languages by Acharya et al. (1987: 15), a plausible source since most languages with t ̣o/t ̣ho also borrowed the Munda word for 20 (Bengali kur ̣i, Sdn, Aw koḍi, but Bhj bīs) instead of having the inherited bīs as most IA languages. In Sdn and Chat, got ̣ is placed before the numeral one (got ̣ek ‘one, a single’), whereas it is placed after the numeral in all other cases.

. Nominal derivation (‘Protector’ > noun of agent, relator) Apart from the numerous Sanskrit suffixes and prefixes used in the official fabric of the modern standard languages, particularly technical languages, corresponding to the Persian and Arabic affixes used in Urdu neology, the only productive colloquial suffix which can be added either to adjectives or to nouns is vālā. It is sometimes also treated as an unbound morph. Vālā derives from the noun pālak > pāl, pālā ‘guardian’ and is now, with a bleached meaning ‘related to’ the all-purpose device for deriving nouns of agent out of verbal nouns (dekhnevālā ‘see-er, spectator’), or out of other nouns (t ̣opīvālā laṛkā ‘the boy with the cap’, kapṛevālā ‘the cloth merchant’, cāyvālā ‘tea man’, sender or maker). When used with adjectives, it indicates selection within a set: hārī saṛī ‘a/the green sari’, hārīvālī saṛī ‘the green sari’, hārīvālī ‘the green one’).

. Personal pronouns The notion of ‘self ’ (āpan; from Sk ātmana ‘body centre, soul’, then reflexive) is used in many dialects as a first-person pronoun, and in some as an inclusive first

Grammaticalization in standard Hindi/Urdu and Hindi dialects

507

person plural. The most important innovation, apart from the growing importance of honorific hierarchies superseding personal hierarchies, is the development of an inclusive 1st person plural on the Dravidian model, both in Dakhini and Bazari Hindi of Bombay (Dharavi Hi): āpan (< Sk ātmana ‘soul, body centre’) is used for the inclusive ‘we’ and ham for the non-inclusive ‘we’. Either form is also used for the singular ‘I’ in substandard Hindi (verb agreement in the plural). The same notion of ‘body centre, soul, self ’ became the second honorific pronoun in Hindi as in most Indo-Aryan languages, as well as the third honorific person in Hindi. All IA languages have three forms for the second person pronoun (inherited tū for intimate relation, tum for neutral relations, both inherited from the Sk paradigm of second singular person), and most of them use āp or a reflex of āp for either the outsider or the respected addressee. This threefold distinction (intimate, neutral, respectful), reshaped the Sanskrit paradigm since tum and āp are used both for singular and plural, discarding the former second plural person, and āp was extended to the third person, with highly respectful connotations, in addition to the standard respectful third person, which is conveyed by the plural.

.. ‘King, lord’ > honorific pronoun This is a relatively isolated path that can be found in Awadhi and some Bhojpuri dialects, where the word rau ‘king’ (< Sk rājā, Prk ran͂ a > rana, rav, rao) is the current way of addressing respected persons. The genitive raura, raur is used as a possessive: raur larikā ‘your (honorific) boy, your son’.

. Reflexive and focalizer The grammaticalization of the Sk word ātmana meaning originally ‘centre of the body, trunk’, then ‘soul, cosmic principle, absolute principle’, into a reflexive was already achieved in Sk where it corresponds to the IE sva-. In Hindi and IA in general, this not only came to be the reflexive pronoun and adjective but also, in its reduplicated (apne āp) or simple (āp) forms, a focalizer (‘very X’, ‘N itself ’) and in its adjectival form an emphatic of the possessor. (7)

Hi/Ur a. maĩ (apne) āp jāūg̃ ī 1  go..1 . ‘I will go myself.’ b. vah apne (āp) ko dekh rahā thā 1   look  .3. ‘He was looking at himself in the mirror.’

508

Annie Montaut

c. ye merī apnī kitābẽ haĩ 1   books are ‘They are my own books.’ Numerous studies on reflexives and intensives, including Montaut (1997), point out the affinities between both functions, often favouring the latter as primary to reflexivization since the notion of emphasizing the most salient entity in the context also accounts for the syntactic coreference to the most important argument in the sentence. However, the IA data provides evidence for its grammaticalization first as a reflexive (classical Sanskrit), then into an honorific pronoun, and the use as intensifier and focalizer only appearing later in the texts. Significantly, the word ‘king, master’ has been grammaticalized also in both functions as a reflexive and an honorific pronoun in Awadhi (rau, raura), but not as a focalizer.

. Case and postposition The impoverishment of nominal flection in middle IA resulted in a syncretic use of the two or three (direct and oblique) cases present at the beginning of New IndoAryan in the 11th–14th centuries. During the next centuries new case markers developed, initially competing with the old inflectional system, from locational or body nouns as well as a few verbs. The distinction between case markers and adpositions is not straightforward in Hindi since many case markers used for direct and oblique arguments are also used for other types of complements, for instance ke pās ‘close to, near’ which is used for possession. Given the extreme diversity of case markers for a given function within the Hindi speaking area, it will be clearer to present the data starting from the word grammaticalized rather than its function, except for the genitive, which is always renewed by means of a similar device.

.. Genitive: grammaticalization of ‘done (by)’ A striking characteristic of the genitive postposition in all Hindi (as well as IA) languages is that it agrees in case, number and gender with the head noun. This adjectival behaviour is accounted for by the origin of the marker, the past participle of the verb kar ‘do’, kr̥ta > kita, > kia > kaa > kā (Hi rām kā kamrā [Ram of room] ‘Ram’s room’, rām kī bet ̣ī [Ram of daughter] ‘Ram’s daughter’), kau, ko (Br, Garh), or with palatalization before front vowels > cia > cā (Marathi and Western languages). The form ker/kera/kar sometimes reduced to -ar, -er or -r is preserved in Eastern languages (Mag, Aw, Bhj), and is derived from the gerund kāryā ‘to be done’ or a regional form of the participle kara. The form -ra (ro, ri), found in Rajasthani is from the same origin (suwa ro kamro [sleep. of room] ‘sleeping room’). The suffix –na

Grammaticalization in standard Hindi/Urdu and Hindi dialects

509

(< action noun –an/ana > nominalizer and neutral relator) is less frequent and is found marginally in Garhwali and widely in Gujarati (ram no ghar [Ram of house] ‘Ram’s house’).

.. ‘Side, place’ > dative > accusative The noun kakṣa ‘side, place’ (still in use in Standard Hi/Ur with the meaning ‘room, classroom’) grammaticalized in an allative then dative marker with the forms kakh, kakhā,̃ kākh in Garhwali / Kumaoni, and more reduced forms like kau, kū,̃ ku, ko (Hi/Ur), kai, kaĩ, ke. Allative origin (Strnad 2013: 325) is attested in the 13th century, as well as its emerging use as an accusative marker: (8)

Sant Basha7 ̃ hai tīra patāla kū̃ gagana kū̃ sād mārai aims arrow hell  sky / strike ‘He aims his arrow at the netherworld and strikes (at) the sky.’ (Kabir)

.. ‘Ear’ > locative > dative > ergative, instrumental The ergative most current in Hindi and IA is ne, nẽ, na, ni, and is derived from the Sanskrit noun karṇa ‘ear’, in the locative case karṇe (renewed in*karṇasmin by analogy). A closer form to the etymon is the Garhwali form kuṇī used in certain dialects for dative/accusative (Chatak 1966). It was first traced as a locative (9a–b) then allative (9c–d) and dative marker in early Rajasthani and more recently agent marker (9e) by Tessitori (1914–1916) with the following examples: (9)

Old Rajasthani a. na jāṇaū̃ kihā̃ kan̩hī achaï  know..1 where  be..3 ‘I don’t know where he is.’ b. cārāï naï nirmala nīra road  pure water ‘A limpid lake close by the road.’ 8

 The Sant Basha is a medieval language (13th–17th centuries) created as a kind of literary koine by the mystic wandering poets who propagated it, with various regional influences and not possibly assigned to any particular region.  Other example with the meaning ‘near, close to’: mithyādr̥ṣ t ̣hi loka kanhain srāvai vasirau nahī:̃ ‘a shravaka (hermit) should not live near heretics’.

510

Annie Montaut

c. te savihū̃ naï karaũ paranām 3 all. / do..1 salutation ‘I bow to all of them.’ (in front of/ for) d. āṽ yā rā kaṇhi come.. king / ‘[They] went to the Raja (king).’ e. adiśvara naï dikṣā lidhi Adishwara / consecration.. take.. ‘The Adishvara took the consecration.’ The word kaṇe is also found occasionally in Bhojpuri with the meaning ‘near’ (apnā bāp kāṇe gaile [ father near went] ‘he went near his father’). In Garhwali (as in Marathi) ne is reduced to -n/-na and is also grammaticalized as an instrumental marker, and ergative marker (-na/-an): bhūkh-an māriau ‘died of hunger’, mi.na bolyau ‘I told’. In modern Marwari, as well as in Panjabi (reflex nū)̃ , it is grammaticalized as a dative/accusative marker (Ex. 4).

.. ‘Company’ > comitative/ instrumental, dative The Sk noun sanga ‘company’ provided the marker for comitative case (Chat sange ‘with’) and then instrumental in many languages (san) but in Pahari languages it grammaticalized into dative, and finally, saṇī, haṇī, are the most common markers of dative/accusative in Garhwali/Kumaoni (mi Rames saṇī jāndū̃ [1 Ramesh  know] ’I know Ramesh’).

.. Deictic adverb > allative > dative taĩ, tai, derived from the locative of the resumptive pronoun tāvat (tāvati > tāvahĩ, tāmhĩ *taaĩ, *tannĩ, tāĩ) ‘so long, so far, up to, till’ grammaticalized as dative/accusative, as an alternate form for san̩ī in Garhwali. Another suggested etymology for the dative is from tarati ‘go through, cross’, with possible convergence with tavati ‘up to’, a convergence all the more convincing since cultural connotations conveyed by the notion of crossing, then reach beyond for salvation, were highly pregnant with the mystic poets of early New Indo-Aryan.

.. Verb ‘be’ > ablative/instrumental, dative A number of participial forms of the two verbs “be” provided various postpositions diversely used as case markers in Hindi dialects: the present participle of bhavati

Grammaticalization in standard Hindi/Urdu and Hindi dialects

511

‘be, become’, honti, gave other alternate forms of dative in some Garhwali dialects (te, tī), used as ablative in other dialects, and the root stha ‘be, stand’ gave thai, thai,̃ also used as dative markers. The present participle of the root as (santo) provided the instrumental/ablative se in SH (dialectal reflexes saĩ, sã, sũ, si) according to Hoernle ([1880] 1973), although Tiwari (1961) and Chatak (1966) derive the Hindi se from samena, the instrumental case of the word ‘equal, even’. The Awadhi dative santī is from Sanskrit as ‘be’ in the present participle.

.. Verb ‘touch’ > dative > ergative, instrumental The verb lagnā ‘to touch, be lying’ is considered the origin of the postpositions lai, used for dative in Kumaoni whereas its reflex le is the standard ergative and instrumental marker like in Nepali, and is optionally an ergative marker in Garhwali (other proposed etymology: verb labh ‘acquire’). The longer form lage (lāgi, lagi), closer to the etymology, marks dative in Awadhi, shortened to la in Chattisgarhi (bāp lā ‘to father’) where the longer form is used for beneficiary (Chat lāgi ‘for the sake of, for’).

.. Verb ‘turn’ > ablative An uncommon postposition used in Northern languages (Grh/Ku) is bāt ̣i, bāt ̣, derived from the same verb vr̥t according to Sharma (1984) which provided for the Bhj auxiliary and for the noun “present” borrowed by high level style Hindi from Sanskrit (vārtāman). Two other forms borrowed from Sanskrit may account for the semantic shift, vr̥tti ‘text commentary’, since commenting on a text implies starting from it, and vārtā(lāp) “dialogue”, involves a movement to and fro. The Hindi word bāt ‘speech’, then ‘(abstract) thing’ also comes from this root, with a possible reflex in Bundeli bai, be marking dative/accusative.

.. Other Verb ‘take’ > beneficiary Most Hindi languages use ke lie to mark beneficiaries, a postpositional locution formed with the past participle of the verb le ‘take’ and the all-purpose postposition ke. ‘Praise’ > beneficiary The word vanda ‘praise’ provided for the meaning ‘for’ and then dative in Awadhi; raura badi lit. ‘in praise of your lordship’ > ‘for You’, a similar grammaticalization

512

Annie Montaut

as occurs with the Arabic borrowing khātir in Urdu and many Hindi dialects, competing with the more usual ke lie in this meaning. The Awadhi postposition ḍagar ‘through’, a word also meaning ‘way’ in the modern language as well as in Chattisgarhi (ḍagr.e ‘on the way’), is attributed to the substrate by Saxena (1971: 937) (“local word”), whereas Hindi has dvārā < Sk dvāra ‘door’ in the meaning of ‘by (means of)’. Note the presence of a comitative ḍagare in Garhwali, probably from the same etymology. The postposition t ̣han is occasionally used in the meaning ‘near’ (Chat), from the noun t ̣hana ‘abode’. A great number of complex postpositions are formed from adverbs, generally with the postposition ke: pās ‘nearby’, from Sk parśva ‘side’, gives rise to the postposition ke pās ‘near’, bād (from Arabic) ‘after’ provides for the postposition ke bād, sāmne ‘in front’ for ke sāmne ‘in front of, (ke) ūpar ‘above’, (ke) nīce ‘under’, etc. The comitative ke sāth represents the grammaticalization of the word ardha ‘half ’, prefixed by sa- ‘with’: Sk sārdham ‘being by half with’, ‘together’.

Dative > accusative > experiencer subject Differential Object marking has no specific marker, contrary to Tamil, and is systematically expressed by the dative marker in contrast to the nominative form of unmarked objects. This marking is still optional in Bhojpuri but fairly grammaticalized for human and specific inanimate objects in other languages. Subject differential marking, which grammaticalized at about the same period, also systematically uses the dative marker for experiencers, whatever the dative marking in the given language (see example [3]).

.. Conclusion The great diversity of markers, varying according to languages and dialects, for a single function, is a remarkable fact, as well as the fact that the same word grammaticized in many different cases (Montaut 2013a, 2016a). Although syncretism is now limited in a given language and variation ruled out from standardized languages, there are still dialects like Bangaru which use the same case marker for ergative, dative/accusative, ablative (si in one of its dialects, nae in another, ti in a third one). On the whole, mā/mā/̃ mẽ is the only consistent marker, used in all languages for locative (< Sk madhya ‘middle’) with reflexes such as manye, majje.

Grammaticalization in standard Hindi/Urdu and Hindi dialects

513

 Grammaticalization in the verbal category . Valency .. Causative Correlated intransitive and transitive or causative bases are usually inherited, with vowel umlaut for the causative, and suffix va (< Sk apaya) for double causative: nikal ‘leave’, nikāl ‘make leave, take out’, nikalvā ‘have X taken out’; dhul ‘be washed’, dho ‘wash’, dhulvā ‘have X washed’. But the massive renewal of the verb lexicon in the 16th–18th centuries by means of complex predicates formed out of nouns or adjectives resulted in a new function of verbs ‘be’ and ‘do’ as markers of intransitive/medio-passive and transitive/causative respectively (intazām honā ‘be organized’, intazām karnā ‘organize’). These two light verbs have to a considerable extent lost syntactic variability in such combinations, and the noun has also been partly decategorized (it no longer allows for number marking, nor accusative marking). Following Brinton (2011: 560), one can assume that they are to a large degree grammaticalized, whereas the verb “eat”, used as a light verb contrasting with “give” to inverse the diathesis is probably too limited as a use pattern to be considered grammaticalized as a category (the pair occurs only with few nouns such as ‘oath’, ‘blow’, ‘whip’, ‘fraud’).

.. Permissive: ‘give’ The verb de ‘give’ has been grammaticalized as an auxiliary of permissive causation (“allow X to”) in all Hindi dialects as well as many (if not all) other NIA languages.

.. Passive: ‘go’ The old synthetic passive in -ya (> ja) is preserved only in very few IA languages, most of them use the verb jā ‘go’, with the past participle of the main Verb as a passive auxiliary (phonetic resemblance and reanalysis of the suffix ya > ja as the verb ‘go’?). Passive constructions are frequently endowed with modal meanings (ability in negative contexts, prescription), particularly with intransitive verbs as in (10b). (10) Hi/Ur a. cor pakr ̣ā gayā thief seize go()... ‘The thief was taken.’

514

Annie Montaut

b. mujh=se calā nahī ̃ gayā 1= walk...  go()... ‘I could not (bring myself to) walk.’ Given the fact that the modal meaning is considered the basic use of passive in medieval Hindi (Gaeffke 1967: 51–58), and that the causative alternation is the basic device for expressing the passive/active contrast, one can assume that the new periphrastic passive re-created a voice opposition while also retaining its basic modal meaning. In contrast with the intransitive with medio-passive meaning, the periphrastic passive is always dynamic and agented (even if agent is not expressed) as shown by the fact that only agentive intransitive verbs can be passivized as in Example (10b). As a voice marker simply reverting the diathesis of a transitive predicate, its recent extension is often attributed to the influence of English (10a).

. Tense .. Present: ‘be’, ‘turn’ as copula, present participle As already mentioned, the renewal between 15th and 18th century of the verbal paradigm, out of the drastically reduced Sanskrit paradigm, resulted in a great number of compound forms, mainly with various forms of the ‘be’ verb. This analytical renewal includes even the general present in Hi/Ur. Present tense was expressed by the old synthetic form inherited from Sk, with only person endings suffixed to the verb (1 -asmi > -ahũ, -ū,̃ 2 -asi > -ai, 3 -ati > -ai, 6 -anti > -aĩ)) up to medieval times, also conveying potential and future meanings: (11) a. Sant Bhasa sasihara sūra grās-ai moon sun swallow-.3 ‘The moon swallows the sky.’ b. Old Aw ko kah-ai ?  say-3 ‘Who can tell?’ The modern forms in Hi/Ur represent a twofold renewal, first with the finite verb “be” added to the present participle (like the English progressive) in both meanings of habitual and specific present, then (18th–19th centuries), with a supplementary auxiliary for specifying the progressive meaning. The sentence in (12) had both meanings till mid-19th century but only habitual present now.

Grammaticalization in standard Hindi/Urdu and Hindi dialects

515

(12) 19th century Hi/Ur bacce khel-t-e haĩ child. play--. be.3 ‘The children play/are playing.’ (from Kellogg 1876) In Bhojpuri, present tense is conveyed by bāt ̣ (bān in the first person in most dialects), which is a unique case in IA languages, since it does not derive from a “be” verb but from the Sk verb vr̥t ‘turn, move’: Bhj ham dekhat bāt ̣-ī ‘I (am) see(ing)’, with a first person ending –ī. The same verb bāt ̣ or its reflex bāṛ is also used as a copula. In a few Western dialects, the present is conveyed by the auxiliary sa ‘be’ (Kului, Bgr) as in Punjabi, also derived from Sk as like ca. A few languages and dialects have a present inherited from the Sanskrit present participle -anta > -and > -nd, to which were added personal endings (Ghr jā-̃ ndũ ‘I go’). It should be noticed that this participial formation although not frequent today was present in the old language (Sant Bhasha, old Awadhi mahī carata, earth. walk.. “walk on the earth”), along with the old synthetic present, already with a meaning excluding the non-present, and is probably the first step in the development of the modern Hi/Ur general present: Hi/Ur added an auxiliary, Garhwali person endings. In both cases, the old synthetic present became reanalysed as a mood.

.. Past (imperfect) The tense marker for general past comes from a form of the verb “be”, thā in Hi/Ur, cha in Grh/Ku, not in itself a past form, and from the verb rah ‘stay’ in Aw/Bhj/ Chat, added to the present participle of the main verb (Bhj jāt rahau ‘I was going/ went’). These verbs also behave as a past copula, although they are not originally past.

.. Anterior and aorist: past passive participle > Ø The definite past or anterior has the same form as the past participle (see [1a] and [4a]) except for an extra nasalization on the feminine plural ending. The grammaticalization from passive to resultative and to perfect and anterior is a common path (Wiemer 2004: 293). The western languages retained the old construction of the predicative participle with instrumental agent in Sk, which developed into ergative (see EX.(4a) above). The verb has a nominal morphology, agreement in gender and number with intransitive subject or transitive patient (in Western languages in contact with Panjabi -s at the second person). In most languages except Hi/Ur, there is a glide before the ending, which is sometimes analysed as a definite past tense

516

Annie Montaut

marker and which is present only with a vowel ending basis in Hindi and only in the masculine singular. In Eastern languages, this development was further shifted to a nominative alignment, with suffixation of person endings in agreement with the subject. Formerly used for expressing the whole domain of past (13a), it is now restricted in all IA languages to the definite past (preterit), used also as an aorist to express eventuality (13c), and as an evidential to express mirative (13d) and polemic meanings (Montaut 2016b). (13) a. Sk mayā tat kr̥tam 1. this... done... ‘I have done /did it.’ b. Hi/Ur maĩ=ne yah kām kiyā (kiyā hai) 1.= this work.. do[].. do.3.9 ‘I did that work (have done).’ c. yadi vah so gayā to use na jagānā if 3 sleep go.[]. then 3..  wake.up ‘If he went asleep let him sleep.’ d. are, tum kaise āe! Oh 2. how come.[]. Hey, you here! /How come you are here?’

.. -l- as a definite past marker Initially used in past participles and predicative participles in Eastern languages (first attested in Westernmost languages as Marathi), the suffix -ila/ela/la is originally a nominal suffix, with a vaguely diminutive meaning, soon lost when used for deriving adjectives from nouns. It is now reanalyzed as a tense marker (anterior), and followed by person endings: Bhj tab dekhlān ‘then they saw’, with l as a past marker and -ān as a third person plural. Note that the same suffix is also used in certain languages like Kumaoni for future, although it has a quite different origin (see below). In Hindi creoles as well as in Dakhini, the main verb and the auxiliary have been fused and the segments -t- or -y- are now reanalysed as tense markers for present or past, with or without person endings. Eastern dialects such as Sdn have

 This tense has neither morphological tense nor aspect marker, the reason why I add the gloss in square brackets. For the various labels it received during the 20th century, see Montaut (2016b).

Grammaticalization in standard Hindi/Urdu and Hindi dialects

517

for all persons āta ‘come’, āya ‘came’, in the same way as āba ‘will come’ with the inherited marker -b-; Dakhini with person endings has āt-ū̃ ‘I come’, at-o ‘you come’, āt-ā “s/he comes”. This reanalysis is convergent with certain uses of the present participle in -t- as a predicate for present in early stages of Hindi: karata pahūkar sanān [do.. sacred.lake bath] ‘they do the ritual bath in the sacred lake’ (Chand Bardai, Prithviraj Rasau, Rajasthan, 13th century).

.. ‘Go’ > future In Western languages except Marwari which retained the Sk sigmatic future (ṣ > s, h), future is expressed by ga, the past participle of jā “go”, suffixed to the now subjunctive of the main verb: jāūg͂ ā ‘I will go’, with both person endings attached to the basis and gender/number endings attached to the future suffix. The form was not fused before mid-19th century and can be found before with intervening particles (karū ͂ hī gā ‘will sure do’). Beside the anomaly of the sequence (non-final person endings), the grammatical path also differs from the well-known scenario of motion path (goal infinitive and present of go): the form gā, commonly related to the Sk participle gata > gaa, is better interpreted as having rather a relational meaning than a past (or resultative) one. In Konkani one finds a postposition gela, also derived from gata > gaa + el as a marker for genitive, where the ‘go’ element has been semantically bleached to a simple relator ‘pertaining to’. The fact that the near future is achieved in Hi/Ur by a relator similarly meaning ‘pertaining to’ (vālā, see below) and copula may help interpreting the path. However, the fact that the relator is suffixed to the finite form used to map the non-past domain, not to the base or a non-finite form, is a unique fact in the history of the renewal of the Standard Hi/Ur paradigm, yet to be explained.

.. Obligation > future A more expected pattern of grammaticalization is the reinterpretation of the Sk gerundive (V-tavya), usually considered a passive participle conveying obligation when predicative, as a future marker. This gerundive (> avva > aba >ba) was originally constructed as the other passive/resultative participle with an instrumental agent, an alignment which was shifted to a nominative one during the 14th–15th centuries with person endings added in agreement with the subject. This type of future occurs in all Eastern languages (Aw, Bhj, Mag, Mai) as in Bengali (see Montaut 2017).

518

Annie Montaut

.. ‘Touch’ -l- > future Third common marker of future is -l- (Grh/Ku) as in Nepali, homonym of the past marker -l- but with different origin: it is usually related to the verb lag “touch, be near”: Ku u yāĩ rɔ.l “he will remain here” (Grh reflexes rɔlo, rahlu).

.. Periphrastic futures: close and imminent future (‘be Verbal noun’, ‘be to V’) The near future is expressed by the verbal noun with suffix -vālā and copula (jānevālā hai ‘is going to go’) and an imminent process is expressed by the verbal noun + ko + copula, ‘be to V’ = ‘be about to V’.

. Aspect The whole verbal paradigm, almost exclusively consisting of compound forms in Hi, is structured by the aspectual opposition conveyed by participles: the first one, inherited from the Sk present participle -anta (> -ata, -ta/da) conveys imperfect, and the second one, inherited from the Sk past passive participle -ita (> -ia > -a), conveys the so-called “perfective” aspect, which is better labelled “perfect” or “accomplished” since the perfective is expressed by other ways (see section 3.4.6). The old marker -ita was reduced to a Ø-morph, which grammaticalized to perfect aspect.

.. Progressive: ‘stay’ Progressive aspect is fully grammaticalized in Hindi/Urdu and marked by the verb rah ‘stay’ (V rah- T). The verb rah, which is also an independent verb, is itself derived from the root rah- ‘be isolated, alone, separate’ (cf. rahit ‘isolated, devoid’, virah ‘loneliness, separation, longing in separation’). Its past participle plus copula is added to the basis of the main verb:10 (14) a. Hi/Ur bacce ab khel rahe haĩ child. now play stay[.].. [= ] be.. ‘The children are now playing.’ b. bacce roz kehlte haĩ child. every.day playing.. be.. ‘The children play everyday.’  Fully glossed in the following example (14). Further on, synthetically glossed after the lexeme in order to simplify the numerous and sometimes redundant morphological agreement markers.

Grammaticalization in standard Hindi/Urdu and Hindi dialects

519

Many languages have no distinct form for progressive and habitual present or imperfect, particularly those which maintained the (alternative) synthetic form for present longer: Kumaoni has -rya of same origin, as an optional suffixed marker (me͂ ghar janairyõ ‘I am going home’). Similarly, Garhwali uses either the simple ͂ dū̃ or, depending on the dialects, the periphrastic present form of the present jān jāṇu͂ chu͂ or the suffixed form jāna-raya, which indicates that the grammaticalization of the category is still in process. A higher occurrence of a distinct progressive is not necessarily favoured by the complete loss of the simple present: Eastern languages, which have an auxiliated present (Aw dekhata hai “sees”, Bhoj dekhat bāt ̣) seem to have not fully grammatcalized the opposition. Both use rah as a past tense marker (cf. section 3.3.2), either as a progressive (Aw mai ͂ bajār-ai jāt rahau͂ [1 market- go. stay..1] ‘I was going to the market’) or as a habitual (Bhoj bhūkh pet ̣ jarat rahal ‘hunger was burning [his] belly’, u khāt rahele ‘he used to eat’).

.. Durative/continuative: ‘stay’, ‘go’, ‘come’ The simple expression of unbroken duration is conveyed by the auxiliary rah ‘stay’ with present participle (as opposed to progressive): boltā rahtā hai ‘keeps speaking’, boltā rahā ‘kept speaking’. Less grammaticalized and occurring only with stative intransitive verbs, the same auxiliary occurs in Hi with the past participle (bait ̣he rahe “they remained seated”). More marked form of continuative (sometimes with increase of the process, and suggestion of a never-ending process: “went on and on”) use the present participle, with the verb jā ‘go’ mostly in the progressive, and the complex verb calā jānā ‘walk go’ or calā ānā ‘walk come’, the latter when the process is represented as continuing till the speaker’s time.

.. Frequentative: ‘do’, ‘be, exist’ Apart from the general present and imperfect, which contrast with the progressive, Hi has a marked habitual formed with the verb karnā ‘do’ added to the past participle of the main verb, which remains invariable, a unique case in the Hindi verb paradigm. The form occurs not only in the past imperfect (āyā karte the [come... do ..] ‘they used to come’) but also in other tenses and moods ‘(āyā karūg͂ ī (come... do...) ‘I will come regularly/repeatedly’). It has been suggested that the main verb is a nominal form, which accounts for the uninflected form and for the semantic path (do the coming), but lacks of other instances of this formation in Standard Hindi. Another grammaticalization, more limited (and with low frequency in Standard Hindi) is the auxiliation of the periphrastic present or imperfect of the verb ‘be’

520

Annie Montaut

(hotā hai, hotā thā), elsewhere expressing an existence “by nature” as opposed to the simple form (hai ‘is’, thā ‘was’) used as copula and auxiliary. This periphrastic present, added to the semi inflected main verb, conveys the meaning of regularity, mostly with ‘whenever’, and is compatible with the progressive: (whenever I came), pī rahī hotī thī ‘she was drinking tea’.

.. Terminative: ‘pay off’ Completion is conveyed by the simple past (and with vector verbs), but emphasis on completion or on acquired experience has a specific marker in Hindi, cuk from a verbal root (cukā ‘pay off ’ rather than cūk/cuk ‘miss, fail, default’): khā cukā hū ͂ ‘I already ate/I have finished eating’.

.. Inchoative: ‘touch’ The verb lagnā ‘touch’, constructed with a goal infinitive, is grammaticalized as a marker of inchoation (bolne lagī ‘she started speaking’). The fact that it is incompatible with negation, interrogation and progressive, like the terminative, may suggest that it is not yet fully grammaticalized. Note that in earlier stages of the language, it used to occur before the infinitive (lagā calne ‘he started walking’), which is a non-typical order for auxiliaries.

.. Perfective aspect: vector movement verbs A dozen movement verbs (‘go’ is the most frequent in all Hindi and IA languages as illustrated in example [1]) are used with all tenses and with non-finite forms as semi-auxiliaries which convey, with perfectivity, various subjective meanings: they are sometimes treated as purely aspectual markers (Nespital 1997) since contrary to simple verbs they are not compatible with phrasal auxiliaries (progressive, inchoative, terminative), they are rare in questions and negative clauses and rule out a conative meaning of the verb: ā jāo [come go.] ‘come!’; vah bait ̣h gayā [3 sit went] ‘he sat down’. However, used with a transitive verb, jānā ‘go’ emphasizes a sudden, uncontrolled action (pī gayā [drink went) ‘he drunk hastily, gulped’). A number of supplementary meanings are conveyed by the various verbs used as vectors, with transitive verbs, le ‘take’ and de ‘give’ direct the process towards (15a) or away from the subject, whereas ḍāl ‘throw’ emphasizes a process performed with suddenness, and mār ‘strike’, a process performed in a brutal or rude way. The verb “sit” is added to transitive verbs for conveying a negative judgement on the process as extremely inappropriate or foolish (15b).

Grammaticalization in standard Hindi/Urdu and Hindi dialects

(15) a. samjhā? – understand.[]. ‘Did you understand?’ –

521

samajh liyā understand take.[].. ‘(I) (fully) understood.’

b. tum kyā samajh bait ̣he 2. what understand sat ‘What did you imagine (crazily / you totally misunderstood)?!’ Similar attitudes from the speaker’s viewpoint and added suddenness are conveyed by “rise” and “fall” with intransitive verbs (hãs par ̣ā [laugh fell] ‘burst into laughter’, ro ut ̣thā [cry rose] ‘burst into tears’).

. Mood and modality .. Indicative present > subjunctive present The Hindi subjunctive or contingent mood is, as in most NIA languages, the result of a reanalysis of the old synthetic present (basis – person ending) as, first, conveying what Bybee et al. (1994) labelled an “open meaning”, mapping the whole nonpast notional domain, then, with the renewal of the present and future tenses, restricting its meaning to the non-actualized and consequently reanalysed as a subjunctive (maĩ jāū?̃ [1 go-1] ‘May I go?’). Other tenses of the subjunctive mood are compound forms with the ‘be’ verb in the subjunctive auxiliated to either participle. The only inherited finite form in Hi/Ur, the subjunctive displays person endings directly on the verb, inherited from the Sk present (see section 3.2.1).

.. Counterfactual: present participle A reanalysis of the present (imperfect) participle contemporary to the previous one resulted in the grammaticalization of the aspectual marker -t- as a counterfactual in most IA languages (mai jata [1 go-t-. ‘had I gone, if I had gone / I would have gone’). It should be noticed that this form, analogous to the present participle except for the feminine plural which adds to the adjectival endings a nasalization as in the noun paradigm, has also a meaning of habitual, markedly indefinite imperfect in Hi.

.. Presumptive: future of ‘be’ The use of hogā ‘will be’ after an inflected verb conveys doubt on the assertion, represented as a mere probability or supposition or inference: ve pahũce hõge [3.

522

Annie Montaut

reached will.be] ‘they probably arrived’. Sometimes the deontic modal cāhie ‘must’ is also used as a presumptive.

.. Obligation: ‘see > want’, ‘be’, ‘fall’ The default (general) obligation marker is the invariable cāhie, related to the verb cāh ‘want’, which originally derives from the Sk basis cakṣ ‘see, look for, expect’, with passive inflection (a similar etymology accounts for the Gujarati marker joy used for strong/constraining obligation). Specific deontic obligation is conveyed by the verb ‘be’ and constraining obligation (be obliged to, have to) is conveyed by the verb paṛ ‘fall’. The three markers require the nominal form of the main verb in a special construction with dative agent and agreement of the verbal noun with the object if the main verb is transitive (Ex. [16c]). (16) a. Hi/Ur baccõ ko jaldī sonā cāhie child..  early sleep. should ‘Children should go to bed early.’ (general advice) b. tumko āj jaldī sonā hai /hogā 2. today early sleep. is/will.be ‘You must go to bed early today.’ c. tumko yah cit ̣t ̣hī dobārā likhnī paṛegī / paṛī 2. this letter.. twice write... fall.../fall.[].. ‘You will /had to write this letter again’.

.. Possibility ‘power’ > ‘can/may’, ‘find’ > ‘can’, ‘come’ > ‘be able’ The grammaticalization of Sk śak > sak, originally meaning power (still available in modern languages in the noun śakti ‘power’) is the source of the default possibility marker, compatible with all syntactic and semantic contexts. The second marker pā, from a verb originally meaning ‘find’ and still available in the lexicon, is mainly used in negative contexts with a distinctive meaning of “succeed in, manage to”. A third way of expressing possibility, as mere ability close to knowledge and experience, is the verb ‘come’, which requires the same special construction as the deontic markers (dative agent, optional object agreement). (17) a. Hi/Ur vah angrezī bol saktā hai 3 English speak can .3. ‘He can speak English.’

Grammaticalization in standard Hindi/Urdu and Hindi dialects

523

b. usko sāikil calānī ātī hai 3. bicyle.. drive... come .3. ‘He can/knows how to drive a bicycle.’ c. usko angrezī (bolnī) ātī hai 3. English.. speak... come .3. ‘He knows (is able to speak) English.’ Intransitive verbs with a medio-passive meaning also express potential modality with negation and instrumental agent (same conditions as the “incapacitive” passive cf. example [10]). Example (18a) conveys, without modal auxiliary and negated intransitive, a meaning roughly equivalent to (18b) with modal auxiliary and transitive verb. What accounts for the modal meaning is the construction, in the same way that it does in the modal passive. (18) a. Hi/Ur mujh=se darvāzā khul nahī ̃ rahā hai 1= door.. be.open   be..3. ‘I cannot (manage to) open the door.’ b. maĩ darvāzā khol nahī ̃ pātā 1 door open  can ‘I cannot open the door.’

.. Other Distinction between equative ‘be’ and locative / existential ‘be’ Sadani has grammaticalized the divergence between the two Sanskrit verbs ‘be’ (bhū ‘be/become’ and as ‘be’) in order to contrast equative and existential sentences: heke or hay/ay (< bhū), depending on dialects, is used in equative sentences, and ahe (< as) for locative sentences. The Halbi dialect of Sadani has the same distinction with different forms, using rah in the past for equative sentences and asot for locative and existential sentences. This distinction, also present in Sinhalese, Oriya and Bengali, but absent in Standard Hindi and Western dialects, is probably due to the contact with Dravidian languages. A negative copula and auxiliary exist in Bhojpuri and Chattisgarhi: Bhj naikhe ‘not be’ corresponds to the positive copula hokhe ‘be’ (Tiwari 1966: 178), made by the fusion of negation na and copula hokhe in a reduced form: ham naikhī jāt ‘I do not go’, second person naikhā, plural naikhān. Chat has the equivalent nakhaū̃ ‘I am not’. This emergence of a negative copula is considered to result from the contact with Dravidian languages (Apte and Southworth 1967), which had two distinct verbs for the positive and negative copula (and two others for the existential/loca-

524

Annie Montaut

tive ‘be’) right from the beginning. However, the same process accounts for the standard Hindi negation nahī,͂ which could be considered to behave as a negative “be” in the present tense (< na.asti) since the “be” auxiliary can be dropped: (19) ve skūl nahī ̃ jāte (haĩ) 3 school  go... .. ‘They do not go to school.’

Agreement Person endings are restricted to the present of “be” verb in SH (subjunctive and compound tenses used the present of verb “be”) but occur more frequently in other tenses in Eastern dialects (Bhj -ī for 1st person, -ā for 2nd singular, -ās for 2nd plural, -ū and -ān for 3rd singular and plural). Agreement generally occurs with a single argument (subject or object in ergative constructions, see [4a]) except in Magahi, which has, like Maithili, a double agreement as a rule, and indexes all animate participants irrespective of their functions with a complex set of diversely fused affixes (20). This system of agreement, atypical for IA languages, is attributed to the contact with Mundari, which, as all Austro-Asiatic languages, indexes all animate participants on the predicate. (20) Mag a. ham oara paisa de-l-i-ai 1. 3H. money give--1–3H ‘I gave him money.’ b. ham okar intajaar kai-l-i-ain 1. 3H. waiting do--1-3 ‘I waited for him (did his waiting).’ c. ukar naukar ai-l-ain 3H. servant come--3H ‘His servant came.’ Marginally, predicates in ergative constructions can also display a kind of double agreement, either distributed on different segments of the predicate, like Marwari in (4c) or, as in Panjabi and Marathi, in the second person only. While ergative languages typically display only gender/number agreement in the past, non-ergative languages usually display person agreement: Bhj ham dekhlo ‘I saw’, tu dekhlas ‘you saw’, ‘o dekhle/dekhlasi ‘he saw’, u dekhlān (dekhlāni) ‘They saw’, Chat ham dekhhyau, tu delhles u dekhkis, ‘I, you, he saw’, plural dekhen, dekhew, dekhin).

Grammaticalization in standard Hindi/Urdu and Hindi dialects

525

. Non-finite forms .. Participles and aspect As mentioned above (aspect, tense), both participles are bearers of aspect, although usually labelled as respectively present participle with the suffix -t- (< Sk -anta) which came to be reanalysed as a marker of imperfect aspect, and perfective participle with zero marker (< Sk -ita) as a marker of definite past. Both forms added the gender/number inflexion required in the given languages.

Coverb or “conjunctive participle”: ‘do’ A third, invariable, non-finite form became a coverb used for coordination or subordination of predicates, hence the usual label of “conjunctive participle”. The device is inherited (presumably borrowed by Sanskrit from Dravidian) but most modern languages have renewed the form, originally ending in -i (< Sk -ya), by suffixing the basis of kar ‘do’ (allomorph ke in colloquial Hi/Ur, -k, -ik elsewhere) to the main verb basis. No distinct subject is allowed. Many languages (but not standard Hi/Ur) allow strings of coverbs before the main finite verb. (21) a. Hi/Ur vah kamre mẽ ā-kar bait ̣h gayā 3 room  come- sit went ‘He came and sat down.’ b. Garh daur-ik vai=ka gala par lipt-ik chumyo run- 3= neck on entwine- kiss... ‘He ran [to him], enlaced his neck and kissed [him]’

.. Infinitive and verbal noun The old infinitive was lost in all IA languages except Marathi. New infinitives were consequently created, most of them by using the -an/ana/na suffix of action nouns (Standard Hindi jānā ‘to go’, khānā ‘to eat’). Only this form appears in SH, as well ̃ as Urdu, Garhwali/Kumaoni (jāna) and Punjabi; all other IA languages also have a -v- infinitive mainly used as a verbal noun (Br karava, Bhj, Mag karav ‘do’). The -vverbal nouns derive from the Sk gerund in -tavya which was widely used in predicative constructions expressing obligation, although basically it used to express the mere notion of the verb. This point, with the -b- future also related to the old gerund, is discussed in detail in Montaut (2017).

526

Annie Montaut

.. Conclusion In the verbal category, the number of words involved in the various paths is limited (‘stay’, ‘do’, ‘be’, ‘go’), with the grammaticalization paths proliferating, particularly in Standard Hindi: constructions are more relevant than the lexical word used (‘be’, ‘stay’, ‘go’ are examples of multiple grammaticalizations, but in different constructions). While it is not surprising to find a wide range of grammaticalization paths for ‘be’, the grammaticalization of ‘go’ as a passive auxiliary is less frequent and accounts for the dynamic passive, never used for medio-passive meanings (see section 3.1.1 and section 3.1.3), not to mention the past form used for future.

. Negation: not-be While the Sanskrit negation na was retained for negating non-finite forms and modal finite verbs, the common negation with finite indicative verbs as well as the word meaning ‘no’ nahī,̃ is the reanalysis of the periphrastic expression nasti (=na asti) [.be.3.] ‘is not’ as a simple negation ‘not, no’. The original meaning as a negative copula and auxiliary is still found in Eastern languages (see section 3.4.6).

 Grammaticalization of complex constructions . Correlation and conjunctive subordination As noted above, the original system for complex sentences was the correlation (correlative theme in ya- … resumptive theme in ta-) which is the remote deictic basis, as in (3), with various endings conveying adverbial meanings (temporal, locational, manner, conditional, etc.). Conjunctive subordination is new and has been borrowed as a universal complementizer from the Persian ki (although also related sometimes to the Sk interrogative kiṃ ‘what’). To this ki various adverbs were added (interrogative: kyõ-ki [why-that] ‘because’, relative jab-ki [when-that] ‘whereas’, borrowed elements: halāk͂ i ‘although’, tāki ‘so that’, cūk̃ i ‘since’). However, this hypotactic system which allows the order Main – Dep (whereas the Sanskrit correlative system had the order Rel [Dep] – Corr [Main]) was transformed into a new correlative-like system by means of resumptive cataphoric pronouns. Such correlativation of the complementizer, optional with finite verbs, is required with non-finite verbs: (22) a. Hi/Ur maĩ=ne (yah) sunā ki vah ānevālā hai 1= (this) hear.PST that 3 come-vala is ‘I have heard that he would be coming.’

Grammaticalization in standard Hindi/Urdu and Hindi dialects

527

b. yah sunkar / yah sunne par ki vah ānevālā hai… this hear. this hear. on that 3 come-vala is ‘Hearing that / on hearing that he is going to come …’ This suggests an incipient grammaticalization of the proximate deictic as a resumptive (cataphoric) pronoun in the subordination system. During the last few centuries, relatives without correlative became a regular device for non-restrictive relativization, resulting from a truncation of the correlative diptych, sometimes with the ki conjunction suffixed to the relative in colloquial language (jo ki ‘who/which’). Apart from the correlative and conjunctive systems, the preferred device is the non-finite depending clause, which requires the order Dep – Main that is more conform to the modern head final order. All postpositions can be added to verbal nouns (allowing a distinct subject in the genitive case) and express the required dependent relation (cause: infinitive + ke kāraṇ/kī vajah se ‘because’, purpose: infinitive + ke lie ‘in order to’, ke binā ‘without’, kī vajāy ‘instead of ’, etc.). A few of them can also be added to inflected participles (see section 4.4).

. Resumptive > relative participle Dakhini, spoken in Dravidian environment, has replaced the indigenous correlative clause by a participial-like construction based on the Dravidian model except for the resilience of the resumptive pronoun and finite verb. The finite verb is placed in the position of a qualifier before the noun phrase: (23) a. Dakh ap kharide so ghar mere ko pasand ai 2 bought  house 1  liking is ‘I like the house which you bought.’ (I like this you-bought-house) b. Hi/Ur jo ghar āp=ne kharīda vah mujh=ko pasand hai  house 2= bought  1= liking is So, the old form of the deictic corresponding to modern standard vah and used as a resumptive in earlier stages of Hindi instead of the now standard vah, has become a kind of determiner in a simple sentence in the absence of a relative pronoun, and the finite verb is reanalysed as a qualifier.11 This evolution is convergent with the

 This example is also a good illustration of the influence of contact at other levels: de-aspiration (hai > ai), loss of ergative construction (apne kharida > ap kharide, loss of idiosyncratic inflectional obliques: mujhe > mere ko).

528

Annie Montaut

preferred structure of Standard Hindi with the adjectival participle, yet the presence of the deictic/resumptive (ruled out in Hi/Ur) exactly parallels the Dravidian relative participle, while the fact that the verb is finite proves that the initial basis was a correlative. So is reanalysed as a relative participle marker, which transforms a finite verb into a qualifying expression used for relativizing the noun. (24) Dakh = āyā-so laṛkā come. boy ‘The boy who came.’

Tamil vanda payyan come.. boy

. Emphatic restrictive particle hī: ‘as soon as’ This particle, which is also a focus marker, is added to the manner correlative jaise … vaise or jyõ … tyõ ‘as/as’ for conveying temporal coincidence: jaise hī … vaise hī, jyõ hī … tyõ hī ‘as soon as’. Added to present participles, it conveys the meaning ‘as soon as’: (X) ke āte hī ‘as soon as X arrived’ (where X represents a subject distinct from the main subject).

. ‘also’ > even if, although (+ participle, + infinitive locative) Added to a present participle or to an infinitive, bhī ‘also’ conveys a concessive meaning, in alternation with the concessive conjunction hālāk̃ i or yadyapi: X ke āne par bhī [X  come. on bhī] ‘despite X’s coming, although X came’; X ke āte hue bhī [X  coming been bhī)], with the same meaning (see section 5.2).

. The infinitive and participial clauses: ko > complementizer? For expressing order (‘asked X to do’), Hindi uses an infinitive followed by the dative postposition ko (substandard Hi/Ur ke lie ‘for’, used with other verbs like ‘oblige’ to construct the infinitive complement in Standard Hi/Ur). (25) maĩ=ne sītā se kitāb laut ̣ne ko kahā 1= Sita  book return. ko say.[].. ‘I told Sita to give back the book’ One may also consider ko as a postposition simply marking the goal (verbal noun) of the order. The whole sentence is a succession of nominal phrases with a final finite verb, which converges with the Dravidian pattern – only one finite verb is

Grammaticalization in standard Hindi/Urdu and Hindi dialects

529

allowed in a sentence, except with quotative expressions. Similarly, clauses complementing verbs of perception (see, hear X do) have the dependent verb as an adjunct to the object (usually marked by ko). The fact that the coverb is preferred to conjunctive coordination for coordinating clauses with common subject, even with distinct complements, and that participial adjuncts are preferred to conjunctive subordination, goes on the same line.

. Say > Quotative > subordination marker Ancient and middle IA languages do have a quotative (iti), used only for closing a quote or discourse segment uttered by a distinct speaker/thinker. This quotative of obscure origin is not related to the verb “say” and has never been used as a complementizer. In certain modern IA languages like Marathi and Bengali, a verb ‘say’ has been grammaticalized as a quotative, a new development usually attributed to the Dravidian substrate, since in Dravidian languages the quotative is the only way for introducing a distinct finite verb in the sentence. This also happened in Dakhini, where the coverbs ‘speak’ or ‘say’ have been reanalyzed as a quotative and complementizer (‘that’) with conditional meaning when suffixed by the particle to ‘then’.

(26) a. Dakh o sabā̃ ātū̃ ka-ke (bolke) bolyā 3 tomorrow come..1 say- speak. speak... ‘He said that he would come (lit. I come saying) tomorrow.’ b. tū ātū̃ ka-to mai bī ātū̃ 2 come..1 say-then 1 too come..1 ‘If you come, I too will come.’ For the reanalysis of the particle to as a postponed marker of condition clauses in Standard Hindi and topic particle, see below. Garhwali developed another type of quotative marker from the same verb bal (< bol), but that one is rarely used as a complementizer. With the correlative base jãn- ‘as’, the compound jãnbolyā conveys the meaning ‘as if ’. However, the “quotative” is mostly used as an evidential marker, conveying hearsay, scepticism or intensity depending on its position in the sentence and intonation (27). (27) a. Garh mantrī=jī tumrā gaũmā āyā̃ bal Minister= your village came bal ‘(We heared that/one says) The minister came to your village.’

530

Annie Montaut

b. mantrī=jī cha bal āyā̃ tumrā gaũmā Minister= aux bal came your village ‘The Minister has come to your village?! (surprised: really?)’ c. mī=na apnī āk̃ hyũ dekhī bal 1= refl eyes. see. bal ‘I saw it with my own eyes, true!’ It is also used in tales (ek cha bal rājā [one was bal king] ‘Once upon a time there was a king’). Kumaoni uses the two verbs kah ‘say’ and bal ‘speak’ to convey respectively intensity/admiration and hearsay.

. Order: Dep Main without linking device as a means of subordination The ordering of the dependent clause before main clause, as in (26) may constitute a way of hierarchizing both clauses without any marker except intonation and order, a process which Heine and Kuteva (2005: 113) consider highly relevant to grammaticalization: (28) Hi/Ur vah kal āegā yah khabar ab-hī milī 3 tomorrow come..3. this news now-just be.got.[].. ‘(We) just got the news that he will come tomorrow.’ In sentences of this type, the resumptive yah ‘this’ has been sometimes analysed as a complementizer (see the reanalysis of the postposition ko in section 4.5), but the pattern is closer to a complex sentence of the type yah khabar milī ki … ‘got this news that …’ with ki, and not yah as a complementizer. It however raises the problem of how to analyse the “correlativization” of the borrowed conjunctive model with added resumptive pronouns (Montaut 2013b).

. Other ‘Somewhere’ > unless, lest The indefinite/interrogative kahī ̃ ‘somewhere’ is used as a complementizer of verbs of fear, doubt, etc. It is also used to convey a meaning of negative teleonomy in simple sentences, with negation: kahī ̃ vah na āe ‘(let’s hope) he does not come’ (it can be discussed whether or not it is a case of insubordination).

Grammaticalization in standard Hindi/Urdu and Hindi dialects

531

‘First’ > before (adverb) > temporal clause Like many postpositions, se pahle ‘before’ (< pahlā ‘first’) is used in non-finite clauses, conveying anteriority (no conjunctive alternative): X ke āne se pahle [X  come. before] ‘before x came’.

 Discourse particles and connectors In Hindi/Urdu, as well as in other IA languages, the number of words which grammaticalized into discourse particles is rather limited, but the frequency of each is very important, particularly in the colloquial level where hardly any speech sequence is devoid of particles, particularly to.

. Deictic > topic particle, argumentative particle The connector to (ta Gh/Ku, tau Br ‘then’) is the grammaticalization of a Sk adverb tavat (> tava > tau), itself the ablative of the deictic ‘that’. This deictic first grammaticalized in most IA languages as a connector when clause or sentence initial (29a) and as a resumptive pronoun in the correlative system, particularly after conditionals (13c). Very often the ‘if ’ conjunction (agar, yadi) is dropped, particularly in the spoken register, and the correlative, which is always required, remains along with intonation the only marker left conveying the meaning condition. As a result, an incomplete sentence is then marked as a condition by the resumptive to, which is never dropped even when the main clause is absent, and occurs at the end of the ‘if ’ clause, being intonationally part of it (29b). Following the now classical analysis of conditionals as topics, one can understand how to came to be reanalysed as a topic particle (29c). (29) a. to ham bhī jāẽge then 1 too go.. ‘Then we too will go.’ b. yah pahle hī jagī huī to … this first just be.awaken be.[] to ? ‘(And) If she is already awaken … ?’ c. yah to ham sab jante hai this to() 1 all know be... ‘This, we all know’ It further underwent a process of subjectification when, as a clitic first grammaticalized into a (contrastive) topic particle, it became a discourse particle which has

532

Annie Montaut

scope on the whole clause (in various positions). The whole sentence is presented as a counter argument or a contrasting viewpoint of the speaker with various argumentative meanings (denial of relevance of previous argument, objection by new unpredicted argument, countering a supposed reluctance on the part of addressee, etc.; see Montaut [2016c]). (30) are! bātāo to! hey tell. to ‘Oh, will you tell it in the end/what are you waiting for telling it!’

. ‘too’ > ever, even > intensifier The Sk particle api (‘even’ /indefinite /concessive /‘too’) suffixed with the intensifier hu produced the modern bhi which is used in the meaning “too” as a coordinator (with two coordinates, postposed), a marker of concession (with participle or verbal nouns: na cāhte bhī not wanting bhi ‘although not wanting’) and marker of indefiniteness when added to pronouns (koi bhi ‘anybody’, jahā̃ bhī ‘anywhere’). The Sk form has been analysed by Emeneau (1980) as a grammatical calque from the Dravidian suffix –um, which corresponds to replication in Heine and Kuteva’s (2005) terms, but the fact that similar “disparate” constellations, as he coins it, exist elsewhere (Forker 2016), with Latin =que for instance, rather suggests that the initial meaning ‘too’ grammaticalized into concession and indefiniteness. It also shares discursive and polemical uses with to, although less frequently than to in speech (āo bhī! [come.] ‘Why don’t you come, do come!’). Certain languages such as Aw or Bhj, maintain the older form in -o (etymology as complex as the etymology of bhī but almost consensual according to Emeneau [1980] and Tiwari [1961]).

. Restrictive particle > focus marker The restrictive particle hī (do hī ‘only two’) has become the standard focus marker, particularly when the focus position, which is preverbal, is already affected to the focused entity: ham usī ko cunā hai ‘it is him whom we have chosen’/ ‘we have chosen him’.

Grammaticalization in standard Hindi/Urdu and Hindi dialects

533

 Other . Alternative question marker The complementizer ki ‘that’ is used as an alternative question marker: Hi kaunsā, rāmeś ki rājeś? ‘Which one, Ramesh or Rajesh?’ Chat to.la khaw ki tor bardai.la? ‘Should I eat you or your cow?’ It is in particular obligatory when the second term is the negation nahī ̃ “no” (Hi rukoge ki nahī ̃ ‘Will you wait or not?’).

. Attenuative/intensive suffix -sā < ‘see’, ‘seem’ The suffix -sā, agreeing in gender and number, is widely used in colloquium and Standard Hindi, but not in formal or scientific discourse, after nouns, adjectives and participles. It usually conveys attenuation, (pīlā-sā ‘vaguely yellow, yellowish’, kuttā-sā ‘something like a dog, sort of a dog’, sisaktkā-sā ‘somewhat sobbing’) but with dimensional adjectives it behaves, on the contrary, as an intensifier (bahut-se log ‘very many people’, chot ̣ā-sā ‘quite small’). This twofold function is comparable with reduplication which in similar contexts also provides for low or high degree. Its origin can be traced to an abbreviation of jaisā, the comparative/manner adverb ‘like, in the same manner, as’, itself derived from the combination of the relative and ‘see’ verb (yadr̥śya).

. Reduplication > high or low degree (Adj), simultaneity, cause-effect, avoidance (participle) Reduplicating a word or an onomatopoeic syllable is a common tool for enriching the lexicon (phaṛphaṛ ‘rustling sound’, bār-bār [time-time] ‘often’), which grammaticalized in Hindi as a marker of high degree with adjectives as in most creoles, with additional affective connotations of ‘nicely/suitable [Adj]’, while conveying attenuation with colour adjectives (hārā-hārā ‘greenish’). Reduplicated present participles transform the relation with the main verb into a relation of cause-effect, simultaneity or, less frequently, avoidance of the process, depending on the semantic class of both verbs: (31) bāriś āte āte rah gaī Rain coming coming stay go.[].. ‘It nearly rained’

534

Annie Montaut

. Echoing reduplication > ‘and so on’, notional enlargement, notional fragmentation Echo formation is a pan Indian morphonological process which operates by modifying the first consonant of the reduplicated word. In Hi, the first consonant is replaced by v-, in Garhwali and Kumaoni by ś- or h-. The resulting compound is said to have an extended meaning: cāy-vāy ‘tea and the like’ (cāy ‘tea’), pen-ven ‘pen and other writing material’, but it is also very often derogatory (paṇdị t-vaṇdị t ‘those/ stupid pandits’), and sometimes, particularly with verbs, leads to the fragmentation of the notion: paṛh-vaṛh ‘make random readings’.

. ‘First’ > before The ordinal ‘first’ pahla grammaticalized in most Hindi and IA languages to the meaning “before”. It probably originates from pratham ‘first’ (now pradhan ‘foremost, first’) > pardhan. With the restrictive particle, it forms pahle hī ‘already’.

. ‘One’ > approximation After cardinals, the numeral ek ‘one’ makes the count approximative: bīs ek ‘about twenty’.

. Additional coordinator: ‘other’ The Sanskrit indefinite apara > avara ‘other’ grammaticalized to the standard coordinator between nouns, verbs, clauses aur (aura, au, o).

 Summary (uncommon grammaticalization paths in bold) – “people, human being” > plural (section 2.1) – “all” > plural (section 2.1) – “bunch” > classifier (section 2.2) – “protector” > nomen agentis, relator (section 2.4) – reflexive > first person pronoun, inclusive plural, honorific pronoun (section 2.4) – “king” > honorific second person plural (section 2.4.1) – “body centre, soul” > reflexive, focalizer, emphatic of possessor (section 2.5)

Grammaticalization in standard Hindi/Urdu and Hindi dialects

– – –

535

“done” > genitive (section 2.6.1) “side, place” > allative > dative > accusative (section 2.6.2) “ear” > locative > allative > dative > accusative, locative > ergative (section 2.6.3) – “company” > comitative, dative > accusative (section 2.6.4) – deictic > allative > dative (section 2.6.5) – “being” > dative, ablative/instrumental (section 2.6.6) – “touch” > dative, ergative/instrumental (section 2.6.7), future (section 3.2), inchoative (section 3.3.5) – “turn” > ablative (section 2.6.8), present auxiliary (section 3.2.1), past auxiliary (section 3.2.2) – dative > accusative > experiencer marker (section 2.6.9), complementizer (?) (section 4.5) – “praise” > for, to (section 2.6.9) – “way” > through, with (section 2.6.9) – “door” > by (means of ) (section 2.6.9) – “half” > comitative (section 2.6.9) – light verbs “be”/“do” > voice markers (section 3.1.1) – “give” > permissive (section 3.1.2) – “go” > passive (section 3.1.3) – “be” > tense auxiliary (section 3.2) – “stay” > past auxiliary (section 3.2.2), durative (section 3.3.2) – past participle > anterior, aorist (section 3.2.3) – diminutive suffix > definite past > mirative (section 3.2) – “gone” > future (section 3.2.5) – obligative gerund > future (section 3.2.6) – “do” > frequentative (section 3.3.3), “do” > coverb (section 3.5.1) – “pay off” > terminative (section 3.3.4) – movement verbs > perfective (section 3.3.6) – indicative present > subjunctive (section 3.4.1) – present participle > habitual present (3.2.1), counterfactual (section 3.4.2) – “will be” > presumptive (section 3.4.3) – “see > want” > obligation (section 3.4.4) – “fall” > obligation (section 3.4.4) – “power” > possibility (section 3.4.5) – “find/get” > possibility (section 3.4.5) – “come” > ability (section 3.4.5) – negated Existential verb > negative copula (section 3.4.6), negation (section 3.6) – resumptive pronoun > relativizer (section 4.2) dative > complementizer (?) (section 4.5) reordering > subordination (section 4.7) “say”, “speak” > quotative > complementizer (section 4.6), evidential (section 4.6) deictic adverb > topic particle, argumentative particle (section 5.1)

536

Annie Montaut

“too” > coordinator, indefiniteness, concession, discourse particle (section 5.2) restrictive particle > focus marker (section 5.3) subordinative conjunction > alternative question marker (section 6.1) ‘see’ > ‘seem’ > attenuative suffix (section 6.2) reduplication > high/low degree (Adj), cause-effect, avoidance (participles) (section 6.3) echo-formation > notional extension, derogative meaning (section 6.4) “first” > before (section 6.5) “one” > approximation (section 6.6) “other” > and (section 6.7)

Grammatical abbreviations (other than Leipzig gloss) PPL = participle, HON = honorific, RESUM = resumptive pronoun.

Language abbreviations Aw = Awadhi, Bgr = Bangaru/Haryanvi, Bhj = Bhojpuri, Br = Braj, Bun = Bundeli, Chatt = Chattisgarhi, Dakh = Dakhini, Hi = Hindi, Garh = Garhwali, Ku = Kumaoni, Mag = Magahi, Mai = Maithili, Mrw = Marwari, Sdn = Sadani/Sadri, Sk = Sanskrit, Ur = Urdu.

References Acharya, Katti P., Rekha Sharma, Sam Mohan Lal & K. S. Rajyashree. 1987. Pidgins and creoles as languages of wider communication. Mysore: Central Institute of Indian Languages. Apte, Mahadeo & Franklin Southworth. 1967. Contact and convergence in South Asian languages (Special publication of the International Journal of Dravidian Linguistics). Trivandrum: International Journal of Dravidian Linguistics. Brinton, Laurel. 2011. The Grammaticalization of Complex Predicates. In Narrog Heiko & Heine Bernd (eds.), A handbook of grammaticalization, 559–569. Oxford: Oxford University Press. Bybee, Joan, Revere D. Perkins & William Pagliuca (eds.). 1994. The evolution of grammar, tense, aspect and modality in the languages of the World. Cambridge: Cambridge University Press. Cardona, George & Danesh Jain (eds.). 2003. The Indo-Aryan languages. London: Routledge. Chatak, Govind. 1966. Madhyapahari ka bhashashastrîa Adhyayan [Linguistic study of central Pahari]. Delhi: Radhakrishna Pr. Chatterji, Suniti Kumar. 1970 [1926]. The evolution of the Bengali language. Delhi: Rupa. Emeneau, Murray. 1980. Essays on linguistic area. Stanford: Stanford University Press. Forker, Diana. 2016. Towards a typology for additive markers. Lingua 180. 69–100. Gaeffke, Peter. 1967. Untersuchungen zur Syntax des Hindi. The Hague: Mouton.

Grammaticalization in standard Hindi/Urdu and Hindi dialects

537

Grierson, Sir George Abraham. 1967–1973 [1903–1928]. Linguistic survey of Indian Languages (11 vol.). Delhi: Motilal Barnarsidass. Heine, Bernd & Tania Kuteva. 2005. Language contact and grammatical change. Cambridge: Cambridge University Press. Hoernle, A.F. Rudolf. 1973 [1880]. A comparative grammar of the Gauḍian (Aryo-Indian) languages: With special reference to the Eastern Hindi. London: Trübner. Kellogg, Samuel H. 1876. A Grammar of the Hindi Language. Allahabad: Mission Press. Montaut, Annie. 1997. Pronoms, réfléchis et marqueurs de focus dans les langues indiennes. In Anne Zribi-Hertz (ed.), Les Pronoms, 101–128. Vincennes: Presses de l’Université de Vincennes. Montaut, Annie. 2008. Reduplication and echo-words in Hindi. Annual Review of South Asia Languages and Linguistics. 21–61. Montaut, Annie. 2012. Le Hindi. Louvain: Peeters (Collection Langues du monde). Montaut, Annie. 2013a. The rise of non-canonical subjects and semantic alignments in Hindi. In Leonid Kulikov & Ilja Serzants (eds.), Diachronic typology of non-canonical subjects, 92–117. Amsterdam & New York: Benjamins. Montaut, Annie. 2013b. De l’anaphore à la corrélation. In Olga Inkova & Pascale Hadermann (eds.), Corrélation. Aspects syntaxiques et sémantiques, 193–207. Geneva: Droz. Montaut, Annie. 2016a. Why the ergative case in modal (in)transitive clauses? The historical evolution of Aspect, modality, ergative and locative in Indo-Aryan. In Eystein Dahl & Krysztof Stronski (eds.), Indo-Aryan ergativty in typological and diachronic Perspective, 135–167. Amsterdam: Benjamins. Montaut, Annie. 2016b. The verbal form (y)ā in Hindi/Urdu: An Aorist with aoristic meanings. In Zlatka Guentchéva (ed.), Aspectuality and temporality: Empirical and theoretical issues, 413– 446. Amsterdam: Benjamins. Montaut, Annie. 2016c. Deixis et particule énonciative en hindi: l’exemple de to. Faits de langue 45. 35–83. (English shorter version 2015: The discourse particle to and word ordering in Hindi. In Information Structure and Spoken Language. In M. M. Jocelyne Fernandez-Vest & Robert D. Van Valin (eds.), Information structure and spoken language from a crosslinguistic perspective, 263–282. Berlin & Boston: Mouton de Gruyter.) Montaut, Annie. 2017. The Grammaticalization of participle and gerunds: preterite, future, infinitive. In Andrej Malchukov & Walter Bisang (eds.), Unity and diversity in grammaticalization scenarios, 97–136. Berlin: Language Science Press. Nespital, Helmut. 1997. A dictionary of Hindi verbs. Allahabad: Lokbharati Prakashan. Rai, Amrit. 1984. A house divided. Origin and development of Hindi/Hindavi. Oxford: Oxford University Press. Saxena, Ram Baburam. 1971 [1937]. Evolution of Awadhi. Delhi: Motilal Banarsidass. Sharma, D. D. 1984. The Formation of Kumaoni Language. Delhi: Bahri Publications. Southworth, Franklin. 2005. Linguistic prehistory of India. London: Routledge-Curzon. Strnad, Jaroslav. 2013. Morphology and syntax of Old Hindī. Amsterdam: Brill. Tessitori, Luigi. 1914–1916. Notes on the Grammar of the Old Western Rajasthani, with Special Reference to Apabhramsha and to Gujarati and Marwari. Indian Antiquary. 42–44. Tiwari, Udayan N. 1961. Hindi Bhasha ka udgam aur uska vikas [The origin of the Hindi language and its development]. Prayag: Bharati Bhandar. Tiwari, Udayan N. 1966. The development of Bhojpuri. Calcutta: The Asiatic Society. Verma, Mahindra K. 1991. Exploring the parameter of agreement in Magahi. Language Science 13(2). 125–143. Wiemer, Björn. 2004. The evolution of passives as grammatitical constructions in Northern Slavic and Baltic languages. In Walter Bisang, Nikolaus P. Himmelmann & Wiemer Björn (eds.), What is grammaticalization? A look from its fringes and components, 271–331. Berlin & New York: Mouton de Gruyter.

Guillaume Jacques

12 Grammaticalization in Japhug  Introduction Japhug and other Gyalrong languages are among the languages with the richest morphology of all the Sino-Tibetan family. While the ultimate lexical origin of most grammatical markers in Japhug is unknown and probably unrecoverable, many affixes are nevertheless analyzable as being derived from other independent words or other grammatical markers (for instance, denominal prefixes to voice markers). Although the latter are not cases of grammaticalization stricto sensu, they are nevertheless highly relevant to grammaticalization theory, as they potentially provide examples of ‘missing links’ in pathways of grammaticalization. Therefore, they are systematically included in this survey. All grammatical elements whose origin can be traced without overdue speculation are treated in this paper. First, I discuss the noun phrase, in particular nominal morphology, pronouns, case markers and discourse markers. Second, I analyze the verbal template. Third, I present the historical origin of a selection of complex constructions. Fourth, I study two cases of degrammaticalization in Japhug.

 Nominal categories The inflection of Japhug nouns is quite limited in comparison with that of the verbs. There are no genders or noun classes, number (section 2.1) and case (sections 2.3 to 2.7) are marked by clitics, and the only inflectional category is that of possession (see section 2.2). The only productive nominal derivation not involving compounding is that of comitative adverbs (section 2.5). Topic and focus markers are also discussed in this section (2.8), as their scope is nearly always on a noun phrase rather than a verb phrase.

. Number Japhug has two number markers, dual ni and plural ra. These clitics are not obligatory for non-singular arguments (even in the case of human referents), and do not necessarily trigger plural or dual agreement on the verb. The dual ni is obviously related to the numeral ʁnɯz < *q-nis ‘two’, exemplifying the well-attested pathway  → . The etymology of the plural marker ra is unknown; a potential cognate exists in Pumi (=ɹə, cf. Daudey [2014, 135]; Jacques [2017a]; Japhug -a regularly corresponds to Pumi -ə in the native vocabulary). https://doi.org/10.1515/9783110563146-012

540

Guillaume Jacques

The plural marker ra can express plurality or collective meaning, as in example 1; it is not incompatible with numerals, as shown in 2. (1)

ɯ-k h a ra nɯ-mɤ-kɤ-sɯz nɤ ɯʑo kɯ qajɣi χsɯm 3.-house  3.---know  3  bread three lo-βzu -make ‘She made three pieces of bread without her relatives knowing.’ (the raven, 108)

(2)

rɟɤlpu nɯ kɯ kɯki tɯrme kɯtʂɤɣ ra nɯ-ɕki to-ti king   this people six  3- -say ‘The king told these six men.’ (Liuhaohan zoubian tianxia, 200)

In addition, the marker ra can indicate approximative time or location (see section 2.7).

. Possession and pronouns Japhug nouns can be divided into two main categories, inalienably possessed nouns and alienably possessed nouns. The former must take a possessive prefix, even when the possessor is non-specific. In that case, the indefinite possessor tɯ-/tɤ- or generic possessor tɯ- prefixes are used. Pronouns and possessive prefixes are very similar (see Table 1), but it appears that in Japhug pronouns are derived from possessive prefixes rather than the opposite: pronouns other than 3 and 3 are build by combining the possessive prefix with the root -ʑo meaning originally ‘self ’ (a common source for pronouns, see

Tab. 1: Pronouns and possessive prefixes. Free pronoun

Prefix

Person

a-ʑo, aj nɤ-ʑo, nɤj ɯ-ʑo tɕi-ʑo ndʑi-ʑo ʑɤ-ni i-ʑo nɯ-ʑo ʑa-ra

anɤɯtɕindʑindʑiinɯnɯtɯ-, tɤtɯ-

         indefinite generic

tɯ-ʑo

Grammaticalization in Japhug

541

Heine & Song [2011]). Japhug thus exemplifies a pathway   → . It is not, however, a case of degrammaticalization in the strict sense, since the bound pronominal prefixes have not become free morphemes by themselves. The 3 and 3 are formed differently from the rest of the pronouns, by combining the status constructus form ʑɤ- of the pronominal root /ʑo/ with the dual and plural markers ni and ra, with regular vowel harmony ɤ → a / _Ca in the case of ʑara ‘they’.

. Beneficiary The genitive ɣɯ (borrowed from Amdo Tibetan), in addition to marking the possessor, is the normal way to mark beneficiaries and can be used to mark recipient of some ditransitive verbs such as k h o ‘give, pass over’, as in 3. (3)

ki kɯra ɲɯ-k h am-a tɕe ki nɤki nɯ aʑɯɣ nɯ-k h ɤm. this : -give[III]-1  this   1: -give[III] ‘I will give you this, and you will give me that.’ (slobdpon, 130)

Alternatively, when the predicate is a transitive verb with an overt object, the beneficiary can be marked as a possessive prefix on the object, as a-tɯ-ci 1-.water ‘water for me’ in 4 (a case of NP-internal beneficiary, see Lehmann et al. [2004, 80] and Malchukov et al. [2010, 15]). (4)

χsɤr khɯtsa ɯ-ŋgɯ nɯ tɕu a-tɯ-ci ci gold bowl 3-inside   1-.-water  tɤ-rke ma wuma ɲɯ-ɕpaʁ-a -put.in[III] because really -be.thirsty-1 ‘Pour some water for me in the golden bowl, I am very thirsty.’

. Dative The dative ɯ-ɕki, used to mark the recipient of indirective ditransitive verbs as in (5), derives from a relator noun meaning ‘side’, a meaning still marginally present in Japhug in examples like (6). (5)

tɤ-pɤtso ra kɯ nɯ-sloχpɯn ɯ-ɕki to-t h u-nɯ .-child   3.-teacher 3- -ask- ‘The children asked their teacher.’ (Looking at the snow, 11)

542 (6)

Guillaume Jacques

ɯ-rte nɯ ɯ-rna ɯ-ɕki pɯ-kɯ-ɴqoʁ nɯnɯ 3.-hat  3.-ear 3- :-:S/A-hang  pjɤ-mɟa tɕe ɯ-ku ɯ-taʁ to-ta :-take  3.-head 3-on -put ‘He took the hat that was hanging on his ear and put it on his head.’ (140505 liuhaohan zoubian tianxia, 164)

Japhug thus attests a grammaticalization pathway  →  → .

. Comitative adverbs Japhug and other Gyalrong languages have a productive derivation whereby a comitative adverb can be derived from a noun by removing all possessive prefixes, adding the prefix kɤɣɯ- and partially reduplication the last syllable of the noun stem, as in tɤ-rte ‘hat’ → kɤɣɯ-rtɯ~rte ‘together with (his) hat’, as illustrated by (7). (7)

kɤɣɯ-rtɯ~rte ʑo kha ɯ-ŋgɯ lɤ-tɯ-ɣe -hat  house 3-inside -2-come[II] ‘You came inside the house with your hat.’ (You were expected to take it off before coming in)

As shown in Jacques (2017b), these adverbs originate from the combination of the S/A-participle kɯ- with the denominal derivation prefix aɣɯ-, which derives proprietive stative verbs from nouns, as shown in Table (2). Comitative adverbs are actually homophonous with the S/A-participle of such verbs, as shown by examples (8) and (9), which present a minimal pair contrasting the comitative adverb ‘with his/her children’ on the one hand and the participle ‘having many children’ on the other hand (both derived from the possessed noun tɤ-rɟit ‘child’), both pronounced kɤɣɯrɟɯrɟit.

Tab. 2: The denominal prefix aɣɯ-. Base noun

Meaning

Denominal verb

Meaning

tɯ-ɣli tɤ-lu tɯ-mɲaʁ tɯ-ɕnaβ ɯ-mdoʁ

excrement, manure milk eye snot colour

aɣɯ-ɣli aɣɯ-lu aɣɯ-mɲaʁ aɣɯ-ɕnɯ~ɕnaβ aɣɯ-mdoʁ

producing a lot of manure (of pigs) producing a lot of milk (of cows) having a lot of holes be slimy having the same colour

Grammaticalization in Japhug

(8)

iɕqha tɕ h eme nɯ kɯ-ɤɣɯrɟɯrɟit ci the.aforementioned woman  :S/A-have.many.children  pɯ-ŋu .-be ‘This woman had a lot of children.’

(9)

kɤɣɯ-rɟɯ~rɟit ʑo jo-nɯ-ɕe-nɯ -children  --go- ‘She/They went back with their children.’

543

Ambiguous sentences like (10) actually constitute the pivot constructions that allowed reanalysis in contexts where both proprietive (‘having X’) and a comitative (‘with X’) interpretations were possible. (10) si kɤɣɯrtɯrtaʁ ɲɯ-ɕar-nɯ tree :S/A:have.many.branches//:branch -search- ‘They search for a tree having a lot of branches’ → ‘They search for a tree and/with its branches’ This is thus a particular instance of a pathway  → , which may be attested in other language families (Stassen 2000; Stolz et al. 2006; Arkhipov 2009).

. Comparee and standard In the comparative construction, both the comparee and the standard are marked, respectively by the postpositions kɯ and sɤz (example 11). The marker kɯ on the comparee is obligatory only if the standard is not overt, otherwise it is optional. (11) [ɯ-ʁi]standard sɤz [ɯ-pi 3.-younger.sibling . 3.-elder.sibling nɯ]comparee kɯ mpɕɤr  . be.beautiful: ‘The elder one is more beautiful than the young one.’ (elicited) The mark kɯ on the comparee is etymologically related to the ergative kɯ (borrowed from Amdo Tibetan). The complex grammaticalization pathway leading from ergative to comparee marker is presented in Jacques (2016a). This unusual pathway  →  rather than the more common  →  is all the more surprising as the ergative kə/ɣə in Amdo

544

Guillaume Jacques

Tibetan from which the Japhug ergative kɯ was borrowed is used for the standard in the comparative construction. The marker sɤz contains the locative suffix -z (which also appears as a tautosyllabic postposition zɯ), but the etymology of the first element sɤ- is unknown.

. Location In Japhug, there are four distinct (non mutually exclusive) ways of marking locative adjuncts. First, locative and temporal adjunct are commonly left unmarked. Second, they can take the locative postpositions zɯ or tɕu. Third, relator (possessed) nouns such as ɯ-ŋgɯ ‘inside’, ɯ-taʁ ‘on’, ɯ-pa ‘under’, ɯ-rkɯ ‘side’ can be used for more specific locations. They can be followed by the locative zɯ or tɕu as in (12). (12) kha ɯ-rkɯ zɯ nɯnɯ qajɯ pɯ-nnɯ-ŋu, tɯrdoʁ house 3.-side   worm .--be grain kɯ-fse, tɤ-rɤku pɯ-kɯ-ʁndɤr :S/A-be.like .-cereals -:S/A-be.spilled kɯ-fse nɯra ɣɯ-tu-ndze ɲɯ-ŋu :S/A-be.like : --eat[III] -be ‘(During winter,) it comes near the house to eat worms or grains that have been spilled (on the ground).’ (23-pGAYaR, 94) These postpositions are however optional, as shown by the following example from the same story as (12). (13) kha ɯ-rkɯ kɤ-ɣi wuma ʑo rga house 3.-side -come really  :fact ‘It likes to come near the house.’ (23-pGAYaR, 95) Fourth, the plural marker ra can indicate approximate location, as in 14. This use of ra is reminiscent of plural markers in Kirghiz and Old Japanese, which combine collective, hypocoristic and approximate locative meanings (see Antonov 2007, 195). (14) tɯ-zda nɯ ma kɯmaʁ tɯrme a-pɯ-me .-companion  apart.from other people --not.exist tɕe, kha ra aʁɤndɯndɤt ɲɯ-ɤnɯɣro ɲɯ-ŋu ɲɯ-ti.  house  everywhere -play -be -say ‘He says that (the young monkey) would play everywhere in the house whenever there are no other people (apart from members of the family).’ (19 GZW2, 10)

Grammaticalization in Japhug

545

. Topic and focus Topic and focus markers do not strictly belong to nominal markers, since they can have scope over verb phrases or even entire sentences, but since they are mainly used with noun phrases, they are nevertheless included in this section. Three of the discourse markers in Japhug have clear etymologies: the delimitative topic pɯpɯŋunɤ (‘as for …’), the aforementioned topic iɕqha and the unexpected focus rcanɯ.

.. Delimitative topic The delimitative topic marker pɯpɯŋunɤ ‘as for …’ is transparently derived from the conditional past imperfective form be the verb ‘be’ (as in 15), meaning originally ‘if it/he/she is …’. (15) pɯ~pɯ-ŋu nɤ ~-.-be if ‘If it is … However, the grammaticalized status of this marker is obvious when the element marked with pɯpɯŋunɤ ‘as for …’ is a first or second person pronoun, as in example (16). (16) nɤʑo pɯpɯŋunɤ, ɬɤndʐi ra ɣɯ nɯ-kɯ-βʁa, 2 as.for demon   3.-:S/A-be.victorious nɯ-rɟɤlpu tɯ-ŋu 3.-king 2-be: ‘You, you are the king of the demons.’ (hist140512 fushang he yaomo, 61) Here, the form of the topic markers remains pɯpɯŋunɤ, without agreement: if this still were a conjugated verb, the second person singular form shown in (17) would have been expected instead. (17) a-slama pɯ~pɯ-tɯ-ŋu nɤ 1.-student ~-.-2-be if ‘If you were my student … This shows that pɯpɯŋunɤ is not a proper verb form anymore, and has been fully grammaticalized.

546

Guillaume Jacques

.. Aforementioned topic The temporal adverb jiɕqha ‘just before’ has become grammaticalized as a pre-nominal determiner ‘the aforementioned’ expressing that the nominal in question has been referred to previously in the discourse, though not in the last few sentences. In example (18), for instance, the leaf is mentioned four sentences before. (18) tɯmɯ ci tɕhɤrnaʁ ci tɕhɯmtɕhɯm ko-lɤt. tɕendɤre rain  rain  .II:little.rain -throw  jiɕqha tɤ-jwaʁ nɯ pjɤ-nɯndzom tɕe, the.aforementioned .-leaf  :-flow.along  ɯ-ʁi ɯ-kɯr ɯ-ŋgɯ nɯ tɕu 3.-younger.sibling 3.-mouth 3-inside   tɯ-ci χsɯ-ntɕhaʁ jamar pjɤ-ɕe. .-water three-drop about :-go ‘There was a little rain, and (the water) flowed along the leaf (that the elder brother had placed) and three drops of water flowed into his younger brother’s mouth.’ (Smanmi 11, 61)

.. Unexpectedness The marker rcanɯ topicalizes the preceding noun phrase and emphasizes the unexpectedness of the situation or event described by the phrase that follows, as in (19), where the blackening of the sparrows surprised (and amused) the person telling the story. (19) tɕendɤre thɯ-kɤ-βlɯ nɯ ɲɯ-ɕti tɕe, ɯ-ŋgɯ  -:P-burn  -be:  chimney 3-inside ɲɯ-ɲaʁ rcanɯ kumpɣɤtɕɯ ra ɲɤ́-wɣ-sɯɣ-ɲaʁ-nɯ ʑo -be.black  sparrow  ---be.black-  ‘Because there has been burning going on, the inside of the chimney is black, and it made the sparrows (who had build a nest inside it) become (completely) black!’ (22 kWmpGAtCW, 72) When it occurs before an adjectival verb, whether in finite or non-finite form as kɯdɯ~dɤn ‘numerous’ in (20), or before an ideophone (21), rcanɯ indicates high degree. Adjectival verbs in this case often have emphatic reduplication. (20) tɕe nɯ ɕoŋtɕa rcanɯ kɯ-dɯ~dɤn ʑo   wood  :S/A-~-be.a.lot  pjɤ-sɯ-phɯt-nɯ. --chop-pl ‘And they had (people) chop quite lot of wood (for them).’ (28 qAjdo, 103)

Grammaticalization in Japhug

547

(21) ɯ-phoŋbu nɯ rcanɯ ʁɲɟliʁɲɟli ʑo ɲɯ-pa 3.-body   :II:huge;massive  - ‘Its body is huge.’ (20 sWNgi, 16) This marker is derived from the possessed noun ɯ-rca ‘following, together with’ (see example 22) together with the distal demonstrative nɯ ‘that’. (22) aʑo a-rca kɤ-ɣi mɤ-tɯ-cha 1 1-following -come -2-can: ‘You cannot come with me.’ The evolution from ‘following, together with’ to ‘unexpectedness’ is not completely straightforward. The pathway  →  (non-scalar additive) →  (scalar additive) →  can be proposed.1 It involves intermediate stages that are all attested: in particular, many languages have the same morpheme for scalar and non-scalar additives (for instance Karbi, see Konnerth [2014]), and the directionality is clearly from non-scalar to scalar additives. However, this hypothesis can only be confirmed if traces of the proposed intermediate stages are discovered in other Gyalrong languages.

 Verbal categories . Person indexation Japhug and other Rgyalrongic languages have a polypersonal indexation system that includes several morphemes with cognates in Kiranti languages (for instance, the second person *tə- prefix, the inverse *wə- and the direct third person object *-w), and appears to be at least in part of proto-Sino-Tibetan origin, though this issue is controversial (see DeLancey 2011; Jacques 2012a), and thus cannot be included in this paper. I focus here on generic person markers and portmanteau prefixes for local scenario 1 → 2 and 2 → 1, which have relatively transparent etymologies, and briefly discuss possible origins for the inverse prefix.

.. Generic person prefixes Japhug has a system of generic person marking with ergative alignment (Jacques 2012b), in which generic S/P are marked by the prefix kɯ- (examples 23 and 24), while generic A is marked by the prefix wɣ- (example 25).  This pathway was suggested by Nat Krause and Linda Konnerth.

548

Guillaume Jacques

(23) tɕeri tɤ-pɤtso pɯ-kɯ-ŋu tɕe, nɯ kɤ-ndza wuma ʑo but .-child .-:S/P-be   -eat really  pɯ-kɯ-rga. .-:S/P-like ‘When (we) were children, (we) liked it a lot.’ (12: ndZiNgri, 135) (24) tɕe ʁja nɯnɯ tɯ-q h oχpa a-mɤ-t h ɯ-ɕe  rust  .-inner.organ --:-go ra ma tu-kɯ-ɕɯ-ngo ɲɯ-ɕti have.to:  -:S/P--be.sick -be: ‘Rust should not go into one’s organs, otherwise it would cause one to get sick.’ (30: Com, 86) (25) tɯrme kɯ tú-wɣ-ndza mɤ-sna. People  --eat -be.good: ‘It is not edible.’ (11: paRzwamWntoR, 39) The generic human S/P prefix kɯ- is homophonous with the S/A participle prefix, illustrated by examples (26) and (27). (26) spjaŋkɯ kɤ-kɯ-nɯʑɯβ wolf -:S/A-sleep ‘The wolf which had fallen asleep.’ (27) ɯ-kɯ-ndza 3-:S/A-eat ‘The one who eats it.’ The generic human kɯ- arose most probably due to the reanalysis of participles as finite verbs. The exact scenario for this grammatical change is too complex to be presented here in detail (see Jacques [2018] for a complete account), but the general lines are as follows. There is evidence that the ancestor of kɯ- in proto-Gyalrong could be used to mark generic human for all S, A and P. First, there are two irregular verbs in Japhug, ti ‘say’ and sɯz ‘know’ which form their generic A with kɯ- rather than wɣ-. Second, other Gyalrong languages (Tshobdun and Situ, see Sun [2014b]) use the cognates of kɯ- for all generic human core arguments. The ancestor of Japhug kɯ-, proto-Gyalrong *kə-, was a general nominalizer that could be used for S-, A- and P-participles, and which was reanalyzed as a generic human marker for these three syntactic roles already in proto-Gyalrong. In

Grammaticalization in Japhug

549

Japhug kɯ- became replaced by the inverse wɣ- to mark generic human in A function.2 Japhug thus attests the following two paths of grammatical change: (28)  >  (29)  >  A

.. Portmanteau prefixes The Japhug transitive conjugation includes two portmanteau prefixes for local scenarios ta- 1 → 2 and kɯ- 2 → 1. The non-local forms taking these prefixes in Gyalrong languages have suffixes coreferent with the P, as illustrated by examples (30) and (31). (30) ku-kɯ-nɤjo-a -2 → 1-wait-1 ‘Wait for me!’ (heard in context). (31) maka ʑo mɤ-ta-βde-ndʑi at.all  -1 → 2-leave- ‘I will never abandon you two.’ (140507 tangguowu, 166) The portmanteau prefixes for 1 → 2 and 2 → 1 are nearly identical in Situ, Tshobdun and Zbu, as presented in Table 3 (data from Lín 1993, 218; Sun & Shidanluo 2002; Jacques 2012a; Gong 2014; Zhang 2019). Other Gyalrong languages only differ from Japhug in two regards: Japhug does not have the inverse wɣ- prefix in the 2 → 1 form, and Zbu and Tshobdun allow an alternative form with the second person prefix tə- and the inverse prefix. In all four languages, the verb receives suffixes coreferent with the patient (second person in 1 → 2 and first person in 2 → 1).3

 The generic human A prefix wɣ- in Japhug is homophonous with the inverse marker wɣ-, and presents exactly the same morphological alternations (in particular, it is one of the very few prefixes to attract stress and to be infixable with the progressive asɯ-). It is possible to synchronically analyze it as a particular instance of the use of the inverse, supposing an Empathy Hierarchy where generic human are lower than inanimates (see Jacques [2010a, 2012b] for more details on the use of the inverse in Japhug). (1) 1/2 > 3 animate > 3 inanimate > 3 generic human.  All languages apart from Situ allow double suffixation in 2 → 1, with the dual or plural suffixes stacked after the first person, as in Japhug ɲɯ-kɯ-mbi-a-nɯ -2 → 1-give-1 - ‘you (will have to) give (her) to me.’

550

Guillaume Jacques

Tab. 3: Local scenario prefixes in Rgyalrong languages.

Japhug Tshobdun Zbu Situ

→

→

tatɐtɐta-

kɯkə-o-, tə-okə-w-, tə-wkə-w-

A possible explanation for the 1 → 2 prefix is a combination between the second person prefix tɯ- and the agentless passive a-, which yields the expected form in all four languages (Jacques, 2018). In this view, a form such as ta-no-n → 2-chase2 ‘I will chase you’ (Lín 1993, 219) would have developped through the following stages:4 – *tə-ŋa-naŋ-nə 2--chase-2 ‘you will be chased’ (Passive form) – *ta-naŋ-nə 2:-chase-2 (Regular phonological fusion between the person marker and the passive prefix, attested in all four Rgyalrong languages) – *ta-naŋ-nə 1 → 2-chase-2 ‘I will chase you’ (reanalysis of the fused form as a portmanteau prefix; the unspecified agent of the passive construction is construed as being first person) – ta-no-n 1 → 2-chase-2 ‘I will chase you’ (regular sound changes) In the case of 2 → 1, the phonetic identity of this prefix with the nominalizer/generic in all four languages is striking. If, as suggested above, the grammaticalization of the nominalizer kə- as a generic person marker goes back to the common ancestor of all four Rgyalrong languages and not simply that of Japhug and Tshobdun, we may interpret the origin of a form such as kə-w-no-ŋ ‘you will chase me’ in the following way: – *kə-wə-naŋ-ŋa --chase-1 ‘someone will chase me’ (generic form, with inverse since the SAP argument is patient) – *kə-wə-naŋ-ŋa 2 → 1--chase-1 ‘You will chase me’ (reanalysis of the fused form as a portmanteau prefix; the unspecified agent of the passive construction is construed as being second person, ie, the SAP participant not otherwise indexed in the verb form) – kə-w-no-ŋ 2 → 1-chase-1 ‘You will chase me’ (regular sound changes) Note that in the Situ Gyalrong language, unlike Japhug and Tshobdun, nominalized forms in kə- (the cognate of Japhug kɯ-) are compatible with person affixes in particular conditions (see Sun & Lin 2007, 11–12), as in 32, where the verb ‘come’ bears the dual suffix -ntʃ. It is impossible to nominalize a verb in this way in the other Gyalrong languages.  Proto-Rgyalrong follows the preliminary sound laws presented in Jacques (2004).

Grammaticalization in Japhug

551

(32) tərmî to-kə́-pə-ntʃ=tə tʂaʃī nɐrə ɬamō na-ŋôs-ntʃ person --come:-= Trashi and Lhamo .-be:- ‘The people who came were Trashi and Lhamo.’ In this view, the absence of inverse marker in the 2 → 1 form in Japhug is an innovation, which can be explained by the fact that the inverse is redundant in this form. This redundancy is solved in a different way in Zbu and Tshobdun, where at least speakers accept forms replacing the portmanteau kə- by the second person tə- (see Sun & Shidanluo 2002 and Gong 2014).

.. Inverse prefix The inverse prefix wɣ- (proto-Gyalrong *wə) has cognates in many other languages of the family (Jacques 2012a), in particular Kiranti (Bantawa ɨ-, Doornenbal 2009), and is not a Gyalrong-specific innovation. Given its antiquity, attempts at etymologizing this marker are necessarily speculative. There are two possibilities to account for the origin of this prefix, if it is indeed etymologizable. First, it could derive from the verb ‘to come’ (Japhug ɣi < *wi, a verb widely attested in the Sino-Tibetan family), through the well-established pathway  >  >  >  (Jacques & Antonov [2014]; see also Konnerth [2015] for a potential counterexample). Second, it could originate from the third person possessive marker – although in Japhug the two prefixes are dissimilar (wɣ- vs ɯ-), they are homophonous in all other languages (for instance, in Bantawa and Zbu). Such grammaticalization could have taken place through a nominalized form without nominalization prefix taking a third person possessive prefix, used in subordinate clauses that are later reinterpreted as main clauses. Japhug indeed has a non-finite verb form of this type, the bare infinitive (a form discussed in particular in Jacques [2014b]), which could be analogous to the hypothesized construction from which the inverse could have been developed. These hypotheses must be considered to be preliminary until full reconstructions of proto-Gyalrongic and proto-Kiranti become available.

. Associated motion Japhug has a simple associated motion system, with one translocative / andative prefix ɕɯ- and a cislocative / venitive prefix ɣɯ- transparently grammaticalized from the verbs ɕe ‘go’ and ɣi ‘come’ respectively. These prefixes are morphologically fully integrated, as illustrated by example (33), where the translocative (in the allo-

552

Guillaume Jacques

morph ɕ-) appears closer to the root than the negation marker, and cannot bear any TAM or person marker. (33) ma-ɕ-t h ɯ-tɯ-ʑɣɤ-βde ma nɤ-wa ɲɯ-ɤk h u ---2--throw because 2.-father -call ‘Don’t throw yourself (in the river), your father is calling you.’ In Situ, Lín (2003) notices that the cislocative has been further grammaticalized as marking prospective aspect. Grammaticalization of motion verbs as prefixes is unexpected in a strict verbfinal language like Japhug, especially since purposive complements of motion verbs are always preverbal. These prefixes therefore originate from a construction where the motion verbs appeared before the main verb, either in a serial verb construction or simple parataxis (Jacques 2013b; Jacques et al. forthcoming).

. Voice The main sources for voice markers in Japhug are denominal prefixes. Five of the voice derivation prefixes, namely the Antipassive, the Applicative, the Causative, the Passive and the Deexperiencer, are homophonous with denominal derivations with similar meanings, as shown in Table 4.

Tab. 4: Voice markers and corresponding denominal derivations. Form

Voice

Corresponding denominal prefix

rɤnɯ(ɣ)sɯ(ɣ)asɤ-

Antipassive Applicative Causative Agentless Passive Deexperiencer

rɤ- (intransitive dynamic verbs) nɯ(ɣ)- (transitive dynamic verbs) sɯ(ɣ)- (verb meaning ‘use X’ or ‘cause to have X’) a- (stative verb) sɤ- (stative verb expressing a property)

These five voice derivations and their corresponding denominal origin are discussed in the following. In addition, voice derivations originating from markers other than denominal prefixes (in particular, the reflexive ʑɣɤ-) are briefly analyzed.

.. Antipassive The relationship between voice and derivation prefixes was first explained in the case of the Antipassive prefix rɤ- (Jacques 2014b), a prefix attested only in Gyalrong languages and not even found in Khroskyabs, their closest relative (Lai 2017).

Grammaticalization in Japhug

553

The Antipassive derives from the intransitive denominal prefix rɯ-/rɤthrough a two stage pathway. First, an action (or patient) nominal is derived from a transitive verb. This action nominal has the same form as the bare root of the verb, but is a possessed noun requiring a possessive prefix. For instance, from ɕphɤt ‘patch’ one derives the possessed noun -ɕphɤt ‘a patch’, which, in the absence of a definite possessor, must occur with the indefinite possessor prefix tɤ- (tɤ-ɕphɤt). Second, intransitive derivation in rɯ-/rɤ- is applied to this possessed noun, yielding the form rɤ-ɕphɤt ‘to do patching’. Following the regular pattern, possessive prefixes are lost during denominal derivations, so that a form such as *rɤ-tɤɕphɤt with the indefinite possessor prefix would not be expected. The end form rɤ-ɕphɤt ‘to do patching’ can then be reanalyzed as being directly derived from the base transitive verb ɕphɤt ‘patch’, and since the S of this intransitive verb corresponds to the A of the transitive verb ɕphɤt, and the P is lost, this originally denominal prefix is reinterpreted as being an Antipassive marker. Then, this prefix is overgeneralized to most transitive verbs. This reanalysis probably occurred recently in Gyalrongic, as forms such as rɤɕphɤt are still synchronically ambiguous between an Antipassive and a Denominal verb. While antipassive derivations are attested in other languages of the SinoTibetan family (Jacques forthcoming), Japhug and other Gyalrong languages are unique in having developed an antipassive in this fashion. Further evidence for this pathway can be found in irregular nominal forms, as there are several verb for which a semantic or morphological irregularity is shared between the Antipassive verb and the corresponding action/patient noun, but not the base transitive verbs, showing that the Antipassive form derives from the patient. For example, the intransitive verb rɤ-nŋa ‘owe money’ is an irregular Antipassive form of ŋa ‘owe X’; the additional -n- is also found in the noun -nŋa ‘debt’, showing that this irregular Antipassive historically derives from the noun -nŋa ‘debt’ rather than directly from the transitive verb ŋa ‘owe X’.5 The pathway presented here can be summarized as (34): (34)   of transitive verb +    →  The general mechanism is that the action nominalization neutralizes the transitivity of the base verb, and that a new transitivity and argument structure is allocated by the denominal prefix.

 See Jacques (2014b) for additional examples of common idiosyncrasies between action noun and antipassive verb.

554

Guillaume Jacques

The same two-step pathway of grammaticalization proposed to account for the origin of the antipassive prefix can also be applied to four other voice derivation prefixes: the causative sɯ-, the deexperiencer sɤ–, the passive a- and the applicative nɯ-. Although for these derivations, unlike the antipassive case, we lack common (semantic or morphological) irregularities between action nouns and derived verbs, the semantics of the voicing markers and corresponding denominal derivations are very close.

.. Causative The causative prefix sɯ(ɣ)- is one of the most productive derivation prefixes in Japhug, and can be applied to nearly all transitive and intransitive verbs (the detailed meaning of this prefix and the constructions in which it can be used in are described in Jacques [2015b]). The homophonous denominal prefix sɯ(ɣ)- derives verbs meaning ‘use X’ or ‘cause to have X’, such as sɯ-ɕtʂi ‘cause to sweat’ (from the possessed noun -ɕtʂi ‘sweat’), or sɯɣ-tshaʁ ‘to sieve (= to use a sieve)’ from tshaʁ ‘sieve’. The causative sɯ(ɣ)- can thus be explained as the result of reanalysis from the denominal derivation ‘cause to X’ from a possessed action nominal deriving from the base verb: (35)   of verb +    → 

.. Deexperiencer The deexperiencer prefix sɤ- derives stative verbs from intransitive verbs whose S is an experiencer or any non-agentive semantic role. The S of the deexperiencer verb corresponds to the stimulus. Examples include rga ‘like’ → sɤ-rga ‘be lovable’ or ŋgio ‘slip (of a human)’ → sɤ-ŋgio ‘be slippery (of the ground)’ (Jacques 2012b). There are a few examples of a denominal prefix sɤ- expressing a property related to the base noun, such as -ndɤɣ ‘poison’ → sɤ-ndɤɣ ‘be poisonous’ or -mbrɯ ‘anger’ → sɤ-mbrɯ ‘be angry’. The semantics of the deexperiencer derivation is closely related to that of the verb sɤ-ndɤɣ ‘be poisonous’: the property of an object that has effects on surrounding people. Here again, the deexperiencer can be hypothesized to derive from the denominal derivation in sɤ- a possessed action nominal deriving from the base verb, as in (36). (36)   of verb +    → 

Grammaticalization in Japhug

555

.. Passive The passive a- is an agentless passive, which derives intransitive verbs whose S corresponds to the P of the base verb, as ata ‘be put on/in’ from ta ‘put’. The corresponding denominal prefix a- is used to derive a stative verb describing a shape related to the noun, or a visible/perceptible concrete property, as in -ci ‘water’ → aci ‘be wet’, ʑɤwu ‘lame’ → aʑɤwu ‘be lame’ or scaʁa ‘magpie’ → ascaʁa ‘be white and black (like a magpie)’.6 The passive is mainly used in the text corpus with concrete action verbs (a-ta ‘be put on’, a-rku ‘be (put) in’, a-mphɯr ‘be wrapped’ etc.), which are generally used (though not exclusively) with a resultative meaning, thus basically stative like the denominal in a-. Hence, as in the case of all preceding derivations, it is possible to hypothesize that the passive originates from the reanalysis of the denominal derivation in a- of the action nominalization of a transitive verb, following the pathway indicated in (37). (37)   of transitive verb +    →   

.. Applicative The applicative nɯ(ɣ)– derives a transitive verb from an intransitive one; unlike in the causative derivation, the A of the applicative verb corresponds to the S of the intransitive one, and a P argument is added (Jacques 2013a). The P of applicative verbs refers to either the stimulus in the case of cognition verbs (mu ‘be afraid (intr)’ → nɯɣ-mu ‘fear (tr)’) or the the addressee (akhu ‘shout (intr)’ → nɯ-ɤkhu ‘invite (from ‘shout at’) (tr)’). The corresponding denominal derivation nɯ(ɣ)- has many different meaning, but its most productive one is to create a transitive verb from a noun, especially when one has a pair with an intransitive verb in rɯ–. For instance, from a noun such as ftɕaka ‘manner’ one can derive the intransitive rɯftɕaka ‘make preparations’ and the transitive verb nɯftɕaka ‘prepare (vt)’. Supposing an action noun such as ‘fear’ from the verb mu ‘be afraid’, applying this nɯ(ɣ)- derivation would predictably yield a transitive verb with the meaning ‘be afraid of, fear’ of the applicative verb nɯɣmu. It is thus possible here again to suppose that the applicative derivation in Japhug came into being through the pathway in (38). (38)   of intransitive verb +    →   The last two examples are nouns borrowed from Tibetan, showing that this derivation is fully productive.

556

Guillaume Jacques

Although for the causative, applicative, passive and deexperiencer derivations, no common irregularities between action noun and derived verb have been brought to light up to now, the case for reanalysis of the denominal marker as a voice marker is strong, as they not only have compatible semantics and phonological shape, they also share identical allomorphs (nɯ-/nɯɣ-/nɤ-, sɯ-/sɯɣ-/sɤ- and a-/ɤ-, see Jacques [2013a, 2015b]; Jacques & Chen [2007] for more details).

.. Reflexive The reflexive ʑɣɤ- (and its cognates in other Gyalrong languages) differs from all other derivations in that it does not derive from a denominal prefix. Two hypotheses have been proposed to account for its origin. Jacques (2010b) proposed that ʑɣɤ- from proto-Gyalrong *wjɐ- results from the incorporation of the third person full pronoun *wəjaŋ, (Japhug ɯʑo) with phonological reduction. Sun (2014a) argued that it originates from the fusion of the pronominal root *-jaŋ with the verb stem, to which the inverse prefix *wə- is added. These two hypotheses agree in any case that this prefix is partly derived from the bound pronominal root *-jaŋ ‘oneself ’, and that its shape in proto-Gyalrong was *wəjaŋ, disagreement between the two hypotheses concerns the interpretation of the nature of the element *wə- in this form, since both the inverse marker and the third person singular possessive prefix have the same shape.

. Incorporation Japhug has an incorporation-like construction in which noun-verb nominal compounds are turned into verbs by means of a denominal prefix (Jacques 2012c). For instance, from the noun cɯ ‘stone’ and the verb p h ɯt ‘pluck, take out’ one can derive an action nominal cɯp h ɯt ‘clearing the stones (from a field, before ploughing)’, which can in turn be made into an incorporating verb by denominal derivation ɣɯ-cɯp h ɯt ‘take out stones (out of the field)’. (39) a. cɯ nɯ-p h ɯt-a stone -take.out-1 b. cɯ-p h ɯt nɯ-βzu-t-a stone-clearing -do--1 c. nɯ-ɣɯ-cɯ-p h ɯt-a --stone-take.out-1 ‘I cleared the stones.’ (from the field)

Grammaticalization in Japhug

557

The construction (39c) has further become a full incorporating construction in the closely related Khroskyabs language, where the denominal prefix has in some cases disappeared due to phonological attrition (Lai 2017, 388–409). Gyalrongic languages thus offer a third possible origin for incorporating constructions, in addition to coalescence of noun and verb and backformation (Mithun 1984): reanalysis of denominal verbs derived from noun-verb nominal compounds.

. TAME Tense-Aspect-Modality-Evidentiality in the Japhug verb is mainly marked by orientation prefixes and stem alternations. The diachronic origin of stem alternations is completely opaque, so that the present section focus on orientation prefixes. In addition, I discuss the progressive prefix asɯ-, for which a Japhug-internal etymology can be proposed.

.. Orientation prefixes With one exception (the non-past factual), all finite verbs forms in Japhug obligatorily take one and only one orientation prefix. As shown by Table 5 orientation prefixes encode seven different directions, and come in four distinct sets, here marked as A to D. Finite verb forms are built by combining a specific orientation prefix with the appropriate verb stem, as indicated in Table 6. With the exception of motion verbs and concrete action verbs, which are compatible with all directions, most verbs have only one or two lexically determined orientation, which appears in the Imperfective, Past Perfective, Past Inferential, Irrealis and Imperative. Thus for instance the verb ndza ‘eat’ appears with the ‘upwards” orientation prefixes, as shown by the 3 → 3 Imperfective tu-ndze ‘He eats it’ (series B prefix, ndza → stem 3 ndze) or the imperative 2 → 3 tɤ-ndze ‘Drink it!’ (series A prefix).

Tab. 5: Orientation prefixes in Japhug Rgyalrong.

up down upstream downstream east west no direction

perfective (A)

imperfective (B)

perfective  → ’ (C)

inferential (D)

tɤpɯlɤt h ɯkɤnɯjɤ-

tupjɯluc h ɯkuɲɯju-

tapalat h akanaja-

topjɤloc h ɤkoɲɤjo-

558

Guillaume Jacques

Tab. 6: Finite verb categories in Japhug Rgyalrong.

Non-past Factual Non-Past Imperfective Past Perfective Past Imperfective Past Inferential Perfective Past Inferential Imperfective Sensory Imperfective Egophoric Present Imperfective Irrealis Imperative

   .  .    

stem

prefixes

         

no prefix B A or C pɯD pjɤɲɯkua- + A A

or  or 

or or or or

   

However, three TAM categories, namely Egophoric, Sensory and Past Imperfective require always the same orientation prefix (respectively ‘towards east’ (B) ku-, ‘towards west’ (B) ɲɯ- and ‘downwards’ (A, D) pɯ-/pjɤ-), regardless of the orientation lexically selected by the verb in question. For instance, the sensory form of ‘eat’ is ɲɯ-ndze sens-eat[III] ‘he/it eats it’ with the ‘towards west’ series B orientation prefix ɲɯ- instead of an ‘upward’ orientation prefix. The pathway of grammaticalization  →   is relatively straightforward and has been the topic of a specific study which does not need to be repeated here (Lin 2011). Given he fact that the ‘downwards’ orientation prefix is used to build the past imperfective category in all Gyalrong languages, grammaticalization most probably took place at the proto-Gyalrong stage.

.. Egophoric and Sensory evidential prefixes For the remaining two categories, Sensory and Egophoric, note that in other Gyalrong languages, including Situ, the ‘towards east’ ko-, ‘towards west’ nə-prefixes are etymologically related to Japhug ku- and ɲɯ- respectively, and that they also have the Egophoric and Sensory Evidential functions attested in Japhug (Lin 2002). These data data could seem to provide evidence for the pathways indicated in (40). (40)  →    →   Yet, the functional link between evidentiality and solar orientation system is not obvious, and it is by no means certain that the orientation system of proto-Gyalrong, when the Sensory and Egophoric markers were grammaticalized, indeed included an East/West solar-based dimension, or whether the prefixes ancestral to Japhug ɲɯ- and ku- expressed something different. It should be noted that the prefixes ɲɯ-

Grammaticalization in Japhug

559

and ku- in Japhug and their cognates in other Gyalrong languages have additional meanings than ‘towards west’ and ‘towards east’. In particular, Lín (1993, 228–9) argues that in Situ, the ‘towards west’ prefix nə- expresses in some cases ‘centrifuge’ or ‘towards outside’ directions (离心 外扩散), while the ‘towards east’ prefix koexpresses ‘centripetal’ direction ( 心). There is some evidence that the same is true in Japhug too; for instance, in example (41) we see that the imperfective of the verb ‘to separate’ (in this particular context, ‘spread wings’) takes the prefix ɲɯ- ‘toward east’ (expressing thus motion away from oneself) while that of the verb ‘to put together, to gather’ (here ‘to fold wing’) takes ku- towards west’ (motion towards oneself). (41) ji-kɯ-nɯqambɯmbjom tɤ-kɯ-rɤŋgat nɯ kɯ-fse, -:S/A-fly -:S/A-prepare  :-be.like ɯ-ʁar nɯ ɲɯ-qɤt nɤ ku-wum, 3.-wing  :-separat  :-put.together ɲɯ-qɤt nɤ ku-wum ŋu :-separate  :-put.together be: ‘(When it tweets), it does as if it were about to fly, it spreads its wings and then folds them, spreads its wings and then folds them.’ (24-ZmbrWpGa, 121) It makes more sense that centripetal orientation, rather than ‘towards east’ direction, would be grammaticalized as an egophoric marker. In Japhug, the egophoric indicates that the speaker has an intimate knowledge of a state of affair due to his direct participation in the event, as in example (42). It is mainly restricted to first person forms in assertive sentences, though it it also compatible with third persons in the case of third person referents possessed by the first person (‘my son’, ‘my work’ etc.). (42) ɯ-spa ci ku-taʁ-a bag 3.-material  :-weave-1 ‘I am weaving (cloth to make) a bag.’ (conversation, 14.10) In interrogative sentences, due to the rule of anticipation (using the TAM category one expects the addressee with employ in his answer, see Tournadre & LaPolla [2014]), Egophoric marking appears in second person forms, or in third person forms in case of referents possessed by a second person (see example 43 below). While no direct pathway of grammaticalization  →  has ever been reported, there are clear cases of cislocatives becoming 2/3 → 1 portmanteau person markers, in particular in hierarchical indexation systems (see Jacques & Antonov 2014). While the exact pivot construction which could allow reanalysis from centripetal/cislocative to egophoric is still unclear, it is valuable to explore in more detail this hypothesis.

560

Guillaume Jacques

Accounting for the pathway  →  might be more difficult at first glance. However, it should be noted that the Sensory evidential is used in direct opposition to the Egophoric in most contexts. In assertive sentences, it is very rarely used with a first person verb form (only if the speaker forgot something or lost consciousness at a certain stage) and is mainly restricted to second and third person forms. In interrogative sentences, it appears with first or third persons, rarely with second persons). There is almost complete complementary distribution with the Egophoric. Sentences (43) and (44) illustrate the difference of use of the Egophoric and Sensory forms in present third person forms, the only context where the two TAM categories are commonly in contrast to each other. These questions expect answers in the Egophoric and Sensory forms respectively. Question (44) was asked when I phoned from my parents’ home (when I came for the holidays). The Sensory is used because I only seldom meet with my parents, and the expectation is that I had just realized whether or not they were well after having arrived at their place. Question (43) on the other hand, asked about my son, expects an answer in the Egophoric because since I live with him in the same house, I always know whether he is fine or not (I did not ‘discover’ whether he was fine at a certain point). No other TAM category could be appropriate in this context.7 (43) nɤ-tɕɯ ɯ-kú-pe? 2.-son --be.good ‘Is your son well?’ (conversation 2014.02) (44) nɤ-mu nɤ-wa ni ɯ-ɲɯ́ -pe-ndʑi? 2.-mother 2.-father  --be.good- ‘Are your parents well?’ (conversation 2014.12) The Egophoric and the Sensory are thus in near-complementary distribution, and in the few cases where both are possible with a verb in the same person form, the contrast is nearly always binary. The opposition between Egophoric (personally experienced knowledge) and Sensory (knowledge mediated through observation or second hand report) thus appears to have been grammaticalized as a metaphorical extension of that between motion towards vs away from the speaker.

.. Progressive The progressive asɯ- / ɤsɯ- / az- / ɤz- differs from most TAM markers in being disyllabic (at least some of its allomorphs) and by the fact that two verb prefixes (the inverse and the autobenefactive, see example 45 and Jacques [2015c]) can be  For instance, the factual could only be used for state of affairs that are part of commonly accepted knowledge.

Grammaticalization in Japhug

561

infixed within it, suggesting that this prefix should be etymologically analyzed as a combination of two elements. (45) tɕe pjɤ-ɣi tɕe qala kɯ pjɤ-k-ɤ́z-nɤjo-ci tɕe,  :-come  rabbit  .--wait-  ‘(The leopard) came down, and the rabbit was waiting for him there.’ (The smart rabbit.2014, 60) It can only be used with transitive verbs, and removes all markers of morphological transitivity (stem three alternation, past 1/3 → 3 -t suffix) on the verb forms, as in (46), where in the factual stem 3 ndɤm instead of stem 1 ndo would be expected. The verb remains however syntactically transitive, and the A still takes the ergative marker (as in example 45). (46) sɯjno ɯ-mdoʁ ʑo asɯ-ndo. grass 3.-colour  -hold: ‘It has the colour of grass.’ (25 rtchWRjW, 69) These two features can be accounted for by assuming that asɯ- originates from the combination of the agentless passive a- (on which see section 3.3.4) with the causative sɯ- (section 3.3.2). First, the causative derivation was applied (the sɯ- element is closer to the verb stem). The causative turned the transitive base verb into a ditransitive one. Then, the passive turned it back to two-argument valency, suppressing the causer, and removing all morphological transitivity marking. In addition, as mentioned in section 3.3.4, the passive in Japhug has a stative overtone, which, applied to a dynamic transitive verb, became a progressive reading. The combination of passive and causative became common enough to change from a combination of derivations into an inflexional marker. This hypothesis does not account for all the data; in particular, it does not explain the presence of the ergative on the A: according to the model proposed here, the A of the base transitive verb, turned into a causee by the causative derivation, should be changed to an S by the passive one and would not be expected to take the ergative: the resulting verb form should have a zero-marked S corresponding to the A of the base verb, and a zero-marked adjunct, not indexable in the verb morphology, corresponding to the P of the base verb. It is possible that sentences such as (46) with zero-marked 3 → 3 form and nonovert A were the pivot allowing reinterpretation from a stative intransitive construction into a syntactically transitive construction. Suppose that we accept the historical hypothesis proposed above. At an earlier stage,8 in the progressive construction,  Cognates of the asɯ- prefix are only found in Tshobdun and Zbu, not in Situ; this is probably a Northern Gyalrong innovation; this earlier stage would correspond to the exclusive common ancestor of Japhug, Tshobdun and Zbu (see the comparative discussion in Gong [2018, 294–298]).

562

Guillaume Jacques

the referent corresponding to the A of the basic transitive construction was not marked with the ergative, and the one corresponding to the P of the basic transitive construction could not be indexed on the verb. A sentence such as (46), where the first referent is not overt (and thus the presence or absence of ergative not explicitely manifested), and the second referent is third person (zero-marked), could be reinterpreted as a syntactically transitive one by analogy with other transitive constructions, keeping the surface form but modifying the underlying analysis.

 Complex constructions The present section focuses on a selection of complex constructions involving linking elements for which a straightforward etymology can be proposed. Very few constructions are discussed in this section; those borrowed from Tibetan (such as conditional in nɤ ‘if ’), subordinate clauses with finite main verb and no overt subordinator (most complement clauses, some relative clauses, see Jacques [2016b]) and subordinate clauses with a main verb in participial form are not treated here.

. Alternative Japhug has a conjunction me ‘whether … or’ repeated after each noun or phrase in the alternative correlative construction, as in example 47. (47) saɕɯ nɯnɯ ɯ-qa me, ɯ-ru me, ɯ-jwaʁ larch  3.-root whether 3.-trunk whether 3.-leave me nɯra tɯrgi cho naχtɕɯɣ whether : fir  be.similar: ‘Whether its root, its trunk or its leaves, the larch is identical to the fir.’ (08 saCW, 5) This conjunction is obviously grammaticalized from the negative existential copula me ‘not exist’ through an alternative concessive conditional ‘whether … exists or’ involving originally the affirmative and negative existential verbs tu vs me as in 48 (48) tɤ-ʁa me tɕe, nɯ pɯ-nɯ-tu .-free.time not.exist:   .--not.exist pɯ-nɯ-me kɯ-khɯ nɯ kɯ-rga me. .--exist :S/A-be.possible  :S/A-like not.exist: ‘(Nobody gathers wild strawberries), because (we) don’t have time, it does not matter whether there are (strawberries) or not, nobody likes it.’ (11 paRzwamWntoR, 92)

Grammaticalization in Japhug

563

In its grammaticalized form, me has lost all person and TAME marking. In 49, we see that the conjunction me does not take first or second person singular indexation when used with a pronoun, as would be expected if it still were a verb and the construction an alternative concessive conditional. (49) aʑo me, nɤʑo me, ɯʑo me, kɤsɯfse ɕe-j ra 1 whether 2 whether 3 whether all go:-1 have.to: ‘Whether I, you or he, we all have to go.’ (elicited)

. Adversative There are four different adversative constructions in Japhug whose meaning can all be translated as English ‘not only/ not just … but also’, and appear to be semantically very close and interchangeable. All four constructions are recently grammaticalized and etymologically transparent. First, mɤra ma ‘not just, …’ occurs after either noun phrases (example 50) or clauses. It is grammaticalized from the negative form of the verb ra ‘need, have to’ followed by the linker ma, a construction that still exists in the language, as in example (51). (50) nɤʑo mɤrama rɟɤlpu ɕɯŋarɯra kɯ ta-thu-nɯ 2 not.just king each.better.than.the.other  :3 → 3’-ask- ɕti ri, mɯ-tɤ-nɤla-j ɕti tɕe be.:  --agree-1 be.:  mɤ-jɤɣ -be.possible: ‘Not just you, many kings, each better than the other (came) to ask (for our daughter in marriage), but we did not agree, so it is not possible.’ (The fox, 72–73) (51) tɯ-nɯzdɯɣ-nɯ mɤ-ra ma a-βlu tu 2-worry.about:- -have.to  1.-idea exist: ‘You don’t need to worry about that, I have an idea.’ (hist140505 liuhaohan zoubian tianxia, 217) Second, the linker ɯtɤjɯ ‘not only …’, mainly used after finite verbs, as in (52), is originally a relator noun meaning ‘something added’, ‘some more …’, as in (53). It can be compared to constructions such as English ‘in addition to being X, it is also Y’. (52) ɕɯ-mŋɤm ɯtɤjɯ ɲɯ-sɤzoŋzoŋ ʑo ŋu -hurt: not.only -tingle  be: ‘Not only does it (nettle) hurt, it also causes a tingling sensation.’ (hist140428 mtshalu, 6)

564

Guillaume Jacques

(53) ki nɤ-ŋga ɯ-tɤjɯ a-pɯ-ŋu ma tɯ-nɤndʐo this 2.-clothes 3.-added --be  2-feel.cold: ‘Have some more clothes, otherwise you will be cold.’ (Jacques 2015a) Third, the form mɤkɯjɤɣ kɯ ‘not only’ used after finite verbs (example 54) is the negative form S/A participle of the verb jɤɣ ‘be possible, be allowed’ followed by the ergative kɯ. Example (55) illustrate the same verb form in its non-grammaticalized use. (54) tɤ-mthɯm nɯra tu-ndze mɤkɯjɤɣ kɯ, ɯ-di .-meat : -eat[III] not.only 3.-smell ɲɯ-ɕɯmnɤm -cause.to.have.a.smell ‘(The mouse) does not only eat meat, it also makes it stinky.’ (27spjaNkW, 198) (55) kɯ-ra tɕe pjɯ́ -wɣ-sat mɤ-kɯ-jɤɣ protect :S/A-have.to  --kill -:S/A-be.allowed ŋu be:fact ‘It has to be protected and is not to be killed.’ (27-kikakCi, 88) Fourth, the linker ʁo alala ri ‘not only’, used after noun phrases as in 56, is the combination of the adversative marker ʁo (倒 dào), the adverb alala ‘of course’ and the locative ri. (56) tɕe nɯnɯra ʁo alala ri ɯʑo sɤz kɯ-xtɕi pɣa nɯra kɯnɤ  : not.only 3  nmlz:S/A-be.small bird : also ku-ndɤm qhe tu-ndze -catch[III]  -eat[III] ‘In addition to these, it also eats birds that are smaller than itself.’ (19-qandZGi, 58)

. Expression of degree Gradable predicates, including adjectival stative verbs, can be used in Japhug in the degree construction, in which the verb is nominalized (with a tɯ- nominalization prefix and a possessive prefix coreferent with the S/A, see Jacques [2016a]) and combined with a finite stative verb indicating the degree. Among the verbs of degree that occur in this construction, sɤre (originally a verb meaning ‘be funny’) functionally corresponds to an intensifier such as ‘very, extremely’, as can be seem in example (57).

Grammaticalization in Japhug

565

(57) wo ɯ-tɯ-mna ɲɯ-sɤre, wuma ʑo  3-:-be.better -be.funny really  ɲɯ-pe ri -be.good but ‘(My disease) feels much better, it is very nice, but …’ (28-smAnmi, 132) This is not a completed grammaticalization (since sɤre in this construction is still conjugated), but illustrates the same grammaticalization pathway as that found in French drôlement: be funny → intensifier.

. Purposive clauses There are three purposive constructions in Japhug, and their origin is transparent. First, purposive clauses can be build with a verb in participial form followed by the linker ɯ-spa as in (58). The form ɯ-spa is etymologically a noun meaning ‘matter’ (at an earlier stage the irregular oblique participle of the verb pa ‘do’). This construction is one more example of the well-attested pathway matter → purposive (Heine & Kuteva 2002, 212). (58) paʁ nɯra ʁo lɯski, ɕa kɤ-ndza ɯ-spa pig :  of.course meat :P-eat 3.-purposive ku-χsu-nɯ pjɤ-ŋu -raise- .-be Second, the verb nɯmga ‘do (on purpose), (to have)’, either in a finite or infinitive form (as kɤ-nɯmga in example 59) can mark a purposive clause. (59) tɕe kɯpɤz nɯ mɯ-ɲɯ-kɤ-βzu kɤ-nɯmga, iʑɤra,  type.of.bug  ---grow -do.on.purpose 1 ji-mthɯm nɯra tu-χtɯ-j tɕe nɯ ɯ-ŋgɯ 1.-meat : fridge -buy-1   3.-inside ri pjɯ-nɯ-rku-j ɕti ma, maka ɯ-pɕi  --put.in-1 be.:  at.all 3.-outside tú-wɣ-ɕɯɴqoʁ qhe, ŋotɕu nɯ́ -wɣ-tɯ~ta ʑo kɯpɤz --hang  where --~put  type.of.bug ɲɯ-βze ɲɯ-ɕti -grow -be. ‘In order not to have bugs, our meat, we buy refrigerators and put it in there, as if one hangs it outside, bugs will grow wherever you put it.’ (28-kWpAz, 46–48)

566

Guillaume Jacques

Third, Core Gyalrong languages, including Japhug (Jacques 2014a) and Tshobdun (Sun 2012) have purposive converbs in sɤ-, mainly used in the negative form as in (60), which appear to be derived from the oblique participle sɤ- (on which see Jacques 2016b). (60) [kɯ-lɤɣ acɤβ nɯ kɯ ɯ-mɤ-sɤ-jmɯ~jmɯt,] :S/A-herd Askyabs   3--:-forget ɯ-p h ɯŋgɯ nɯ tɕu rdɤstaʁ-pɯpɯ tɕ h ɯrdu ci ɲɤ-rku, 3.-inside.clothes   stone-little pebble  -put.in ‘The cowboy Askyabs put a little pebble inside his clothes so that he would not forget it.’ (The frog, 166)

. Causal clauses Among the causal clauses in Japhug (Jacques 2014a), one presents a clear grammaticalization pathway: clausal clauses in núndʐa ‘for this reason’ as in (61). This linker is the fusion of the demontrative nɯ ‘that’ with the possessed noun ɯ-ndʐa ‘reason’, a word borrowed from Tibetan ɴdra. (61) [tɕe ɯ-mtɯ ɣɤʑu] tɕe, tɕe núndʐa qapɣɤmtɯmtɯ  3.-crest :exist   for.this.reason hoopoe tu-ti-nɯ ɲɯ-ŋu -say- -be ‘It has a crest, and this is the reason why it is called ‘hoopoe’.’ (Hoopoe, 20)

. Other constructions The verb me ‘not exist’ is used in addition in a construction expressing that an action is futile – that the result will be the same whether or not it takes place. As shown by example (62), in this construction the verb me ‘not exist’ follows two bare infinitives of the same verb, in affirmative and in negative forms, and me agrees with the P if the verb is transitive.9 (62) ndza mɤ-ndza me-a eat:. -eat:. not.exist-1 ‘(I am so lean that) whether you eat me or not, the result will be the same.’

 This construction is very rare and is not attested with intransitive verbs in the corpus; according to some consultants, in the case of intransitive complement verbs there is no agreement on me.

Grammaticalization in Japhug

567

 Degrammaticalization This overview of grammaticalization in Japhug would not be complete without an account of attested cases of degrammaticalization, which include a suffix becoming an independent word, and a relator noun of location (equivalent to a a postposition) becoming a common noun.

. Suffix to clitic The locative postposition zɯ in Japhug is related to the allative suffix -s found in Situ (Lín 1993, 330). Yet, the phonetic correspondence of Japhug z to Situ s is anomalous: in wordinitial position without cluster, Situ s- always corresponds to Japhug s- (Jacques 2004, 317–8). However, a sound change common to Japhug, Tshobdun and Zbu is the voicing of final *-s → -z.10 The Japhug form zɯ can be accounted for in the following way. After the regular sound change *-s → -z, the locative suffix -z was degrammaticalized as an enclitic (a case of deinflectionalization, see Norde [2009, 152]), and an epenthetic vowel ɯ was added to it as in all case marking clitics (Ergative kɯ, Genitive ɣɯ). The opposite possibility, namely that the allative marker was an independent word or clitic in proto-Gyalrong and that it became phonologically fused with the preceding noun in Situ, cannot account for the presence of voicing in the Japhug form zɯ. Moreover, there is evidence that the proto-Gyalrong allative suffix reconstructed here as *-s is cognate to the -s element found in several case markers in Tibetan (on which see Hill 2012).

. Relator noun of location to common noun The Japhug noun ɯ-thoʁ ‘the ground’ is highly anomalous in having an obligatory third person singular possessive suffix ɯ-. In addition, this word has no known cognates in other Gyalrongic languages, while it is a perfect match for a being a borrowing from a Tibetan word with the shape t h og (compare the other borrowed noun thoʁ ‘thunder’ from Tibetan t h og). To account for the etymology of this word, I propose the following scenario in four stages.

 In several Gyalrong languages including Japhug and Tshobdun, final -z is realized as voiced except utterance-finally and when preceding a word beginning with an unvoiced obstruent, as was first recognized by Sun (2005) about Tshobdun.

568

Guillaume Jacques

First, Japhug borrowed the Tibetan relator noun t h og(tu) ‘on’ as ɯ-t h oʁ *‘on’ (not attested), adding a third person possessive prefix like all relator nouns (see section 2.7). This relator noun was in competition with the existing native equivalent ɯ-taʁ ‘on’.11 Second, it became restricted to the collocation *sɤtɕha ɯ-thoʁ zɯ ‘on the ground’ (not attested), with the native locative zɯ and the noun of Tibetan origin sɤtɕha ‘earth, ground, place’. Third, the collocation *sɤtɕha ɯ-thoʁ zɯ ‘on the ground’, felt as redundant or tautological, became reduced as ɯ-thoʁ zɯ ‘on the ground’ (attested). Fourth, the noun ɯ-thoʁ ‘ground’ was created by backformation from the locative phrase ɯ-thoʁ zɯ ‘on the ground’. The fact that the locative postposition zɯ is always optional (section 2.7) undoubtedly made this step easier. Thus, Japhug attests an example of degrammation (see Norde 2009, 135) from a relator noun meaning ‘on’ (with or without motion) to a common noun meaning ‘ground’.

 Discussion . The verbal template Japhug and other Gyalrong languages have elaborate verbal templates, with more than ten prefixal slots (Jacques 2013b). It is possible to find complex verb forms with eight prefixes and an incorporated noun, such as (63). (63) a-mɤ-ɕ-tɤ-tɯ́ -wɣ-z-nɯ-snɯ-ɲaʁ ra ----2----heart-black have.to: ‘Don’t let them go and harm you!’ Some of these prefixes are of proto-Sino-Tibetan provenance (see DeLancey 2011, 2014; Jacques 2012a), but others are Gyalrong innovations, grammaticalized from either denominal prefixes (see section 6.2), nouns, pronouns, adverbs or verbs that can still be identified. The only verbal prefixes coming from verbs in Japhug are the associated motion prefixes (3.2). This grammaticalization is shared by all core Gyalrong languages (Japhug, Tshobdun, Zbu and Situ, see also Gong [2018, 201] and Jacques et al. forthcoming), but not attested in Khroskyabs and Stau (Lai 2017). It is unclear whether

 It is not surprising in Japhug to have several competing relator noun for the same functional slot. The dative ɯ-ɕki discussed in section 2.4 is itself in competition with another form ɯ-phe, probably borrowed from another Gyalrong variety.

Grammaticalization in Japhug

569

these languages never grammaticalized associated motion prefixes, or whether these prefixes were lost without trace. The reflexive ʑɣɤ- is the only prefix of pronominal origin (3.3.6). This grammaticalization is a common innovation of Core Gyalrong; Khroskyabs has innovated a slightly different reflexive prefix (Lai 2017). While Japhug and all other Gyalrongic languages have incorporation, there are very few prefixes whose origin can be unambiguously traced to nouns. The orientation prefixes (section 3.5.1) can be either from locative nouns or locative adverbs. The date of the grammaticalization of these prefixes is a vexed matter, as similar systems are found in the region. Sun (1983) proposed the existence of a ‘Qiangic’ subgroup of Sino-Tibetan on the basis of the presence of these prefixes, but the fact that some varieties of Tibetan have developed orientation prefixes too (Sun 2007) shows that this typological feature is of little value for establishing the phylogeny. The orientation systems of Core Gyalrong languages show too many commonalities to be the result of independent grammaticalization, and which cannot be due to language contact. In particular, all Core Gyalrong languages except Zbu (which has a simplified orientation system, and probably lost many features) have developed a present egophoric marker from the orientation prefix meaning ‘toward east’, probably through grammaticalization of its use as a centripetal motion marker (section 3.5.2).

. Denominal derivations Denominal verbalizing derivations are involved in the pathways of grammaticalization of many nominal and verbal affixes in Japhug, in particular valency-changing prefixes (antipassive, causative etc., see section 3.3), incorporation (section 3.4) and also comitative adverbs (section 2.5). Grammaticalization pathways based on denominal derivation are in many ways comparable to pathways based on light verbs (as in the case of the Antipassive, cf. Creissels [2012]). The only real difference is in fact that the historical origin of denominal verbalizing prefixes in Japhug is unknown. It is conceivable that they originate from verbs too, but the grammaticalization took place in a past so remote that it may not be recoverable, as cognates of these prefixes exist in other branches of the Sino-Tibetan family. If they indeed originate from verbs, it is interesting that they are prefixes and not suffixes, in a language with strict verb-final word order (on which see Jacques 2013b).

 Conclusion The study of grammaticalization in Gyalrong languages and Japhug in particular is a very rich topic, and the present paper is but a mere sketch of the most obvious

570

Guillaume Jacques

phenomena observed in this language. While some of the grammaticalization pathways found in Japhug are quite common crosslinguistically ( → ,  → /), other appear to be unique to Japhug or Gyalrong languages, for instance  →  →  or   → .

Acknowlegments I would like to thank Alec Coupe, Linda Konnerth, Nat Krause, Alexis Michaud, Mark W. Post, Amos Teo and an anonymous reviewer for comments of earlier versions of this work. The examples are taken from a corpus that is progressively being made available on the Pangloss archive (Michailovsky et al. 2014, http://lacito.vjf. cnrs.fr/pangloss/corpus/list_rsc.php?lg=Japhug). This research was funded by the HimalCo project (ANR-12-CORP-0006) and is related to the research strand LR4.11 “Automatic Paradigm Generation and Language Description” of the Labex EFL (funded by the ANR/CGI).

Abbreviations Abbreviations follow the Leipzig glossing rules. Additional abbreviations include  – conative,  – evidential,  – generic,  – idiophone,  – inferential,  – inverse,  – linker,  – sensory

References Antonov, Anton. 2007. Le rôle des suffixes en /+rV/ dans l’expression du lieu et de la direction en japonais et l’hypothèse de leur origine altaïque: INALCO dissertation. Arkhipov, Alexandre. 2009. Comitative as a cross-linguistically valid category. In Patience Epps & Alexandre Arkhipov (eds.), New challenges in typology: transcending the borders and refining the distinctions, 223–246. Berlin, New York: Mouton de Gruyter. Creissels, Denis. 2012. The origin of antipassive markers in West Mande languages. In 45th Annual Meeting of the Societas Linguistica Europaea, Stockholm. Daudey, Henriëtte. 2014. A grammar of Wadu Pumi. Melbourne: LaTrobe University dissertation. DeLancey, Scott. 2011. Notes on verb agreement prefixes in Tibeto-Burman. Himalayan Linguistics Journal 10(1). 1–29. DeLancey, Scott. 2014. Second person verb forms in Tibeto-Burman. Linguistics of the TibetoBurman Area 37(1). 3–33. Doornenbal, Marius. 2009. A Grammar of Bantawa: Grammar, paradigm tables, glossary and texts of a Rai language of Eastern Nepal. Leiden: Leiden University dissertation. Gong, Xun. 2014. Personal agreement system of Zbu rGyalrong (Ngyaltsu variety). Transactions of the Philological Society 112(1). 44–60.

Grammaticalization in Japhug

571

Gong, Xun. 2018. Le rgyalrong zbu, une langue tibéto-birmane de Chine du Sud-ouest : une étude descriptive, typologique et comparative. Paris: Institut national des langues et civilisations orientales dissertation. Heine, Bernd & Tania Kuteva. 2002. World Lexicon of Grammaticalization. Cambridge: Cambridge University Press. Heine, Bernd & Kyung-An Song. 2011. On the grammaticalization of personal pronouns. Journal of Linguistics 47(3). 587–630. Hill, Nathan W. 2012. Tibetan -las, -nas, and -bas. Cahiers de Linguistique Asie Orientale 41(1). 3–38. Jacques, Guillaume. 2004. Phonologie et morphologie du japhug (Rgyalrong). Paris: Université Paris VII − Denis Diderot dissertation. Jacques, Guillaume. 2010a. The inverse in Japhug Rgyalrong. Language and Linguistics 11(1). 127–157. Jacques, Guillaume. 2010b. The origin of the reflexive prefix in Rgyalrong languages. Bulletin of the School of Oriental and African studies 73(2). 261–268. Jacques, Guillaume. 2012a. Agreement morphology: the case of Rgyalrongic and Kiranti. Language and Linguistics 13(1). 83–116. Jacques, Guillaume. 2012b. Argument demotion in Japhug Rgyalrong. In Katharina Haude & Gilles Authier (eds.), Ergativity, Valency and Voice, 199–226. Berlin: Mouton De Gruyter. Jacques, Guillaume. 2012c. From denominal derivation to incorporation. Lingua 122(11). 1207–1231. Jacques, Guillaume. 2013a. Applicative and tropative derivations in Japhug Rgyalrong. Linguistics of the Tibeto-Burman Area 36(2). 1–13. Jacques, Guillaume. 2013b. Harmonization and disharmonization of affix ordering and basic word order. Linguistic Typology 17(2). 187–217. Jacques, Guillaume. 2014a. Clause linking in Japhug Rgyalrong. Linguistics of the Tibeto-Burman Area 37(2). 263–327. Jacques, Guillaume. 2014b. Denominal affixes as sources of antipassive markers in Japhug Rgyalrong. Lingua 138. 1–22. Jacques, Guillaume. 2015a. Dictionnaire Japhug-Chinois-Français, version 1.0. Paris: Projet HimalCo. http://himalco.huma-num.fr/. Jacques, Guillaume. 2015b. The origin of the causative prefix in Rgyalrong languages and its implication for proto-Sino-Tibetan reconstruction. Folia Linguistica Historica 36(1). 165–198. Jacques, Guillaume. 2015c. The spontaneous-autobenefactive prefix in Japhug Rgyalrong. Linguistics of the Tibeto Burman Area 38(2). 271–291. Jacques, Guillaume. 2016a. From ergative to comparee marker: multiple reanalyses and polyfunctionality. Diachronica 33(1). 1–30. Jacques, Guillaume. 2016b. Subjects, objects and relativization in Japhug. Journal of Chinese Linguistics 44(1). 1–28. Jacques, Guillaume. 2017a. The morphology of numerals and classifiers in Japhug. In Picus Sizhi Ding & Jamin Pelkey (eds.), Sociohistorical Linguistics in Southeast Asia, 135–148. Leiden: Brill. Jacques, Guillaume. 2017b. The origin of comitative adverbs in Japhug. In Walter Bisang & Andrej Malchukov (eds.), Unity and diversity in grammaticalization scenarios, 31–44. Berlin: Language science press. Jacques, Guillaume. 2018. Generic person marking in Japhug and other Rgyalrong languages. In Fernando Zúñiga & Sonia Cristofaro (eds.), Diachrony of hierarchical systems, 403–424. Amsterdam: John Benjamins. Jacques, Guillaume. forthcoming. Antipassive derivations in Sino-Tibetan/Trans-Himalayan and their sources. In Katarzyna Janic, Denis Creissels & Alena Witzlack-Makarevich (eds.), The multifaceted nature of antipassive, Amsterdam: Benjamins. Jacques, Guillaume & Anton Antonov. 2014. Direct / inverse systems. Language and Linguistics Compass 8/7. 301–318.

572

Guillaume Jacques

Jacques, Guillaume & Zhen Chen. 2007. Chápǔhuà de bùjíwù qiánzhuì jí xiāngguān wèntí 茶堡话的 物前缀 相关问题 [The intransitve prefix in Japhug and related problems]. Language and Linguistics 8(4). 883–912. Jacques, Guillaume, Aimée Lahaussois & Shuya Zhang. forthcoming. Associated Motion in SinoTibetan/Trans-Himalayan. In Antoine Guillaume & Harold Koch (eds.), Associated motion, Berlin: Mouton de Gruyter. Konnerth, Linda. 2014. Additive focus and additional functions in Karbi (Tibeto-Burman) =tā. In Kayla Carpenter, Oana David, Florian Lionnet, Christine Sheil, Tammy Stark & Vivian Wauters (eds.), Proceedings of the 38th Annual Meeting of the Berkeley Linguistics Society, 206–222. Konnerth, Linda. 2015. A new type of convergence at the deictic center: Second person and cislocative in Karbi (Tibeto-Burman). Studies in Language 39(1). 24–45. Lai, Yunfan. 2017. Grammaire du khroskyabs de Wobzi. Paris: Université Paris III dissertation. Lehmann, Christian, Yong-Min Shin & Elisabeth Verhoeven. 2004. Direkte und indirekte Partizipation Zur Typologie der sprachlichen Repräsentation konzeptueller Relationen. Arbeitspapiere des Seminars für Sprachwissenschaft der Universität Erfurt 13. Lin, Youjing. 2002. A Dimension Missed: East and West in Situ rGyalrong Orientation Marking. Language and Linguistics 3(1). 27–42. Lin, Youjing. 2011. Perfective and imperfective from the same source: directional “down” in rGyalrong. Diachronica 28(1). 54–81. Lín, Xiàngróng ( 榮). 1993. Jiāróngyǔ yánjiū 嘉戎語研 [A study on the Rgyalrong language]. Chengdu: Sichuan minzu chubanshe. Lín, Yòujīng. 2003. Tense and Aspect Morphology in the Zhuokeji rGyalrong Verb. Cahiers de Linguistique − Asie Orientale 32(2). 245–286. Malchukov, Andrej, Martin Haspelmath & Bernard Comrie. 2010. Ditransitive constructions: a typological overview. In Andrej Malchukov, Martin Haspelmath & Bernard Comrie (eds.), Studies in Ditransitive Constructions: A Comparative Handbook, 1–64. Berlin: De Gruyter Mouton. Michailovsky, Boyd, Martine Mazaudon, Alexis Michaud, Séverine Guillaume, Alexandre François & Evangelia Adamou. 2014. Documenting and researching endangered languages: the Pangloss Collection. Language Documentation and Conservation 8. 119–135. Mithun, Marianne. 1984. The Evolution of Noun Incorporation. Language 60(4). 847–894. Norde, Muriel. 2009. Degrammaticalization. Oxford: Oxford University Press. Stassen, Leon. 2000. AND-languages and WITH-languages. Linguistic Typology 4(1). 1–54. Stolz, Thomas, Cornelia Stroh & Aina Urdze. 2006. On Comitatives and Related Categories: A Typological Study With Special Focus on the Languages of Europe. Berlin, New York: Mouton de Gruyter. Sun, Hongkai. 1983. Liujiangliuyu de minzu yuyan jiqi xishu fenlei 六江流域的民族语言 其 分类 [Minority languages of the Six River Valley and their genetic classification]. Minzu xuebao 3. 99–273. Sun, Jackson T.-S. 2005. Jiāróngyǔzǔ yǔyán de yīn’gāo: liǎnggè gè’àn yánjiū 嘉戎語組語 言的音高:兩個個案研 [On Pitch in the rGyalrongic Languages: Two Case Studies]. Yuyan yanjiu 語言研 25(1). 50–59. Sun, Jackson T.-S. 2007. Perfective stem renovation in Khalong Tibetan. In Roland Bielmeier & Felix Haller (eds.), Linguistics of the Himalayas and Beyond, 323–340. Berlin, New York: Mouton de Gruyter. Sun, Jackson T.-S. 2012. Complementation in Caodengr Gyalrong. Language and Linguistics 13(3). 471–498. Sun, Jackson T.-S. 2014a. Sino-Tibetan: Rgyalrong. In Rochelle Lieber & Pavol Štekauer (eds.), The Oxford Handbook of Derivational Morphology, 630–650. Oxford: Oxford University Press. Sun, Jackson T.-S. 2014b. Typology of Generic-Person Marking in Tshobdun Rgyalrong. In Richard VanNess Simmons & Newell Ann Van Auken (eds.), Studies in Chinese and Sino-Tibetan

Grammaticalization in Japhug

573

Linguistics: Dialect, Phonology, Transcription and Text, 225–248. Taipei, Institute of Linguistics, Academia Sinica. Sun, Jackson T.-S. & Youjing Lin. 2007. Constructional Variation in rGyalrong Relativization: How To Make a Choice? In Pre-Conference Proceedings of the International Workshop on Relative Clauses, 205–226. Taipei: Institute of Linguistics, Academia Sinica. Sun, Jackson T.-S. & Shidanluo. 2002. Cǎodēng Jiāróngyǔ yǔ rèntóng děngdì xiāngguān de yǔfǎ xiànxiàng 草 嘉戎語 認同等第 相關的語法 象 [Empathy Hierarchy in Caodeng rGyalrong grammar]. Language and Linguistics 3(1). 79–99. Tournadre, Nicolas & Randy LaPolla. 2014. Towards a new approach to evidentiality. Linguistics of the Tibeto-Burman Area 37(2). 240–262. Zhang, Shuya. 2019. From proximate/obviative to number marking: Reanalysis of hierarchical indexation in Rgyalrong language. Journal of Chinese Linguistics 47(1). 125–150.

Seongha Rhee

13 Grammaticalization in Korean  Introduction . The aim of the present paper In the present paper, I review prominent grammaticalization processes as attested in the history of Korean in nominal and verbal domains and further in complex constructions. The paper also discusses prominent grammaticalization patterns that deserve attention, and reviews grammaticalization phenomena from areal and typological perspectives.

. Location, genetic affiliation, and historical documentation Korean is spoken by about 77.2 million people (cf. Ethnologue; Simons and Fennig 2018) in and around the Korean peninsula: South Korea, North Korea, China, Japan, and other Korean communities around the world. It was formerly regarded as an Altaic language as the eminent Finnish Altaic linguist Gustaf Ramstedt proposed in 1939 (Ramstedt [1939] 1997), but in the modern linguistic research tradition the Altaic hypothesis is contested (see Kim [1992]; Sohn [1999]; Song [2005], among others, for discussion of the issue). Ethnologue (Simons and Fennig 2018) lists it as a Koreanic language along with Jejueo, a language variety spoken in Jeju Island. Most recently it is frequently discussed under the geographically motivated classificatory term ‘Transeurasian languages’ (Johanson and Robbeets 2010), often along with Japanese since the two languages share many structural characteristics (Narrog and Rhee 2013; Narrog, Rhee, and Whitman 2018). The historical depth of texts written in Hangeul (or Hankul), the Korean alphabet, goes back to the 15th century CE (1443). The invention of this alphabetic-featural script is a landmark event that divides Early Middle Korean (EMK) and Late Middle Korean (LMK), and a large body of texts has been compiled through government-led projects such as the 21st-Century Sejong Project.1 Prior to the invention of Hangeul, Chinese characters were used. Since Chinese is a vastly different language with a different word order and without grammatical markers corresponding to Korean postpositions and inflections, the way the characters were used had to be mod-

 The following historical period labels are used: OK: Old Korean (~9th century); EMK: Early Middle Korean (10th–15th centuries); LMK: Late Middle Korean (15th–16th centuries); EMoK: Early Modern Korean (17th–19th centuries); MoK: Modern Korean (20th–21st centuries); and PDK: PresentDay Korean (21st century). https://doi.org/10.1515/9783110563146-013

576

Seongha Rhee

ified to suit Korean. Accordingly, several different writing systems were invented, such as Itwu, Hyangchal, and Kwukyel, that made use of Chinese characters for their meaning (semantogram) or sound (phonogram) to represent Korean words or grammatical morphemes, thus creating problems for modern scholars in interpreting such texts. Recently, much advance has been made in deciphering such texts found in poems, tombstone inscriptions, ledgers, administrative reports, pedigree records, and the like, but much remains controversial. Identifiably Korean texts written in Chinese characters date back to the 5th century CE (Whitman 2015), notably a 1,800character inscription on a stele commemorating King Kwanggaeto of the Koguryo Kingdom, erected in 414 CE.

. Typological characteristics Typologically Korean is an agglutinating, head-final, SOV language. In line with the SOV word order characteristics in morphology, it is suffixal and postpositional. Prepositions do not exist and prefixes are relatively unproductive. Therefore, in most grammaticalization scenarios independent lexemes develop into suffixes and postpositions as bound morphemes. Multiple suffixes and postpositions may occupy slots of a relatively fixed order. Argument NPs are often omitted and NPs may occur without their postpositional particles, especially in speech, and interpretive flexibility thus created often affects grammaticalization. NPs are strictly head-final, thus all nominal modifiers occur before the modified noun. Further, there is no article among the nominal modifiers, and plural marking is optional. Syntactic and thematic functions of NPs are marked with case markers and postpositions. With these role markers, NPs may occur rather freely without functional confusion, even though there is a canonical and preferred word order . In terms of verbal morphology, finite verbs are marked for tense, aspect, mood, and modality, whereas non-finite verbs are often marked with ‘converb’-markers (converbalia, Ramstedt [1903]), whose main function is adverbializing a non-finite verb (Haspelmath 1995).2 Finite verbs occurring sentence-finally are further marked by a sentence-type marker, such as declarative, imperative, interrogative, hortative, exclamative, etc., all of which forming a complex paradigm further modulated by the degree of honorification and politeness, commonly known as hwakyey ‘speech level’ (see 3.5). Morphological complexity as a result of extensive agglutination of nominal and verbal suffixes that encode diverse grammatical notions can be illustrated in (1):3

 The notion of converb was first introduced in Korean linguistics by Ramstedt (1997), and is called pwutongsa (Lee [1961] 1998: 23, Ahn 1967) among Korean linguists.  Transliteration is based on the Extended Yale System (Rhee 1996). Also note that periphrastic forms are written with a dot which indicates the word boundary. When two or more English words

Grammaticalization in Korean

(1)

577

a. ku-nun caki kyoswu-nim-tul-hanthey-kkaci-to mwulyeyha-ta he- self professor----- be.rude- ‘He is rude even to his professors. (lit. … rude even to the point of (being rude) to self ’s honorable professors.)’ b. pelsse kanguy-lul kkuthna-y-e.peli-si-ess-keyss-ta-te-kwun-yo already lecture- finish--------- ‘(I) recall (they told me) that (the professor) must have finished the lecture (by then).’

.. Phonology Korean is a non-tonal, syllable-timed language. It has nineteen consonants, ten vowels, and two semivowels in the phonemic inventory. Consonants exhibit a threeway contrast (lax-aspirate-tense) in stop consonants, a two-way contrast (lax-tense) in the (alveo-)dental fricatives, and no contrast in the glottal fricative (Sohn 1999: 14; Song 2005: 24–32), as shown in Table 1, taken from Song (2005: 26):

Tab. 1: Consonants in Korean. Stop Bilabial Dental Palatal Velar Glottal

p, pp, ph t, tt, th c, cc, ch k, kk, kh

Fricative

Nasal

Lateral

s, ss

m n

l

h

Consonant clusters are not allowed at the word-initial positions, and they are allowed to occur at the word-final position but only one consonant is pronounced. Certain phonological constraints, such as avoidance of lateral at the word-initial position, are disappearing presumably due to foreign language influences.

.. Word classes and constructional morphology Korean linguists and grammarians use commonly attested word classes such as noun, pronoun, verb, adjective, numeral, adverb, determiner, and particle, even

in the gloss correspond to a single morpheme in examples, the words are written with a dot between them.

578

Seongha Rhee

though the number of word classes and their membership vary by individual researchers and frameworks. Adjectives inflect with respect to tense, aspect, mood, and modality, and thus are often regarded as stative verbs. There are, however, a handful of adjectives that are not inflected and are used only as modifiers in the prenominal position. Korean has a rich system of derivational morphology, and compounding is also productive both in noun and verb derivation. Among the idiosyncrasies in word class is the presence of a large number of ideophones. For instance, an authoritative dictionary (Phyocwun Kwuke Taysacen, 1999) contains 9,964 ideophone headwords (Shon 2012) and a lexicographers’ comprehensive list contains about 29,600 ideophones (Koo and Rhee 2018). The significance of ideophones in Korean is that they are systematically used in color description (thus 752 color terms in the above-mentioned dictionary, Hong 2015: 13) and taste description (thus 268 gustatory terms in Rhee and Koo 2017), among others.

.. Nouns and noun phrases Nouns in a sentence may occur before a particle for case or other functional marking, and they may occur after a determiner, such as demonstrative, qualifier, or quantifier. They constitute the largest word class with native Korean and Sino-Korean members (Sohn 1999: 204), accounting for 58 % of the 50,000 headwords of an official dictionary (NIKL). In sentences, nouns may be quantified using a classifier (see 2.1). There is a large number of defective nouns that have lost the features characteristic of nouns, especially, lacking syntactic autonomy, for which reason they occur only with a modifier (see 4.4).

.. Verbs and clauses Verbs must be inflected for tense, aspect, and modality, which is a defining characteristic of the verb category that separates it from all others. As alluded to in 1.3.2, however, most adjectives in Korean also share this characteristic, and thus these two categories are often collectively referred to as predicatives or yongen ‘a declinable word; a conjugative word’, both marked with the infinitive marker -ta in dictionary entries. Verbs constitute the second largest category accounting for 21.9 % of the 50,000 headwords of an official dictionary (NIKL). Individual verbs have clear behavior with respect to transitive-intransitive distinction but there are a few verbs that can be used as either. There is a large number of verbs that developed into auxiliary verbs (see 3.1–3.3).

Grammaticalization in Korean

579

 Grammaticalization of Nominal Categories . Class/Gender Korean has no system of grammatical gender and no such system has been attested in its history. There is no sign of an emergent system of gender marking in Modern Korean. However, Korean has a highly developed numeral classifier system with about three dozen classifiers in common use, about half of which are of Chinese origin.4 The most widely used classifier is kay for individuated non-human objects, both tangible and intangible. Some of the common Sino-Korean (SK) and native Korean (NK) classifiers are as shown in (2): (2) Nominal classifiers a. kay: SK, individuated non-human objects (apple, candy, idea, proposal) b. tay: SK, vehicles and other mechanical units (car, crane, bicycle, air-conditioner) c. myeng: SK, humans [±] (student, member, population) d. pwun: SK, humans [+] (teacher, parent, old person) e. can: SK, liquids in a glass or cup (coffee, juice, liquor) f. calwu: NK, long, thin objects (pencil, broom, shovel) g. katak: NK, very thin and long object (thread, hair, noodle) h. mali: NK, animals (cow, cat, bird, insect, fish) i. kulwu: NK, plants (tree, rose-bush, cactus) j. kulus: NK, food in a bowl (steamed rice, noodle, soup)

It is notable that many instances of grammaticalization of numeral classifiers show ‘divergence’ (Hopper 1991; Hopper and Traugott [1993] 2003). For instance, calwu is still used as a regular noun denoting ‘a long handle’, katak for ‘stream, streak or thread’, kulus for ‘bowl, container’, etc., in addition to their classifier usage. An interesting development is found with the classifier mali for counting animals. According to Koo (2009), Korean mali is a historical variant of meli ‘head’ in the 16th century data, but from the 18th century it developed into a full-fledged classifier for non-human animals. In MoK, the historical variants have different specialization, i.e., meli as a body part noun (human and non-human) and mali as a numeral classifier for non-human animals (Koo 2009).

 According to Woo (2001: 115–116), 40 % of classifiers listed in Wulimal Khunsacen (1992) are native Korean words, 23 % Sino-Korean words, and 37 % foreign borrowings. Woo’s (2001) study analyzes 465 native and Sino-Korean numeral classifiers, 52 % of which are of native Korean origin and 48 % of Sino-Korean origin (p. 116).

580

Seongha Rhee

From a historical linguistic perspective, classifier development seems to have been strongly influenced by Chinese (see [6] below). Chae (1982) notes that classifiers proliferated from EMoK, many of which of Chinese origin. Many Sino-Korean classifiers replaced native Korean counterparts. For instance, the most common classifier kay (個) in MoK was first introduced in EMoK in commentaries and translations of Chinese texts and coexisted with the native Korean classifier nas/nach, which became defunct in MoK (Chae 1996). After borrowing occurs, the usage of the classifier over time may not coincide with that in the source language. For instance, Kuo (1995) notes that when the classifier cang (張) was first borrowed in EMoK, it was used for such objects as chair, bow, paper, etc. In MoK, however, it is only used for thin, flat objects such as paper, sheet-metal, plywood board, etc. When native and Sino-Korean classifiers are compatible for certain class of objects, the native Korean classifier is typically used in informal registers whereas its Sino-Korean counterpart is used in formal registers, exhibiting register ‘specialization’ (Hopper and Traugott 2003), a phenomenon commonly attested across languages.

. Number Number marking is optional in Korean. Plurality is optionally marked with the plural suffix -tul, as shown in (3): (3)

haksayng(-tul) sey myeng student(-) three  ‘three students’

The plural marker developed from the OK noun tl(h) which denoted the presence of others of a similar kind or collective reference to the listed items it follows, e.g., A, B, C tlh meant ‘A, B, C, and kindred others’ or ‘A, B, C, these (three)’ (Rhee 2018: 223). The form gradually became a suffix denoting multiplicity of the host noun as early as LMK. In MoK, the plural marker is used in non-pluralizing contexts, thus often called Extrinsic Plural Marking (EPM, Song 1997). It functions as a discourse marker signaling diverse speaker’s stances such as mirativity, irritation, protest, friendliness, solidarity, solicitation, etc. The discourse marker usage of the plural marker -tul is exemplified in (4) below, (note that the discourse marker tul is still glossed as PL): (4)

a. tul way tul ila-y tul?  why  do.this-  ‘Why are you guys doing this?’ [irritated protest] Rhee (2018: 227)

Grammaticalization in Korean

581

b. mom tul cosim tul ha-ko cal tul iss-e tul body  care  do-and well  exist-  ‘Take care of yourselves and stay in peace.’ [friendly well-wishing] Rhee (2018: 227)

. Possession Korean has attributive and predicative means of encoding possession. Attributive possession makes use of two grammatical formants -s or postpositional -uy, and predicative possession makes use of periphrases including the verbs of existence iss- ‘exist’, eps- ‘not exist’, manh- ‘exist abundantly’, cek- ‘exist scantily’. Their usage is partially illustrated in the following: (5)

a. honca-s mal [one.person- word] ‘monologue’ b. John-uy cip [John- house] ‘John’s house’

(6)

John-i ton-i iss-ta John- money- exist- ‘John has money.’

The lexical sources of -s in (3a), attested as early as in EMK (11th century) and -uy, attested in the form of -uy or -y in EMK (10th century), are unclear (cf. Park 1998). It is notable, however, that -uy/y was also used for locative marking. From LMK the two possessives were differently specialized, i.e., -s for [+] possessor and -uy/y for [–] possessor. Therefore, both the attributive and predicative possession markers in Korean have been grammaticalized based on the ‘Location schema’ (Heine 1993, 1997) as their conceptual basis.

. Determiner Korean has three demonstratives that function as noun determiners depending on the distance from the interlocutors, i.e., i for speaker-proximal, ku for speaker-distal, and ce for mutually-distal. These demonstratives are attested in the earliest extant data as i, ku, and tye, but their lexical sources have not been established. They also function as pronouns in certain contexts (see 2.7).

582

Seongha Rhee

. Case Korean has a large inventory of postpositional case particles that signal the syntactic relations of NPs to the predicate, as shown in part in (7), with lexical sources indicated in parenthesis after the ‘ zAm > Am > m] (see Rhee 2008a: 242).  The number of defective nouns varies greatly depending on the researcher because the notion of ‘defective’ is subjective. Huh (1995) lists as many as 99 defective nouns and Lee (2009) lists 74, whereas Ko (1989) and a few others list around 50. Some researchers use the label ‘formal nouns’, ‘bound nouns’, or ‘dependent nouns’ instead of ‘defective nouns’.

Grammaticalization in Korean

(11) cikyeng ‘domain’ seym ‘calculation’ palam ‘wind’ pep ‘law’

nolus ‘role’ the ‘lot/foundation’ tey ‘place’ nawi ‘margin’

585

phan ‘venue/situation’ moyang ‘appearance’ cek ‘time’ chek ‘pretense’

The most productive usage of defective nouns is that in the so-called ‘mermaid constructions’ (see 4.4).

. Personal pronoun The Korean pronominal system is not well developed due to social factors and idiosyncratic traits of language use, i.e., extensive use of regular nouns and titles as vocatives, and pro-drop (Koo 2016; Rhee 2019). Historical data from OK and EMK suggest that Korean had a simple pronominal system (na for 1, ne for 2, i, ku and tye for 3, and ce for reflexive) and lacked honorific pronouns. The lexical sources of these pronouns have not been identified, except that 3 pronouns are identical with speaker-proximal, speaker-distal, and mutually-distal demonstratives (see 2.4). In LMK, there appeared a new 2. pronoun kukuy (< ku tuy ‘that place’) and a number of reflexive honorific pronouns, i.e., ckya (< ‘self ’s house’), tangsin (< ‘the body concerned; applicable body’), caki (< ‘self ’s body; self ’s self ’), ckuy (< ‘self ’s body; self ’s self ’), and caney (< ‘persons like you’, ‘self ’s person’(?); Suh 2003: 479–480) (Rhee 2019). Peculiar changes that occurred in history include reference extension, such as the person-neutral reflexive ce that became also used for humiliative 1 in LMK, and the reflexive tangsin extended its referential function to 2. in EMoK, and to 3. in MoK. This multifunctional tangsin is used for diverse, genre-sensitive references in PDK. Rhee (2019) notes that as the society becomes increasingly complex, pronouns and referential expressions with the [+] feature have been actively innovated throughout history by recruiting certain lexical and grammatical items. Therefore, most instances of innovation involve the upward modification for second person pronouns and downward modification for the first person pronouns in terms of the honorification hierarchy. For instance, when the LMK . tangsin was recruited as 2 in EMoK it was for highly honorific reference, but it was partially demoted when the still more honorific terms, e.g., imca (< ‘owner’) and tayk (< ‘honorable house’), were innovated in MoK. In turn, all these 2. forms were partially demoted when the innovation of still more honorific terms, e.g., elun (< ‘superior person’) and elusin (< ‘honorable superior person’), were innovated in PDK. Similar patterns of demotion of existing forms and innovation of honorific forms are found with caney (< ‘self ’s person, persons like you’) and kutuy (< ‘that place’) for 2.. On the other hand, OK/EMK 1 was na with no plain/humiliative distinction. In LMK new 1SG forms, e.g., s(y)oin (< ‘small person’) and sosin

586

Seongha Rhee

(< ‘small servant’), were innovated. Then in EMoK another new 1 humiliative form was innovated by recruiting  ce (see Rhee [2019] for more detail).

 Grammaticalization of verbal categories . Voice/Valency Modern Korean has a number of causative and passive markers. Many of them have undergone grammaticalization and are fully morphologized, whereas some of them are still in the periphrastic states as indicated by a dot between words, some involving converbs -key and -e, as shown in Table 2. As is evident in Table 2, Korean has a number of morphemes that carry functional ambiguity between causative and passive, i.e., the i-type causatives and the i-type passives. Therefore, certain derived words are ambiguous; e.g., wul-li- (ring/) can mean ‘make (a bell) ring’ or ‘(a bell) is made to ring’. The lexical sources of the primary causative/passive morphemes are not identified, but those of the less grammaticalized, periphrastic forms are obvious from the presence of lexical morphemes, i.e., the causatives -key.ha- (< ha- ‘do’), -key.sikhi- (< sikhi- ‘make (someone do x)’, -key.mantul- (< mantul- ‘make, create’), and the passives -key.toy(< toy- ‘become’), -e.ci- (< ci- ‘fall’), -tangha- (< SK tang ‘experience’), -pat- (< pat‘receive’), and -mek- (< mek- ‘eat’), with the last three affixed to nouns, whereas all others to verbs or adjectives.10

Tab. 2: Causative and Passive Markers in Korean Rhee and Koo (2014: 311).

Causative Passive

Affixes (variants)

Periphrases

i-Type (-i-, -hi-, -li-, -ki-) wu-Type (-wu-, -kwu-, -chwu-) i-Type (-i-, -hi-, -li-, -ki-)

-causative: -key.ha-causative: -key.sikhi-, -key.mantul-passive: -key.toy-passive: -e.ci-passive: -tangha-passive: -pat-passive: -mek

 For instance, hayko ‘dismissal’ becomes a passive verb haykotangha- ‘be dismissed’; sang ‘prize’, sangpat- ‘be awarded’; yok ‘n. curse’, yokmek- ‘be cursed’, etc. (Rhee and Koo 2014). As Haspelmath (1990: 41) notes, these newly derived verbs carry adversative and beneficial flavors.

Grammaticalization in Korean

587

. Aspect .. Progressive Progressive is marked by a number of periphrases. As is the case with most other aspectual markers, progressives originated from serial verb constructions involving converb markers, notably -e/-a and -ko. The only exception is -nun.cwungi- (12i) of nominal (and copular) origin from SK cwung ‘middle, center’. By virtue of the presence of the verbs, which often have fully lexical usage, individual progressive markers often carry meanings related to their lexical usage in addition to the aspectual meaning. The most widely used progressive is -ko.iss- (< is-/isi- ‘exist’) which developed at the turn of the 20th century (Huh 1987), but there are many others that developed more recently, as listed in (12) (see Lee [1988] 1993; Rhee 1996; Kim 2011; Koo and Rhee 2016; Ho 2003, among many others):11 (12) a. b. c. d. e. f. g. h. i.

-ko.iss-ko.kyeysi-ko.cappacyess-ko.ancass-e.ka-e.o-e.naka-e.ci-nun.cwungi-

< is-/isi- ‘exist’ < kyeysi- ‘exist.’ ([+] agent) < cappaci- ‘fall back’ (Strongly pejorative, Repetitive) < anc- ‘sit’ (Pejorative, Repetitive) < ka- ‘go’ (Andative, often Negative) < o- ‘come’ (Venitive, often Positive, Repetitive) < naka- ‘go out’ (Gradual) < ci- ‘fall’ (Gradual, Facilitative, Inchoative, Passive) < cwung SK ‘middle, center’ (Emphatic)

.. Habitual, repetitive, iterative The aspects of habitual, repetitive, and iterative are mostly marked with auxiliary verbs developed from verb serialization. The most frequently used form, -kon.ha-, however, has an intervening topic/contrast marker -n. Some of the habitual/repetitive/iterative markers are as listed in (13): (13) a. b. c. d.

-kon.ha-ko.ancass-ko.cappacyess-e.tay-

< ha- ‘do’ < anc- ‘sit’ (Pejorative, Repetitive) < cappaci- ‘fall back’ (Strongly pejorative, Repetitive) < tay- ‘tough’ (Irritated, Pejorative)

 This section is largely based on the author’s collaborative work with Hyun Jung Koo. Korean aspectual markers often carry elaborate meanings in addition to regular aspectual meanings (cf. ‘semantically elaborate categories’ Kuteva [2009]), often including the speaker’s evaluative stance (Rhee 1996).

588

Seongha Rhee

e. f. g. h.

-e.ssah-e.pelusha-llakmallakha-takan

< ssah- ‘pile up, accumulate’ (Irritated, Strongly pejorative) < pelusha- ‘do habitually’ (Pejorative) < mal- ‘stop’ (Repeated, Avertive) < taku- ‘draw near’ (Premonitive connective)

As is evident in comparison with the progressive markers, certain forms belong to both categories, which is expected considering the conceptual affinity between continuation and repetition. Also noteworthy is that many markers of this aspectual category signal the speaker’s negative stance.

. Completive There are multiple forms layered in the function of completive aspect marking. Like other aspect markers, these are developed from verb serialization and carry diverse stance-related, intersubjective meanings, as listed in (14): (14) a. b. c. d. e. f. g. h. i.

-ko.na-ko.mal-koya.mal-e.mek-e.nay-e.twu-e.peli-e.chiwu-e.noh-

< na- ‘exit’ (commonly as connectives -konani and -konase) < mal- ‘stop’ (often Intentional; Undesirable incidence) < mal- ‘stop’ (Intentional) < mek- ‘eat’ (Negative) < nay- ‘pull out’ (With much effort) < twu- ‘place, store’ (Purposive, Preparatory) < peli- ‘throw away’ (often Undesirable, Malefactive) < chiwu- ‘throw away, displace’ (strongly Undesirable) < noh- ‘put down, release’ (Purposive, Preparatory; Connotation of helplessness when used as connectives -e.nohase, -e.nohuni)

.. Resultative Resultative, also commonly known as ‘persistence-of-result’ or ‘state-persistence’ among Korean linguists, is marked by the auxiliaries developed from the verbs of existence through verb serialization.12 The primary marker is -e.iss- which further developed into the past tense marker -ess- in the 17th century (Rhee 1996). Resultatives are as shown in (15) and exemplified in part in (16):

 The verb kyeysi- is the honorific counterpart of iss- ‘exist’. The role of the converbs in the construction is crucial. Koo (1987) contrasts them by regarding -e as a consolidating connective and -ko as an isolating connective, and following Koo (1987), Rhee (1996) attributes the differential grammaticalization of an identical source lexeme to this functional difference.

Grammaticalization in Korean

(15) a. b. c. d.

-e.iss-e.kyeysi-ko.iss-ko.kyeysi-

589

< iss- ‘exist’ < kyeysi- ‘exist.’ ([+] subject) < iss- ‘exist’ (limited to ‘wear’ ‘ride’ ‘know’, etc.) < kyeysi- ‘exist.’ ([+] subject)

(16) ku-nun ankyeng-ul ssu-ko.iss-ta he- glasses- wear-- ‘He wears glasses.’ (Lit. ‘He put on glasses and remains (in that state).’13

.. Inchoative Inchoative is marked by a few markers developed from verb serialization, with one peculiarity that -key.toy- and -key.sayngki- make use of the converb marker -key, typically used for marking mode or manner. Inchoative markers are partially listed in (17): (17) a. -key.toy-

< toy- ‘become’ (often unplanned chance result; also for Passive) b. -key.sayngki- < sayngki- ‘come into being’ (often unplanned chance result) c. -e.ci< ci- ‘fall’ (often fast and facile change) d. -e.tul< tul- ‘enter’ (Ingressive, Reductive) e. -e.na< na- ‘exit’ (Egressive, Ampliative)

.. Avertive Korean has a number of avertive markers, which signal that an event advanced to a great extent but failed to reach completion contrary to expectation (‘Avertive’, Kuteva [2009]; ‘Action narrowly averted’, Kuteva [1998]; ‘Antiresultative’, Malchukov [2004]; ‘Proximative’, Kuteva, Heine et al. [2019]), as partially listed in (18) and exemplified in (19): (18) a. b. c. d. e. f.

-l.ppen.ha-taka.mal-lkka.malkkaha-ltong.mal.tongha-ltus.mal.tusha-llak.mallakha-

< ha- ‘do’ (Averted) < taku- ‘draw near’, mal- ‘stop’ (Disrupted action) < mal- ‘stop’, ha- ‘do’ (Repeated but frustrated) < mal- ‘stop’, ha- ‘do’ (Repeated but frustrated) < mal- ‘stop’, ha- ‘do’ (Repeated but frustrated) < mal- ‘stop’, ha- ‘do’ (Repeated but frustrated)

 This example is ambiguous between the resultative reading (‘he wears’) and the progressive reading (‘he is putting on’) since -ko.iss is also a marker of the progressive aspect (see 3.2.1).

590

Seongha Rhee

(19) a. tol-ey kelli-e nemeci-l.ppen.ha-yss-ta stone-at be.caught- fall--- ‘(I) almost tripped over a stone.’ b. sem-i poi-llak.mallakha-n-ta island- be.seen--- ‘The island is barely visible.’ c. pi-ka o-lkka.malkkaha-n-ta rain- come--- ‘The rain seems likely to come but it doesn’t.’ Even though avertive is commonly associated with a past event (Kuteva, Aarts et al. 2019) as in (19a), most Korean avertives can be used with respect to a presently ongoing situation as exemplified in (19b) and (19c). The avertive meaning of ‘a barely visible island’ in (19b) comes from ‘an island that nearly materializes in one’s visual field but then fails to do so repeatedly’.14

. Modality .. Deontic modality Korean deontic modality markers signal prohibition, obligation, permission, and suggestion. Prohibitive modality is marked periphrastically involving an explicit marker of negation, e.g., an ‘not’ or mal- ‘stop’, as shown below with their verbal sources: (20) a. -myen.an.toyb. -esenun.an.toyc. -ci.mal-

< toy- ‘become, be good’ (Prohibitive) < toy- ‘become, be good’ (Prohibitive) < mal- ‘stop’ (Prohibitive)

Prohibition meaning in (21b) below is compositionally derived from (21a) through morpho-syntactic reanalysis and functional reinterpretation: (21) a. keki ka-myen an toy-e there go-if not be.good- ‘(Things) will not be as good (as they should be), if (you) go there.’

 The meaning encoded by (18c)–(18f) is a combination of iteration and ‘frustrated completion’ (Kuteva, Aarts et al. 2019).

Grammaticalization in Korean

591

b. keki ka-myen.an.toy-e there go- ‘(You) must not go there.’ Obligation is also signaled by a few periphrastic markers whose composition is diverse. The most common forms, (22a) and (22b), make use of the strong conditional -ya ‘if and only if ’. (22) a. b. c. d.

-eya(man).ha-eya(man).toy-l.swupakkey.eps-nun.pepi-

< ha- ‘do, be’ (Obligative) < toy- ‘become, be good’ (Obligative) < swu ‘way’, pakk ‘outside’, eps- ‘not exist’ (Obligative) < pep ‘law’, i- ‘be’ (Obligative)

Permission is marked by two markers whose meaning is derived from the particle -to ‘also, even’ and the verbs toy- ‘become, be good’ and coh- ‘be good’. Their literal source meanings are ‘even when (one) does x, (things) will be good.’ (23) a. -eto.toyb. -eto.coh-

< toy- ‘become, be good’ (Permissive) < coh- ‘be good’ (Permissive)

Suggestion is a deontic modality in the sense that the speaker is expressing obligation in a more mitigated form, a common strategy in Korean, in which an impositive speech act is often avoided. These forms also make use of toy- ‘become, be good’ and coh- ‘be good’. (24) a. -myen.toyb. -nun.ke-y.cohc. -nun.phyeni.coh-

< toy- ‘become, be good’ (Suggestive) < coh- ‘be good’ (Suggestive) < phyen ‘side’, coh- ‘be good’ (Suggestive)

.. Epistemic modality Epistemic modality markers signal varying degrees of the speaker’s certainty about the proposition, including possibility, probability, certainty, and impossibility. Epistemic modality is marked by a few forms listed below: (25) a. b. c. d.

-l.swu(to).iss-l.ci(to).molu-key.sayngkyess-(nu)n/l.moyangi-

e. -l.pepha-

< swu ‘way’, iss- ‘exist’ (Possibility) < molu- ‘not know’ (Possibility) < sayngki- ‘come into existence’ (Possibility-Probability) < moyang SK ‘shape, appearance’, i- ‘be’ (PossibilityProbability) < pep SK ‘law’, ha- ‘do, be’ (Probability)

592

Seongha Rhee

f. -mcikhag. thullimepsh. i. j. k.

-ko(to).nam-l.swu(ka).eps-l.lika.eps-l.theki.eps-

< ha- ‘do, be’ (Probability) < thullim ‘error’, eps- ‘not exist’ (Certainty; takes a nominalized subject) < nam- ‘remain, surplus exists’ (Certainty) < swu ‘way’, eps- ‘not exist’ (Impossibility) < li ‘reason’, eps- ‘not exist’ (Impossibility) < thek ‘threshold’, eps- ‘not exist’ (Impossibility)

A large body of literature (Bybee and Pagliuca 1985; Bybee, Perkins, and Pagliuca 1994, among many others) addresses polyfunctionality of certain modality markers that can mark both the deontic and epistemic modality, e.g., the English must in You must go home (deontic) and The story must be true (epistemic). However, this oft-cited deontic-epistemic syncretism is not found in Korean. The polyfunctionality of a form for agentive and root modalities is also known to exist in many languages, e.g., the English can in Carol can read cuneiform (ability) and I think there’s a place where I can get a cheap kettle (possibility), taken from Bybee and Pagliuca (1985: 65).15 The Korean modality marker -l.swu.iss- ‘can’ exhibits this polyfunctionality for ability and possibility marking.

.. Boulomaic modality Boulomaic modality markers encode the speaker’s intention and wish. Like other markers of modality, boulomaic modality markers are also periphrastic, suggesting a relatively low level of grammaticalization. Some of such markers are listed in (26): (26) a. b. c. d. e. f. g. h.

-lkka.po-l.they-eyakeyss-ko(ya).malkeyss-l.ke(s)i-lyeko.ha-ko.siph-myen.ha-

< -kka ‘’, po- ‘see’ (Tentative intention) < the ‘ground’, i- ‘be’ (Strong intention) < -kyess- ‘’ < iss- ‘exist’ (Strong intention) < mal- ‘stop’ (Strong intention) < ke(s) ‘thing’, i- ‘be’ (Intention) < -lye(ko) ‘’, ha- ‘do’ (Intention) < siph- ‘want, think’ (Desiderative) < -myen ‘if ’, ha- ‘do’ (Desiderative)

.. Evidentiality Korean has a system of evidentiality marking for quotative (identified author), reportative (unspecified author; hearsay; common knowledge), inferential (inferred  Bybee, Perkins, and Pagliuca (1994: 191–199) propose the grammaticalization channel of [Ability > Root possibility > Epistemic possibility].

Grammaticalization in Korean

593

information), and retrospective (first-hand information recollected). First-hand information is marked by the retrospective -te-, largely translatable as ‘as I recall it’. Quoting and reporting are usually marked by way of complementizers (s) (see 4.1 for more discussion). Inferential evidential with the meaning ‘it seems that’ is marked by various constructions as shown in part in (27): (27) a. b. c. d. e. f. g.

-na/ka/kka/ci.siph-(nu)n/l.tus.siph-(nu)n/l.seng.siph-(nu)n/l.tus.ha-(nu)n/l.kes.kath-l.kesi-l.thei-

< -na/ka/kka/ci ‘’, siph- ‘want, think’ (Inferential) < tus ‘appearance’, siph- ‘want, think’ (Inferential) < seng ‘nature’, siph- ‘want, think’ (Inferential) < tus ‘appearance’, ha- ‘do, be’ (Inferential) < kes ‘thing’, kath- ‘be like’ (Inferential) < kes ‘thing’, i- ‘be’ (Inferential) < the ‘ground’, i- ‘be’ (Inferential)

. Tense .. Past Even though -te- seems to have indicated the ‘pastness’ of an event in OK data (AD 977) (Lee 1998: 59) (note that this is the Retrospective marker in MoK), Korean does not seem to have had a fully grammaticalized marker for the past tense until the advent of -es- in the 17th century. The serial verb construction consisting of the converb -e and the verb of existence is- ‘exist’ is attested in the 15th century data in the form of -e.is- and -eys-. The construction had the resultative or state persistence meaning (see 3.2.4). In the 17th century the latter form was further reduced to -eswhich acquired the past/perfect meaning. In MoK -e.iss- continues to mark resultative or state persistence, and -ess- marks past/perfect, a clear case of divergence (Rhee 1996).

.. Future In MoK the future tense is marked by -li-, -keyss-, and -l.ke(s)i-, the oldest of which is -li-, attested in OK without a known lexical origin. On the other hand, the other two are modern innovations that developed from -key-hay-e-iss- [-do--exist] in the early 19th century (Huh 1987; Rhee 1996) and from -l-kes-i- [.thing-be] in the 20th century, respectively.16 MoK -keyss- carries other futurity-relat-

 The converb -key, glossed as , distinctively marks the mode or manner (see 3.2.5). Its modemanner meaning is primarily responsible for the emergence of the futurity meaning in -keyss- (Rhee 1996: 109–121).

594

Seongha Rhee

ed meanings, e.g., Conjectural, Intentional, Hypothetical willingness, Ability, Possibility, Predestination, etc. (Rhee 1996: 121–131), and -l.ke(s)i- marks Inferential (see 3.3.4).

.. Present Even though there is controversy, -n- in the EMK text dating from AD 977 seems to indicate the ‘present’ tense. Without doubt this form is the predecessor of the MoK present tense marker -nun- for verbs (N.B. verbs of existence/non-existence, copula, and adjectives do not use -nun- for the present; they are ø-marked). The lexical origin of -n-, however, is not available for the lack of historical data.

. Mood Korean has a well-developed system of marking grammatical mood (sentence-type or speech-act type), but its classification has long been a subject of controversy, from as few as four classes (Nam 2001) to eight (Ko 2008) to ten (Kim 1960). These mood markers are also modulated along four to seven speech levels, also variable depending on the researcher. For these reasons a detailed discussion on the grammaticalization of these markers is beyond the scope of the present paper, and the exposition is necessarily brief. Some of the mood markers are listed below according to the descending order of honorification and politeness, largely following the scale of: Deferential − Polite − Blunt − Familiar − Intimate − Plain (cf. Sohn 1999: 355): (28) a. b. c. d. e. f.

Declarative: Interrogative: Imperative: Hortative: Exclamative: Promissive:

-(su)pnita, -eyo, -(s)o, -ney, -e, -(n)ta -(su)pnikka, -eyo, -(s)o, -na/-nunka, -e, -ni/-nunya -sipsio, -eyo, -o, -key, -e, -ela -sipsita, -eyo, -psita, -sey, -e, -ca (none), -kwunyo, (none), (none), -kwun, -kwuna (none), -lkeyyo, (none), -msey, -lkey, -ma

Most of these mood markers do not have known lexical sources, except that the fused morpheme -sup- (and its eroded form -p-), found mostly at the Deferential level, originated from the verb of locution slp- ‘speak (to an honorable person)’ in OK (Lee 1956: 39). Another interesting category of mood marking is Apprehensive (or ‘Preventives’; Ramstedt [1997]; Kim [1960]; Ko [[1970] 1989]), the primary function of which is to warn the addressee of potentially harmful consequences of an action (‘lest’),

Grammaticalization in Korean

595

as in (29a) below, or to present reasons of the speaker’s action or mental state of fear (‘for fear that’), as in (29b), taken from Rhee and Kuteva (2018).17 (29) a. pelley tuleo-lla changmwun tat-ala insect come.in- window close- ‘Close the window lest insects come inside.’ b. cencayng-i na-lkkapwa cam-i an o-n-ta war- come.out- sleep- not come-- ‘(I) cannot sleep for fear of a war.’ (lit. ‘Sleep does not come for fear that a war may break out.’) Apprehensives have developed mostly from uncertainty of the future, i.e., their source constructions involve a future-marker, a question marker, a cognitive verb ‘fear’ or ‘do not know’, or a mode- or purpose-adverbializer, or a combination of two or more of them, as shown below with their literal constructional meanings: (30) a. -lkka ‘lest’ b. -lkkapwa ‘for fear of ’ c. -lkka.mwusewe ‘for fear of ’ d. -lkka.molla ‘for fear of; (I) fear that’ e. -lla ‘lest; (I) fear that’ f. -lseyla ‘for fear of; (I) fear that’ g. -ci.molla/moluni ‘for fear of ’ h. -ci.moshakey ‘lest’ i. -ci.moshatolok ‘lest’ j. -ci.anhkey ‘lest’ k. -ci.anhtolok ‘lest’

<  +  ‘will (it)?’ <  +  + see +  ‘as (I) see if X will’ <  +  + be fearful +  ‘as (I) fear X will’ <  +  + not know + / ‘(as) (I) don’t know if X will’ <  + / ‘(as) X will’ <  +  ‘as X will be of Y-ing’ <  + not know +as ‘as (I) don’t know if X will Y’ <  +  + do +  ‘so that X cannot Y’ <  +  + do +  ‘in order that X cannot Y’ <  +  + do +  ‘so that X may not Y’ <  +  + do +  ‘in order that X may not Y’

As Sohn (1999: 357) notes, there are many other idiosyncratic sentence-enders that are used to mitigate or strengthen the speaker’s assertion. The functions of individual sentence-enders are very complex due to the fact that many connectives have

 The  connectives -lla and -lkkapwa in the examples can be used as main-clause mood markers as well. Many, if not all,  markers can be used either as a subordinator (‘lest’, ‘for fear that’) or a sentence-ender (‘I fear that’) (Rhee and Kuteva 2018).

596

Seongha Rhee

come to function as sentence-enders through insubordination (see 5.1 below) and the initially context-induced, pragmatically-inferred meanings have become semanticized in the meaning of the mood marker (Rhee 2012).

. Agreement Korean does not require person/number/tense agreement on the verb as English does. However, as alluded to in 3.5, Korean has a fully grammaticalized honorification and politeness system. Politeness is used to show politeness toward the addressee, and is usually marked by the particle -yo, whose lexical origin is not known. Honorification is more complicated as it involves multiple considerations, i.e., subject honorification, addressee honorification, object honorification, and suppression of honorification. Subject honorification is marked by -si- on the verb, presumably developed from the verb of existence isi- ‘exist’ (Yang 1939: 126; Lee 1956: 49). Object honorification is marked when the object of an action denoted by the predicate has the [+] feature. It was formerly marked with -sp-, -zp-, -cp, and -p-, developed from the verb slp- ‘speak to (an honorable person)’, like the Deferential mood marker (see 3.5 above). In MoK, however, object honorification is only lexically marked, i.e., in place of regular verbs inherently [+] verbs are used, e.g., poypinstead of po- ‘see’, tuli- instead of cwu- ‘give’, mosi- instead of teyli- ‘accompany’, etc. Addressee honorification is marked with varying levels of speech as discussed in 3.5. Honorification suppression is required when involving [+] subject and [++] addressee. For instance, when a boy speaks to his [++] grandfather (addressee) about his [+] father (sentential subject), honorification marking must be suppressed, i.e., not used, as shown in (31b):18 (31) a. emma apeci o-si-ess-e-yo mom father come-[]-- ‘Mom, Father has come (home).’ b. halapeci apeci o-{ø, *si}-ess-supni-ta grandpa father come-{ø, *[]}--- ‘Grandpa, Father has come (home).’

 Korean prescriptive grammar requires honorification agreement along all these parameters. The complexity of the honorification system is such that it is the area in which speakers make mistakes most frequently due to momentary confusion in formulating properly honorific-marked utterances.

Grammaticalization in Korean

597

 Grammaticalization of Complex Constructions . Complement clauses Korean complement clauses are marked by complementizers, and as noted in 3.3.4, complementizers vary according to the mood of the embedded clause, i.e., declarative (-tako), imperative (-lako), interrogative (-nyako), and hortative (-cako). They developed in the 18th and the 19th centuries (Rhee 2008b). The constructions involved in this development were a string of a mood marker, the locution verb ha‘say’, and the connective -ko ‘and’, from which the locution verb disappeared as a result of phonological erosion, as shown in (32): (32) a. -ta/-la  b. -la  c. -nya  d. -ca 

+ + + +

ha ‘say’ ha ‘say’ ha ‘say’ ha ‘say’

+ + + +

-ko  -ko  -ko  -ko 

> > > >

-tako/-lako . -lako . -nyako . -cako .

Historically, the declarative-based complementizer -tako came into being first, and then others followed this trail-blazer (Rhee 2008b). This suggests that grammaticalization may be actuated by a structural analogy whereby members of an entire paradigm may follow the one member that leads the grammaticalization process, thus creating a whole new paradigm in a short period. There are other complementizers of various origins, as listed in (33), developed from constructions involving a general noun functioning as a nominalizer (33a, 33b), a nominalizer (33c, 33d), or an interrogative sentence-ender (33d, 33e): (33) a. b. c. d. e.

-(nu)n/l.kes -(nu)n/l.sasil -m -(nu)n/l.ci -lkka

< kes ‘thing; ’ < sasil ‘fact’ < -m ‘’ < -ci ‘; .’ < -l-kka ‘-.’

. Relative clauses When a noun accompanies a clausal modifier, the clause is marked by a relativizer. In Korean, a relativized clause is an adnominalized constituent, and thus must be marked by an adnominalizer. There are three types of adnominalizers, commonly called as ‘participles’ in typological literature, depending on the relative time of the event or state denoted by the modifier clause and that of the main clause, i.e., -l for prospective, -n for anterior, and -nun for simultaneous (see 2.6 above).

598

Seongha Rhee

(34) onul cenyek phathi-ey o-{l, n, nun} salam today evening party-to come-{., ., .} person ‘the person who {will come, has come, is coming} to the party tonight’ One interesting historical fact is that the MoK adnominalizers -l and -n were nominalizers, and the historical vestiges of the former function are found in the 15th century data (Rhee 2008a). However, the lexical origins of these adnominalizers/relativizers have not yet been identified.

. Adverbial clauses The markers of adverbial clauses do not form a uniquely identifiable category in terms of form and meaning. In terms of semantics they mark cause/reason, condition, concession, contingency, time, purpose, result, etc., and in form they appear as diverse particles or constructions. Some of the clausal adverbializers are as listed in (35) with their source meaning/function: (35) a. b. c. d. e. f. g. h. i. j.

-nikka -myen -eto -ntey -ese -n.taum(ey) -n.hwu(ey) -n.twi(ey) -le -tolok

< -ni-kka ‘-’ (Cause/Reason) < -myen ‘’ (Conditional) < -e-to ‘-’ (Concessive) < -n-tey ‘.-place’ (Cause/Reason; Background) < -e-se ‘-’ (Cause/Reason; Consequential) < -n-taum-(ey) ‘.-next-at’ (Consequential) < -n-hwu-(ey) ‘.-after-at’ (Consequential) < -n-twi-ey ‘.-after-at’ (Consequential) < -le ‘’ (Purposive) < -tolok ‘’ (Purposive)

. Mermaid constructions Korean features so-called ‘mermaid constructions’ in which a noun performs the dual function as a noun and as a part of a predicate, the latter when it is followed by the copula (see 2.6), hence the label ‘mermaid constructions’ (Tsunoda 2013). The nouns participating in mermaid constructions tend to be semantically light, and thus they are also often considered nominalizers (Rhee 2008a). Korean has a large inventory of such nouns (Kim 2013; Narrog, Rhee, and Whitman 2018). For instance, Kim (2013) analyzes more than seventy nouns in the category. A historical survey suggests that these mermaid constructions grammaticalized mostly in the 20th century, the earliest one being cikyeng (< tikyeng) in the 19th century. Some of such nouns are as shown in (36), taken from Narrog, Rhee, and Whitman (2018: 175)

Grammaticalization in Korean

599

with modifications, and an example of the mermaid construction involving cikyeng is given in (37): (36) Form seym cikyeng nolus cham pep

Source meaning ‘calculation’ ‘domain’ ‘role’ ‘point in time’ ‘law’

Function in Mermaid Construction be like; copulative; equivalence be in the state of; undesirable situation be in the state of; undesirable situation be about to; proximative aspect be obliged to; deontic obligation

(37) phokcwuk soli-lo kwi-ka mekmekha-l.cikyengi-ta firecracker sound- ear- be.deafened-- ‘(I) am nearly deafened by the firecracker noise. (lit. … in the state of becoming deafened …)’

 Other prominent patterns of grammaticalization and reanalysis . Insubordination Among numerous noteworthy instances of grammaticalization in Korean are those that involved the ellipsis of a main clause, whereby former connectives for the embedded clause occur at the utterance-final position and become reanalyzed as sentence-enders. This ongoing phenomenon received much attention by grammaticalization researchers (Lee 1982; Sohn 1995; Koo 1998; Kim 1998; Koo and Rhee 2001; Rhee 2002), and was given various characterizations and labels, e.g., ‘from silence to grammar’ (Rhee 2002), ‘main clause ellipsis’ (Sohn 1995; Koo and Rhee 2001; Sohn 2003; Rhee 2012), ‘insubordination’ (Evans 2007; Malchukov 2013; Evans and Watanabe 2016), ‘suspended clause’ (Ohori [1995] for Japanese). The extent of insubordination and its impact on Korean grammar can be easily surmised from the fact that among the 381 sentence-enders in MoK listed in Kim (2001: 147–151), about 170 of them involve ‘functional shift’, which is nearly always that of insubordination. New sentence-enders usually carry rich intersubjective meanings, as shown in part in the following examples, taken from Rhee (2002): (38) From Connective to Sentence-Ender a. -ketun Hypothetical conditional, Comparative conditional → Topic presentation, Reason, Incidentality

600 b. c. d. e.

Seongha Rhee

-nikka

Cause/Reason, Contingency, Adversative → Addressee reconfirmation, Protest, Assertion -myense Concurrence, Contrast → Addressee confirmation, Challenge, Derisive -(nu)ntey Background, Adversative → Surprise, Reluctance, Reason, Background -key Mode → Exhortative, Dubitative

. Light verb Korean has a light verb that actively participates in grammaticalization. As briefly alluded to in 1.3.2 and elsewhere, the light verb ha-, traceable to h- ‘say, do’ in LMK, is commonly used in deriving an adjective or verb from various source category words, including regular nouns (e.g., kumci-ha- ‘prohibit’ < kumci ‘prohibition’), adjectives (e.g., coh-aha- ‘like’ < coh- ‘be good, be likable’), loan words (e.g., haynsem-ha- ‘be handsome’ < E. handsome), ideophones (e.g., kkwang-ha- ‘make an exploding noise’ < kkwang ‘bang’), etc. We have also seen many instances of ha- in grammaticalization, e.g., De-verbal postposition (see 2.5), Causative (see 3.1), Deontic modal (see 3.3.1), Future (see 3.4.2), complementizer (see 4.1), and many others. The verb ha- is light not only semantically but also phonetically, which makes it vulnerable to reduction and loss, as seen in its erosion in the development of complementizers (see 4.1). Therefore, a large number of particles and suffixes in MoK, numbering hundreds, have no visible trace of the ha- that was present in their source constructions. One of the notable consequences of this erosion is that a construction that once contained ha- can no longer be analyzed by morphosyntactic rules, which prompts language users to reanalyze the construction as something else, thus creating entirely novel sub-paradigms in the domains of sentence-final particles, connectives, adnominalizers, etc. (Rhee 2009). For instance, the complementizers -tako, -lako, -nyako, and -cako developed from constructions -ta-ha-ko, -la-ha-ko, -nya-ha-ko, and -ca-ha-ko, respectively, each containing the verb ha- ‘say, do’ (see 4.1). After the phonological erosion of ha-, the remaining forms could not be effectively analyzed, thus are reinterpreted as introducers of embedded clauses, i.e., complementizers. This is illustrated with the complementizer -tako in the following: (39) a. kunye-nun ku-ka tochakha-yss-ta ha-ko na-ykey malha-yss-ta she- he- arrive-- say-and I- say-- ‘She said to me, ‘He arrived.’ (lit. She said to me, saying ‘He arrived.’)’ b. kunye-nun ku-ka tochakha-yss-tako na-ykey malha-yss-ta she- he- arrive-- I- say-- ‘She said to me that he had arrived.’

Grammaticalization in Korean

601

 Comparative outlook Historical and comparative research on Korean has tended to focus on the relationship between Korean and the putative ‘core’ Altaic families, i.e., Mongolic, Tungusic, and Turkic, largely focusing on phonological and lexical domains. More recently, Korean has been discussed within the context of Transeurasian languages and Northeast Asian languages. Narrog, Rhee, and Whitman (2018: 116–167) note that certain features shared by Korean and Japanese, such as rich use of converbs with grammatical functions, are areal phenomena found in some Northeast Asian languages such as Nivkh and Yukaghir. Of particular interest from the comparative perspective is the influence of Japanese and Chinese. Narrog, Rhee, and Whitman (2018) note that Japanese and Korean share certain grammaticalization features, such as fitting well with ‘reductionist’ approaches to grammaticalization (e.g., [word/construction > (particle) > suffix > inflection]), having abundant examples of grammaticalizations in the interpersonal domain (intersubjectification), and exhibiting influence from written Chinese, among others. Closely located to China and Japan, Korea has maintained long contact with them and served as a transmitter of Chinese culture to Japan throughout history. Thus, Korean has had more influence from Chinese, mostly written, than from Japanese. The contact between Korean and Chinese was limited in terms of the size of the population in contact, but most prominently motivated by ‘cultural prestige or attractivity’ (Bisang 2006: 89) for China’s dominant cultural-political role in the region. Influences of Chinese with respect to grammaticalization are evident in the development of numeral classifiers, de-verbal postpositions, and mermaid constructions. As noted in 2.1, about half of the numeral classifiers are derived from Chinese. It is likely that the borrowing occurred in the course of translation of Chinese texts (Chae 1982, 1996; Kuo 1995). Narrog, Rhee, and Whitman (2018: 179) note that in Chinese itself, numeral classifiers have increased in number and obligatoriness over time; in Old Chinese, numeral quantification was possible with bare numerals. Incidentally, the presence of numeral classifiers of Chinese origin is also a characteristic of Japanese. Noting that numeral classifiers are only marginally attested in Altaic (Janhunen 2000) but are robustly present in Nivkh (Nedjalkov and Otaina 2013), Narrog, Rhee, and Whitman (2018: 179), following Janhunen, speculate that the numeral classifiers are an archaic trait in Northeast Asia, best preserved in the peripheral languages Nivkh, Japanese, and Korean. It was noted in 2.5 that Korean has a large number of de-verbal postpositions, both of native and Sino-Korean origin. Since many de-verbal postpositions involve Chinese characters, their grammaticalization may be an influence of Chinese. This hypothesis is supported by the two facts that in such cases the Chinese character is almost always one character, which is not used alone elsewhere, and that some of

602

Seongha Rhee

Japanese de-verbal postpositions are entirely calques from Chinese (Yamada 1935; Chen 2005, as cited in Narrog, Rhee, and Whitman 2018). As noted in 4.4, Korean has a large number of mermaid constructions involving nouns that are semantically light. A large number of these nouns are of Chinese origin. This is a characteristic also shared with Japanese but are not attested in many other Northeast Asian and Transeurasian languages, a state-of-affairs leading Narrog, Rhee, and Whitman (2018) to conclude that the presence of a large number of mermaid constructions in Korean and Japanese is the result of (written) Chinese influence.

 Summary and Conclusion This paper briefly presented some of the characteristics of Korean grammar, and described noteworthy instances of grammaticalization in nominal and verbal categories as well as in complex constructions. It further addressed the ongoing development of connectives into sentence-enders through insubordination as a prominent pattern of grammaticalization found in Korean, and the active role of the light verb ha- in the creation of novel paradigms, and discussed a few aspects from a comparative perspective. Korean has a relatively short history of language documentation; records before the 15th century are limited in number and, despite phenomenal progress in recent years, remain unclear in much of their interpretation. Research on texts from the past 600 years has revealed a large number of noteworthy grammaticalization instances. The shallow historical depth is the major problem for tracing grammaticalization trajectories of many grammatical forms, and no reasonable chronology of grammaticalization can be drawn from the current level of understanding. Such limitations notwithstanding, the following can be said with respect to the general patterns. Among nominal morphologies, case markers and delimiters have old grams (e.g., nominative -i, accusative -l, vocative -(h)a, topic -istn, additive -to, distributive -mata, and a few others) attested in OK whose lexical origins are unknown as well as newer grams whose lexical origins, mostly involving nouns and verbs, are relatively transparent (see 2.5). Nominalizers and pronouns are in similar situations. Many pronouns were innovated in MoK often involving place nouns and demonstratives (Sohn 1999; Rhee 2019). The plural marker -tul/tl(h) was grammaticalized in LMK from a noun, but recently it expanded its function into the discourse domain and functions as a stance-marker. There were a few nominal classifiers in the 15th century (LMK), e.g., ca for length, nath for individuated objects, but they increased in number from the 16th century and proliferated in EMoK. Among verbal morphologies, some modality markers date from LMK, e.g., resultative (-e.is-), continuative (-e.ka-), experiential-attemptive (-e.po-), desiderative (-e.sikpu-), obligative (-eza.h-), inferential (-ka.sikpu-), and many others were inno-

Grammaticalization in Korean

603

vated at later times. The origins of most mood markers and honorification agreement markers can also be traced back to LMK (and some to OK, e.g., -ta/-cye for declarative, -(k)o/(k)a for interrogative, -sye for imperative, exclamative -ye, object honorification -slp-, subject honorification -si-, addressee honorification -ngi-, among others; see Lee [1998]; Park [1998]), but their change at later times is not extensive as compared to other categories. Some of the valency-changing grams are attested in OK (causative -hai- and -i-), but their paradigms were enriched with innovated members at later times (see 3.1). More recent instances of grammaticalization include the development of complementizers that emerged in EMoK even though their precursor form was attested as early as in the 16th century (see 4.1). Some LMK predicatives involve defective nouns and the copula, thus matching the template of mermaid constructions (e.g., those involving LMK defective nouns t, s, li, etc.; Jeong [2003]). Unlike these defective nouns whose lexical origin and morphosyntactic composition are rather opaque now, a large number of mermaid constructions actively in use in contemporary Korean are recent innovations in the 20th century (cf. tikyeng ‘domain, boundary’ is attested in the 19th century). Insubordination, responsible for the emergence of a large number of sentence-final particles, is also a recent phenomenon only attested in MoK. Of particular significance are the influence of written Chinese, a point presented in Narrog, Rhee, and Whitman (2018); extensive use of mermaid constructions, which recruit semantically weak nouns for predicative uses; a large number of classifiers, many of Chinese origin; and the rich development of de-verbal postpositions, many also of Chinese origin. In addition, the present study highlights the role of pragmatic inference that operates over the sentence fragments ending with a clausal connective when the main clause is elided, i.e., insubordination. Such inference involves reanalysis of the former connective as a sentence-ender. Since the sentence-enders in Korean are the primary elements that carry the interactional, interpersonal relationship information, the development of sentence-enders through insubordination necessarily involves intersubjectification in the acquisition of meaning and function.

Abbreviations  = additive,  = adnominal,  = addressee-honorific,  = anterior,  = apprehensive,  = avertive,  = causative,  = classifier,  = complementizer,  = conditional,  = conjectural,  = connective,  = converb,  = dative,  = declarative,  = emphatic,  = sentence-ender,  = evidential,  = future,  = honorific,  = hortative,  = imperative,  = instrumental,  = interrogative (=),  = mermaid construction,  = mode,  = limit,  = negation,  = native Korean,  = nominative,  = nominalizer,  = passive,  = perfective,  = plural,  = polite,  = possessive,  = postposition,  = present,  = prohibitive,  = prospective,  = past,  = purpose,

604

Seongha Rhee

 = question/interrogative (=),  = reflexive,  = resultative,  = retrospective,  = singular,  = subject-honorific,  = simultaneous,  = Sino-Korean,  = topic.

References Ahn, Byung Hee. 1967. Mwunpepsa [History of grammar] (Enemwunhaksa, Vol. 5 of Hankwukmwunhwasa Taykyey). Research Institute of Korean Studies, Korea University. Bisang, Walter. 2006. Contact-induced convergence: Typology and a reality. In Keith Brown (ed.), Encyclopedia of Language & Linguistics, 2nd edn, vol. 3, 88–101. Oxford: Elsevier. Bybee, Joan L. & William Pagliuca. 1985. Crosslinguistic comparison and the development of grammatical meaning. In Jacek Fisiak (ed.), Historical semantics, historical word formation, 59–83. Berlin: Mouton de Gruyter. Bybee, Joan L., Revere Perkins & William Pagliuca. 1994. The Evolution of grammar: Tense, aspect, and modality in the languages of the world. Chicago: The University of Chicago Press. Chae, Wan. 1982. Kwuke swulyangsakwuuy thongsicek kochal [A historical investigation of Korean quantifier phrases]. Chintanhakpo 53/54. 155–170. Chae, Wan. 1990. Thukswucosa [Special particles]. In The Korean Language Research Group of Seoul National University Graduate School (ed.), Kwukeyenkwu Etikkayci Wassna [How far Korean language studies have advanced], 263–70. Seoul: Dong-A Publishing. Chae, Wan. 1996. Kwuke pwunlyusa kay-uy chayong kwacengkwa uymi [On borrowing of the Korean classifier -kay and its meaning]. Chintanhakpo 82. 193–215. Chen, Chun-hui. 2005. Bunpōka to shakuyō – Nihongo ni okeru dōshi no chūshikei o fukunda kōchishi ni tsuite [Grammaticalization and borrowing: postpositions in Japanese composed from verbs in ren’yō or -te forms], Nihongo no Kenkyū 1(3). 123–138. Evans, Nicolas. 2007. Insubordination and its uses. In Irina Nikolaeva (ed.), Finiteness: Theoretical and empirical foundations, 366–431. Oxford: Oxford University Press. Evans, Nicholas & Honoré Watanabe (eds.). 2016. Insubordination. Amsterdam: John Benjamins. Haspelmath, Martin. 1990. The grammaticalization of passive morphology. Studies in Language 14(1). 25–72. Haspelmath, Martin. 1995. The converb as a cross-linguistically valid category. In Martin Haspelmath & Ekkehard König (eds.), Converbs in cross-linguistic perspective: Structure and meaning of adverbial verb forms − adverbial participles, gerunds, 1–55, Berlin: Mouton de Gruyter. Heine, Bernd. 1993. Auxiliaries: Cognitive forces and grammaticalization. Oxford: Oxford University Press. Heine, Bernd. 1997. Possession. Cambridge: Cambridge University Press. Ho, Kwangsu. 2003. Kwuke Pocoyongen Kwuseng Yenkwu [A study on Korean auxiliary verb constructions]. Seoul: Yeklak. Hong, Jongseon. 1983. Myengsahwa emiuy pyenchen [Historical change of nominalizing particles]. Kwukekwukmwunhak 89. 31–89. Hong, Ki-Moon. 1957. Cosenelyeksamwunpep [Historical linguistics of Korean]. Pyongyang: Kwahakwen. Hong, Seok-jun. 2015. A lexico-morphological study on color adjectives in the Korean language. Seoul, Korea: Seoul National University dissertation. Hopper, Paul J. 1991. On some principles of grammaticization. In Elizabeth C. Traugott & Bernd Heine (eds.), Approaches to grammaticalization, vol. 1, 17–35. Amsterdam: John Benjamins. Hopper, Paul J. & Elizabeth C. Traugott. 2003 [1993]. Grammaticalization. 2nd edn. Cambridge: Cambridge University Press.

Grammaticalization in Korean

605

Huh, Woong. 1987. Kwuke Ttaymaykimpepuy Pyenchensa [A developmental history of Korean tense system]. Seoul: Saem Publishing. Huh, Woong. 1995. 20-seyki Wulimaluy Hyengthaylon [The 20th century Korean morphology]. Seoul: Saem Publishing. Janhunen, Juha 2000. Grammatical genders from east to west. In Barbara Unterbeck (ed.), Gender in grammar and cognition, 689–707. Berlin: Mouton de Gruyter. Jeong, Howan. 2003. Hankwukeuy Paltalkwa Uyconmyengsa [The development of Korean and dependent nouns]. Seoul: Ihoy. Johanson, Lars & Martine Robbeets. 2010. Introduction. In Lars Johanson & Martine Robbeets (eds.), Transeurasian verbal morphology in a comparative perspective: Genealogy, contact, chance, 1–5. Wiesbaden: Harrassowitz. Kang, Eun Kook. 1993. Cosene Cepmisauy Thongsicek Yenkwu [A diachronic study of Korean suffixes]. Seoul: Sekwanghakswulcalyo Publishing. Kim, Joungmin. 2013. Mermaid construction in Korean. In Tasaku Tsunoda (ed.), Adnominal clauses and the ‘Mermaid Construction’: Grammaticalization of nouns, 249–296. Tokyo: NINJAL. Kim, Min-Soo. 1960. Kwuke Mwunpeplon Yenkwu [A study of Korean grammar]. Seoul: Jipmundang. Kim, Minju. 2011. Grammaticalization in Korean: The evolution of the existential verb. London: Saffron Books. Kim, Nam-Kil. 1992. Korean. In William Bright (ed.), International encyclopedia of linguistics, 282–286. Oxford: Oxford University Press. Kim, Tae-yop 1998. The functional shift of endings from nonfinal to final. Eonehag: The Journal of the Linguistic Society of Korea 22. 171–189. Kim, Tae-yop 2001. Kwuke Congkyelemiuy Mwunpep [A grammar of Korean sentence-enders]. Seoul: Kwukhakcalyowen. Ko, Yong-Kun. 1989 [1970]. Kwuke Hyengthaylon Yenkwu [A study of Korean morphology]. Seoul: Seoul National University Press. Ko, Yong-Kun. 2008. Hankwukeuy Sicey Sepep Tongcaksang [Korean Tense, Aspect, and Modality]. Seoul: Thayhaksa. Koo, Hyun Jung. 1987. Ssikkuth -a, -key, -ci, ko-uy ssuimkwa uymi [The usage and meaning of suffixes, -a, -key, -ci, and -ko]. Konkuk Emwunhak 11–12. 167–188. Koo, Hyun Jung. 1998. Grammaticalization of conditional markers in Modern Korean. Journal of Sangmyung Language and Literature 8. 1–13. Koo, Hyun Jung. 2009. Body in the language: A case with Korean body-part term ‘head’. Language and Linguistics 46. 1–27. Koo, Hyun Jung. 2016. On change of the terms of address between couples during the 70 years of post-colonization as reflected in mass media. Korean Semantics 51: 85–110. doi: 10.19033/ sks.2016.03.51.85. Koo, Hyun Jung & Seongha Rhee. 2001. Grammaticalization of a sentential end marker from a conditional marker. Discourse and Cognition 8(1). 1–19. Koo, Hyun Jung & Seongha Rhee. 2016. Pejoratives in Korean. In Rita Finkbeiner, Jörg Meibauer & Heike Wiese (eds.), Pejoration, 301–323. Amsterdam: John Benjamins. Koo, Hyun Jung & Seongha Rhee. 2018. Ideophones and attenuatives in Korean. Paper presented at the 51st Societas Linguistica Europaea (SLE) Conference, University of Tallinn, Estonia, 29 August – 1 September 2018. Kuo, Chiu Wen. 1995. A study of Korean classifiers. Seoul, Korea: Sungkyunkwan University dissertation. Kuteva, Tania. 1998. On identifying an evasive gram: Action narrowly averted. Studies in Language 22(1). 113–160.

606

Seongha Rhee

Kuteva, Tania. 2009. Grammatical categories and linguistic theory: Elaborateness in grammar. In Peter K. Austin, Oliver Bond, Monik Charette, David Nathan & Peter Sells (eds.), Proceedings of Conference on Language Documentation and Linguistic Theory 2, 13–28. London: SOAS. Kuteva, Tania, Bas Aarts, Gergana Popova & Anvita Abbi. 2019. The grammar of ‘non-realization’. Studies in Language 43(4): 850–897. doi: 10.1075/sl.18044.kut. Kuteva, Tania, Bernd Heine, Bo Hong, Haiping Long, Heiko Narrog & Seongha Rhee. 2019. World lexicon of grammaticalization (2nd revised edition). Cambridge: Cambridge University Press. Lee, Hee-Seung. 1956. On the word itta (to be). Journal of Seoul National University 3. 17–47. Lee, Hyeon-hie 1982. Kwuke congkyelemiuy paltaley tayhan kwankyen [Thoughts on the development of sentence-final particles in Korean]. Journal of Korean Linguistics 11. 143–63. Lee, Ju-Haeng. 2009. Hankwuke Uyconmyengsa Yenkwu [A study of Korean dependent nouns]. Seoul: Hankook Publisher. Lee, Ki-Moon. 1998 [1961]. Kwukesa Kaysel [An introduction to Korean historical linguistics] (revised ed.). Seoul: Thayhaksa. Lee, Ki-Moon & S. Robert Ramsey. 2011. A history of the Korean language. Cambridge: Cambridge University Press. Lee, Seung-Jae. 1998. Kotay kwuke hyengthay [Morphology of Old Korean]. In National Institute of the Korean Language (ed.), Kwukeuy Sitaypyel Pyenchen Yenkwu 3: Kotay Kwuke [A study on the historical change of Korean, Vol. 3: Old Korean], 41–75. Seoul: National Institute of the Korean Language. Lee, Tae Yeong. 1993 [1988]. Kwuke Tongsauy Mwunpephwa Yenkwu [A study of the grammaticalization of Korean verbs]. Seoul: Hanshin Publishing. Malchukov, Andrej L. 2004. Towards a semantic typology of adversative and contrast marking. Journal of Semantics 21. 177–198. Malchukov, Andrej L. 2013. Verbalization and insubordination in Siberian languages. In Martine Robbeets & Hubert Cuyckens (eds.), Shared grammaticalization: With special focus on the Transeurasian languages, 177–208. Amsterdam: John Benjamins. doi: 10.1075/ slcs.132.14mal. Nam, Ki-Shim. 2001. Hyentaykwuke Thongsalon [Modern Korean syntax]. Seoul: Thayhaksa. Narrog, Heiko & Seongha Rhee. 2013. Grammaticalization of space in Korean and Japanese. In Martine Robbeets & Hubert Cuyckens (eds.), Shared grammaticalization: With special focus on the Transeurasian languages, 287–315. Amsterdam: John Benjamins. doi: 10.1075/ slcs.132.21nar. Narrog, Heiko, Seongha Rhee & John Whitman. 2018. Grammaticalization in Japanese and Korean. In Heiko Narrog & Bernd Heine (eds.), Grammaticalization from a typological perspective, 166–188. Oxford: Oxford University Press. Nedjalkov, Vladmimir P. & Galina A. Otaina. 2013. A syntax of the Nivkh language. The Amur dialect. (Translated and edited by Emma Š. Geniušien, edited by Ekaterina Gruzdeva). Amsterdam: John Benjamins. NIKL (The National Institute of the Korean Language) https://krdict.korean.go.kr/statistic/dicStat. (accessed January 2019). Ohori, Toshio. 1995. Remarks on suspended clauses: A contribution to Japanese phraseology. In Masayoshi Shibatani & Sandra A. Thompson (eds.), Essays in semantics and pragmatics in honor of Charles J. Fillmore, 201–218. Amsterdam: John Benjamins. Park, Jinho. 1998. Kotay kwuke mwunpep [Old Korean grammar]. In National Institute of the Korean Language (ed.), Kwukeuy Sitaypyel Pyenchen Yenkwu 3: Kotay Kwuke [A study on the historical change of Korean, Vol. 3: Old Korean], 121–205. Seoul: National Institute of the Korean Language. Phyocwun Kwuke Taysacen [A Standard Korean dictionary]. 1999. Seoul: The National Institute of the Korean Language.

Grammaticalization in Korean

607

Ramstedt, Gustaf J. 1903. Über die Konjugation des Khalkha-Mongolischen. Helsingfors: Société Finno-Ougrienne. Ramstedt, Gustav John. 1997 [1939]. A Korean grammar. Helsinki: Suomalais-Ugrilainen Seura. Rhee, Seongha 1996. Semantics of verbs and grammaticalization: The development in Korean from a cross-linguistic perspective. The University of Texas at Austin dissertation. Seoul: Hankook Publisher. Rhee, Seongha. 2002. From silence to grammar: Grammaticalization of ellipsis in Korean. Paper presented at the New Reflections on Grammaticalization II Conference, University of Amsterdam, The Netherlands, 3–6 April. Rhee, Seongha. 2008. On the rise and fall of Korean nominalizers. In Maria José López-Couso & Elena Seoane (eds.), Rethinking grammaticalization: New perspectives, 239–264. Amsterdam: John Benjamins. Rhee, Seongha. 2008b. Subjectification of reported speech in grammaticalization and lexicalization. Harvard Studies in Korean Linguistics 12. 590–603. Rhee, Seongha. 2009. Consequences of invisibility: Paradigm creation from an eroded light verb. Paper presented at the 19th International Conference on Historical Linguistics (ICHL), Radboud University, Nijmegen, The Netherlands, 10–14 August. Rhee, Seongha. 2011. Nominalization and stance marking in Korean. In Foong Ha Yap & Janick Wrona (eds.), Nominalization in Asian languages: Diachronic and typological perspectives, 393–422. Amsterdam: John Benjamins. Rhee, Seongha. 2012. Context-induced reinterpretation and (inter)subjectification: The case of grammaticalization of sentence-final particles. Language Sciences 34(3). 284–300. doi: 10.1016/j.langsci.2011.10.004. Rhee, Seongha. 2018. Grammaticalization of the plural marker in Korean: From object to text to stance. Journal of Language Sciences 25(4). 221–249. doi: 10.14384/kals.2018.25.4.221. Rhee, Seongha. 2019. Politeness pressure on grammar: The case of first and second person pronouns and address terms in Korean. Russian Journal of Linguistics 23(4). 950–974. doi: 10.22363/2312-9182-2019-23-4-950-974. Rhee, Seongha & Hyun Jung Koo. 2014. Grammaticalization of causatives and passives and their recent development into stance markers in Korean. Poznań Studies in Contemporary Linguistics 50(3). 309–337. doi: 10.1515/psicl-2014–0018. Rhee, Seongha & Hyun Jung Koo. 2017. Multifaceted gustation: Systematicity and productivity of taste terms in Korean. Terminology 23(1). 38–64. Rhee, Seongha & Tania Kuteva. 2018. Apprehensive markers in Korean. Paper presented at the Workshop on Apprehensive Markers, the 51st Annual Meeting of Societas Linguistica Europaea, Tallinn University, Estonia, 29 August – 1 September. Shon, Dal-lim. 2012. A study on the morphological and phonological characteristic of onomatopoeia in Modern Korean. Seoul, Korea: Ewha Woman’s University dissertation. Simons, Gary F. & Charles D. Fennig (eds.). 2018. Ethnologue: Languages of the world, twenty-first edition. Dallas, Texas: SIL International. Online version: http://www.ethnologue.com. Sohn, Ho-Min. 1999. The Korean language. Cambridge: Cambridge University Press. Sohn, Sung-Ock S. 1995. On the development of sentence-final particles in Korean. Japanese/ Korean Linguistics 5. 219–234. Sohn, Sung-Ock S. 2002. The grammaticalization of honorific particles in Korean. In Ilse Wischer & Gabriele Diewald (eds.), New Reflections on Grammaticalization, 309–325. Amsterdam: John Benjamins. Sohn, Sung-Ock S. 2003. On the emergence of intersubjectivity: An analysis of the sentence-final nikka in Korean. Japanese/Korean Linguistics 12. 52–63. Song, Jae Jung. 1997. The so-called plural copy in Korean as a marker of distribution and focus. Journal of Pragmatics 27. 203–224.

608

Seongha Rhee

Song, Jae Jung. 2005. The Korean language: Structure, use and context. London: Routledge. Suh, Jung-Bum. 2003. Kwuke Ewen Sacen [Korean etymological dictionary]. Seoul: Bogosa. Tsunoda, Tasaku (ed.). 2013. Adnominal clauses and the ‘Mermaid Construction’: Grammaticalization of nouns. Tokyo: NINJAL. Whitman, John. 2015. Old Korean. In Lucien Brown & Jae Hoon Yeon (eds.), The handbook of Korean linguistics, 422–438. London: Wiley-Blackwell. Woo, Hyeongshik. 2001. Hankwuke Pwunlyusauy Pemcwuhwa Kinung Yenkwu [A study on the categorizing functions of Korean classifiers]. Seoul: Pagijong Press. Wulimal Khunsacen [A comprehensive Korean dictionary] (by Hankulhakhoy). 1992. Seoul: Emwunkak. Yamada, Yoshio. 1935. Kanbun no kundoku ni yorite tutaeraetaru gohō [Words and grammar passed on through the Japanese reading of Chinese texts]. Tokyo: Hōbunkan. Yang, Joo-Dong. 1939. Hyangkacwuyeksanko [Thoughts on hyangka commentaries]. Chintanhakpo 10. 110–133.

Linlin Sun and Walter Bisang

14 Grammaticalization changes in Chinese  Introduction This paper aims to outline important grammaticalization changes in Chinese reviewed on the basis of an uninterrupted historical record of over three millennia. There are four main historical periods of Chinese involved: (i) Old Chinese (referring to the language between the 12th and 3rd centuries BC; known as 上 汉语 Shànggǔ hànyǔ), with the Han dynasty (206 BC–220 AD) widely being regarded as the time of transition from Old Chinese to the next period. The term of Classical Chinese is generally used to refer to the language from the 5th century BC to the 2nd century AD (cf. Norman 1988). (ii) Middle Chinese (from the 3rd to the 13th century; known as 汉语 Zhōnggǔ hànyǔ). Most of the grammaticalization processes discussed in this paper go back to this period. (iii) pre-Modern Chinese (from the 14th to the 19th century; known as 近代汉语 Jìndài hànyǔ) and (iv) Modern Chinese (现代汉语 Xiàndài hànyǔ), which generally refers to Chinese as it is spoken since the 19th/20th century. Our paper is divided into two main parts: the part on nominal categories (section 2) and the part on categories at clause level (section 3). In the nominal part, we will focus on the grammaticalization of numeral classifiers (section 2.1) and on relational nouns for situating events in space and time (section 2.2). The part on clause-level categories includes markers whose structural scope is confined to the verb up to markers operating at the clause level. It includes aspectual suffixes, directional verbs and modal auxiliaries (section 3.1), modality markers (section 3.2), the passive marker 被 bèi (section 3.3), the disposal marker 把 bă (section 3.4), the copula 是 shì (section 3.5), verbs in the function of prepositions (coverbs) (section 3.6) and elements used for clause combining (conjunctions, subordinators) (section 3.7). Our description of grammaticalization concentrates on Standard Chinese or Mandarin (known also as 现代标准汉语 Xiàndài biāozhŭn hànyŭ, 通话 Pŭtōnghuà or 国语 Guóyŭ), which is the native language of about 70 % of the Chinese-speaking population. In addition to Standard Chinese or Mandarin, there are eight major groups of Sinitic languages officially identified in China, i.e., Jin, Wu, Hui, Gan, Xiang, Min, Hakka/Kejia, and Yue. It is important to be aware that there is a considerable degree of variation across these Sinitic languages as well as within them (as an example, cf. the different varieties of Mandarin itself, i.e., Northeastern Mandarin, Southwestern Mandarin, Ji-Lu Mandarin, etc., cf. Szeto, Ansaldo, and Matthews 2018). The differences are not limited to the level of phonology, they also manifest themselves in morphology and syntax (Chappell 2015; Szeto, Ansaldo, and Matthews 2018). This variation is closely associated with processes of diversification https://doi.org/10.1515/9783110563146-014

610

Linlin Sun and Walter Bisang

and migration through the history of China (Branner 2000; Handel 2015). In this paper, we concentrate on phenomena of grammaticalization in Mandarin that are also found in many other Sinitic languages. Needless to say that the details on the situations beyond Mandarin cannot be presented in detail in a survey paper. Moreover, more research will be needed. Currently, we are far from knowing the overall degree of cross-linguistic variation in Sinitic.

 Grammaticalization within the nominal domain . Numeral classifiers As most other languages of East and mainland Southeast Asia, Chinese has numeral classifiers. Thus, count nouns like xìn ‘letter’ cannot be combined directly with numerals (1a), they need an additional marker, which is 封 fēng in the case of xìn ‘letter’ (1b): (1)

a. * sān xìn three letter [Intended meaning:] ‘three letters’

b.



sān fēng xìn three  letter ‘three letters’

Words like fēng are called ‘classifiers’ (量词 liàngcí in Chinese). They generally highlight or create a unit for making it accessible to counting, or, as Greenberg (1972: 10) puts it “all the classifiers are from the referential point of view merely so many ways of saying ‘times one’”. In spite of this general function, there is a wealth of terminology for different types of classifiers in Chinese grammars. Zhang (2013: 46) lists seven subtypes which are presented below in (2). Subtype (i) counts intrinsic units of individual concepts, while subtypes (ii) to (vi) provide different categories of extrinsic units for measuring the concept expressed by the noun (also cf. the nine types in Chao [1968: 595–631] and Lü [1999: 14]). Kind classifiers of subtype (vii) differ from the other subtypes inasmuch as they count different types or categories to which a given concept belongs rather than units for quantifying them. (2) Seven subtypes of Chinese classifiers (Zhang 2013: 46) (i) Individual classifiers: 颗 瓜 sān kē xīguā [three  watermelon] ‘three watermelons’ (ii) Individuating classifiers: 滴 瓜汁 sān dī xīguā-zhi [three drop watermelon juice] ‘three drops of watermelon juice’ (iii) Standard measure: 公斤 瓜 sān gōngjīn xīguā [three kilo watermelon] ‘three kilos of watermelons’

Grammaticalization changes in Chinese

(iv) Container measure: (v) Partitive classifiers: (vi) Collective classifiers: (vii) Kind classifiers:

611

箱 瓜 sān xiāng xīguā [three box watermelon] ‘three boxes of watermelons’ 片 瓜 sān piàn xīguā [three slice watermelon] ‘three slices of watermelon’ 堆 瓜 sān duī xīguā [three pile watermelon] ‘three piles of watermelon’ 种 瓜 sān zhŏng xīguā [three kind watermelon] ‘three kinds of watermelon’

The classifiers which are of particular typological interest belong to subtype (i), individual classifiers, because they do not exist in all languages, while the other types are cross-linguistically very common. It is this type, which is referred to as “sortal” classifier by Lyons (1977). Our presentation is limited to this type, which is represented by an impressive inventory in Chinese. Table 1 lists the frequently used classifiers and some semantically more specific ones. In each case, the source concept will be indicated in the third column, followed by a fourth column which presents the semantic criteria. As in many languages with nominal classification systems, semantic consistency varies. The classifiers are divided into the semantic classes of animacy (human, animal, plant), shape (one-/two-/three-dimensional), size, instrument (man-made objects), place (locations, buildings, etc.),1 and special classifiers. The general classifier 个 gè, which can be combined with a large number of nouns, is listed first. Remarkably enough, gè is the default classifier for individual persons (e.g., 四个 sì gè xuéshēng [four  student] ‘four students’). Before moving on to the diachronic perspective of grammaticalization, some further specifications on the exact grammatical functions of individuating classifiers and their description in different linguistic approaches will be needed. Since Greenberg (1972), the function of individual classifiers (henceforth ‘simple classifiers’) is associated with individuation. A nominal concept has to be individuated before it can be quantified by a numeral. From the perspective of formal semantics, Chierchia (1998) argues that nominal concepts have to be atomized for counting, i.e., they have to be singled out as individual atoms by a numeral classifier. Individuation and atomization are both based on assumptions of the lexical properties of nouns in numeral classifier languages. From the perspective of individuation, all nouns

 The categories of animacy, shape, size and location are basically from Allan (1977), who subsumes animacy under the more general term of ‘material’ with its subcategories of animate, inanimate and abstract. The category of ‘instruments’ can be seen in the light of Denny’s (1976) notion of functional interaction, which looks at how a given object is made use of by the members of a speech community. Finally, the differences in social hierarchy presented in the context of human animate nouns are reflected in Denny’s (1976) category of social interaction. The category of ‘specific’ is generally used in Chinese linguistics for referring to criteria which cannot be defined in terms of semantically more abstract notions.

612

Linlin Sun and Walter Bisang

Tab. 1: List of classifiers in Chinese. Nr.

Classifier*

Lexical source

Semantic function



个 (個/箇) gè

N: ‘bamboo tree’

General classifier.



wèi

N: ‘position’

Human: The speaker intends to honor the referent of the human-denoting noun.



míng

N: ‘name’

Human: The noun referred to is part of a list of names.



(隻) zhī

N: ‘a bird’

Animate: Animals; is also used for single items of pairs such as eyes, hands.



(頭) tóu

N: ‘head’

Animate: Herded cattle; is also used for garlic.



kŏu

N: ‘mouth’

Animate: Cattle; is also used for other objects associated with a mouth: language/utterances, wells.

N: ‘stalk, stem’

Animate: Plants, trees; is also used for clusters of plant material (e.g., clusters of grass).



棵 kē



duŏ

N: ‘blossom’

Animate: Flowers; is also used for objects with flower/blossom-like shape like clouds.



(條) tiáo

N: ‘branch’

One-dimensional: Long flexible objects like lines, wires, ropes, cigarettes, rivers; is also used for arms, tongues, fish and for the rainbow.



根 gēn

N: ‘root’

One-dimensional: Long roundish objects like bamboos, poles, wires, ropes, cigarettes, hairs.



(

N: ‘twig’

One-dimensional: Objects like writing brushes, fountain pens, bamboo flutes, arrows, chopsticks, rifles/guns.

) zhī



张 (張) zhāng

V: ‘spread, stretch’

Two-dimensional: Sheet of paper, painting, ticket, table (surface), bed, face, plough.



块 (塊) kuài

N: ‘lump, chunk’

Three-dimensional: Earth/land, stone, watermelon, soap; is also used for wristwatches and money/currency units.



粒 lì

N: ‘grain’

Small size: Grain-shaped objects.



枚 méi

N: ‘stalk, stem’

Small size: Coins and other small items.



颗 kē

N: ‘bean’

Small size and three-dimensional: Soya beans, pearls, bullets.



把 bă

V: ‘hold’ N: ‘bundle, handle’

Instrument: Objects with a handle (hoe, knife, axe, comb) or objects that fit into a hand (chopsticks, key).



件 jiàn

V: ‘distinguish’ N: ‘piece’

Instrument: Garments above the waist and social matters and occasions.

* The characters presented in brackets correspond to the old form used before 1957.

Grammaticalization changes in Chinese

613

Tab. 1 (continued) Nr.

Classifier

Lexical source

Semantic function



架 jià

N: ‘frame, rack, stand’

Instrument: Things with support (piano) and large engines (planes).



(

N: ‘deck, plat-form, terrace’

Instrument: Electronic equipment (e.g., computers, TV sets); also used for theatrical performances.



所 suŏ

N: ‘place’

Place: buildings (house), institutions (school, hospital).



间 (間) jiān

N: ‘space, interval’

Place: Rooms (bedroom, hall).



座 zuò

N: ‘seat’

Place: Big locations like mountains, palaces and large Buddha statues.



本 běn

N: ‘root’

Special : Books.

N: ‘unit of length’

Special : Cloth, horses and other hoofed animals.

) tái







封 fēng

N: ‘seal’

Special : Letters, postal items.



辆 (輛) liàng

N: ‘chariot with two wheels’

Special : Vehicles.



盏 (盞) zhăn

N: ‘bowl’

Special : Lamps.



身 shēn

N: ‘body’

Special : Suits.



shŏu

N ‘head’**

Special : Poems, songs.

** While tóu (cf. Nr. 5) is still used in Modern Chinese, which today only occurs in idiomatic expressions.

shŏu is the old word for ‘head’,

are mass nouns in this type of languages and the classifier profiles a unit that is intrinsic to the concept expressed by the noun for making it accessible to counting. In Chierchia’s (1998) analysis, bare nouns in numeral classifier languages denote kinds (also cf. Krifka 1995) and the classifier returns a subset of atomic individuals if it is applied on them. The idea that nouns in languages with numeral classifiers lexically differ from languages with no numeral classifiers is not uncontroversial in linguistics. Thus, Cheng and Sybesma (1999: 517) argue that “the fact that all nouns require a classifier does not mean that all nouns are mass nouns”. In their view, there is a count/mass distinction in Chinese but that distinction “is not visible at the noun level […], it is reflected at the classifier level” (Cheng and Sybesma 1999: 519). In more recent times, the interaction between the lexical meaning of the noun and the function of the classifier is described not by single features but by combinations of features. One of them concentrates on the properties of the classifiers (Li, X. 2013), the other one focuses on the properties of nouns in their interaction with

614

Linlin Sun and Walter Bisang

the classifier (Zhang 2013). Without going too much into details,2 the difference between the two approaches can be illustrated by the analysis of the following example with the classifier duŏ ‘blossom, flower’ (Nr. 8 in Table 1): (3)

a.



sān duŏ huā three  flower ‘three flowers’ (Zhang 2013: 43)

b. sān duŏ yún hree  cloud ‘three pieces of cloud’

In Zhang’s (2013) noun-oriented view, the ambiguity of the two examples is characterized by two different classifier functions, i.e., the ambiguity is seen as the result of the multifunctionality of the classifier in its interaction with the lexical properties of the noun. In the case of (3a), the classifier duŏ matches with the natural unit denoted by the non-mass noun huā ‘flower’ and thus functions as an individual classifier (cf. type [i] in [2]). In (3b), the same classifier interacts with the conceptually non-discrete mass noun yún ‘cloud’ and functions like an individuating classifier (cf. type [ii] in [2]).3 In Li’s (2013) analysis, the classifier duŏ has only one function, which he defines by the two features [+counting] and [−measuring]. The core use of this type is to “spell out the inherent counting unit intrinsic to the sets of discrete entities”. As Li, X (2013) points out, the classifier does not necessarily pick out sets of discrete entities as in the core use. In a more general definition, it “picks out a set of contextually relevant minimal entities which instantiate the kind denoted by the noun” (Li, X 2013: 145). In the case of duŏ, it focuses on the blossoms as they occur in the discrete atomic entity of a flower (3a) as well as on blossom-like characteristics as they can be associated with clouds (3b). The fact that one and the same classifier can be combined with more nouns than the ones of type (i) in (2) as it was illustrated above with duŏ is a characteristic of Chinese classifiers which is not equally prominent in classifier languages of NonSinitic languages. Most items in Table 1 have this property, among them kŏu ‘mouth’ (Nr. 6), kē ‘stalk, stem’ (Nr. 7), duŏ ‘blossom’ (Nr. 8), zhāng ‘spread, stretch’ (Nr. 12) and kuài ‘lump, chunk’ (Nr. 13). In Non-Sinitic classifier languages, individual classifiers strongly tend to have only that function. Thus, the modeling of Chinese classifier systems needs a more complex inventory of features than is the case in other classifier languages. The studies of Li, X (2013) and Zhang (2013) are good examples of how to model classifiers which cover more than the function of type (i) in (2).

 Zhang (2013) describes the Chinese classifier system in terms of the two features of [±delimitable] and [±numerable], while Li, X’s (2013) analysis is based on the features of [±counting] and [±measuring].  In Zhang’s (2013) analysis, the noun in (3a) is a non-mass noun with the feature [+delimitable]. In (3b), it is a mass noun with the feature [−delimtable].

Grammaticalization changes in Chinese

615

There are two criteria which generally hold for determining the degree of grammaticalization of individual classifiers and another one which can be used for Chinese numeral classifiers. The first one is obligatoriness. As was shown in (1) above, numeral classifiers are obligatory in combination with numerals.4 The other one is semantic generality or arbitrariness. If grammaticalization is associated with semantic bleaching, it is to be expected that classifiers increasingly combine with nouns belonging to different semantic categories until they end up as general classifiers, which can be used with basically any count noun (cf. the classifier gè). A third criterion, which is limited to languages in which a given classifier can represent more than one subtype in (2), is the functional reduction to individuation or atomization. Thus, classifiers which belong to more than one subtype are less grammaticalized than the ones which are limited to the subtype of individual classifiers. The fact that most Chinese classifiers belong to more than one subtype may also be related to the historical development of classifiers in Chinese. As was suggested in Bisang (1999), there are two different pathways for the development of classifiers. In the category-oriented process, classifiers develop from systems of compounding, in which the morphological head is in a taxonomic or metonymic relation to the remaining elements of the compound. In an example like Thai dɔ̀ɔk kulàap ‘rose’, the first component expresses the taxonomically general meaning of ‘flower’, which is further specified by the second element, which together with the morphological head produces the meaning of ‘rose’. In processes of grammaticalization, the first element is reanalysed as a numeral classifier, which is repeated in the postnominal classifier position as in dɔ̀ɔk kulàap sǎam dɔ̀ɔk [rose three ] ‘three roses’. In Chinese, grammaticalization of individual classifiers prominently follows the second, item-oriented process, which starts out from individual head nouns which use lexical items, mostly nouns (but cf. the verbal origins of Nr 12, Nr 17 and Nr 18 in Table 1), in already existing syntactic positions for measuring classifiers (mainly individuating, container, collective). If this analysis is correct, the possibility of individual classifiers to belong to more than one subtype can be linked right back to their historical origins. The first examples of classifier constructions go back right to the period of oracle-bone inscriptions between the 13th and the 11th centuries BC. At that time, the classifier construction was by no means obligatory. In fact, numerals occurred much more frequently without a classifier in the two construction types of [ ] and [ ] (for some statistics, cf. Djamouri [1987]). Moreover, the classifier construction of that period differed from the modern construction [  ] inasmuch as the sequence [ ] followed the noun, i.e., [  ].

 We do not discuss idiomatic expressions and compounds, in which numerals often occur without a classifier. In many cases, these constructions reflect the Chinese of older periods in which classifiers did not exist or were used in contexts in which the noun is non-referential.

616

Linlin Sun and Walter Bisang

The use of classifiers was clearly item-oriented in the sense that it was restricted to nouns denoting objects of particular cultural value like slaves, captives, chariots and certain domesticated animals (e.g., horses). In example (4), we find the noun 人 rén ‘man’ in the classifier position for counting members of the Qiang, a tribe dominated by the Chinese: (4)

Oracle-bone inscriptions (Heji 26910; from Yang-Drocourt 2004: 50) 其 羌十人

qí yòu [qiāng shí rén] wáng shòu yòu. 3 offer :Qiang 10 :man king receive help ‘When one offers ten Qiang people, the king will be helped.’ To what extent these examples can be compared to fully-fledged constructions of individual classifiers remains an open question. It is also questionable to what extent there exists a direct historical link to the period when classifiers became more frequent and started developing into a more elaborate system at some time in the Han dynasty (206 BC–220 AD) and the early medieval period (220–589). In the classical period between the 5th and the 3rd centuries BC, we sporadically find classifiers like liàng (Nr. 27) and 乘 chèng for chariots (mostly war chariots), méi (Nr. 15) for wooden objects, pĭ (Nr. 25) for horses and gè (Nr. 1) for arrows. This situation led to different assumptions concerning the origin of Chinese classifiers. Some researchers date its beginning before the 5th century BC (Dobson 1962), others see its origin in the classical period (5th–3rd centuries BC) (Wang 1958), while a third group argues for the Han dynasty (Peyraube 1998; Zhang 2012a, 2012b). What can be seen clearly from the documentation of Liu (1965) is that the number of different classifiers massively increases in the Wei-Jin and the Northern and Southern dynasties (220–589). But this fact does not say anything about obligatoriness, which is certainly an important indicator of a high degree of grammaticalization and of a fully developed classifier system. In this context, Yang-Drocourt (2004: 115–140) presents important new insights. As she reports, the degree of obligatoriness differs depending on text type. In texts which are closer to the spoken language and which were found in the process of excavations in the ruins of Loulan and Niya,5 the use of classifiers is systematic and obligatory. While these texts can be dated between 252 and 330 AD, somewhat later texts of the 5th century AD (e.g., the 說新語 Shìshuō Xīnyŭ by 劉義慶 Liu Yiqing, 403–444 AD) show no obligatory use of classifiers.6 Based on these findings, one may argue that the spoken language had a grammaticalized obligatory classifier system that was only partly reflected in the more canonical texts. This implies that such a fully grammaticalized system existed earlier than

 In Chinese characters: Loulan [樓蘭], Niya [尼雅].  Yang-Drocourt (2004: 121) only finds classifiers in 9 % of the numeral constructions attested in the Shìshuō Xīnyŭ (27 out of 187 instances).

Grammaticalization changes in Chinese

617

after the Tang dynasty (618–907) as is generally recognized by the majority of Chinese linguists. The item-oriented process of development can be observed in all periods of Chinese classifier systems. It can be observed first at the time of oracle-bone inscriptions, when it was used with culturally valuable objects (cf. above) and it can be observed again in the Han dynasty (206 BC–220 AD) and in the development of intensive classifier development in the subsequent period between 220 and 581 AD, when a considerable number of classifiers mentioned in Table 1 already existed (among them the classifiers for animals: Nr 4 zhī, Nr. 5 tóu and Nr. 6 kŏu; the two classifiers denoting branches/twigs Nr. 9 tiáo and Nr. 11 zhī and others like Nr. 12 zhāng, Nr. 18 jiàn, Nr. 21 suŏ, Nr. 24 běn and Nr. 26 fēng). In each case, the use of these classifiers was limited to a few semantically closely related objects before some of them started extending their compatibility. A good example is the classifier tiáo ‘branch’, which is first attested with branches of trees and then expands to concrete objects like certain fruits, ropes and garments (5) until it finally ended up being used with some abstract nouns (6) (cf. Erbaugh 1986; Tai and Wang 1990; Bisang 1999; Yang-Drocourt 2004). (5)

tiáo with garments (Tripitaka; from Yang-Drocourt 2004: 129) 七條

zuò [qī tiáo yī] make 7  jacket ‘to make seven jackets’ (6)

tiáo with an abstract concept (Shishuo Xinyu, 403–444 AD; from YangDrocourt 2004: 130) 增法 十條

yòu zēng [fă wŭshí tiáo] moreover add law 50 paragraph ‘Moreover, (he) added fifty paragraphs of law.’ The two classifiers méi ‘stalk, stem’ (Nr. 15) and gè ‘bamboo tree’ (Nr. 1) had a similar history. The classifier méi was the general classifier in the period between 220 and 581 AD when it became quickly compatible with a vast number of nouns until it was steadily replaced by the classifier gè in the Tang dynasty (608–918) (Wang 1989; Zhang 2012a, 2012b). The first nouns occurring with méi denoted fine, subtle varieties of stems or stalks, while gè was first used with bamboo trees and arrows (Bisang 1999).

. Marking space or time with postpositons (relational nouns) Chinese has several postnominal markers for anchoring objects in space and in time (Chao 1968; Li and Thompson 1981; Fang 2007). In their most simple form, they are

618

Linlin Sun and Walter Bisang

Tab. 2: Relational nouns. (i) Monosyllabic

(ii) Bisyllabic, without de (iii) Bisyllabic, with de

上 shàng

桌子上

桌子上边/上面/上

桌子的上边/上面/上

‘upper side’

zhuōzi-shang table-on ‘on the table’

zhuōzi shàngbiān/ shàngmiàn/shàngtou ‘on the table’

zhuōzi de shàngbiān/ shàngmiàn/shàngtou ‘on the table, on top of the table’

里 lĭ





jiàoshì-li classroom-in ‘in the classroom’

里边/里面/里 jiàoshì lĭbiān/ lĭmiàn/lĭtou ‘in the classroom’



‘inside’

堂前边/前面/前 shítáng qiánbiān/ shàngmiàn/shàngtou ‘in front of the canteen’

堂的前边/前面/前 shítáng de qiánbiān/ shàngmiān/shàngtou ‘in front of the canteen’

前 qián ‘in front of’



堂前 shítáng-qián canteen-front ‘in front of the canteen’

的里边/里面/里 jiàoshì de lĭbiān/ lĭmiàn/lĭtou ‘in the classroom’

monosyllabic like the following: 上 shàng ‘upper side, on top of ’, xià ‘under’, 里 lĭ ‘inside, within, in’,7 外 wài ‘outside, out of ’, 前 qián ‘in front of ’ and hòu ‘behind’. In their more complex form, each of these monosyllabic forms can be extended by suffixing the markers 边 biān ‘edge, side’, 面 miàn ‘face, surface’ or tóu ‘head’ (often with neutral tone) as in shàngbiān, shàngmiàn or shàngtou ‘on top of, above’ or lĭbiān, lĭmiàn or lĭtou ‘in, within, inside of ’. The monosyllabic form as well as the bisyllabic forms follow the noun for situating it in space. While the monosyllabic form is a suffix (often with neutral tone), the bisyllabic forms are independent words which either follow the noun directly or with additional use of the modification marker 的 de. Table 2 illustrates the three constructions for shàng ‘upper side’, lĭ ‘inside’ and qián ‘in front of ’. The suggestion of treating postpositions as an adpositional word category in Chinese, along with prepositions, is supported by Chao (1968), Peyraube (1980), Ernst (1988), Djamouri and Paul (2012) and Paul (2015), among others. Given that the above markers have lost most of their nominal properties (with the exception of the optional use of the modification marker de with bisyllabic forms), there are good reasons for considering them as postpositions. Historically, these postpositions are probably derived from genitive constructions, in which they take the position of the head noun. The monosyllabic structure of the type 山上 shān shàng/shang [mountain upper.side] ‘on the mountain’ is old and goes back at least to Classical Chinese (3rd–5th centuries BC). At that time, shàng and the other markers of that type were independent words which were able to occur with the genitive marker 之 zhī as in

 Also see the more specific markers for expressing the notion of being within a given special structure: 内 nèi ‘inside, interior’ and zhōng ‘center, middle, midst’.

Grammaticalization changes in Chinese

619

山之上 shān zhī shàng [mountain  upper.side] ‘on the mountain’. Given the nomi-

nal origin of these markers and the option of linking them with a modification marker in Classical Chinese (之 zhī) and in Putonghua (的 de), they are also called “relational nouns” in Western grammars (cf. e.g., Li and Thompson 1981). The lexical meaning of the markers that get grammaticalized into postpositions of this type is relatively straightforward and is reflected in the translations given above. In spite of this, it is quite hard to determine the parts-of-speech status of the lexical sources involved. This is due to the absence of parts-of-speech specification in lexical items of Classical Chinese (on this topic, cf. Bisang [2008]; Zádrapa [2011]; Sun [2015], forthcoming) and it is reflected later on in many lexicalized structures, i.e., compounds and idiomatic constructions. Thus, the lexical item 面 miàn translated as ‘face’ so far can also function as a verb with the meaning of ‘to face (towards e.g., the north or a river)’ up to currently spoken Mandarin. Similarly, the word 上 shàng translated above as ‘upper part’ can be used as a verb with the meaning of ‘ascend’ both in Classical Chinese and in Mandarin (cf. section 3.1.2).8 Thus, 上山 shàngshān [ascend mountain] means ‘go up a hill or a mountain’. In other constructions, the same word is used as a modifier as in the following lexicalized forms of Mandarin: 上级 shàngjí [upper level/rank] ‘higher authorities/ranks’, 上 shàngyī [upper garment] ‘upper outer garment, jacket’ or 上肢 shàngzhī [upper limb] ‘upper limps [of the human body]’. In idiomatic sayings, the monosyllabic forms can occur alone in the argument positions of verbs as in the following example of the existential verb 有 yŏu ‘there is’: (7)

上有 堂

下有苏

shàng yŏu tiāntáng, xià yŏu Sū Háng. above there is paradise below there is Suzhou Hangzhou ‘Above there is the paradise, down below there are Suzhou and Hangzhou.’ Since each relational noun has its own diachronic profile with its own specific properties, a survey like this cannot be the place for a detailed analysis. In any case, the lexical items involved in the formation of relational nouns are good examples of how grammaticalization and lexicalization follow similar pathways in Chinese and how grammatical markers get fossilized in lexical structures (Xing 2013, 2015). In addition to their spatial meaning, most of the above monosyllabic elements are also involved in the expression of time. The lexical items 前 qián ‘in front of ’

 The character 上 with the verbal meaning ‘ascend’ was probably pronounced differently in Classical Chinese. It is claimed that 上 had two different readings up to Middle Chinese: shăng (in Baxter’s [1992] transcription) when used as a verb, and shàng when used as a noun. However, this distinction associated with two different parts of speech had been neutralized due to the 浊上 zhuóshàngbiànqù rule in Middle Chinese, through which the third tone (e.g., ă) usually became the fourth/departing tone (e.g., à) when the onset was a voiced fricative, plosive or affricate consonant.

620

Linlin Sun and Walter Bisang

and hòu ‘at the back of ’ can occur independently in idiomatic constructions of Mandarin. In their temporal interpretations, the two markers follow the metaphor that time moves from back to front. Therefore, hòu ‘back’ refers to past events, while qián ‘front’ is associated with future events, as illustrated in the following example: (8)

事要

前看



后看

fánshì yào xiàng qián kàn, bú yào xiàng hòu kàn. everthing must toward front look,  must toward back look ‘(One) should always look to the future but not look back (to the past).’ In their function of expressing time, the markers hòu and qián can occur alone or they are expanded to bisyllabic markers. In each case, they take the perspective of the events they mark and situate them relative to the next event. The bisyllabic markers are either formed with the Classical Chinese modification marker 之 zhī (zhīhòu ‘after’, zhīqián ‘before’) or with the Classical Chinese verb yĭ ‘take, use, hold’ (yĭhòu ‘after’, yĭqián ‘before’). In the first example (9a) below, the marker follows a noun, in the second example (9b) it occurs with a clause: (9)

a. 那 假 之后 nà cì jiàqī zhīhòu that  holiday after ‘after that holiday’

b. 他过 节以后 … tā guò-wán Chūnjié yĭhòu he cross-finish Spring Festival after ‘After he had the Spring Festival, …’

While the above bisyllabic markers with hòu and qián are mostly used in the domain of temporal functions, other bisyllabic markers formed by zhī and yĭ can be used in the domain of space or in both domains. The combinations with shàng ‘upper’ and xià ‘under’ belong to the domain of space (zhīshàng and zhīxià refer to spatial positions above or below a point of reference expressed by the noun; yĭshàng and yĭxià are used for indicating positions above or below in a sequence of items/positions). Other bisyllabic forms can express both functions (yĭnèi ‘within a certain domain of space or time, yĭwài ‘without/beyond a certain domain of space or time; similarly, zhīnèi and zhīwài).

 Grammaticalization within the domain of categories at clause level . Aspect Chinese does not have a salient tense marking system, while a variety of aspectual markers play a crucial role in identifying temporal relationships in this language.

Grammaticalization changes in Chinese

621

The formation of aspectual markers constitutes a central component of Chinese grammaticalization. The present section is divided into five subsections: the first one (section 3.1.1) is devoted to the development of resultative aspectual marking as it is expressed by the V1V2 resultative construction. The second one (section 3.1.2) discusses the development of directional verbs into different aspect markers. The remaining three sections outline the development of the four most important aspect markers in Chinese, i.e., the perfective suffix 了 -le and the perfect particle 了 le (section 3.1.3), the durative suffix 着(著) -zhe (section 3.1.4) and the progressive auxiliary zài (section 3.1.5).

.. The resultative construction The resultative construction is one of the most commonly used types of verb compounds in Chinese. It consists of a sequence of two adjacent verbs V1V2, in which the second verb signals the result of the action or process denoted by the first verb. Since V2 provides an endpoint to V1, it makes the whole construction telic and aspectually bounded (Li and Thompson 1981: 54; Bisang 2010). In the resultative construction of example (10) below, the state expressed by the second verb 断 duàn ‘be broken’ represents the result of V1 shuāi ‘fall’. Thus, V2 duàn ‘break’ adds a terminal boundary to the whole construction in the sense that the activity of falling (V1 ) ends up in the result of the undergoer argument (‘leg’) being broken (V2 ). (10) 他摔断了腿 tā shuāi-duàn -le tuǐ. he fall-break  leg ‘He fell and broke his leg.’ Resultative constructions may function either intransitively or transitively, depending mostly on the semantics of the V1 component. Moreover, resultative constructions vary greatly in terms of lexicalization and conventionalization. Some of them are well-established lexical items, while others are syntactically more transparent. This depends on the frequency of occurrence of the construction in language use as well as the semantic generality and the productivity of a given V2, i.e., the number of V1 with which it can be combined. Examples of the most frequently used and the most productive V2 components in Mandarin include wán ‘finish, complete’ (e.g., 写 xiě-wán ‘write-finish’, 做 zuò-wán ‘make-finish’, chī-wán ‘eat-finish’), huài ‘break, spoil’ (e.g., 弄 nòng-huài ‘make-break’, yòng-huài ‘use-break’), 破 pò ‘break, damage’ (e.g., 打破 dǎ-pò ‘hit-break’, 攻破 gōng-pò ‘attack-break’), 开 kāi ‘open’ (e.g., 拉开 lā-kāi ‘draw-open’, 弄开 nòng-kāi ‘make-open’) etc. The resultative construction represents an integrated whole in terms of its meaning and its argument structure as it is related to the argument structures of

622

Linlin Sun and Walter Bisang

the component verbs. Moreover, V1 and V2 form an inseparable unit. The only elements that can be inserted between V1 and V2 are the potential marker 得 de (< dé ‘to obtain, to acquire’) and the negation bù ‘not’. In the case of de insertion, the resulting construction (V1-de-V2 ) expresses affirmative potential meaning in the sense that the activity denoted by V1 can have the result expressed by V2, while the construction resulting from bù insertion (V1-bù-V2 ) conveys negative meaning in the sense that the activity expressed by V1 cannot have the result denoted by V2 (Li and Thompson 1981: 56). The historical development of the V1V2 resultative construction has been widely discussed in the literature. Most researchers agree that the resultative construction was fully developed in Middle Chinese (e.g., Wang 1958; Xu 2006). According to Mei (e.g. 1981, 1991, 2008), Jiang (2000) and many others, the rise of the resultative construction was primarily due to the loss of phono-morphological mechanisms for transitivizing verbs into causatives in Old Chinese.9 With this loss, it was no longer possible to express causatives by a single verb, i.e., a single verb was no longer able to jointly express an action and its achievement. This situation led to the emergence of a new disyllabic strategy, alongside with some other structural strategies, for expressing what was originally conveyed by a single monosyllabic causative verb in Old Chinese by two verbs (V1V2 ), the first denoting the action and the second its resulting state (Xu 2006: 156–164). However, the extent to which the emergence of the resultative construction was really due to the loss of morphology is questioned by some scholars such as Bisang (2010) and Yao (2013) (see below). If the disappearance of morphology took place earlier than the actual development of the resultative construction and if the relevant morphology was not productive and lexically determined, some additional factors must have been at work, among them the disappearance of certain syntactic structures (cf. Bisang 2010). Zhuang (2014) observes that there were syntactic restructuring processes involved in the formation of resultative constructions. At a first stage, causative meaning was expressed by a syntactic construction of the type ‘V-NPU-V’, in which the undergoer argument NP (NPU) of the first verb was followed by the second verb. At a second stage, the V-NPU-V construction was replaced by the VV-NPU construction, in which NPU follows the VV sequence as a whole (also cf. Feng 2002).10 The examples in (11) illustrate the co-occurrence of the V-NPU-V construction and the VV-NPU construction in Dazhuangyanlunjing (ca. 3rd century AD). In both examples, the first verb is dǎ 打 ‘hit’ and the second verb is pò 破 ‘break’.

 On the relevance of the loss of verbal morphology for the emergence of the bèi passive, also cf. sections 3.3 and 3.4.  But also note that the VV sequence itself was not an outcome of the loss of causative verbs, but it did exist already in Old Chinese.

Grammaticalization changes in Chinese

623

(11) V-NPU-V and VV-NPU constructions in Dazhuangyanlunjing (Zhuang 2014: 580) b. 打破水甕 a. 打瓨破 dǎ-pò shuǐwèng dǎ hóng pò hit-break water jar hit jar break ‘break the water jar’ ‘break the jar’ The transition from the V-NPU-V to the VV-NPU construction indicates that the VV sequence stopped consisting of two separate verbs and became an integrated unit. Mei (1991) takes the view that VV sequences like (11a) underwent reanalysis as a new construction, i.e., the V1V2 resultative construction characterized by the following structural constraints: (i) V1 is transitive and V2 is intransitive, and (ii) the construction as a whole functions transitively. In Xu’s (2006) account, the contrast between V1 and V2 is not based on transitivity values, but rather on the lexicalsemantic distinction between action (V1 ) and result (V2 ) as well as on inherently different temporal properties: While V1 is an active dynamic verb, V2 is a stative verb expressing the result or the accomplishment of an activity. The development of verbs in the V2 slot of the resultative construction from full verbs to resultative aspectual components can be illustrated by another look at the verb 破 pò ‘break’, which is one of the most frequent V2-verbs in Middle Chinese (Cheng 1992; He 2005; Song 2007; Li, Y 2013). According to Xu (2006), in Early Middle Chinese, the verb pò was still used either as an active dynamic verb meaning ‘break, damage’ or as a stative verb with the meaning of ‘broken, damaged’ denoting the accomplishment of that action. Later on, however, pò gradually shifted into an endpoint-prominent verb marking the accomplishment or the result of a given action. Parallel to this, with the decline of the V-NPU-V construction, VV-NPU sequences became more frequent and were ultimately reanalyzed in terms of the new V1V2 resultative construction. During that period, pò was fully developed into a stative verb which consistently took the V2 slot in the V1V2 construction for indicating the resulting state of V1. The most common resultative constructions with pò as V2 found in Late Middle Chinese include 擊破 jī-pò ‘beat-break’, 攻破 gōng-pò ‘attackbreak’, 敲破 qiāo-pò ‘knock-break’ and 摔破 shuāi-pò ‘fall-break’, etc. Researchers like Yao (2013) suggest that the resultative construction might have appeared much earlier and due to clause integration. The process of clause integration already appeared in Old Chinese, when a series of juxtaposed clauses expressing temporally successive actions or events (with either intransitive or transitive argument structure) were integrated into a sequence of verbs (verb serialization) through the omission of shared nominal arguments and, as the case may be, the omission of markers of coordination. At a subsequent stage, when sequences of verbs were used more frequently and regularly, the verbs became more tightly integrated and finally inseparable. The result of this was a well-established resultative verb compound. Yao (2013) sees the emergence of the resultative construction in the broader framework of disyllabification, which is widely believed to have started in the early phase of the Han dynasty (about 200 BC to the first years AD). During

624

Linlin Sun and Walter Bisang

that process, many originally monosyllabic words, syntactic constructions and word strings were replaced by disyllabic units (Cheng 1992; Dong 2012). In (12) below from Zuozhuan (ca. 4th century BC), the events expressed by the two verbs involved took place one after the other. First, there was the activity of attacking Kuai (denoted by 攻 gōng ‘attack’), which was followed by a second event (as a result of that attacking activity) in which Kuai broke down (denoted by 潰 kuì ‘break down’). In Yao’s analysis, such serialised clauses have the potential to give rise to the formation of the resultative compound 攻潰 gōng-kuì ‘attack-break down’ by a paratactic tightening process. (12)

寅 攻蒯 蒯潰。 (Zuozhuan – Zhao 23) bǐngyín, gōng Kuǎi, Kuǎi kuì. the 27th day attack Kuai (place) Kuai (place) break down ‘On the 27th day, (Yinxin) attacked Kuai, Kuai was (then) broken down.’

.. Directional verbs ... General remarks Chinese directional verbs are part of the directional construction consisting of the basic meaning component (V, basically a movement or ‘displacement verb’ in terms of Li and Thompson [1981])11 followed by two positions (Vd1 and Vd2) for indicating the directionality of the action expressed by the verb in the basic meaning component (i.e., in the form of VVd1Vd2). In the following example, the motion verb zǒu ‘walk’ has the function of V, while jìn ‘enter’ and lái ‘come’ take positions Vd1 and Vd2, respectively. (13) 张 走进来了 Zhāngsān zǒu-jìn-lái le. Zhangsan walk-enter-come / ‘Zhangsan entered (toward the speaker).’ (Li and Thompson 1981: 61) The position of Vd2 can only be filled by the two verbs of lái ‘come’ and qù ‘go’, which are called ‘deictic verbs’ by Lamarre (2008) and indicate the movement of

 Li and Thompson (1981: 58) discuss three types of displacement verbs. The first type includes verbs signaling motion, such as zǒu ‘walk’, pǎo ‘run’ or fēi ‘fly’. The second type contains the action verbs which inherently imply that their direct object undergoes a change of location, such as ná ‘bring, take’, bān ‘remove’, or rēng ‘throw’. The third type also involves action verbs, but as compared to the second type, this type particularly implies that its direct object may be caused to undergo displacement. In the case of dǎ ‘hit, beat’, this implies that the action of hitting/beating will possibly cause the undergoer in the object position to be displaced.

Grammaticalization changes in Chinese

625

the actor (or sometimes the theme, if other verbal arguments are involved) relative to a given deictic center. In (13), the speaker is located at the place Zhangsan was entering. The position of Vd1 is taken by a core group of the following six ‘nondeictic verbs’: 上 shàng ‘move up, ascend’, xià ‘move down, descend’, 进 jìn ‘move into, enter’, 出 chū ‘move out, exit’, huí ‘move back, return’, and 过 guò ‘move across, cross’. The members of the two sets of verbs can be combined productively, yielding the following twelve combinations: shàng-lái ‘ascend-come’, xià-lái ‘descend-come’, jìn-lái ‘enter-come’, chū-lái ‘exit-come’, huí-lái ‘return-come’, guò-lái ‘cross-come’ and shàng-qù ‘ascend-go’, xià-qù ‘descend-go’, jìn-qù ‘enter-go’, chūqù ‘exit-go’, huí-qù ‘return-go’, guò-qù ‘cross-go’. Constructions of the type VVd1Vd2 as illustrated by (13) present a maximal pattern. Shorter patterns in which either Vd1 or Vd2 are missing are also possible (V-Vd1, V-Vd2). The grammaticality of these patterns and the presence of additional nominal items depends on the semantic properties of V and its argument structure. For the sake of brevity, the present survey will only focus on the directional verbs (Vd1 and Vd2) without paying specific attention to the complexity of possible patterns if additional positions for nouns are included (cf. Lamarre [2008] for a more detailed discussion). Finally, each of the above eight directional verbs can also occur independently in the function of a full verb. Similarly, the twelve combinations of Vd1Vd2 mentioned above can be used alone for expressing movement (e.g., tā jìn-lái-le [3. enter-come-] ‘s/he entered.’). The combination of V with elements of Vd1 and Vd2 as illustrated by (13) constitute directional-resultative constructions in Li and Thompson’s (1981: 58–65) terminology, who treat these constructions as a subtype of the resultative compounds discussed in the previous section (also cf. Wang 1958: 403–409). As can be seen from example (14), the directional verb can also be used to express resultative aspectual functions by adding a terminal boundary to the displacement-activity expressed by the main verb. Thus, shàng indicates that the activity of turning off the light (expressed by guān) ends up in the light being gone. (14) 他关上了灯 tā guān–shàng -le dēng. he turn.off–ascend  light ‘He turned off the light.’ Since Talmy (1985, 2000), there have been various attempts of determining the typological status of Chinese (and various mainland Southeast Asian languages) as a verb-framed language which encodes the  of motion in the lexical meaning of the main verb of a clause (e.g., Spanish) or as a satellite-framed language which elaborates on  information by adding verb-external satellites (e.g., English). Due to notorious problems with that dichotomy in languages like Chinese, Slobin (2004: 228) suggested the equipollent type as a third type in which the  of

626

Linlin Sun and Walter Bisang

movement expressed by the main verb and the  information expressed by the satellite(s) “are equal in formal linguistic terms” (also cf. Chen and Guo 2009 for a similar view). As is shown for Modern Chinese, none of these types convincingly accounts for Chinese (Lamarre 2008; Shi and Wu 2014). From a diachronic perspective, these findings are in contradiction to Peyraube (2006) who claims that Chinese has fully shifted from a verb-framed type in Old Chinese to a satellite-framed type in Modern Chinese. Given current research on the diachronic development of Chinese directionals, Shi and Wu’s (2014) structural and usage-based analysis of text samples from Old, Middle, pre-Modern and Modern Chinese possibly presents the most plausible conclusion. Chinese started out as a rather strongly verb-framed language at the period of Old Chinese and is on the move to the satellite-frame type since then. Currently, it favors the side of the satellite-framed type without being a typical satellite-framed language. Given the overall complexity of the development of directional verbs with their slots in the directional construction, the present paper will limit itself to the exemplary discussion of the two individual directional verbs of 来(來) lái ‘come’ (section 3.1.2.2) and 过(過) guò ‘cross, go through, pass through’ (section 3.1.2.3). Since the historical formation of the V1V2 resultative construction was presented in section 3.1.1, the remainder of this section focuses on the grammaticalization processes which led from a lexical motion verb to an aspect marker.

... Grammaticalization of the directional verb 来(來) lái In Old Chinese, the word 來 (来) lái originally denoted actions or motions of the type ‘(cause to) come (to the speaker or deictic center), (cause to) arrive (at the location of the speaker or deictic center)’ – either intransitively or transitively. In both examples of (15) from Lunyu (ca. 3rd century BC), lái denotes movement through space. (15) a. Intransitive interpretation of lái ‘come, arrive’ 有 遠 來 (Lunyu) yǒu péng zì yuǎn fāng lái. have friend from distant place come ‘There are friends who come from afar.’ b. Causative interpretation of lái ‘cause to come, cause to arrive’ 來之則安之 (Lunyu) jì lái zhī zé ān zhī.  cause.to.come  then be.settled/content  ‘Since (we) have made them come here, (we should) content them.’ According to Liu (2012: 94–95), during the Han dynasty (206 BC–220) (transition time from Old Chinese to Middle Chinese), lái developed into a directional verb, one that takes a locative object-NP indicating the goal of the movement. In most examples found of that period, the locative object of directional lái is a locative pronoun

Grammaticalization changes in Chinese

627

(e.g., cĭ ‘here’) or a relational noun (e.g., 前 qián ‘front, before’; cf. section 2.2). This can be illustrated with (16) from Shiji (1st century AD), with lái ‘come’ being followed by xià ‘underside, under’. (16) 神來 教我 (Shiji) shén lái xià jiāo wǒ. divine-being come underside teach me ‘A divine being came below to teach me.’ (Liu 2012: 94) In Middle Chinese, the directional verb lái further developed several grammatical functions, most of which are related to the notion of aspect (cf. e.g., Cao 1995; Wu 1996; Liu 2012). One of the most well attested aspectual functions of lái is found in the V1V2 resultative construction, in which lái as V2 serves to add a directional and/ or resultative reading to the construction as a whole. This is illustrated by (17) from Fobenxingjijing (6th century), in which lái (V2 ) expresses the completion of the downward movement denoted by xià ‘descend’ (V1 ): (17) 复有 虛空 复有 來 碎末 (Fobenxingjijing) fù yǒu zhù-zài xū-kōng bú xià, fù yǒu xià-lái also exist stay-be.in air  descend also exist descend-come zì-rán suì-mò naturally small-pieces ‘There were also some of them which stayed in the air without descending, and also some others which broke into small pieces at the moment they descended.’ (cf. Liu 2012: 97) Another aspecto-temporal function of lái is that of the perfect as it developed in Middle Chinese. While lái in the V1V2 resultative construction immediately follows the main verb, perfect-marking lái normally takes the sentence-final position, indicating that an event of the past is still currently relevant. This is illustrated by (18) from Fajupiyujing (4th century): (18) 吾已 來, 須復辦 (Fajupiyujing) wǔ yǐ shí lái, bù xū fù bàn I already eat   need again prepare ‘I have already taken a meal; there is no need to prepare one again.’ (Liu 2012: 104) According to Liu (2012: 100–106), the perfect function of lái originated from the P+N/V(+CONJ)+lái construction. This construction was used in Old Chinese for expressing a directional movement in a spatial sense, with lái as an independent verb

628

Linlin Sun and Walter Bisang

(cf. zì yuǎn-fāng lái [from far-direction come] ‘come from afar’ in [15a]). In the course of time, this construction underwent both semantic and structural change. On the semantic side, it was extended to temporal concepts, with lái gradually losing its concrete verbal content and becoming an aspecto-temporal marker. On the syntactic side, the construction was reduced to N/V+lái, with both the preposition () and the conjunction () being omitted after having been optional for a while. In its early stages, lái as a perfect marker often had the function of a conjunction as in (18) above. Between the 12th and the 15th centuries, lái was fully developed into a sentence-final perfect particle (Chao 1968; Sun 1996: 99). This is illustrated by (19) from Piaotongshi (15th century): (19) 哥哥那裡 來 (Piaotongshi) gēge nălĭ qù lái brother where go  ‘Where have you been, brother?’ (Sun 1996: 98) Wang (1958) observes that the aspectual meaning of perfect was exclusively expressed by the sentence-final particle 矣 yǐ in Old Chinese, while this meaning was conveyed by a set of sentence-final elements in Middle Chinese, including 矣 yǐ, yĕ and 來 lái. By the 15th century, lái became the dominant perfect marker (Sun 1996: 101–103). By the 18th century, however, the perfect use of lái disappeared (cf. section 3.1.3 on the perfect particle 了 le).

... Grammaticalization of the directional verb 过(過) guò In Old Chinese, the word 過(过) guò expressed a spatial motion event of crossing or passing. This can be seen from example (20) from Mengzi (4th century BC). In Mandarin, guò still retains that meaning as in 过马 guò mǎlù ‘cross a road’, but it can also be interpreted in an abstract, temporal sense as in 过 子 guò rìzi [cross days.of.life] ‘live, lead a life’. (20) 禹 於外 過其門 (Mengzi – Tengwengong) Yǔ bā nián yú wài, sān guò qí mén ér bù rù. Yu eight years  outside three.times pass by  door   enter ‘Yu was outside for eight years; (during this period, he) passed by his home three times but never entered.’ Since Late Old Chinese, the word guò has developed several functions. On the one hand, it has developed into a polysemous word and can additionally be used in the sense of ‘surpass’, ‘exceed’, ‘excessive’, ‘excessively’, ‘mistake’, ‘blame’, etc. (depending on the context and its collocation with other words). On the other hand, just like most other directional verbs, guò (as well as the compounds guò-lái and

Grammaticalization changes in Chinese

629

guò-qù) can occur as V2 in the V1V2 resultative construction for expressing direction and/or result (Chao 1968; Li and Thompson 1981; Tang 1992; Wu 2003). This is illustrated by (21), in which guò (V2 ) indicates that the action of jumping (V1 tiào) ends up in the result of the river being crossed. (21) 他 过那 河了 tā tiào-guò nèi tiáo hé le. I jump-cross that  river  ‘He jumped over that river.’ (Li and Thompson 1981: 60) A more abstract grammatical function of guò is its occurrence as an experiential aspect suffix (often written with a hyphen as -guo in the pinyin transcription), indicating that an event denoted by the verb has taken place in the past, in particular, the event has been experienced at least once (cf. e.g., Chao 1968; Li and Thompson 1981; Yeh 1996; Wu 2003). The experiential function of guò is presented in (22) below, in which guò is attached directly to the resultative compound shuāi-duàn ‘fallbreak’. (22) 我摔断过腿 wǒ shuāi-duàn-guò tuǐ. I fall-break- leg ‘I fell and broke my leg once.’ (Li and Thompson 1981: 227) According to Wang (1958: 312), the experiential function of guò emerged during the period of the Tang dynasty (618–907) and was used frequently in the Song dynasty (960–1279). An example from Zhuziyulei (1270) is given below. (23) 字字為咀嚼過。 (Zhuziyulei) zì-zì wéi jǔjué-guò. every word  mull.over- ‘Every word has been mulled over.’ Wu (2003) sees the experiential meaning of guò in the light of a metaphorical extension from guò as a directional-resultative complement (V2 ) with its spatial meaning of ‘some Ground being physically crossed’ to ‘some event being physically or even mentally experienced’.

.. Perfective suffix 了 -le and perfect> particle 了 le In Modern Chinese, 了 le is traditionally analyzed as a marker which expresses perfective aspect as well as perfect. In its perfective function, which also includes past

630

Linlin Sun and Walter Bisang

meaning (Wu 2004), it is directly suffixed to the verb. For that reason, it is also called a verbal aspect suffix (written with a hyphen as -le in the pinyin transcription). In its function of a perfect, le occurs at the end of a sentence, hence its name of sentence-final le (written as a separate word). In its perfect function, le indicates that an event has happened before speech time and is still relevant beyond that moment (Chao 1968; Li and Thompson 1981; Li, Thompson, and Thompson 1982; Sun 1996). Example (24) illustrates both functions. In its first occurrence suffixed to xiě-wán ‘write-finish’, -le is identified as a perfective aspect marker highlighting the terminal temporal boundary of that event (Bisang 2004), while the second le in the sentence-final position marks perfect in the sense of a currently relevant state. (24) 他写 了今 的 业了。 tā xiě-wán -le jīntiān de zuòyè le. he write-finish  today  homework  ‘He has finished writing his homework today.’ Some more recent studies by authors such as Cao (1995), Yang (2003), Ma (2006) and Yap, Yang, and Wong (2014) do not distinguish perfective -le from perfect le. These authors argue that le is a single aspect marker with two different discourse functions. In our present discussion, we maintain the distinction between perfective and perfect for being able to include that perspective. For that purpose, we use le1 for the perfective function and le2 for the perfect function. The origin of perfective le1 is related to the verb 了 liǎo ‘complete, finish, stop’ by Wang (1958: 305). It is already attested in the texts of the Han dynasty (206 BC– 220 AD). In Modern Chinese, liǎo still exists with the same meaning but its use is restricted to compounds such as 了断 liǎo-duàn ‘end, stop [V/N]’). In Middle Chinese, around the Tang dynasty (618–907), the verb liǎo started expressing the accomplishment of an action or event in the V2 position of the V1-NPU-V2 construction, preceded by the undergoer-NP of V1 (i.e., V1-NPU-liǎo). This construction is illustrated with (25) from Wuzixubianwen (Tang dynasty), in which liǎo signals the accomplishment of the event shā Zǐxù [kill Zixu]. (25)

(Wuzixubianwen) shā Zǐxù liǎo, Yuè cóng Wú dài sù sìbǎiwàn kill Zixu complete Yue state from Wu state loan grain four million dàn. Dan (measure unit) ‘After having killed Zixu, the Yue state loaned four million dans of grain from Wu.’ 子胥了,

從吳

粟四百萬石

As discussed in section 3.1.1, the V-NPU-V construction was steadily replaced by the VV-NPU construction between the 4th and 13th centuries and was ultimately

Grammaticalization changes in Chinese

631

reanalysed as a resultative construction (cf. e.g., Zhuang 2014). In line with this, the V-liǎo-NPU construction gradually replaced the V-NPU-liǎo construction until it was fully established around the Yuan dynasty (1271–1368). By that time, liǎo had shifted from a verb into a resultative/completive aspect marker, maybe even with phonological reduction to le (Wang 1958: 307; Sun 1996: 91–92). Consequently, as a result of both reanalysis and phonological reduction, the perfective aspect marker le was formed. In (26), an example from the Ming dynasty (1368–1644), perfective le occurs immediately after the verb yǔ ‘give’ and is followed by both the indirect object tā ‘him’ and the direct object liǎng wán yào ‘two pills’ of that verb. (26) 我 了他兩 藥 (Yuanquxuan) wǒ yǔ le tā liǎng wán yào. I give  him two  pill ‘I gave him two pills.’ The rise of perfect le2 has been the subject of two different hypotheses. One of them is described in section 3.1.2.2 above on the perfect function of 來 lái ‘come’ in Middle Chinese, illustrated by (19) (Chao 1968; Sun 1996). Sun (1996: 101) assumes that during the later phase of this development, perfect lái underwent phonological reduction to [lə], the basis of Modern Chinese le2 . Sun’s (1996) assumption seems to go well with the observations that the earliest occurrence of le2 as a perfect aspect marker was found between the 13th and 15th centuries (Ota [1958] 2003; Cui 1984), and that the co-occurrence of le1 and le2 did not become common until the 18th century, when the perfect use of lái completely disappeared (Zhang 1986). The second hypothesis claims that perfect le2 is derived in the same way as perfective le1 from the verb liǎo ‘complete, finish, stop’. Yap, Yang, and Wong (2014) argue that by Late Middle Chinese, structures of clause combining and verb serialization gave rise to the syntactic relabeling of the verb liǎo as a tense-aspect marker – either as a perfective suffix (le1 ) or as a sentence-final perfect particle (le2 ). This relabeling took place at two levels. At the first level of bi-clausal constructions expressing temporal sequences of events in the form of ‘(NPA) V-NPU-liǎo, (NPA) VP’,12 clause-final liǎo came to be reinterpreted as an anterior or perfect aspect marker and was phonologically reduced to le [lə] (cf. [25]). At the second level of the transition from V-NPU-V to VV-NPU as discussed in section 3.1.1, liǎo in the position of the second verb was reinterpreted first as a resultative or completive marker, then as a perfective verbal suffix. Due to the combination of the developments at both levels, le in Modern Chinese can either be suffixed to a verb as a perfective marker or serve as a perfect particle in the sentence-final position.

 The actor argument NPA may be dropped, thus it is put in brackets.

632

Linlin Sun and Walter Bisang

.. Durative aspect marker 着(著) -zhe In Modern Chinese, the marker 着(著) -zhe is often referred to as a durative aspect suffix (but see Yeh [1993] on its function as an imperative marker), as it must be directly attached to a verb host (i.e., V-zhe vs. *VO-zhe). The durative marker -zhe either signals a continuous state or a progressive activity as a background for another event. Thus, it can not only be suffixed to activity verbs, but also to non-activity verbs (e.g., zuò ‘sit’, 站 zhàn ‘stand’, 躺 tǎng ‘lie’) and to verbs that can express both functions (e.g., 拿 ná ‘take/be holding’, 挂 guà ‘hang/be hanging’, 穿 chuān ‘put on/be wearing’) (Li and Thompson 1981: 219–222). In (27), -zhe is suffixed to tǎng ‘lie’ and signals that the posture of lying is going on. (27) 他 床上躺着。 tā zài chuáng-shàng tǎng-zhe. 3 at bed-on lie- ‘He is lying on the bed.’ (Li and Thompson 1981: 220) According to Wang (1958: 308), durative -zhe originates from the verb 著 zhuó used in the sense of ‘to adhere, to attach (to somewhere)’ in Old Chinese,13 as illustrated by the following example from Zuozhuan (about 4th century BC): (28) 風 著於 (Zuozhuan – Zhuang 22) fēng xíng ér zhuó yú tǔ. wind walk  adhere  earth ‘The wind blows and then adheres to earth.’ In Middle Chinese, since the Tang dynasty (618–907), the occurrence of zhuó (possibly also pronounced as zháo) as the second verb in the VV-NPU construction became widely used (Wang 1958). In line with what was observed for the VV-NPU construction in the context of the resultative construction (section 3.1.1), V-zhuó/zháo-NPU was used for expressing either durative or resultative/completive aspectual meaning (depending on the semantics of the first verb in the construction). As a matter of fact, the two distinct aspectual meanings of zhuó/zháo co-existed for several centuries, even though its durative use was always more common (Wang 1958: 311). As can be seen in example (29) from the Yuan dynasty (1271–1368), zhuó/zháo continued to be used with both meanings. In its first occurrence, zhuó/zháo functions as a completive marker denoting the accomplishment of the action expressed by the verb zhuàng ‘come upon, run into’, while it marks durativity in its second occurrence with the verb dān ‘carry’.  Notice that the character 著 is also pronounced zhù when it occurs with meanings like ‘remarkable, marked’, ‘to show’ or ‘to mark’.

Grammaticalization changes in Chinese

633

(29) 撞著 個大漢 擔著一 酒桶 (Jingbentongsuxiaoshuo) zhuàng-zhuó/zháo bā gè dàhàn, dān- zhuó/zháo yī duì jiǔtǒng come upon- eight  fellow carry- one pair barrel ‘(He) came upon eight burly fellows, (they were) carrying a pair of barrels. Wang (1958: 311) observes that it was not until the end of the Ming dynasty (1368– 1644) until zhuó/zháo had fully developed into a durative aspect marker, most likely with reduced pronunciation zhe as in Modern Chinese. The resultative/completive function with the phonologically more substantial pronunciation of zhuo or zhao (sometimes also zhe) still exists in many Chinese dialects and varieties of today.

.. Progressive aspect marker 在 zài The progressive aspectual function is expressed by zài ‘be at, live at, exist’ in preverbal position as illustrated by (30). Unlike the durative marker zhe (section 3.1.4), progressive zài often co-occurs with dynamic activity verbs such as 打 dǎ ‘hit’, 跑 pǎo ‘run’ or tiào ‘jump’. Often, the adverbial 正 zhèng ‘just, precisely’, which itself has also evolved into a preverbal progressive marker (cf. Zhang 2002; Lei and Hu 2010 and others), is combined with progressive zài to form 正 zhèngzài for emphasising that the event under discussion is going on. In general, zài, 正 zhèng and 正 zhèngzài can all be used for marking progressive. (30) 他在打篮 tā zài dǎ lánqiú. he  play basketball ‘He is playing basketball.’ Historically, the character zài existed already in the oracle bone inscriptions (about 12th–13th centuries BC), where it was used either as a verb expressing existence or for introducing a locative NP in the [zài + NP] construction (Djamouri and Paul 1997, 2009; Paul 2015). Both uses of zài can be seen in example (31) from Shijing (7th–11th centuries BC), in which zài in its first occurrence functions as a verb with the meaning of ‘live’, while zài in its second occurrence serves to introduce the spatial NP zǎo ‘aquatic plants’ as a locative (on the adpositional function of zài, cf. section 3.6). (31) 魚在在藻 (Shijing) yú zài zài zǎo. fish live  waterweed ‘The fish lives (where?) (It is) in the waterweed.’

634

Linlin Sun and Walter Bisang

The [zài + VP] construction already appeared in Late Old Chinese. However, as observed by Wang (2015), this construction exclusively expressed the existence of a given state at that time. It never marked progressive as zài does today. It was much later in the Song dynasty (960–1279) that its function of marking an ongoing action/ event started developing out of the construction [zài + zhèlǐ/nàlǐ ‘here/there’ + VP]. In this construction, zài in its function of a preposition came to be reanalysed as a progressive aspect marker, while the locative demonstratives zhèlǐ ‘here’ or nàlǐ ‘there’ lost their referential function and finally disappeared (Lei and Hu 2010). Before zài had fully developed into a progressive marker, there was a considerable period of ‘layering’. Even in the Qing dynasty (1644–1911) the [zài + zhèlǐ/nàlǐ + VP] and the [zài + VP] constructions were both used to express progressive meaning (Wang 2015: 198). This is illustrated in (32) from Xiyangji (1598) on the use of zài in the [zài + nàlǐ + VP] construction (32a) and in the [zài + VP] construction (32b): (32) a. 公 在那裏打 哩 (Xiyangji) gōngrán zài nàlǐ dǎzuò li. openly /at there meditate  ‘(The arhat) was meditating in open.’ b. 師父還在打 (Xiyangji) shīfù hái zài dǎzuò. master still  meditate ‘The master was still meditating.’

. Modality As in many other languages, mood and modality are expressed by a variety of items in Chinese (e.g., verbs, nouns, adjectives, adverbs, auxiliaries, particles). This section is concerned with items which are equivalent to what is called “modal verbs” or “modal auxiliaries” (cf. English may, can, etc.). In Chinese, expressions belonging to this category are called ‘can-wish verbs’ (能 动词 néng-yuàn dòngcí) since Wang (1947). Almost all of them originated from verbs and have retained their original meaning as well as the meanings they developed in the process of grammaticalization (cf. examples below). For more details about Chinese modals and their classification, the reader is referred to Tsang (1981), Tiee (1985), Huang (1999), Hsieh (2005) and Tsai (2015). The modal verb 要 yào is one of the most important members of the Chinese modal system. It occurs in front of the main verb and can have different modal functions, as illustrated in (33). In (33a), it expresses dynamic modality in terms of volition (e.g., desire, wish). Its deontic function of obligation is shown in (33b). Similar to English must, yào can also express epistemic modality if the speaker is confident in the truth of a proposition (33c).

Grammaticalization changes in Chinese

635

(33) a. 他喜欢唱歌 每 都要唱 tā xǐhuān chànggē, měitiān dōu yào chàng. he like sing everyday even will sing ‘He likes singing and wants to sing even every day.’ b. 校 说 们要 点 前到校 xiàozhǎng shuō, xuéshēng-men yào bādiǎn yǐqián dào xiào. Principal say pupil- must 8 o’clock before reach school ‘The principal says that all pupils must reach the school before 8 o’clock.’ c.



际一

蛮干

要失败的

bùgù shíjì yīwèi mán gàn, yào shībài de. regardless of reality blindly recklessly do must fail  ‘(If one) acts blindly and recklessly, without considering the reality, (he) must fail.’ (from Lü 1999: 592) At a further stage of grammaticalization, yào also serves as a future marker: (34) 他要調職了 tā yào diàozhí le. he  transfer  ‘He is going to transfer to another position.’ (You 1998: 165) The different uses of yào as we find them today can be traced back to the Old Chinese word 要 yāo.14 At the time of Late Old Chinese, 要 yāo had three major functions: (i) It occurred as a verb with various meanings such as ‘tie with a waistband’, ‘keep within bounds, constrain’, ‘intercept’, ‘desire, seek, pursue’ and ‘force, coerce’. (ii) It was used as a noun pronounced today as yào, with the meaning ‘crucial point, key point, essential’. (iii) It functioned as a stative verb or an adjective pronounced today as yào, with the meaning of ‘important, crucial, essential’. Most linguists assume that yào as a preverbal modal auxiliary emerged during the Han dynasty at the earliest (e.g., Lu 1997; Ma 2002; Kuo 2015). This is evidenced by texts like the Shiji and the Book of Han (1st century AD), in which yào was used for denoting either deontic ‘must’ or epistemic ‘must’. You (1998) holds the view that modal yào originated from the verb yāo in its meaning of ‘desire, seek, pursue’ (cf. function [i] above) in Late Old Chinese, from which the deontic modal meaning of yào was derived first. Its epistemic use was the result of a later development. Ma (2002) points out that modal yào, especially when used in the sense of epistemic

 Old Chinese 要 yāo was etymologically related to the meaning ‘waist (body part)’, which was denoted later by the character 腰 yāo, as this word is written until today.

636

Linlin Sun and Walter Bisang

‘must’, may alternatively be traced back to the adjective yào ‘important, crucial, essential’ (cf. function [iii] above). In contrast, Kuo (2015) starts out from the meaning of ‘summarize, sum up’, which is seen as a metaphorical extension of the above function (i) of yāo ‘tie with waistband’/‘keep within bounds, constrain’. Kuo (2015) argues specifically that modal yào was derived from the [yāo-zhī + VP] construction through the omission of zhī (object pronoun of 3rd person). When the resulting [yāo + VP] construction became more common in the later period of the Han dynasty (about the first two centuries AD), the verb yāo was reanalyzed as a preverbal modal verb with the meaning of ‘must’, pronounced today as yào. The function of yào as a future marker in (34) is widely seen as a further development from its dynamic modal meaning of volition (33a) by the time of the Song and Yuan Dynasties (960–1368) (Wang 1958; Ota 2003; Lu 1997; Peyraube and Li 2012; Kuo 2015). Example (35) from Zutangji (10th century) illustrates its volitional function, while example (36) from Zhuziyulei (1270) shows it in its function of a future marker. According to You (1998), that development took place in two stages. At the first stage, the volitional meaning of yào was associated with intentional meaning by pragmatic inference (volition > intention). At the second stage, the intentional interpretation was extended to future meaning in contexts with high confidence in the accomplishment of the intended action (intention > future). A further extension of yào from future marker to conditional subordinator will be discussed in section 3.7. (35) 出來!吾要識汝 (Zutangji) chū-lái wǔ yào shí rǔ. exit-come I will know you ‘Come out! I want to know you.’ (36) 到工 要斷絕處 (Zhuziyulei) dào gōngfū yào duànjué chù. reach work be.going.to break.off place ‘When the work is going to break off …’ (Peyraube and Li 2012: 164) Kuo (2015) suggests that both the dynamic modal use of yào and its use as a future marker was transferred from another modal auxiliary in Middle Chinese, i.e., yù (< ‘wish, hope’). This functional takeover was due to the phonological similarity of the two verbs yào and yù and to their semantic similarity when used to express deontic or epistemic necessity. As for modal yù, Peyraube and Li (2012: 157) observe that its lexical meaning was originally associated with a wish or a hope of the speaker. In Late Old Chinese, yù developed the meaning of ‘intend to’. By the time of the Tang dynasty (618–907), yù further developed several other functions, including the functions of deontic and epistemic modality and of future tense. In Modern Chinese, yù is no longer used as a free word. Its use is limited to compounds.

Grammaticalization changes in Chinese

637

In general, the formation processes of different uses of modal 要 yào outlined above show two general tendencies of semantic change (cf. You 1998), both of which are typical of grammaticalization processes of many other modal auxiliaries in Chinese: (i) The first tendency concerns the development from non-modal to deontic or epistemic modal meaning. This tendency is also observed in grammaticalization processes of some other modal auxiliaries such as kě (‘agree, approve’ > ‘permit’ > deontic possibility > epistemic possibility), 能 néng (‘bear’ > ‘be capable of ’, ‘ability’ > deontic possibility or epistemic possibility), 會 huì (‘get together’ > ‘meet, see’ > ‘know’, ‘understand’, ‘be able to’ > epistemic necessity) (cf. Xing 2015), 該 gāi (‘discipline, requirements’ > deontic necessity or epistemic necessity) and 當 dāng (‘match, be equal to’ > deontic necessity) (for details, cf. Li [2004]). (ii) The second tendency pertains to the development from the dynamic modal meaning of volition or intention to future tense. This tendency is also observed in the development of the modals yù (‘wish, hope’ > intention > future) and 將 jiāng (‘wish, hope’ > intention > future) (cf. Peyraube and Li 2012).

. The passive marker 被 bèi The Chinese 被 bèi construction is most closely associated with passives. In its canonical form, the undergoer-NP (NPU) takes the subject position and the actor-NP (NPA) follows bèi in the preverbal position. Thus, the canonical passive has the following structure as illustrated in (37a): NPU + bèi NPA + V. If the actor-NP is not mentioned, the canonical passive with bèi occurs in its short form as NPU + bèi + V, as in (37b). (37) a. 他被 师骂了 tā bèi lăoshī mà le. he  teacher scold  ‘He was scolded by (his) teacher.’ b. 他被骂了 tā bèi mà le. he  scold  ‘He was scolded.’ In colloquial speech, passive 被 bèi is often replaced by 给 gěi (< ‘give’), 让 ràng (< ‘let, allow’) (e.g., tā gěi/ràng lăoshī mà le [he / teacher scold ] ‘He was scolded by his teacher’) as well as the combination of causative ràng ‘let’ and gěi (e.g., tā ràng lăoshī gěi mà le [he let teacher  scold ] ‘He was scolded by the teacher’) (on the function of the verb gěi ‘give’ as a preposition, cf. section 3.6).

638

Linlin Sun and Walter Bisang

In addition to the structural characteristics outlined above, the bèi construction differs from English passives in at least two ways. Firstly, it is associated with adversative meaning in the sense that the event it marks is conceived as being negative for the noun in the subject position (Wang 1958: 432; Li and Thompson 1981: 493). Secondly, the marker bèi is also used in another non-canonical form, known as ‘indirect passive’, which also has adversative meaning. In this construction, the subject position is filled with an additional experiencer-NP (NPE), while NPU is moved to the postverbal position. Thus, the indirect passive as illustrated by (38), has the following structure: NPE + bèi NPA + V+ NP. (38) 张 被 匪打 了爸爸 Zhāngsān bèi tǔfěi dǎsǐ -le bàbà. Zhangsan  bandit kill - father ‘Zhangsan had his father killed by bandits.’ (Huang, Li, and Li 2009: 140) The grammatical status of bèi gave rise to various analyses in the literature. In Wang’s (1947, 1958) presentation, bèi is categorized as an auxiliary, while it is identified as a preposition by Wang (1970) and Li (1990). Li and Thompson (1981) treat bèi as a coverb with a mixed status of verb and preposition. The biclausal analysis with bèi in the function of a matrix verb goes back to Hashimoto (1969). According to Huang, Li, and Li (2009: 112–152), the matrix verb bèi takes an NP as its subject and an inflectional phrase (IP) as its complement. Diachronically, bèi as a passive marker did not become common until Early Middle Chinese (Wang 1958). In Old Chinese, the character 被 was used nominally with the meaning of ‘blanket, quilt, bedding’ (which is retained up to the present day).15 By the time of Late Old Chinese, 被 bèi became a verb with the meaning of either ‘wear, cover (with a cloth)’ or ‘receive, suffer, experience (something)’. The two verbal meanings are illustrated in (39a) and (39b), respectively, both from Zhanguoce (ca. 1st century BC). (39) a. 被 聽事 (Zhanguoce – Chu 1) bèi wáng yī yǐ tīng shì. wear king clothes  manage affairs ‘(Gongsun Hao) managed government affairs while wearing the king’s clothes.’

 In Baxter and Sagart’s (2014) OC reconstruction, 被 in its nominal function was pronounced as bĕi (with the third tone) in Old Chinese.

Grammaticalization changes in Chinese

639

b. 百姓 被兵之患 (Zhanguoce – Wei 3) bǎixìng wú bèi bīng zhī huàn. common people  suffer war  suffering ‘The people would not suffer the consequences of war.’ In Wang’s (1958: 425) report, the earliest occurrence of bèi for expressing passitivity is also found in Late Old Chinese. An example of this type is (40) from Hanfeizi (3rd century BC): (40) 知 被辱 … (Hanfeizi – Wudu) zhī yǒu bèi rǔ intimate friend  insult: ‘(If one’s) intimate friends were insulted …’ In the above example, bèi occurs immediately before the verb without a noun phrase intervening. Considering the flexibility of parts of speech (Bisang 2008; Zádrapa 2011; Sun 2015, forthc.), we assume that bèi in its passive function of that time was probably the result of the reanalysis of an original VO-construction (bèi as a verb meaning ‘receive, suffer, experience’, followed by the undergoer-object rǔ ‘insult:N’) as a passive construction, with bèi as a passive voice marker, followed by a verb like rǔ ‘insult’. In Middle Chinese, by the time of the Northern and Southern dynasties (420– 589), bèi also marked actor-NPs of passive constructions (e.g., Wang 1958). This can be seen in the following example (41) from Shishuoxinyu (4th century), in which the undergoer-NP Liàng zǐ ‘Yuliang’s son’ takes the subject position, and the actor-NP Sūjùn marked by bèi occurs before the verb hài ‘kill’. (41)

(Shishuoxinyu – Fangzheng) Liàng zǐ bèi Sūjùn hài. Yuliang son  Sujun kill ‘Yuliang’s son was killed by Sujun.’ 子被蘇峻

Thus, bèi evolved into a passive marker that allows both the long passive and the short passive form as it does in Modern Chinese. In the course of time, with the bèi construction becoming increasingly common, other alternative markers for expressing passive, for example, 见 jiàn (< ‘see’) or 為 wéi (< ‘act, serve as, become, be’)16 gradually declined. By the beginning of pre-Modern Chinese, bèi became the most commonly used prima facie passive marker (Tang 1988).

 On the origin of the long passive based on wéi, cf. a recent publication by Li (2018).

640

Linlin Sun and Walter Bisang

The reason for the rise of bèi as a passive marker is ascribed by some researchers like Sun (1996) to the loss of morphological marking on Old Chinese verbs17 for determining the semantic relations between nominals in a proposition. It is claimed that Old Chinese deployed different types of VV sequences (‘V-kill’ vs. ‘V-die’) for distinguishing between actor-subjects (V-kill) and undergoer-subjects (V-die). This distinction in terms of word order became neutralized in Early Middle Chinese (Mei 1990, 1991). Consequently, the semantic relationships of the nominals in a proposition remained underspecified. In this situation, 被 bèi and 把 bǎ (cf. section 3.4 below) were grammaticalized into functional markers for the purpose of indicating whether the subject of a sentence was in fact the actor or the undergoer of the predicate (Sun 1996). In this context, bèi developed into a marker of a passive construction for signaling that what immediately follows it is an actor, while the syntactic subject – either overtly expressed or contextually available – must not be an actor.

. The marker 把 bǎ The general description of 把 bǎ as it is given by Li and Thompson (1981) starts out from SVO word order and the possibility of fronting definite nouns to the preverbal position marked by bǎ. This yields the structure ‘S bǎ OV’ as illustrated by the following example (42). (42) 我把茶 弄破了 wǒ bǎ chá-bēi nòng-pò le. I  tea-cup make-broken / ‘I broke the teacup.’ (Li and Thompson 1981: 466) The bǎ construction has long been of interest to specialists of Chinese linguistics. The semantic analysis of bǎ goes back to Wang (1947, 1958), who describes the bǎ construction in terms of a ‘disposal form’ (处置 chǔzhì shì), which is used for describing how an object is purposefully handled, dealt with, or disposed of by an actor (cf. also Chappell 1991). Li and Thompson (1981) proposed two requirements for the appropriate use of bǎ. First, the object-NP marked by bǎ must be definite, specific or generic. Second, bǎ cannot be used with verbs which denote states or activities that do not affect their direct objects in the sense of disposal. Thus, it is incompatible with verbs of emotion (e.g., miss, like) or verbs of cognition (e.g., understand, know). In Sun’s (1996) account, bǎ is termed a marker of high transitivi-

 On the role of loss of verbal morphology for the emergence of the resultative construction, also cf. section 3.1.1.

Grammaticalization changes in Chinese

641

ty that serves to indicate the complete affectedness of an entity. With this, he identifies the aspectual boundedness of an event as a third requirement for the use of bǎ. At the level of syntax, bǎ is described as a coverb by Li and Thompson (1981). Since this classification disregards various functions of bǎ, among them prominently its function of causation (Bisang 2016: 367–368), more recent syntactic analyses focus on the verbal function of bǎ. Sybesma (1999) argues that bǎ is a causative verb and takes the position of little v 0. Huang, Li, and Li (2009) claims that bǎ represents a separate functional category above vP with the subject taking the specifier position of the phrase headed by bǎ, i.e., [baP Subject [ba’ bǎ [vP NP [v’ v [VP XP]]]]]. Targeting the issues at the semantics-syntax interface of the bǎ construction, Lipenkova (2014) analyses bǎ as a head which is responsible for selecting both a causer argument at the sentence-initial position and an eventive VP complement after it. In a few other studies, bǎ is taken as a verb meaning ‘hold, take’ (e.g., Ding 2001). In fact, this function traces back to the original meaning of bǎ in Old Chinese. In Modern Chinese, bǎ is generally not used as an independent verb (Tian 2006).18 It cannot be questioned by a yes/no-question and its use is limited mainly to compounds like bǎ-zhù 把 [hold-stay.] ‘hold firmly’ and some idiomatic phrases like 把他怎 样了 nĭ bǎ tā zénmeyàng le? [you treat he how /] ‘How did you treat him?’. Moreover, its meaning has become more specific. It either means ‘hold firmly/with force’ or ‘guard, keep’ as in 把 bǎ-mén [guard gate] ‘guard a gate / keep the goal (in football)’. The semantic specification of its meaning in compounds clearly shows the diachronic distance between the full semantic meaning of bǎ in the past and its grammatical function today. The earliest occurrence of the bǎ construction in the form of bǎ OV was found in Middle Chinese (during the Tang dynasty, 618–907; Mei [1990]). In its early stages, bǎ can often be interpreted either as an abstract function word or as a full verb with the meaning of ‘take, hold’ in a serial verb construction with sequential meaning (Wang 1958: 411–412). This can be illustrated with example (43), from a poem of the Tang dynasty. Here, bǎ can either be analysed as a transitive verb meaning ‘hold, take’ with the NP juàn ‘book, volume’ serving as the undergoer-object of both bǎ and the second verb kàn ‘see’ (Interpretation [a]), or alternatively, as a function word with no concrete meaning. Sun (1996: 74) observes that between the 10th and 18th centuries, both functions were used in parallel. Afterwards, bǎ became semantically generalized and shifted to the function of a grammatical marker. (43) 但 官把卷看 (by Du Xunhe; Tang dynasty, 618–907) dàn yuàn Chūnguān bǎ juàn kàn. only hope Chunguan (official) take/ volume see [a] ‘It is only hoped that Chunguan would take the volume and see (it).’ [b] ‘It is only hoped that Chunguan would see the volume.’

 For its use as a noun and as a numeral classifier, cf. section 2.1.

642

Linlin Sun and Walter Bisang

The rise of bǎ as a grammatical marker can also be understood in connection with the development of the passive marker bèi (Bennett 1981; Sun 1996). As discussed in the previous section, the emergence of the passive marker bèi is assumed to be due to the loss of verb morphology in Old Chinese for distinguishing actor-subjects from undergoer-subjects. If bǎ is assumed to have evolved into a preverbal nonactor marker in the bǎ-NPU-V construction, bèi can be seen as the marker of a preverbal actor in the passive bèi-NPA-V construction by analogy.

. The copula 是 shì The copula in Chinese is 是 shì, which is negated by the general negator bù ‘not’ ( 是 bù shì ‘not be’). An instantiation of an equative construction with shì is given in (44). In contrast to the English copula, Chinese shì is sometimes optional in affirmative contexts. If shì is omitted, (44) would most likely be analyzed as a a topiccomment structure. (44) 魯迅是紹興人 Lǔxùn shì Shàoxīng rén. Lixun  Shaoxing (place) person ‘Luxun is a Shaoxing person.’ In addition to its assertive function in (44), shì can also be used for emphasis or reiteration, with respect to a statement mentioned earlier in conversation (Li and Thompson 1981). In that context, shì is often referred to as a focus marker and the construction in which it occurs is called the ‘shì construction’ (known as Chinese focus construction). In (45a) without shì, the information is provided in a neutral way, while the shì construction in (45b) is used against the presupposition that the information provided in (45a) is not true (Li and Thompson 1981: 151): (45) a. 他没钱 tā méi qián. 3 not:exist money He doesn’t have any money. (Li and Thompson 1981: 151)

b. 他是没钱 tā shì méi qián. 3 be not:exist money It’s true that he doesn’t have any money. (Li and Thompson 1981: 151)

It is generally observed that 是 shì as a copula came into use around the 1st century AD (Wang 1958: 347; Ota 2003). This is evidenced by texts from the Han dynasty (e.g., Lunheng 論衡, 80 AD), in which shì as a copula was facultative (Feng 1984). Before the Han dynasty, there was no copula, i.e., the equation construction was formed by the unmarked juxtaposition of a subject NP and a predicate NP (Feng 1993; Sun 2015). In (46) from Zhanguoce (about the 1st century BC), both NPs xián

Grammaticalization changes in Chinese

643

rén ‘virtuous person’ and héng shì ‘ordinary person’ are nominal predicates, the first one in an affirmative sentence, the second one in a negative clause formed by the particle 非 fēi ‘no, not’. In this context, it is relevant to note that the word fēi was routinely used in the classical period in two ways, either as a verb meaning ‘violate, disobey, blame, etc. [tr.]; be wrong, be incorrect [intr.]’ or as a particle for negating nominal predicates. (46)

賢人 非恆 (Zhanguoce) Gānmào xián-rén, fēi héng-shì yě. Ganmao virtuous.person  ordinary.person  ‘Ganmao is a virtuous person, (he is) not an ordinary person.’

The word 是 shì already existed in Old Chinese, when it served either as a verb or as a demonstrative pronoun. As a verb, shì behaved much like an antonym to the above-mentioned verb 非 fēi, with the meaning of either ‘follow, agree, conform to [tr.]’ or ‘be right, be correct [intr.]’. Example (47) illustrates the verbal function of shì, (48) illustrates its demonstrative function. Notice that the sentence (48) has an anaphoric structure in a topic-comment construction, in which shì ‘that’ simultaneously refers to the topic fù yǔ guì ‘be wealthy and be noble’ and serves as the syntactic subject of the comment rén zhī suǒ yù yě ‘be what people long for’. (47)

法先 是禮義 … (Xunzi – Feishierzi) bù fǎ xiān wáng, bù shì lǐ yí.  follow emperors.in.ancient.times  follow rites moral.laws ‘(If one does) not follow the emperors in ancient times, (if one does) not follow the rites and moral laws …’

(48) 富 貴 是人之所 (Lunyu – Liren) fù yǔ guì shì rén zhī suǒ yù yě. be.wealthy  be.noble that people   desire  ‘Be wealthy and be noble, that is what people long for.’ The development of shì into a copula has been the subject of many studies. There are basically three scenarios. The first one traces back the copular use of shì to its demonstrative function (Wang [1958: 353], followed by Li and Thompson [1977], Feng [1993] and many others). Its beginning is seen in topic-comment constructions of the type illustrated by (48). While shì initially was part of a topic construction (cf. above), it was reanalysed as the copula of a subject-predicate construction (Li and Thompson 1977: 420). In the course of time, the copular use of shì was conventionalized, while its demonstrative function gradually disappeared. In Middle Chinese, the presence of shì as a copula in the equative construction had become obligatory and by the time of the Tang dynasty (618–907), the copula shì was regularly negated by bù (bù shì [ ] ‘is not’) as it is still today (Wang 1958: 354–355).

644

Linlin Sun and Walter Bisang

The second scenario combines the affirmative meaning of shì ‘follow, agree, conform to; be right, be correct’ paired with the negative fēi as its antonym (e.g., Hong 1964; Yen 1986). According to Yen (1986: 228), shì developed the function of an affirmative particle when language users started recognizing it as the counterpart of the nominal negator fēi. Since fēi as a negation particle always precedes the nominal predicate, affirmative shì analogically came to occur before a nominal predicate. Later, when the affirmative particle shì was firmly established as a copula verb, the negative particle fēi was replaced by bù shì ‘be not’. At that stage, copular shì was negated like every other verb by the negator bù ‘not’. Compared to the former scenario, the second account explains why other Old Chinese demonstratives like cǐ ‘that’ or sī ‘this’ failed to develop into a copula. The reason for that was that they did have no affirmative verbal meaning and thus were not able to pair with fēi. By combining the two above scenarios, researchers such as Ao (1985), Sun (1992) and Chang (2006) suggest that the formation of copular shì resulted from both functions, its demonstrative and its verbal functions. In Sun’s (1992) view, the verbal function of shì enhanced demonstrative shì to develop into a copula. According to Chang (2006: 147), the development of shì from a demonstrative pronoun into a copula can be divided into two stages. At the first stage, when the demonstrative and the verbal properties of shì were unified, shì became an anaphoric verb as illustrated by the following example from Shiji (1st century AD): (49) 得幸武帝 子一人 帝是 (Shiji) dé xìng Wŭ dì, shēng zĭ yī rén, Zhāo dì shì yĕ. obtain favor Wu emperor bear son one person Zhao emperor ana.verb  ‘(She) won the favor of the Emperor Wu, and gave birth to a son, who was the Emperor Zhao.’ (Chang 2006: 148) At that stage, shì as an anaphoric pronoun was able to occur in the predicate position (V) of a (VP), as demonstrated by the analysis below: (50) Analysis of example (49) (Chang 2006: 148) S' Topic

S

Zhao di

Subject

VP shi ye

At the second stage, the anaphoric verb shì began to lose its anaphoric function and gradually developed into a copula.

Grammaticalization changes in Chinese

645

. Prepositions (coverbs) Chinese deploys both prepositional phrases and postpositional phrases in the function of adjuncts, occurring either to the left or to the right of the predicate in a sentence (on postpositions, cf. section 2.2). Most prepositions in Chinese originated from verbs. Since many of them kept at least some of their original verbal properties (e.g., compatibility with tense-aspect markers like -le ‘perfective’, or -zhe ‘durative’), they are called “coverbs” by some linguists (e.g., Chao 1968; Li and Thompson 1981), even though they clearly no longer function as verbs when they occur in the syntactic position of prepositions (Paul 2015). The most commonly used prepositions are 到 dào ‘to, till / arrive, reach’, 对 duì ‘to, toward / face, turn toward’, 给 gěi ‘for, to / give’, 跟 gēn ‘with, and / follow’, 离 lí ‘apart from, till / leave’, tì ‘in place of / substitute for’, 沿 yán ‘along / follow along’, 靠 kào ‘by, by means of, depending on / get close to, lean on”, 按 àn ‘according to / put hand on, press’, yòng ‘with, by use of / use’, zài ‘at / exist, live, be at’, 比 bĭ ‘than, compared to / compare’, and 过 guò ‘than, compared to / cross, pass through, surpass’19 etc. (for more examples, cf. Li and Thompson [1981: 368–369]; Paul [2015: 55–57]). In addition, there are also a few prepositions which can no longer take the function of full verbs in Mandarin. Some of these prepositions are 从 cóng ‘from’ (< ‘follow’), 为 wèi ‘for, for the sake of ’ (< wèi ‘help’ or wéi ‘do, act, serve as, become, be’) or 和 hé ‘with, and’ (< ‘coordinate’). As shown in example (31), the verb zài ‘exist, live, be at’ is attested in its prepositional function since a very early date (example [31] is from the Shijing, 7th– 11th centuries BC). However, it is highly questionable if there was any historical continuity from its function at that time to its prepositional function at later periods. In Classical Chinese, the locative function was much more frequently expressed by the prepositions 於 and 于, both transcribed as yú. The prepositional function of zài started becoming more frequent again in the 3rd century AD, when the preposition yú had developed into the most common preposition with a large number of different semantic functions (e.g., Peyraube 1980, 1988: 224). The following example (51) from Mandarin shows the use of zài in a postverbal PP. (51) 他 在 山 tā zhù zài Zhōngshān lù. 3 live at Zhongshan Road ‘He lives on Zhongshan Road.’ (Li and Thompson 1981: 393) The development of 给 gěi from a verb ‘give’ to a benefactive preposition has been widely discussed (e.g., Peyraube 1988, 1996; Bisang 1991, 1996; Tsao 2012). In Pey Some Sinitic languages use the construction with bĭ ‘compare’ for expressing comparatives, others use the surpass-construction with guò ‘cross/pass, surpass’ (cf. Ansaldo [2010] for a survey).

646

Linlin Sun and Walter Bisang

raube’s (1988, 1996) analysis, the use of a ‘give’-verb in that function started out from two verbs of giving in a VV series in which the second verb was reanalyzed as a benefactive preposition for introducing the goal or recipient of the action named by the first verb. More specifically, Peyraube (1988) describes the development of the Chinese benefactive construction along several historical stages. At stage (i) in the Han dynasty (206 BC–220 AD), a construction with two verbs in a sequence emerged, generating the new structure V1-V2-IO-DO, in which the V2 position was primarily taken by ‘give’-verbs like yǔ ‘give, grant, offer’, yǔ ‘give, grant, bestow’ or 遺 wèi ‘offer, present’. At stage (ii), the direct object (DO) was fronted to the position between V1 and V2, while position V2 was exclusively filled by the verb yǔ ‘give, grant, offer’. Thus, the new construction developed at that stage had the structure V1-DO-V2:give-IO. At stage (iii), between the 6th and 12th centuries, the ‘give’-verb plus its indirect object (IO) started moving to the preverbal position, generating the structure Vgive-IO-V1-DO.20 With this, the patterns as they still exist in Modern Chinese were basically established. Finally, at stage (iv) since the 13th century, the ‘give’-verb yǔ was replaced by 饋 kuì, a verb which had the highly specific meaning of ‘offer food to a superior’ in Classical Chinese. That verb was in turn replaced by 给 gěi ‘give’ during the Qing dynasty (1644–1911). According to Tsao (2012), the formation of gěi as a benefactive preposition is followed by a further development with gěi having evolved into a ditransitive marker. Tsao claims that with the reanalysis of the gěi-phrase as a PP denoting Goal or Recipient of an action, the gěi-phrase can, under certain conditions, be moved to the position attached immediately to the main verb, as illustrated by [dài]V [gěi Zhāngsān]PP in (52). This direct attachment of gěi to the main verb then leads to the formation of a V-gěi complex through reanalysis ([dài]V [gěi]P > [dài-gěi]V), in which gěi enables the main verb to be used ditransitively (vs. dài *(gěi) Zhāngsān yī bāo táng). (52) 他带给张 一包糖 tā dài gěi Zhāngsān yī bāo táng. 3 bring to Zhangsan one bag candy ‘He brought a bag of candy to Zhangsan.’ (Li and Thompson 1981: 374) As mentioned earlier, 给 gěi can, along with 让 ràng (< ‘let, allow’), also mark passive (cf. section 3.3 on bèi). The gěi-passive is analysed by Chappell and Peyraube (2006) as passing through the stage of a permissive causative in the sense of ‘let’ (i.e., gěi ‘to give’ > permissive gěi > passive gěi). Szeto, Ansaldo, and Matthews

 It is observed that Old Chinese generally applied the postverbal word order to PPs, whereas PPs in Modern Chinese usually precede the main verb in a sentence (Li and Thompson 1974a, 1974b; cf. Sun 1996).

Grammaticalization changes in Chinese

647

(2018: 250–251) observe that some Chinese varieties (especially those spoken in inland regions such as Xiang and Gan) also use gěi as a disposal marker, similar to 把 bă (cf. section 3.4). This is illustrated by (53) from the Sinitic language Xiang, spoken in Changsha. (53) pa mən kuan˧ tɕ h i give door close up ‘Close the door.’ (Szeto, Ansaldo, and Matthews 2018: 251) Since the postpositions can be traced back to nominal structures for expressing spatial or temporal relations, they are described in section 2.2 (cf. Djamouri and Paul [2012] on the development of deverbal postpositions).

. Clause-combining elements This section will present data on complementizers, adverbial subordinators and some anaphoric elements in the matrix clause referring back to the previous clause. Each of these elements will be presented individually first. More general aspects of their historical development will be discussed at the end of this section. Standard Chinese does not use any subordinating conjunctions or complementizers in the sense of that in English neither for the purpose of introducing embedded declarative clauses nor for expressing indirect speech. Similarly, embedded interrogative clauses are embedded without any overt marking and with the wh-word remaining in situ as in unembedded wh-questions. Given this situation, the development of ‘say’-verbs into quotative markers or complementizers, which has been reported for many languages around the world, is not found in Standard Chinese. In spite of this, it can be observed in various other Sinitic languages or dialects of Chinese, among them Southern Min, Beijing Mandarin, Taiwanese Mandarin, Cantonese and Hakka/Kejia (cf. Yang 1957; Meng 1982; Liu 1986; He 1989; Liu 1996; Huang 2003; Fang 2006). In Southern Min, the verb 講 kóng ‘say, talk, tell’ (Chappell 2008) has developed into a function word which does not only serve as a complementizer (cf. [54] below) but is also used as a topic marker or as a component of the conditional marker ‘nā kóng +  ‘if …’. In (54), the verb kóng occurs twice in a sequence. It is a full verb in the first position and a complementizer in the second position. (54) 阿 講講伊明 會轉來 á-chek kóng kóng i mê n_nî n ē tńg-lâi. uncle say  3 next-year will return ‘Your uncle said that (kóng kóng) she would return next year.’ (Chappell 2008: 21)

648

Linlin Sun and Walter Bisang

According to Chappell (2008), the quotative marker developed out of serial verb constructions of the VV type in which the ‘say’-verb took the second position. In that position, it was reanalyzed as a quotative marker or a complementizer after verbs of speaking, thinking, perception or feeling. Thus, the marker kóng in Southern Min can be used as a complementizer with verbs such as siūn ‘think’, kámkak ‘feel’, chhiò ‘laugh’, liām ‘nag, insist’, or beh ‘want’. Adverbial clauses are marked by a rich inventory of disyllabic adverbial elements (some of which can also be used as prepositions). The most important causal adverbial subordinators are 为 yīnwèi ‘because’ and 由于 yóuyú ‘owing todue to’ The first component of yīnwèi was a polysemous verb in Classical Chinese with the meanings of ‘rely on, depend on’ or ‘follow, conform to, accord with’, while its second component with its pronunciation in the falling tone (wèi) had prepositional functions ‘for, for the sake of, on account of ’ in addition to the verbal meaning ‘help’ (wéi with the rising tone was a verb meaning ‘do, act, serve as, become, be’, cf. section 3.6). The adverbial subordinator yóuyú also consists of an element whose meaning can be traced back to a lexical item, i.e., the first component yóu which was a polysemous verb in Classical Chinese with the meanings of ‘start from, pass through, go by way of ’ or ‘follow, conform to’. The second component yú was the most common preposition of Classical Chinese (cf. section 3.6). Conditionals are expressed by various adverbial subordinators. The two most common ones are 要是 yàoshì ‘if ’ and 如果 rúguǒ ‘if ’. The first component of the former has the same origin as the modal verb yào ‘want’ (section 3.2) and can serve as a conditional marker meaning ‘if ’ itself, which is seen as an extension of the future marker yào (cf. section 3.2). Its second component shì is the copula of Modern Chinese (for its origins, cf. section 3.5). The first component of the latter can be traced back to a verb in Classical Chinese with the meaning of ‘to be like’, while the second component guǒ is a noun meaning ‘fruit, result, outcome’. Some of these components also occur in other conditional markers, as in 要 zhǐyào ‘as long as, if only’ (< zhǐ ‘only’ + yào ‘if ’) (You 1998) or 是 ruòshì (< ruò ‘be like’ + shì copula, demonstrative). Of particular interest are conditional markers like 假如 jiǎrú, which specifically express hypothetical conditionals. Here, the hypothetical meaning is reflected by the first component jiǎ, which had the verbal meaning of ‘be artificial, be a fake, be untrue’ in Classical Chinese (also cf. below on analogy and the development of several markers with that function through time). The components of the most prominent concessive marker 虽 suīrán ‘although, even though’ are suī, which was already a concessive adverbial subordinator in Classical Chinese, and rán, a verb with the meaning of ‘be so, be like that’ in Classical Chinese. Chinese has a particularly impressive number of concessive conditional markers. Many of them are formed on the basis of a verb which expresses ways in which a given state of affairs can be assessed. The extent to which that assessment holds is expressed by an additional marker. In some cases, that marker is a negation as in 论 bùlùn ‘no matter what’ (< bù ‘negation’ + lùn ‘consider, talk

Grammaticalization changes in Chinese

649

about, comment’ in Classical Chinese), 无论 wúlùn ‘no matter what’ (< wú ‘not to have, not’ + lùn ‘talk about, comment’) and 管 bùguǎn ‘no matter what’ (< bù ‘not’ + guǎn ‘be concerned about, care, bother about’). In the case of 尽管 jǐnguǎn ‘although’, the verb guǎn ‘be concerned about’ is preceded by the lexical component jǐn, whose meaning can actually be traced back either to Classical Chinese jǐn ‘confine, exist within (the framework of)’ or jìn (with the departing tone) ‘use up, exhaust, finish’. The verb 算 suàn ‘calculate, reckon’ in its combination with the preceding anaphoric element 就 jiù ‘then (under that condition)’ forms the adverbial subordinator 就算 jiùsuàn ‘even if, granted that’. Finally, the adverbial subordinator 哪怕 nǎpà ‘no matter what, even if ’ is formed by the wh-word nǎ ‘which’ and the verb pà ‘be afraid of, fear’. Temporal clauses are mostly formed by elements which are described in section 2.2 under the term of relational nouns. Clauses in a simultaneous temporal relation to the main clause are the most prominent exception. They are formed by the 的时间 de shíjiān- construction, which is derived from a relative-clause construction with the relative marker de and the following noun shíjiān ‘time’ in its head position. Adverbial clauses are often reflected by an anaphoric element in the main clause. The markers 所 suǒyǐ ‘so, therefore’ and yīncǐ ‘so, therefore, for that reason’ occur with causal adverbial clauses. 所 suǒyǐ already exists in Classical Chinese, where it had various functions. Its first component has the meaning of ‘place’ in Classical Chinese and is also used as a syntactic marker forming relative clauses and corresponding nominalizations whose heads are not subjects (i.e., they are either objects or adjuncts). The word yǐ is a verb meaning ‘take, use’ in Classical Chinese, where it also served as a conjunction and a preverbal or postverbal marker of instruments and objects. In its current use, suǒyǐ is a fossilized lexical item. The structure of the marker yīncǐ is more straightforward. It consists of the verbal component yīn ‘follow’ (cf. above) plus the Classical Chinese distal demonstrative cǐ ‘that’. Another conjunction, 则 zé, is based on the meaning of ‘norm, standard, rule’ and is mainly associated with conditionals. Finally, there are some elements with a broader orientation which can occur with adverbial clauses of different semantic relations to the main clause, i.e., 就 jiù ‘then, only, right after, right away’, 都 dōu ‘even, all, already’, yě ‘even’ and 才 cái ‘only, unless, not … until’. Hole (2004) provides a comprehensive description of these elements in the framework of focus-background marking and existential or universal quantification over alternatives. The source concepts of the first four of these markers are straightforward, while the meaning of cái needs more research. The marker jiù can be traced back to a verb with the meaning of ‘move toward (highland)’ or ‘come close to’, dōu is an adverbial universal quantifier, yě is also used as a focus particle with the meaning of ‘also’. Most connectives described in this section are the result of a process of disyllabification which started in the early periods of the Han dynasty (also cf. section 3.1.1).

650

Linlin Sun and Walter Bisang

In Dong’s (2012) analysis of grammaticalization, there are two types of disyllabification processes in Chinese. Both of them lead to disyllabic functional markers. One type is the change from a functional phrase consisting of two adjacent items with a clear-cut syntactic relation between them to a disyllabic grammatical word. The phrases of this type are formed by a grammatical word plus a lexical item and develop into a coherent unit through reanalysis and fossilization. The other type is characterized by the fact that the source form cannot be regarded as a linguistic unit. It rather consisted of two independent constituents, which were in no direct syntactic relation even though they also were in adjacent positions. Over time, through frequent collocation, the initial syntactic structure of the two components became opaque and they were reanalyzed as forming a single bisyllabic marker. Some of the markers described above can be clearly assigned to these types. The first type of disyllabification is reflected in markers like 要 zhǐyào ‘as long as’ (‘only’ + ‘if ’), 论 búlùn ‘no matter’ / 无论 wúlùn ‘no matter’ ( + ‘consider’) and 管 bùguǎn ‘no matter, despite’ ( + ‘be concerned), etc. (Dong 2012: 247–250). The latter type can be illustrated beyond the cases mentioned in Dong (2012: 251–255) by 所 suǒyǐ ‘so, therefore’ (cf. above). Historical evidence suggests that the formation of disyllabic adverbial subordinators was also associated with analogy. This can be illustrated by looking at the formation of hypothetical conditional markers formed by the element 假 jiǎ. As described above, jiǎ is an independent lexical item with the meaning of ‘be artificial, be a fake, be untrue’ in Classical Chinese. By the Han dynasty, jiǎ acquired several other verbal and prepositional meanings, primarily the meanings of ‘borrow’, ‘depend on’, ‘get help (from)’, and ‘by means of ’. The collocation of jiǎ with 之 zhī (i.e., 假之 jiǎ-zhī), with the conditional meaning ‘if ’, was also attested by that time, as illustrated in (55). Throughout the history of Chinese, we can observe the emergence of several disyllabic connectives headed by jiǎ for expressing hypothetic conditional meaning, among them jiǎrú 假如 ‘if ’ (rú ‘to be like’), jiǎruò 假 ‘if ’ (ruò ‘to be like’), jiǎshǐ 假使 ‘if ’ (shǐ ‘make, cause’), and jiǎshè 假設 ‘if ’ (shè ‘set up, establish’). These forms can be seen as the result of analogy. There is a linguistic item in the first position of a bisyllabic marker whose second position can be filled by various items. (55) 假之得幸 庸必為我 ? (Zhanguoce – Wei) jiǎ-zhī dé xìng, yōng bì wéi wǒ yòng hū? if get favour would it certainly be (so) I use  ‘If (the woman I offered to the king) were favored, would (she/that) really be made use of by me?’

 Conclusions This paper provided an overview of major grammaticalization changes in Chinese, observed in both the nominal domain and the domain of categories at clause level.

Grammaticalization changes in Chinese

651

The grammaticalized elements under discussion primarily include numeral classifiers, relational nouns expressing spatial and temporal orientation, aspectual markers, modal auxiliaries, coverbs and/or adpositions, complementizers and adverbial subordinators as well as three individual morphemes, i.e., the copula shì, the passive marker bèi and the marker bă. As a general observation, one can clearly see that all of the grammaticalized elements discussed have their diachronic source in lexical expressions such as verbs (action-denoting lexemes) or nouns (object-denoting lexemes).21 The mechanism of reanalysis plays an important role in their diachronic development from content words to function words. It is particularly relevant at earlier stages of grammaticalization and it operates on a large scale, facilitated by some significant structural properties which are characteristic of Chinese (high relevance of pragmatic inference (Bisang 2009), flexibility of parts of speech (Bisang 2008; Sun 2015, forthcoming) and by some general restructuring the language underwent in the course of its development (disyllabification, loss of distinctive markedness for determining grammatical relationships). Moreover, most sources of grammaticalization still retain their original lexical meaning and many of them are multifunctional (e.g., zài ‘be at, live at, exist’ as a marker of progressive aspect in section 3.1.5 and as a preposition in section 3.6; yào as a modal verb expressing deontic modality, epistemic ‘must’ and future in section 3.2; the verb gěi ‘give’ as a preposition in section 3.6 and as a passive marker in section 3.3, etc.). On the basis of the outline given in this paper, it can be concluded that Chinese grammaticalization, as compared to what has often been claimed for Standard Average European languages, generally lacks salient coevolution of meaning and form, where the development from a concrete to an abstract meaning on the level of semantics does not necessarily coevolve with the development from a lower to a higher degree of integration on the level of morphosyntax (Bisang 2004, 2009). Moreover, the limited form-meaning coevolution of grammaticalized elements in Chinese seems to favor the increasing accumulation of new functions over time with limited loss of older functions, a development excellently described by Xing (2015), which formalizes this maintenance of different functions A, B, C etc. by a grammaticalization cline of the type A > A/B > A/B/C … (also cf. Bisang forthcoming). This cline deviates from clines of the type A > {B/A} B as described by Heine, Claudi, and Hünnemeyer (1991) and Hopper (1991).

Abbreviations 3/3 = third person singular,  = adjective/adjectival,  = adverb/adverbial,  = anaphoric,  = attributive,  = auxiliary,  = classifier,  = completive,  = connection,

 On flexible parts of speech in Classical Chinese, cf. Bisang (2008), Zádrapa (2011) and Sun (2015, forthcoming).

652

Linlin Sun and Walter Bisang

 = copula,  = currently relevant state,  = demonstrative,  = direct object,  = durative,  = experiential,  = future,  = genitive,  = indirect object,  = locative,  = noun/nominal,  = negation,  = nominalization,  = actor argument,  = undergoer argument, / = object,  = perfective,  = possessive,  = prepositional phrase,  = predicate, / = preposition,  = progressive,  = pronoun/pronominal,  = past, / = particle,  = question mark,  = subject, / = singular,  = subject-verb-object,  = tense-aspect-mood,  = verb/verbal,  = verb phrase

References Allan, Keith. 1977. Classifiers. Language 53. 285–311. Ansaldo, Umberto. 2010. Surpass comparatives in Sinitic and beyond: typology and grammaticalization. Linguistics 48(4). 919–950. Ao, Jinghao. 1985. 论 词“是”的产 Lun xici “shi” de chansheng [On the emergence of copula shi]. Yuyan Jiaoxue yu Yanjiu 85(2). 29–41. Baxter, William H. 1992. A handbook of Old Chinese phonology. Berlin: Mouton de Gruyter. Baxter, William H. & Laurent Sagart. 2014. Old Chinese: A new reconstruction. Oxford: Oxford University Press. Bennett, Paul. 1981. The evolution of passive and disposal sentences. Journal of Chinese Linguistics 9. 61–90. Bisang, Walter. 1991. Verb serialization, grammaticalization and attractor positions in Chinese, Hmong, Vietnamese, Thai and Khmer. In Hansjakob Seiler & Waldfried Premper (eds.), Partizipation. Das sprachliche Erfassen von Sachverhalten, 509–562. Tübingen: Gunter Narr. Bisang, Walter. 1996. Areal typology and grammaticalization: processes of grammaticalization based on nouns and verbs in East and mainland South East Asian languages. Studies in Language 20(3). 519–597. Bisang, Walter. 1999. Classifiers in East and Southeast Asian languages: Counting and beyond. In Jadranka Gvozdanovic (ed.), Numeral types and changes worldwide, 113–185. Berlin: Mouton de Gruyter. Bisang, Walter. 2004. Grammaticalization without coevolution of form and meaning: the case of tense-aspect-modality in East and mainland Southeast Asia. In Walter Bisang, Nikolaus P. Himmelmann & Björn Wiemer (eds.), What makes grammaticalization? A look from its fringes and its components, 109–138. Berlin: Mouton de Gruyter. Bisang, Walter. 2008. Precategoriality and syntax-based parts of speech: The case of Late Archaic Chinese. Studies in Language 32. 568–589. Bisang, Walter. 2009. Serial verb constructions. Language and Linguistics Compass 3. 792–814. Bisang, Walter. 2010. Grammaticalization in Chinese–a construction based account. In Elisabeth C. Traugott & Graeme Trousdale (eds.), Gradience, gradualness, and grammaticalization, 245–277. Amsterdam & Philadelphia: Benjamins. Bisang, Walter. 2016. Chinese syntax. In Sin-wai Chan (ed.), Routledge encyclopedia of the Chinese language, 354–377. Oxford: Routledge. Bisang, Walter. Forthcoming. Grammaticalization in Chinese–a cross-linguistic perspective. In Xing, Janet (ed.), a typological approach to grammaticalization & lexicalization: East meets West. Berlin: Mouton de Gruyter. Branner, David Prager. 2000. Problems in comparative Chinese dialectology: The classification of Min and Hakka. Berlin: Mouton de Gruyter. Cao, Guangshun. 1995. 近代汉语助词 Jindai Hanyu zhuci [Auxiliaries and particles in pre-Modern Chinese]. Beijing: Yuwen Chubanshe.

Grammaticalization changes in Chinese

653

Chang, Jung-hsing. 2006. The Chinese copula SHI and its origin: a cognitive-based approach. Taiwan Journal of Linguistics 4(1). 131–156. Chao, Yuen-Ren. 1968. A grammar of spoken Chinese. Berkeley: University of California Press. Chappell, Hilary. 1991. Causativity and the ba-construction in Chinese. In H. Seiler & W. Premper (eds.), Partizipation (Das sprachliche Erfassen von Sachverhalten), 563–584. Tübingen: Narr. Chappell, Hilary. 2008. Variation in the grammaticalization of complementizers from verba dicende in Sinitic languages. Linguistic Typology 12 (1). 45–98. Chappell, Hilary. 2015. Diversity in Sinitic languages. Oxford: Oxford University Press. Chappell, Hilary & Alain Peyraube. 2006. The analytic causatives of Early Modern Southern Min in diachronic perspective. In Dah-an Ho, H. S. Cheung, W. Pan & F. Wu (eds.) Shan gao shui chang. Linguistic studies in Chinese and neighboring languages, 973–1011. Taipei: Academia Sinica. Chen, Liang & Jian-sheng Guo. 2009. Motion events in Chinese novels: Evidence for an equipollently-framed language. Journal of Pragmatics 41(9). 1749–1766. Cheng, Lai-Shen Lisa & Rint Sybesma. 1999. Bare and not-so-bare nouns and the structure of NP. Linguistic Inquiry 30. 509–542. Cheng, Xiangqing (ed.). 1992. Lianghan Hanyu yanjiu [Studies in Chinese of the Western and Eastern Han dynasties]. Jinan: Shandong Jiaoyu Chubanshe. Chierchia, Gennaro. 1998. Reference to kinds across languages. Natural Language Semantics 6. 339–405. Cui, Guibo. 1984. 子語 所表 的 个白話語法 象 “Zhuzi yulei” suo biaoxian de jige yufa xiangxiang [Some grammatical features of vernacular Chinese as revealed in Zhuzi Yulei]. Taipei: National Taiwan University MA thesis. Denny, Peter J. 1976. What are noun classifiers good for? Chicago Linguistic Soiety. 122–132. Ding, Picus Sizhi 2001. Semantic change versus categorical change: a study of the development of BA in Mandarin. Journal of Chinese linguistics 29. 102–128. Djamouri, Redouane. 1987. Études des formes syntaxiques dans les écrits oraculaires gravés sur os et crapaces de tortue. Paris: Thèse de l’EHESS. Djamouri, Redouane & Waltraud Paul. 1997. Les syntagmes prépositionnels en yu et zai en chinois archaïque. Cahiers de Linguistique Asie Orientale 26(2). 221–248. Djamouri, Redouane & Waltraud Paul. 2009. Verb-to-preposition reanalysis in Chinese. In P. Crisma. & G. Longobardi (eds.), Historical syntax and linguistic theory, 194–402. Oxford: Oxford University Press. Djamouri, Redouane & Waltraud Paul. 2012. Deverbal postpositions in Chinese from a diachonic perspective. Paper presented at the Twentieth Annual Meeting of the International Association of Chinese Linguistics, Hongkong Polytechnic University, 29–31 July. Dobson, W. A. C. H. 1962. Early Archaic Chinese. Toronto: University of Toronto Press. Dong, Xiufang. 2012. Lexicalization in the history of the Chinese language. In Janet Zhiqun Xing (ed.), Newest trends in the study of grammaticalization and lexicalization in Chinese, 235– 274. Berlin: Walter de Gruyter. Erbaugh, Mary S. 1986. Taking stock: the development of Chinese noun classifiers historically and in young children. In Colette Craig (ed.), Noun classes and categorization, 399–436. Amsterdam: Benjamins. Ernst, Thomas. 1988. Chinese postpositions – Again. Journal of Chinese Linguistics 16(2). 219– 244. Fang, Jingmin. 2007. 现代汉语 成分的语法 Xiandai Hanyu fangwei chengfen de yufa diwei [The grammatical position of spacial components in Modern Chinese]. In Xu Jie & Zhong Qi (eds.), 汉语词汇 法 语音的相 关联 Hanyu cihui, jufa, yuyin de xianghu guanlian [Interface in Chinese: morphology, syntax and phonetics], 140–162. Beijing: Beijing Language and Culture University Press.

654

Linlin Sun and Walter Bisang

Fang, Mei. 2006. 北京话里 说 的语法 – 从言説动词到从 标记 Beijinghua li “shuo” de yufahua – cong yanshuo dongci dao congju biaoji [Grammaticalization of shuo ‘say’ in Beijing Mandarin: from lexical verb to subordinator]. Zhongguo Fangyan Xuebao 1. 107–121. Feng, Chuntian. 1984. 从 充 论衡 看关 词“是”的问题. Cong Wangyun “Lunheng” kan guanxici “shi” de wenti [Copula shi: form the view of Wangyun’s Lunheng]. In Xiangqing Cheng (ed.), Lianghan Hanyu Yanjiu [Studies in Chinese of the Western and Eastern Han dynasties]. Jinan: Shandong jiaoyu chubanshe. Feng, Li. 1993. The copula in Classical Chinese declarative sentences. Journal of Chinese Linguistics 21 (2). 277–311. Feng, Shengli. 2002. 汉语动补结构来源的 法分析. Hanyu dongbu jiegou laiyuan de jufa fenxi [A syntactic analysis of the origin of resultative constructions in Chinese]. Yuyanxue Luncong 26. 178–208. Greenberg, Joseph. 1972. Numeral classifiers and substantial number: problems in the genesis of a linguistic type. Paper presented at the Proceedings of the 11th International Congress of Linguistics, Bologna–Florence, August–September. Handel, Zev. 2015. The classification of Chinese: Sinitic (the Chinese language family). In William S.-Y. Wang & Chaofen Sun (eds.), The Oxford handbook of Chinese linguistics, 34–44. Oxford: Oxford University Press. Hashimoto, Mantaro. 1969. Observations on the passive construction. Unicorn 5. 59–71. He, Leshi. 2005. 记 语法特点研 Shiji yufa tedian yanjiu [A study of the grammar in Shiji]. Beijing: Commercial Press. He, Wei. 1989. 獲嘉 言研 Huojia Fangyan Yanjiu [A study of Huojia dialect]. Beijing: The Commercial Press. Heine, Bernd, Ulrike Claudi & Friederike Hünnemeyer. 1991. Grammaticalization. A conceptual framework. Chicago & London: The University of Chicago Press. Hole, Daniel. 2004. Focus and background marking in Mandarin Chinese: System and theory behind cai, jiu, dou and ye. London & New York: Routledge. Hong, Xinheng. 1964. 孟子 里的“是”字研 Mengzi li de “shi” zi yanjiu [A study of the word shi in Mengzi]. Zhongguo Yuwen 1964 (4). Hopper, Paul J. 1991. On some principles of grammaticization. In Elizabeth Closs Traugott & Bernd Heine (eds.), Approaches to grammaticalization, vol. 1, 17–35. Amsterdam & Philadelphia: John Benjamins. Huang, Yu-Chun. 1999. Hanyu Nengyuan Dongci zhi Yuyi Yanjiu [A Semantic Study on Chinese Modal Verbs]. Taipei: Institute of Chinese Language Teaching, National Taiwan Normal University MA thesis (unpublished). Hsieh, Chia-Ling. 2005. Modal verbs and modal adverbs in Chinese: An Investigation into the Semantic Source. UST Working Papers in Linguistics, Graduate Institute of Linguistics (National Tsing Hua University) 1, 31–58. Huang, James C.-T., Audrey Y.-H. Li & Yafei Li. 2009. The syntax of Chinese. Cambridge: Cambridge University Press. Huang, Shuan-fan. 2003. Doubts about complementation: a functionalist analysis. Language and linguistics 4 (2). 429–455. Jiang, Shaoyu. 2000. 汉语动结 产 的时代 Hanyu dongjieshi chensheng de shidai [Emergence of Chinese resultative constructions]. Guoxue Yanjiu 6. Krifka, Manfred. 1995. Common nouns: A contrastive analysis of Chinese and English. In Greg N. Carlson & Jeff F. Pelletier (eds.), The Generic Book, 398–411. Chicago: University of Chicago Press. Kuo, Wei-ju. 2015. 多義情 詞 “要” 來源試探 Duoyi qingtaici ‘yao’ laiyuan shitan [A study of the origin of the polysemous modal yao]. Chinese Studies 33 (3). 1–34.

Grammaticalization changes in Chinese

655

Lamarre, Christine. 2008. The Linguistic Categorization of Deictic Direction in Chinese: With Reference to Japanese. In Xu Dan (ed.) Space in languages of China: Cross-linguistic, synchronic and diachronic perspectives, 69–97. Dordrecht: Springer. Lei, Dongping & Lizhen Hu. 2010. 時間副詞 “正 ” 的形成再探 Shijian fuci ‘zhengzai’ de xingcheng zai tan [On the formation of time adverb zhengzai]. Studies of the Chinese Language 2010 (1). 67–73. Li, Audrey Yen-hui. 1990. Order and constituency in Mandarin Chinese. Dordrecht: Kluwer Academic Publishers. Li, Charles N. & Sandra A. Thompson. 1974a. An explanation of word order change SVO – SOV. Foundations of Language 12. 201–214. Li, Charles N. & Sandra A. Thompson. 1974b. Historical change of word order, a case study in Chinese and its implications. In J. M. Anderson & C. Jones (eds.), Historical linguistics I, 199– 217. Amsterdam: North Holland Publishing Company. Li, Charles N. & Sandra A. Thompson. 1977. A mechanism for the development of copula morphemes. In Charles N. Li (ed.), Mechanisms of syntactic change, 419–444. Austin & London: University of Texas Press. Li, Charles N. & Sandra A. Thompson. 1981. Mandarin Chinese. A functional reference grammar. University of California Press. Li, Charles N., Sandra A. Thompson & R. McMillan Thompson. 1982. The discourse motivation for the Perfect Aspect: The Mandarin particle le. In Paul J. Hopper (ed.), Tense-Aspect: Between semantics and pragmatics, 19–44. Amsterdam & Philadelphia: Benjamins. Li, Renzhi. 2004. Modality in English and Chinese: A typological perspective. Boca Raton, Florida, USA: Dissertation.com. Li, Xuping. 2013. Numeral classifiers in Chinese. Berlin: Mouton de Gruyter. Li, Yan. 2013. 记 的结果补语浅析 Shiji zhong de jieguo buyu qianxi [A brief analysis of resultative complements in Shiji]. Modern Communication 2013, 12. Li, Yin. 2018. The diachronic development of the Chinese passive: From the wei … suo passive to the long passive. Language 94 (2). 74–98. Lipenkova, Janna. 2014. The syntax-semantics interface in the Chinese ba-construction. Berlin, Germany: Free University of Berlin dissertation. Liu, Cheng-hui. 2012. The grammaticalization of the directional verb ‘lái’: a construction grammar approach. In Janet Zhiqun Xing (ed.), Newest trends in the study of grammaticalization and lexicalization in Chinese, 87–113. Berlin: Walter de Gruyter. Liu, Hsiu-ying. 1996. Minnanhua shuo-hua dongci kóng zhi gongneng yanbian ji yuyi tantao [Exploration of the functional development and semantics of the speech act verb kóng ‘say’ in Southern Min]. Manuscript. Hsinchu, Taiwan: Institute of Linguistics, National Tsing Hua University. Liu, Shiru. 1965. 魏晋南北 量词研 Wei-Jin Nanbeichao liangci yanjiu [A study on classifiers in the Wei-Jin and the Northern and Southern dynasties]. Beijing: Zhonghua shuju chuban. Liu, Yuehua. 1986. 对话 “说” “想” “看”的一种特殊 法 Duihu-zhong “shuo”, “xiang”, “kan” de yizhong teshu yongfa [The special use of shuo ‘say’, xiang ‘think’ and kan ‘see’ in conversation]. Zhongguo Yuwen 3. 168–172. Lu, Zhuoqun. 1997. 助动词“要”汉代起源说 Zhudongci ‘yao’ handai qiyuan shuo [On the origin of modal verb yao in the Han dynasty]. Research In Ancient Chinese Language 1997 (3). 45–48. Lü, Shuxiang. 1999. 现代汉语 百词 Xiandai Hanyu Babai Ci [800 words in Modern Chinese]. Beijing: The Commercial Press. Lyons, John. 1977. Semantics. Cambridge: Cambridge University Press. Ma, Beijia. 2002. “要” 的語法 “Yao” de yufahua [Grammaticalization of yao]. Studies in Language and Linguistics 49(4). 81–87. Ma, Lixia. 2006. Acquisition of the perfective aspect marker LE of Mandarin Chinese in discourse by American college learners. Iowa City, USA: University of Iowa dissertation.

656

Linlin Sun and Walter Bisang

Mei, Tsu-Lin. 1981. 代漢語 成貌 和詞尾的來源 Xiandai hanyu wanchengmao jushi he ciweide laiyuan [The orign of the perfective aspect construction and the perferctive suffix in Modern Chinese]. Yuyan Yanjiu 1. 65–77. Mei, Tsu-Lin. 1990. 唐 处置 的来源 Tang Song chuzhi shi de laiyuan [On the origin of resultative constructions of the Tang-Song dynasties]. Zhongguo Yuwen 1990 (3). 191–206. Mei, Tsu-Lin. 1991. 從漢代的“動, ”,“動, ”來看動補結構的 展 – 兼論 時 起詞 的 關係的 立 Cong Handai de “dong, sha”, “dong, si” lai kan dongbu jiegou de fazhan – Jianlun zhonggu shiqi qici de shishou guanxi zhonglihua [A look at the development of the verb-resultative construction from “V-kill” and “V-die” in the Han dynasty – Concurrently on the neutralization of the pre-verbal actor/undergoer distinction in Middle Chinese]. Yuyanxue Luncong 16. 112–136. Mei, Tsu-Lin. 2008. Shanggu Hanyu dongci zhuo qing bieyi de laiyuan [The original differences of the verbs with voice and voiceless initials in Ancient Chinese]. Minzu Yuwen 2008 (3). 3–20. Meng, Cong. 1982. 语“说”字小集 Kouyu “shuo” zi xiaoji [A mini-collection of the uses of say in the spoken language]. Zhongguo Yuwen 5. 337–346. Norman, Jerry. 1988. Chinese. Cambridge: Cambridge University Press. Ota, Tatsuo. 2003 [1958]. 國語歷 法 Zhongguoyu Lishi Wenfa [History of the grammar of Chinese]. Beijing: Beijing University Press. Paul, Waltraud. 2015. New perspectives on Chinese syntax. Berlin: Walter de Gruyter. Peyraube, Alain. 1980. Les constructions locatives en chinois moderne. Paris: Editions Langages croisés. Peyraube, Alain. 1988. Syntaxe diachronique du chinois. Évolution des constructions datives du XIV e siècle av. J.-C. au XVIII e siècle. Paris: Colège de France, Institut des hautes études chinoises. Peyraube, Alain. 1996. Recent issues in Chinese historical syntax. In C. T. James Huang & Y. H. Audrey Li (eds.), New horizon in Chinese linguistics, 161–214. Dordrecht: Kluwer Academic Publishers. Peyraube, Alain. 1998. On the history of classifiers in Archaic and Medieval Chinese. In Benjamin K. T’sou (ed.), Studia Linguistica Serica, Proceedings of the 3rd International Conference on Chinese Linguistics, 131–143. Hongkong: Hongkong City University. Peyraube, Alain. 2006. Motion events in Chinese: A diachronic study of directional complements. In Maya Hickmann & Stéphane Robert (eds.), Space in languages: Linguistic systems and cognitive categories, 121–138. Amsterdam: John Benjamins. Peyraube, Alain & Ming Li. 2012. The semantic historical development of modal verbs of volition in Chinese. In Janet Zhiqun Xing (ed.), Newest trends in the study of grammaticalization and lexicalization in Chinese, 149–167. Berlin: Walter de Gruyter. Shi, Wenlei & Yicheng Wu. 2014. Which way to move. The evolution of motion expressions in Chinese. Linguistics 52. 1237–1292. Slobin, Dan. 2004. The many ways to search for a frog: Linguistic typology and the expression of motion events. In Sven Strömqvist & L. Verhoeven (eds.) Relating events in narrative, Vol. 2. Typological and contextual perspectives, 219–257. London: Lawrence Erlbaum. Song, Yayun. 2007. 东汉训诂材料与汉语动结 研 Donghan xungu cailiao yu hanyu dongjie shi yanjiu [Philology of the Eastern Han dynastic and the resultative constructions]. Yuyan Kexue 2007 (6). 84–96. Sun, Chaofen. 1996. Word-order change and grammaticalization in the history of Chinese. Stanford: Stanford University Press. Sun, Linlin. 2015. Flexibility in the parts-of-speech system of Classical Chinese. Mainz, Germany: University of Mainz dissertation. Sun, Linlin. Forthcoming. Flexibility in the parts-of-speech system of Classical Chinese. (Trends in Linguistics). Berlin: Mouton de Gruyter.

Grammaticalization changes in Chinese

657

Sun, Xixin. 1992. 汉语历 语法要略. Hanyu Lishi Yufa Yaolue [The essentials of Chinese grammar]. Shanghai: Fudan Daxue Chubanshe. Sybesma, Rint. 1999. The Mandarin VP. Dordrecht: Kluwer Academic Publishers. Szeto, Pui Yiu, Umberto Ansaldo & Stephen Matthews. 2018. Typological variation across Mandarin dialects: An areal perspective with a quantitative approach. Linguistic Typology 22(2). 233–275. Tai, James & Lianqing Wang. 1990. A semantic study of the classifier tiao. Journal of the Chinese Language Teachers Association 25. 35–56. Talmy, Leonard. 1985. Lexicalization patterns: Semantic structure in lexical forms. In Timothy Shopen (ed.), Language typology and syntactic description, Vol. 3: Grammatical categories and the lexicon, 57–149. Cambridge, MA: Cambridge University Press. Talmy, Leonard. 2000. Toward a cognitive semantics, Vol.2. Cambridge, MA: MIT Press. Tang, Ting-chi. 1992. Studies on Chinese morphology and syntax (Vol. 3). Taipei: Student. Tang, Yuming. 1988. 唐 清的“被”字 Tang zhi Qing de “bei” zi ju [The sentences with the word “bei” from the Tang to the Qing dynasties. Zhongguo Yuwen 6. 459–468. Tian, Jun. 2006. The BA construction in Mandarin Chinese: a syntactic-semantic analysis. University of Victoria. https://www.uvic.ca/research/centres/capi/assets/docs/ studentessays/Jun_BA_Construction.pdf (accessed 25 October 2019). Tiee, Henry Hung-Yeh. 1985. Modality in Chinese. In Nam-Kil Kim & Henry Hung-Yeh Tiee (eds.), Studies in East Asian linguistics, 84–96. Los Angeles: Department of East Asian Languages and Cultures, University of Southern California. Tsai, Wei-Tien Dylan. 2015. On the topography of Chinese modals. In Ur Shlonsky (ed.), Beyond functional sequence, 275–294. Oxford: Oxford University Press. Tsang, Chui-Lim. 1981. A semantic study of modal auxiliary verbs in Chinese. California: Stanford University dissertation. Tsao, Feng-fu. 2012. Argument structure change, reanalysis and lexicalization: Grammaticalization of transitive verbs into ditransitive verbs in Chinese, Japanese and English. In Janet Zhiqun Xing (ed.), Newest trends in the study of grammaticalization and lexicalization in Chinese, 275–302. Berlin: Walter de Gruyter. Wang, Jin-hui. 2015. 時間副詞 “ ” “正 ” 的形成探 Shijian fuci “zai” yu “zhengzai” de xingcheng tanjiu. Language and Linguistics 16 (2). 187–212. Wang, Li. 1947. 國語法 論 Zhongguo Yufa Lilun [Theories of Chinese grammar]. Shanghai: Commercial Press. Wang, Li. 1958. 漢語 稿 Hanyu Shigao [A draft history of Chinese gramma]. Beijing: Science Publishing House. Wang, Peter C.-T. 1970. A transformational approach to Chinese ba and bei. Austin, USA: University of Texas at Austin dissertation. Wang, Shaoxin. 1989. 量词个 唐代前 的 展 Liangci ge zai Tangdai qianhou de fazhan [the development of the classifier ge around the Tang dynasty]. Yuyan Jiaoxue yu Yanjiu 2. 98– 119. Wu, Fuxiang. 1996. 敦煌 语法研 Dunhuang Bianwen Yufa Yanjiu [Research on the grammar in Dunhuang Bianwen]. Changsha: Yuelu Shushe. Wu, Hsiao-Ching. 2003. A case study on the grammaticalization of GUO in Mandarin Chinese – polysemy of the motion verb with respect to semantic change. Language and Linguistics 2003 (4). 857–885. Wu, Zoe. 2004. Grammaticalization and the development of functional categories in Mandarin Chinese. London: Routledge. Xing, Janet Z. 2013. Semantic reanalysis in grammaticalization in Chinese. In Zhuo Jing-Schmidt (ed.), Increased empiricism: New advances in Chinese linguistics, 223–246. Amsterdam & Philadelphia: John Benjamins. Xing, Janet Z. 2015. A comparative study of semantic change in grammaticalization and lexicalization in Chinese and Germanic languages. Studies in Language 39 (3). 594–634.

658

Linlin Sun and Walter Bisang

Xu, Dan. 2006. Typological change in Chinese syntax. Oxford & New York: Oxford University Press. Yang, Jun. 2003. The basic function of particle le in Modern Chinese. Journal of the Chinese Language Teachers Association 38(1). 77–96. Yang, Shih-feng. 1957. 灣桃園 話 言 Taiwan Taoyuan Kejia hua fangyan. The Hakka dialect of Taoyuan, Taiwan (Series A, No. 22). Taipei: The Institute of History and Philology Monographs. Yang-Drocourt, Zhitang. 2004. Évolution syntaxique du classificateur en chinois (du XIIIe siècle av. J.-C. au XVIIIe siècle). Paris: Centre de recherches linguistiques sur l’Asie orientale. Yao, Zhenwu. 2013. 上 汉语动结 的 展 相关研 法的检讨 Shanggu hanyu dongjieshi de fazhan ji xiangguan yanjiu fangfa de jiantao [Resultative constructions in Old Chinese and their development, as well as the research methods]. Guhanyu Yanjiu 2013 (1). 63–74. Yap, Foong Ha, Ying Yang & Tak-Sum Wong. 2014. On the development of sentence final particles (and utterance tags) in Chinese. In Kate Beeching & Ulrich Detges (eds.), Discourse functions at the left and right periphery: Crosslinguistic investigations of language use and language change (Studies in Pragmatics Series), 179–220. Bingley United Kingdom: Emerald Publishers. Yeh, Meng. 1993. Stative situations and the imperfective -zhe in Mandarin. Journal of Chinese Language Teachers’ Association 28. 69–98. Yeh, Meng. 1996. An analysis of the experiential guoEXP in Mandarin: A temporal quantifier. Journal of East Asian Linguistics 5. 151–182. Yen, Sian L. 1986. The origin of the copula shi in Chinese. Journal of Chinese Linguistics 14(2). 227–241. You, Xueying. 1998. Some speculations on the semantic change of Chinese modal verb “yao”. Wenshan Review of Literature and Culture 1998 (9). 161–175. Zádrapa, Lukáš. 2011. Word-class flexibility in Classical Chinese. Leiden: Brill Publishers. Zhang, Cheng. 2012a. 汉语通 量词的 展与汉语量词范畴的确立 [The relation between the development of general classifiers and the establishment of the category of numeralclassifiers]. Journal of Chinese Linguistics. 40(2). 307–321. Zhang, Cheng. 2012b. 类型 视野的汉语 量词演 Leixingxue shiye de hanyu mingliangci yanbianshi [A typological study on the history of Chinese classifiers]. Beijing: Beijing University Press. Zhang, Niina Ning. 2013. Classifier structures in Mandarin Chinese. Berlin: Mouton de Gruyter. Zhang, Taiyuan. 1986. “了” 字 成 的語意演變研 “Le” zi wanchengshi de yuyi yanbian yanjiu [A study of the semantic evolution of the perfective “le”]. Taipei: National Taiwan University MA thesis. Zhang, Yajun. 2002. 时间副词 “正” “正 ” “ ” 其虚 过程 . Shijian fuci ‘zheng’, ‘zhengzai’, ‘zai’ ji qi xuhua guocheng kaocha [Research on time adverbs zheng, zhengzai and zai and their grammaticalization processes]. Journal of Shanghai Normal University (Philosophy & Social Science Edition) 2002 (1). 46–55. Zhuang, Huibin. 2014. The prosodic history of Chinese Resultatives. Language and Linguistics 15(4). 575–595.