Susceptibility vs. Resistance: Case Studies on Different Structural Categories in Language-Contact Situations 9783110785517, 9783110785197

The topic of the volume is the contrast between borrowable categories and those which resist transfer. Resistance is

265 18 4MB

English Pages 492 Year 2022

Table of contents :
Contents
Preface
On the (almost im)possible emergence of grammatical gender in language-contact situations
Language contact and number inflection in Patagonian Welsh
VOY – PARA – SIEMPRE: Three Spanish-derived function words and the Chamorro irrealis
On the borrowing of the English adversative connector but
On loan conjunctions: A comparative study with special focus on the languages of the former Soviet Union
Parallel Romancization: Chamorro and Tetun Dili – two heavy borrowers compared
Index of Authors
Index of Languages
Index of Subjects

Recommend Papers

Susceptibility vs. Resistance: Case Studies on Different Structural Categories in Language-Contact Situations 9783110785517, 9783110785197

The topic of the volume is the contrast between borrowable categories and those which resist transfer. Resistance is

153 54 16MB Read more

Media Interaction With the Public in Emergency Situations : Four Case Studies

146 31 346KB Read more

Sticky Situations : Case Studies for Early Childhood Program Management 9781475830859, 9781475830842

Being the director of an early childhood education program not only includes knowing about child development and develop

139 10 355KB Read more

Marketing Impact of Halal Labeling toward Indonesian Muslim Consumer’s Behavioral Intention Based on Ajzen’s Planned Behavior Theory: Policy Capturing Studies on Five Different Product Categories

251 90 327KB Read more

Essentials on Dynamic Capabilities for a Contemporary World: Recent Advances and Case Studies (Studies on Entrepreneurship, Structural Change and Industrial Dynamics) 3031348133, 9783031348136

This book is about dynamic capabilities (DCs) in the context of the 21st century, in which global challenges seem to com

114 67 3MB Read more

Caring for Patients from Different Cultures: Case Studies from American Hospitals [Fifth Edition] 9780812290271

Now in its fifth edition, Caring for Patients from Different Cultures provides healthcare workers with a frame of refere

113 5 1MB Read more

On the Case in the English Language Arts Classroom : Situations for the Teaching of English 9780814134214, 9780814134238

117 110 2MB Read more

Structural Analysis and Renovation Design of Ageing Sewers: Design Theories and Case Studies 9783110471748, 9783110471731

In Japan, as a large number of sewer lines approach and exceed their design service life, rehabilitation of these ageing

174 36 24MB Read more

International Case Studies in Event Management (Routledge International Case Studies in Tourism) 1032487089, 9781032487083

122 58 Read more

Different Iron Ages: Studies on the Iron Age in Temperate Europe 9780860547792, 9781407349060

171 96 215MB Read more

Susceptibility vs. Resistance: Case Studies on Different Structural Categories in Language-Contact Situations
9783110785517, 9783110785197

Author / Uploaded
Nataliya Levkovych (editor)

0 0 0
Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up

File loading please wait...

Citation preview

Susceptibility vs. Resistance

Koloniale und Postkoloniale Linguistik Colonial and Postcolonial Linguistics Edited by Stefan Engelberg, Peter Mühlhäusler, Doris Stolberg, Thomas Stolz and Ingo H. Warnke

Volume 19

Susceptibility vs. Resistance

Case Studies on Different Structural Categories in Language-Contact Situations Edited by Nataliya Levkovych

ISBN 978-3-11-078519-7 e-ISBN (PDF) 978-3-11-078551-7 e-ISBN (EPUB) 978-3-11-078554-8 Library of Congress Control Number: 2022933753 Bibliographic information published by the Deutsche Nationalbibliothek The Deutsche Nationalbibliothek lists this publication in the Deutsche Nationalbibliografie; detailed bibliographic data are available on the internet at http://dnb.dnb.de. © 2022 Walter de Gruyter GmbH, Berlin/Boston Printing and binding: CPI books GmbH, Leck www.degruyter.com

Contents Preface | VII Thomas Stolz and Nataliya Levkovych  On the (almost im)possible emergence of grammatical gender in languagecontact situations | 1 Deborah Arbes  Language contact and number inflection in Patagonian Welsh | 51 Thomas Stolz  VOY – PARA – SIEMPRE: Three Spanish-derived function words and the Chamorro irrealis | 91 Nicole Hober  On the borrowing of the English adversative connector but | 183 Thomas Stolz and Nataliya Levkovych  On loan conjunctions: A comparative study with special focus on the languages of the former Soviet Union | 259 Thomas Stolz and Nataliya Levkovych  Parallel Romancization: Chamorro and Tetun Dili – two heavy borrowers compared | 393 Index of Authors | 467 Index of Languages | 474 Index of Subjects | 479

Preface This volume opens a series of yet undetermined (i.e. perhaps even infinite) length which comprises theory-oriented and/or methodologically and/or empirically innovative studies of language-contact phenomena in a wide reading of the term. Interdisciplinary and cooperative approaches are as welcome as projects which address the history of thought in the domain of language-contact studies. The focus is on topics which hitherto have not made it to the centre stage of the on-going debate or urgently call for a substantial extension of the database. The distant goal of this initiative is to widen the scope of languagecontact studies by way of demonstrating that the extant canon of recurrent topics does not suffice to further the general understanding of what happens (and what does not) when languages are partners in contact of each other. In this way, it is hoped that the contributions to this volume and the volumes-to-be initiate new strands of dedicated research to ultimately keep the wheels of the linguistics of language contacts turning. It is our declared policy to impose no overly strict limits on the size of the contributions. Some topics cannot be ticked off just like that. The above untitled series fits perfectly into the program of Koloniale und Postkoloniale Linguistik/Colonial and Postcolonial Linguistics (KPL/CPL) because the processes to be scrutinized are identified and studied against the social, cultural, and historical backdrop of concrete colonial and postcolonial situations. This volume and its future successors relate directly to issues which are of utmost importance for the still new academic discipline of (Post-)Colonial Language Studies (Warnke et al. 2016). The effects of colonialism on languages in asymmetric situations of contact are meticulously taken account of. The impact of postcolonial processes on languages is investigated equally methodically. At the same time, the widening of the scope to which allusions were made in the foregoing paragraph requires that the erstwhile restriction to say, prototypical situations of colonialism, is relaxed so that other categories of sociolinguistic asymmetry can be covered as well. This is the point where the concept of Language Empire Building (Stolz 2015) comes into play. Language Empires come in widely different shapes under whose wraps different and diverse forms of nonprototypical colonialism may be hidden. It is a task of the envisaged series to uncover the parallels and differences between language-contact phenomena in the one and in the other kind of colonialism in order to prepare the grounds for an overarching model and theory. Language contact can be approached from different perspectives. In this first volume, case studies of contact-borne processes which affect language

https://doi.org/10.1515/9783110785517-001

VIII | Preface

structure (or fail to do so) are presented. As the main title of the volume suggests, certain structural categories are prone to react to external influence whereas others remain largely immune against the pressure from outside. The bulk of the extant structure-oriented investigations into language contact look at the changes the replica languages undergo. It is, however, an equally intriguing question that some components of the grammatical system seem to be exempt from giving way to foreign influence. Several of the contributions to this volume discuss cases of this kind. Besides these articles which address instances of resistance there are others which emphasize the susceptibility of certain categories for being transferred from donor to replica languages independent of their genetic affiliation and grammatical structure. It is also shown that important new insights can be gained from systematically describing the borrowing behavior of several genetically related replica languages in contact with genetically related donor languages. Furthermore, there may be long-term consequences of language contact which can no longer be attributed to the donor language but reflect internal processes of the replica language. It is argued that the delayed effects cannot sweepingly be lumped together with the immediate effects of language contact. One of the cases of resistance is addressed in the first contribution to this volume. Thomas Stolz and Nataliya Levkovych pose the question why grammatical gender does not arise in language-contact situations. They provide a critical cross-linguistic appraisal of those instances which are assumed to constitute cases of contact gender to prove that hardly any of these putative instances of contact-induced genesis of grammatical gender passes the test. In the second study of this volume, Deborah Arbes looks at the number inflection of nouns borrowed from Spanish or English in the Welsh variety spoken in Patagonia. Does any of the two prestigious donor languages exert influence on the inherited Welsh number system? The Patagonian Welsh case is still largely unfamiliar territory for language-contact studies. Thomas Stolz provides a historically informed in-depth study of the Spanish-derived function words voy, para, and siempre and their role within the system of the irrealis in Chamorro. The Spanish origin of these free morphemes notwithstanding, they cannot be described by way of copying the rules of the donor language. In point of fact, since their original borrowing (perhaps in the early 19th century) the functions words have undergone changes which are completely independent of the donor language. Nicole Hober contributes the second study which is associated with resistance in the sense that the borrowability of the English adversative conjunction but can be shown to be unexpectedly low. The limited borrowability is surprising because in language-contact theory, it is commonly assumed that adversative

Preface | IX

connectors are not only easy to borrow but also usually the prime movers in a chain of processes which involves selected members of the class of connectors. Connectors are also in the focus of the second joined paper by Thomas Stolz and Nataliya Levkovych who collect and systematize the borrowing of adverbial subordinators, complementizers, and NP-conjunctions in the languages of the former Soviet Union. The authors present a rich stock of evidence for MATborrowing of conjunctions in almost a hundred replica languages – the donor languages being for the most Russian, Arabic, and Persian. It is shown that patterns of borrowing emerge whose recurrence defies the idea that we are facing incidental parallels. The final paper is again co-authored by Thomas Stolz and Nataliya Levkovych. The behavior of two Austronesian replica languages – Chamorro and Tetun Dili – under the influence of two Ibero-Romance donor languages – Spanish and Portuguese, respectively – is compared in lexicon, phonology, and grammar. It is argued that the replica languages have become more similar to each other exactly because of the almost identical borrowings from two closely related donor languages. These short sketches of the contents of the individual papers gloss over many interesting aspects connected to the lines of argumentation in the articles. In each of the contributions, problems are tackled which (undeservedly) have been neglected in language-contact studies. More specifically, the authors take it on them to try cracking tough nuts, in a manner of speaking. A plethora of new empirical data is offered to the interested public. Hitherto ignored connections are made between phenomena which, if at all, have previously been looked at only in isolation. The authors intend to continue their projects and thus are looking forward to discussing their ideas with the interested readership. As editor I am grateful to the authors who have agreed on short notice to publishing their contributions in this volume. I am also indebted to the series editors of KPL/CPL for accepting this volume in their series. Cornelia Stroh deserves a word of thanks too for taking care of the technical side of the editorial work. Nataliya Levkovych, Bremen, June 2021

X | Preface

References Stolz, Christel (ed.). 2015. Language empires in comparative perspective. Berlin & Boston: De Gruyter Mouton. Warnke, Ingo H., Thomas Stolz & Daniel Schmidt-Brücken. 2016. Perspektiven der Postcolonial Language Studies. In Thomas Stolz, Ingo H. Warnke & Daniel Schmidt-Brücken (eds.), Sprache und Kolonialismus. Eine interdisziplinäre Einführung zu Sprache und Kommunikation in kolonialen Kontexten. 1–25. Berlin & Boston: De Gruyter Mouton.

Thomas Stolz and Nataliya Levkovych

On the (almost im)possible emergence of grammatical gender in language-contact situations Abstract: The paper critically evaluates statements on the contact-induced genesis of grammatical gender in languages which were previously devoid of this category. To this end, several language-contact situations between gendered donor languages and genderless replica languages from different macro-areas are inquired into. It is shown that the subject matter is largely understudied. The quantity and quality of the extant descriptions of cases of borrowed grammatical gender call for more in-depth studies dedicated to this topic. The low cross-linguistic frequency of the phenomenon notwithstanding, its further exploration promises important insights for the theory of grammatical gender and for language-contact studies in general. What obstacles the evaluation of the available data is the often fragmentary and unsystematic documentation of the empirical facts. There is evidence of cross-linguistically recurrent patterns whose validity can only be tested on the basis of a dedicated typologically-inspired in-depth study of the behavior of grammatical gender in language-contact situations. Keywords: borrowing; emergence; grammatical gender; language contact

1 Introduction When Greenberg (1978) asked how languages acquire gender (= GG) markers he focused exclusively on language-internal developments most of which would nowadays be classified as instances of grammaticalization. In Greenberg’s seminal study, no mention is made of language contact as a potential factor in the acquisition of exponents of GG. Similarly, Dahl (2004: 198–199) speaks of the complexity of the emergence of GG-systems without accounting for the possibility that borrowing may play a role.

|| Thomas Stolz: University of Bremen, FB 10: Linguistics/Language Sciences, UniversitätsBoulevard 13, 28359 Bremen, Germany. E-Mail: [email protected] Nataliya Levkovych, University of Bremen, FB 10: Linguistics/Language Sciences, UniversitätsBoulevard 13, 28359 Bremen, Germany. E-Mail: [email protected]

https://doi.org/10.1515/9783110785517-002

2 | Thomas Stolz and Nataliya Levkovych

Accordingly, the evidence of contact-borne transfer of GG-exponents from a donor language (= DL) to a replica language (= RL) discussed in the literature is relatively scarce. Two references suffice to sketch the present state of our knowledge.1 Based on prior work by Heath (1978), Gardani (2008: 55–56) presents the GG-prefixes (r)a- (class III), wu- (class IV), and ma- (class V) – all three of them classifying non-human referents – as borrowings from Pre-Nunggubuyu into Warndarnang (spoken in Arnhem Land), i.e. from one prefixing into another prefixing Australian language (Aikhenvald 2000: 386–388). Warndarnang thus acquired some of its GG-markers via language contact. Another case in point is that of the Mba subgroup of Adamawa-Ubangi (Central Africa) whose members have probably borrowed GG-prefixes from neighboring Bantu languages (Kleinewillinghöfer 2017: 4).2 What is worth noting in both cases is that the RL had a GG-system of their own prior to contact with the DL. Warndarnang integrated the Pre-Nunggubuyu GG-prefixes in its already existing paradigm of GG-prefixes. The Mba languages normally employ autochthonous suffixes to mark GG. The borrowed Bantu prefixes co-exist and interact structurally with these suffixes. This means that the category of GG was already firmly established in the RL when the borrowing took place. The RL-system and the DL-system were organized differently. The original grammaticalization of GG in the RL predates the borrowing of certain GG-morphemes, i.e. language contact has not triggered GG into being but only contributed to shaping the means to express it. In the Australian as well as in the African example, the interpretation of the synchronic facts in contact-linguistic terms relies heavily on the reconstruction of hypothetical diachronic stages. In the light of the (admittedly limited) empirical evidence of the borrowing of GG-exponents, the question arises whether GG as a category can be introduced into a previously GG-less RL by way of copying it from a DL which is already equipped with GG. The likelihood that a GG-less language comes into contact with a GG-language is not entirely negligible. Corbett (2005a–c) evaluates data from 256 languages world-wide 144 (= 56%) of which lack GG whereas the re-

|| 1 The frequently discussed case of Ma’a, one of the prominent candidates for the status of Mixed Language, is controversial (Thomason and Kaufman 1988: 227). If we are dealing with Bantuization of an originally Cushitic language then we also have the spectacular wholesale transfer of the Bantu concord system with its entire set of GG-markers, etc. If, however, Ma’a is the outcome of the Cushiticization of an erstwhile Bantu language the original GG-system has simply survived relexification and no borrowing of GG-morphology has taken place at all. 2 The Mba-case has gained some attention since Pasch (1988) described it. Corbett (1991: 185– 186) and Aikhenvald (2000: 387) feature the Mba data in their accounts of GG, too.

On the (almost im)possible emergence of grammatical gender | 3

maining 112 are GG-languages. With 44% of the sample the GG-languages constitute the minority albeit a very robust one.3 According to the maps dedicated to GG in the World Atlas of Language Structures (WALS), the geographic distribution of the GG-languages yields sizable clusters so that a GG-language tends to be surrounded by other members of the same type. However, on the outskirts of these GG-hotbeds there is ample opportunity for languages with and without GG to be involved jointly in the same language-contact situation. What happens to GG if only one of the partners in contact has this category? Is it possible that, given the most favorable social conditions, the RL copies GG from the DL and thus must be re-classified as newcomer to the class of GG-languages? To provide satisfactory answers to these questions we put forward Hypothesis 1 (= H1) the validity of which has to be tested empirically – a task which we attempt to fulfill at least partly in this study. Hypothesis 1: GG may emerge in situations of language contact between a DL with GG and a RL without GG via copying. To prove H1 right instances of MAT-borrowing and PAT-borrowing (Sakel 2007) too are admitted as pieces of evidence. However, as will transpire from our subsequent sketch of the current state-of-the-art, H1 is particularly difficult to corroborate on the basis of hard empirical facts. The problems we have to face in our search for uncontroversial proof of H1 are telling by themselves and bear implications for the theory of language contact in general. Before we disclose these interesting results it is necessary to go a long way which, in a manner of speaking, is paved with unexpected frustration. In the remainder of the paper, H1 will be complemented by an array of further hypotheses of ours whose validity will be assessed in the conclusions. For the category of GG and its typology we rely on the numerous contributions by Corbett and associates (Corbett 1991, 2005a–c, 2006; Corbett and Fedden 2016). Our approach to language-contact phenomena is indebted to Matras and Sakel (2007) and Matras (2009). Concise reviews of the literature on GG and language contact are given by Duke (2009: 67–76), Rothe (2012: 61–74), and, with special focus on GG-genesis in language-contact situations, T. Stolz (2012: 97–104). In this paper, the phenomena are studied qualitatively. The languages scrutinized for GG-borrowing constitute a convenience sample. Unless otherwise stated, the examples are drawn from the pertinent linguistic lit-

|| 3 Note that Corbett (1991: 2) gives an estimate of the number of GG-languages for Africa and New Guinea which is about ten times as big as that of the GG-languages in his WALS-sample.

4 | Thomas Stolz and Nataliya Levkovych

erature. In the case of sentential examples morpheme boundaries, glosses, and English translations are ours unless a different source is identified by the letter Y in the first line of the example. Boldface highlights those parts of the examples which are important for the ensuing discussion. The paper is structured as follows. In Section 2, we present the working definition of the notion of GG. Section 3 reviews shortly the most prominently discussed topics in connection with the behavior of GG under the conditions of language contact. Section 4 is the center piece of our study which contains a catalog of (sometimes only putative) cases of GG-borrowing from different parts of the world. The sketches vary in size on account of the different degree of intricacy and documentation of the cases involved. The conclusions are drawn in Section 5.

2 Basics When speaking about GG we take the same stance as Corbett (1991: 1) who defines GG on the basis of agreement. For a language to boast the category of GG the existence of agreement between different syntactic words is presupposed. Corbett’s own reference is to the first sentence in the following quote from Hockett (1958: 231) according to whom [g]enders are classes of nouns reflected in the behavior of associated words. To qualify as a gender system, the classification must be exhaustive and must not involve extensive intersection: that is, every noun must belong to one of the classes, and very few can belong to more than one.

For the nouns themselves it is assumed that GG is an inherent category which may be covert whereas the “associated words” frequently express GG overtly. GG is the cover term for a wide range of classificatory strategies in the nominal domain including Bantu-like concord classes. As we will see further below, the continuation of the quote after the first sentence postulates conditions which, if at all, are fulfilled by only the smallest number of languages addressed in this paper. Since agreement alone imposes already a criterion which many of the candidates for GG-borrowing fail to meet we briefly recapitulate what is minimally expected of a language to justify its classification as a GG-language. a) NP-internal agreement: modifiers such as adjectives, numerals, determiners, etc. co-vary formally according to the GG of the head noun. b) NP-V agreement: person indexes (preferably those of the 3SG) on the verb agree with the subject-NP (or other) in GG.

On the (almost im)possible emergence of grammatical gender | 5

c) Anaphor = pronominal GG: pronouns which are co-referential with nouns in a previous (or subsequent) clause are chosen according to the GG of the noun. A language is said to have GG if at least one of the above three kinds of agreement applies. The RL either has to create means of its own on the model of the DL (= PAT-borrowing) or borrow phonologically realized forms directly from the DL (= MAT-borrowing). We shortly discuss the above options in connection to MATborrowing. On Thomason’s (2001: 70–71) borrowing scale, the borrowing of pronouns which is a prerequisite for Option (c) is possible on Stage III More intense contact whereas adjectives (cf. Option (a)) – alongside nouns – belong to the wordclasses which are already involved in borrowing on the initial Stage I Casual contact. Option (b) on the other hand is only possible on Stage IV Intense contact where “the addition or loss of morphological categories that do not match in source and borrowing languages” (Thomason 2001: 71) may occur. On account of these differences in the borrowability of the elements which are sensitive to GG we formulate H2. Hypothesis 2: The contact-induced genesis of GG starts within the NP and involves the adjectival attributes of a given head noun. Intense contact includes very extensive bilingualism on the part of the speakers of the RL. This is also the precondition for “the wholesale loss or addition of agreement patterns” (Thomason 2001: 71). Since agreement is not restricted to GG the RL may display other kinds of agreement (number, case, etc.) to which GGagreement is added. If agreement was unknown in the RL the copy of GG epiphenomenally also introduces the principle of agreement. T. Stolz (2015: 270–272) summarizes the extant statements on the rise of agreement in language contact many of which do not allow us to formulate generalizations about the topic under review. Aikhenvald (2000: 388) considers the phenomenon to be “extremely rare”. Bakker et al. (2008: 176) assume that it is generally difficult to copy agreement marking because of lack of transparency. These and similar judgments put a damper on our initial enthusiasm as to the possibility of confirming H1–H2. Further discouraging factors come to the fore in Section 3.

6 | Thomas Stolz and Nataliya Levkovych

3 What usually happens 3.1 Reorganization There is no scarcity of hypotheses as to the behavior of GG under the conditions of language contact. In many cases, contact scenarios in which two GGlanguages participate are investigated. Loanword integration provides the backdrop for discussions about the competition between the principles of the GG-systems of the DL and the RL. Corbett (1991: 81–82) defends the idea that borrowed nouns tend to be assigned to a GG according to the already established rules of the RL but concedes that “they can have considerable effects on the gender system of the language which receives them.” In Lower Sorbian, for instance, nouns borrowed from German are frequently classified as feminine (= F) although they are masculine (= M) in the DL (German Kragen [M] ‘collar’ > Lower Sorbian kraga [F]) or neuter (= NT) (German Frühstück [NT] ‘breakfast’ > Lower Sorbian fryštuka [F]) (Bartels 2009: 321). In contrast, C. Stolz (2005, 2008, 2009) doubts that the GG-assignment rules of the RL are generally stronger than those of the DL. She advocates the idea that especially in situations of minority languages with wide-spread bilingualism in the majority language direct GG-copy can become the preferred option in loanword integration, i.e. the GG the noun has in the DL is kept also in the RL although the gender assignment rules of the latter are violated in this way. By way of example, we mention a case of covert GG-borrowing from the endangered Germanic variety Cymbrian spoken in Northern Italy. To mention only one of a long list of examples, the Cymbrian noun minute ‘minute’ is phonologically and semantically identical with German Minute. However, German Minute is F whereas Cymbrian minute is M according to the GG of Italian minuto (C. Stolz 2008: 418). In cases of this kind, GG is affected by language contact but not created. No matter how interesting they are for the theory of GG and language contact, they are not directly relevant to the topic of this study and thus are not further looked into.

3.2 Reduction and loss What has also gained attention among scholars of GG and/or language contact is the loss of GG in the course of language contact. Symptomatically, Johanson (2002: 104), with reference to the influence of GG-less Turkic languages on GGlanguages, claims that GG belongs to the “less essential, semantically relatively empty distinctions” which may fall victim to language contact easily. The same

On the (almost im)possible emergence of grammatical gender | 7

author mentions GG-loss in Mongolian4, Persian5, and Anatolian Greek caused by Turkic influence. Boretzky and Igla (1994: 71–72) argue that the Livonian dialect of Latvian has lost its erstwhile binary GG-system because of intense contacts with GG-less Balto-Finnic languages. Thomason (2001: 86–87) assumes “shift-induced interference from Livonian, a Uralic language.”6 Beside the wholesale loss of entire GG-systems, there are also several reports on more limited processes of reduction. This is the case in several Northwest Aramaic varieties which, according to Arnold (2007: 189), have neutralized their former GG-distinctions in the PL under the influence of neighboring Arabic varieties. Similarly, Reershemius (2007: 249) assumes that Northeast Yiddish varieties lost its NT because the Baltic languages it was in contact with for an extended period of time only have M and F. A striking case is that of the Pidgin varieties in the Bantu domain as described in detail by Heine (1973). The language-contact scenarios generally involve more or less closely related languages all of which boast differentiated GG-systems. In spite of the structural similarity between the partners in contact, the Pidgins which developed in situations of Bantu-Bantu contacts are characterized by drastically reduced numbers of nominal classes. Of the eleven GGprefix pairs postulated for Standard Swahili, Kenya-Pidgin-Swahili retains only three (Heine 1973: 185). What is more, the typical Bantu concord system has collapsed almost completely (Heine 1973: 190–191) so that there is no longer agreement in Kenya-Pidgin-Swahili – and thus it becomes difficult to classify this pidgin as GG-language. Examples (1)–(2) illustrate the lack of agreement on the verb in the Pidginized variety (Heine 1973: 191).

|| 4 According to Poppe (1964: 101) the GG-system (verb-based!) was already on the verge of extinction in the earliest Middle Mongolian documents. Since the Middle Mongolian lexicon gives evidence of extensive borrowing from Turkic languages it is possible that the loss of GG in Mongolian was at least precipitated by language contact with Turkic. 5 Duke (2009: 77) reviews the controversy between those scholars who consider language contact to be the main reason for the loss of GG in Persian (e.g. Szemerényi 1980) and their opponents who like Schmitt (1989: 98–99) emphasize the role of phonological changes (fixed penultimate accent caused the reduction of final syllables) in the process. 6 Putniņš (1985: 106) argues that in the NOM.PL one still hears the occasional F form (as e.g. dialectal beĩgs = Standard Latvian beigas ‘end’ – a F plurale tantum – which alternates with M.PL beĩgi in the Livonian variety of Latvian). These putative remnants of the F are typical of the speech style of those Livonian dialect speakers who attended Latvian schools in the inter-war period, meaning: they are probably innovations in the sense that patterns of the standard have entered the dialect grammar. Another issue to be discussed elsewhere is the role of (dialectinternal) phonological reduction in the GG-loss.

8 | Thomas Stolz and Nataliya Levkovych

(1)

Standard Swahili wa-tu wa-na-kuja CL.2-human CL.2-PRS-come ‘The people are coming.’

(2)

Kenya-Pidgin-Swahili7 wa-tu na-kuja PL-human AOR-come ‘The people are coming/have come/will come.’

With certain modifications, the picture is largely the same for all pidgins investigated by Heine (1973: 234). Duke (2009: 209–224) dedicates a case study to the diachrony of GG in Afrikaans. To her mind, among the five Germanic languages she inquired into, Afrikaans is the “lone exception” in the sense that “imperfect learning through language contact clearly played a role in the loss of agreement” (Duke 2009: 265).8 In all other cases of GG-reduction in Germanic – Dutch, English9, Mainland Scandinavian – she is skeptical as to the importance the factor language contact should be given because purely internal language change seems to provide an equally valid alternative explanation. Enger (2011) and Dolberg (2019) look at Mainland Scandinavian and Anglo-Saxon again with a strong focus on processes of reduction and loss in language-contact situations. There is thus a plethora of studies for which language contact counts as a potential threat to GG even if both languages involved in the contact situation are GG-languages.

3.3 Resistance Contact between languages which differ as to the presence, size, or composition of GG-systems does not necessarily imply that GG is doomed to reduction and loss. Pensalfini and Meakins (2019) report on the age-long contacts between two

|| 7 Heine (1973: 213) argues that the present tense marker na- of Standard Swahili is neutral as to tense in Kenya-Pidgin-Swahili and therefore analyzes it as aorist. 8 Siemund (2008: 185–186) discusses data from Afrikaans which are suggestive of a (perhaps severely limited) system of pronominal GG, i.e. a case which resembles our Option (c) in Section 2. Duke (2009: 210–211) employs practically the same examples. 9 As to pronominal GG in English, Duke (2009: 240–241) takes account only of the Early Middle English period. Siemund (2008: 174) on the other hand defends the idea that there is an intricately organized system of pronominal GG also in contemporary English.

On the (almost im)possible emergence of grammatical gender | 9

genetically unrelated Australian languages, viz. GG-less Mudburra and the GGlanguage Jingulu. There are literally hundreds of nouns borrowed from the one into the other and vice versa. Irrespective of the amount of loan nouns in both of the partners in contact, the authors could not identify any substantial changes in the domain of GG. Mudburra is still GG-less just as Jingulu remains a largely well-behaved GG-language. It is unclear whether the observed increase of supposed violations of the agreement patterns in Jingulu can be attributed to influence by the GG-less Mudburra. These violations consist of the employment of the macro-GGs (i.e. M as animate-default and NT as inanimate-default) in lieu of using the fourfold distinctions of M, F, vegetable, and NT (Pensalfini and Meakins 2019: 452). It is conceivable though that we are dealing with a language-internal process independent of language contact. Even under Creolization, GG of the European lexifier is not always completely abandoned. Holm (2008: 305–308) mentions several Portuguese-based Creoles in which GG-agreement according to the Portuguese pattern is possible at least optionally (Kabuverdianu, Macau Creole Portuguese). Maurer (2013: 154–155) adds a small number of further cases among them Spanish-based Chabacano (Zamboanga) and French-based Louisiana Creole. In most of the cases, GGagreement is optional and very often possible only with a limited set of adjectives. Maurer’s sample also contains other kinds of contact languages such as Mixed Languages with Michif being one of the featured cases. According to Maurer’s classification, Michif like Louisiana Creole has many (but not all) adjectives which agree in GG with the noun. Rosen and Gillon (2018: 75–106) scrutinize this issue to conclude that GG-agreement in Michif is an intricate mixture of a Cree-like animacy-based component and a sex-based French component. These and similar cases are indicative of the possibility of GG to survive even in situations of extreme pressure such as Creolization and language mixing. If GG is strong enough to overcome the vicissitudes of language contact it might also be fit for transferal according to H1.

4 Pieces of evidence for GG-borrowing In his treatise of GG in the history of the Romance languages Loporcaro (2018: 291) takes issue with the widely diffused idea that language contact inevitably leads to reduction. The author adduces evidence from a variety of Romance languages whose GG-systems have undergone “complexification” in the sense that the number of GG-categories increased owing to language-contact with

10 | Thomas Stolz and Nataliya Levkovych

another GG-language. New GG-distinctions arose in the context of contact between two GG-languages but GG as such was already there prior to contact. A case in point is mentioned in Gardani (2020: 275). The northern variety of Istro-Romanian is shown to borrow NT from Croatian so that the original binary GG-system becomes a ternary GG-system with the distinction of M≠F≠NT. According to Petrovici (1967: 1526) and Kovačec (1968: 87–88), the NT occurs exclusively in the speech of native speakers younger than forty years of age whereas the elder generation assimilates Croatian NT-nouns by way of reclassifying them as F. It is true that a new GG has a reason in a course of language contact. However, the RL already had a GG-system before Croatian influence became crucial. What is of more interest for this study, are cases in which GG emerges from scratch. In Section 4.2.1, we review a selection of statements from the extant literature which, at least on the surface, seem to refer to cases of contact-borne GGemergence. On closer inspection, it turns out that many of these cases give evidence of something else. But there are other cases which seem to be more promising for the topic of this paper. These cases are addressed in Sections 4.1 and from Section 4.2.2 onwards.

4.1 Hup for a start It is instructive to quote verbatim Matras’s (2007: 42–43) summary of the findings the contributors to Matras and Sakel (2007) have made in the domain of GG and language contact: Developments affecting gender marking include a shift from feminine to masculine in Mosetén, the loss of neuter gender in NE Yiddish (as in the contact languages Lithuanian and Latvian), and the incipient system of nominal classifiers in Hup (classifying inanimates by shape and animates by gender), adopted from Tukano. Definitely the most extensive development in this domain is the borrowing of Chinese classifiers into Vietnamese. There are, in addition, some marginal phenomena such as the loss of gender in pronouns (in Rumungro as well as in North-eastern Neo-Aramaic). Our sample gives us the impression that gender in the narrow sense (of a two- or three-gender system) is more stable in contact situations than more differentiated systems, where influence might be more extensive.

Except for the case of Hup, the phenomenology which can be reconstructed from this quote is not supportive of H1. Since classifiers in Chinese and Vietnamese are usually not involved in agreement it is clear that Matras’s notion of GG is different from that propagated by Corbett (1991). If agreement is not crucial for Matras’s definition of GG it makes sense to have a look at the Hup data to

On the (almost im)possible emergence of grammatical gender | 11

determine whether we are indeed facing an instance of GG arising in language contact. From Epps (2007a: 555) we learn that the contact-induced innovation in Hup is still in its early stages and affects only a few nouns (most of them neologisms). This restriction to a subset of the nominal lexicon of the language is in conflict with Hockett’s above criterion according to which each and every noun of a given language has to participate in the GG-system. Given that the situation described for Hup is recurrent across language contacts, we can formulate H3. Hypothesis 3: The contact-induced NP-internal genesis of GG starts with a small subset of the nouns of the replica language. Moreover, we are dealing with PAT-borrowing because the classifiers themselves are grammaticalized from Hup material. In Epps (2007b: 275) the information is conveyed that “classifiers and gender markers in Hup also occur with numerals, demonstratives, adjectives, relativized verbal forms, and on nouns as derivational markers.” This opens possibilities for agreement phenomena. In the reference grammar of Hup Epps (2008: 277–278) accordingly mentions two phenomena of interest to us, namely anaphoric reference (3) and agreement (4). (3)

Hup – anaphor Y [Epps 2008: 277] nup bóda=tat-ʔěʔ, núp [d’ɔh-yǽt-ǽp]=tat this ball=FRUIT-PERF this rot-lie_on_ground-DEP=FRUIT ‘This was a ball, this rotting round thing lying here.’

(4)

Hup – agreement Y nup=(g’æt) pɨhɨ́t=g’æt this=LEAF banana=LEAF ‘this big banana leaf’

[Epps 2008: 278] tɨh=pǒg=(g’æt) 3SG=big=LEAF

Example (3) does not fit the description of Option (c) above fully whereas example (4) seems to fulfill the requirements of Option (a). Superficially, this speaks in favor of H1. However, Epps (2008: 278) is cautious not to jump to conclusions when she states that Hup classifying terms can arguably serve a marginal agreement-marking function by virtue of appearing, optionally, on multiple constituents of the clause []. However, this agreement-like phenomenon is extremely rare in natural discourse in Hup (being confined mostly to elicitation contexts), and may be better characterized as apposition of distinct noun phrases, rather than marking concord within a single noun phrase.

The exact structural status of examples like (4) in Hup is difficult to determine on the basis of the numerically insufficient empirical evidence. What is im-

12 | Thomas Stolz and Nataliya Levkovych

portant to note nevertheless is the optional character of the phenomenon. We seize the opportunity to refine H2–H3 by way of integrating optionality into H4. Hypothesis 4: The contact-induced NP-internal genesis of main optional for an indefinite period of time.

GG

may re-

What H2–H4 suggest is that we cannot expect to find a fully-blown GG-system to be borrowed as such in language contact. It is by far more realistic to assume that GG-borrowing is an extended process which starts locally and might eventually be generalized unless factors intervene which put a stop to the development.

4.2 Supporters The genesis of GG is generally a rarum in the documented dynamics of human languages whether language contact is accounted for or not (Field 2002: 192). Following Wohlgemuth and Cysouw (2010), rara of this kind call for an extra explanation unless they can be counted out on logical grounds. Claudi (1985: 139–140) describes the rise of a GG-system in Zande and explicitly excludes language contact as a factor in this process. Similarly, Corbett (1991: 313) mentions several cases of GG-systems which have undergone further internal differentiation so that new sub-categories have come into being – although without any discernible external influence. Support for H1 comes, however, from Gardani (2012: 77) who claims that GG as inherent property of nouns can be borrowed much more easily and much more frequently than morphosyntactic categories such as case. Matras (2009: 174) goes a step further since he assumes that “[g]ender may also be introduced into a language along with borrowed forms,” meaning: an erstwhile GG-less RL may acquire GG by way of borrowing gendered items from a DL. These encouraging statements need to be looked into more deeply to determine to what extent they are supportive of H1 in the first place.

4.2.1 False friends For a start, we have to take a stance as to the distinction of borrowing and codeswitching. The nature of our sources is such that we cannot guarantee that each of the cases we are about to discuss in what follows is a genuine instance of borrowing. It is exceptional that authors explicitly address this problematic issue. Janurik (2015: 210–214) investigates GG-phenomena in Erzya-Russian

On the (almost im)possible emergence of grammatical gender | 13

bilingual discourse. The Uralic language Erzya is GG-less whereas the socially dominant Russian is a GG-language. Janurik (2015: 210) makes the following distinction: Russian borrowings in Erzya are insensitive to GG whereas in ErzyaRussian codeswitching GG-agreement of Russian controllers and targets can be observed as in (5). (5)

Erzya Y [Janurik 2015: 210] tet’a-ń jondo baba-m ul’ńe-š pek strog-aja dad-GEN side grandmother-1SG.POSS be-3SG.PST very strict-FEM ‘My grandmother from father’s side was very strict.’

In this example, the Russian F loan noun baba ‘grandmother’ and the predicative adjective strogaja [F] ‘strict’ (< Russian strogij [M]/strogaja [F] ‘strict; severe’) agree in GG. The usual Russian morphology appears on the adjective. Janurik (2015: 214) concludes that “the gender agreement rules of the Russian language have started to infiltrate the Erzya-Russian spoken bilingual variety.” She goes on to claim “that a category of gender is emerging in the Erzya language.” The author is cautious to add that this conclusion holds for a variety of Erzya which is characterized by abundant codeswitching. It remains unclear though whether GG-agreement itself is considered to be a manifestation of codeswitching. If the so-called bilingual variety is accepted as a variety in its own right the above case could easily be registered among the instances of contact-induced emergence of GG. Seifart’s (2020) World-Wide Survey of Affix Borrowing (AfBo) reports on six languages which have borrowed bound GG-morphology (altogether eleven affixes are registered). The DL-RL pairs of which AfBo takes account are (i) Bora/Resígaro, (ii) Greek/Cypriot Arabic (aka Kormatiki), (iii) Hindi/Kurux, (iv) Latin/Basque, (v) Russian/Yiddish, and (vi) Swedish/Finnish. Except (iv) and (vi), these contact scenarios involve two GG-languages and fall thus outside the scope of this study. Moreover, (iv) and (vi) too are not genuine cases of GGborrowing since no agreement applies. We are dealing with simple sex-marking. The following paragraphs and Section 4.2.1.1–4.2.1.2 are meant to show that both kinds of excluded cases are encountered time and again in the literature under the rubric of GG-borrowing. Matras (2009: 174) bases his hypothesis that GG may arise through language contact on data from Indonesian presented by Tadmor (2007). In his paper on grammatical borrowing in Indonesian Tadmor (2007: 311–313) shows that borrowing of gendered nouns from Sanskrit and Arabic has added a considerable number of words to the lexicon which are specified for the sex of the referent. The loans from Sanskrit have a final -a for male sex and -i for female sex as in

14 | Thomas Stolz and Nataliya Levkovych

putra ‘son’/putri ‘daughter’, siswa ‘male high school student’/siswi ‘female high school student’, dewa ‘god’/dewi ‘goddess’, etc.10 These pairings are numerous enough to have triggered cases of false analogy in the Austronesian part of the lexicon where the sex-neutral pemuda ‘young person’ was reinterpreted as ‘young male person’ because of the final vowel. Accordingly, a new word pemudi ‘young female person’ was created on the Sanskrit model. In addition, the Sanskrit gendered derivational suffixes of agent nouns -wan/-man (male) and -wati (female) entered the Indonesian lexicon as in karyawan ‘male worker’/karyawati ‘female worker’. Arabic loanwords in the sphere of Islam may also be sensitive to sex (and number) distinctions like in mukmin ‘male believer of Islam’/mukminah ‘female believer of Islam’/mikminin ‘male believers of Islam’/mukminat ‘female believers of Islam’ (< Arabic mu’min/mu’minah/ mu’minīn/mu’mināt). However, Tadmor (2007: 312) states that [i]n most loanwords with distinct male and female forms, the female form is not in common use, and the male form in fact serves as the unmarked member (which can refer to females as well, especially colloquially).

Superficially, this statement is in line with H3–H4. However, in contrast to the Hup case discussed in Section 4.1, Indonesian fails to meet the most crucial criterion of GG, namely the existence of agreement. The Indonesian data can be interpreted as derivational sex-marking without any further morphosyntactic effects. This is not GG. As we will understand in due course, the Indonesian case is relevant to our study nevertheless. Gardani (2012: 83) who by the way does not speak of the contact-borne emergence of GG because his focus is exclusively on the transfer of bound morphology, refers to the Indonesian example pemuda/pemudi ‘male/female young person’ from Tadmor (2007). These words are presented in isolation, i.e. no syntactic context is given. The same format is chosen for two further examples from Kharia (Munda) and Kurux (Dravidian) which “have borrowed from Hindi (Indo-Aryan) the suffix -i marking the feminine gender” (Gardani 2012: 83). The two cases have to be kept apart. 4.2.1.1 Mixing of GG-systems In the case of Kurux-Hindi contact, two GG-languages are involved, i.e. the borrowing of the Hindi F suffix has not introduced GG as such into Kurux which boasts an original binary GG-system distinguishing M and non-M. Kurux has bor|| 10 For a parallel case in Kawi (Old Javanese) see Royen (1929: 566).

On the (almost im)possible emergence of grammatical gender | 15

rowed a number of adjectives from Hindi which come in two forms, M and F like leca: [M]/lici: [F] ‘bow-legged’ (with agreement being restricted to predicative adjectives and thus not conforming to H2). Note particularly that autochthonous Dravidian adjectives never agree in any category with the head noun. The erstwhile binary GG-system has been replaced with a ternary system since F has entered the scene via language contact (Kobayashi and Tirkey 2017: 105–106). Similar cases from the Indian subcontinent have been discussed by Royen (1929: 567–569) already. One of his examples is Santali (Munda) for which the author provides a pair of examples reproduced in (6). (6)

Santali Y lelh-a koṛ-a stupid-M young_person-M ‘stupid boy’

[Royen 1929: 567; Neukom 2001: 56] lelh-i koṛ-i stupid-F young_person-F ‘stupid girl’

Santali has an autochthonous binary GG-system based on the distinction of animate and inanimate. Neukom (2001: 22) observes that owing to the influence by Indo-Aryan languages “some adjectives show natural gender agreement with their heads.” The pattern shown in (6) is restricted to loans from Indo-Aryan languages notably from Hindi and Bengali. According to Neukom (2001: 55), there is no distinct adjectival word-class in Santali except these borrowed adjectives. Seifart’s (2012) detailed description of the contact-induced changes in the GG-system of the Arawakan language Resígaro which has borrowed heavily from Bora yields a similar picture. In the GG-marking of animates for instance M is a newcomer to the system which formerly relied on the distinction of F vs. non-F (Seifart 2012: 489). Kurux, Santali, and Resígaro do not provide proof of the contact-induced genesis of GG. Together with Michif mentioned in Section 3.3 the two languages from India and the language from Amazonia are of general interest for the study of GG under the conditions of language contact because they show how mixed GG-systems may arise. In three of these cases (Michif, Resígaro, and Santali) the mixture involves a component which is based on animacy and a second component which is sex-based. According to Corbett (2005b), sex-based GG-systems are typologically unmarked in the sense that cross-linguistically they are attested three times as frequently as animacy-based GG-systems. In the languages addressed above the two systems co-exist and are thus candidates for the category of combined GG as discussed by Corbett (1991: 184–188) and Corbett et al. (2017). GG-system mixing can be considered a kind of parallel system borrowing as discussed by Kossmann (2010: 469) who refers inter alia to the integration of Arabic loans in Cypriot Greek and Greek loans in Coptic.

16 | Thomas Stolz and Nataliya Levkovych

4.2.1.2 Sex-marking As to Gardani’s Kharia case, the evidence is also not in support of H1 but for different reasons. Peterson (2011: 139) denies the existence of GG in Kharia. What is more, the many word pairs borrowed from Hindi along with their sexspecifying morphology such as buɖha ‘old man’/buɖhi ‘old woman’ are said to be unproductive and the pattern has “no further morphosyntactic ramifications, especially since the ‘pronominal’ system distinguishes neither natural gender nor size” (Peterson 2011: 139). The Kharia case is of the same kind as the Indonesian example above. The many sex-specified loans from Hindi notwithstanding, Kharia has not borrowed GG from Hindi. It is what it was prior to contact with Indo-Aryan, a GG-less language. The introduction of sex-specific word-forms and derivational devices into GG-less languages is reported for a variety of further languages (Aikhenvald 2000: 387–388, fn. 22). Among these we find the Basque variety of Lower Navarre (spoken in the Southwest of France) where the sex-specifying agent noun suffix -essa from Gascon has been adapted as -(t)sa to mark female agents as in errejent ‘(male) primary-school teacher’/errejent-sa ‘female primary-school teacher’. The loan-suffix is also used on Basque stems (Haase 1993: 49–50). Hualde and Ortiz de Urbina (2003: 117) mention that the use of this suffix is also attested in southerly varieties of Basque but not as frequently as to the north of the border between Spain and France. The integration of the Gascon sexspecifying suffix has not established GG as a category in the replica language. However, some varieties of Basque to be addressed below (Section 4.2.8) seem to develop GG under the influence of Spanish. Sierra Popoluca has borrowed the sex-differentiating agent noun suffixes -eeroj and -eeraj from Spanish -ero/-era (Gutiérrez Morales 2012: 223). Comparable evidence of the contact-induced introduction of sex-distinctions on agent nouns comes from a variety of Turkic languages which have borrowed the morphological means from Slavic (Boretzky and Igla 1994: 79). A case in point is the Troki variety of Karaim for which Kowalski (1929: xxxiii–xxxiv) identifies the suffixes -ka/-ča/-č́a as loans from (unidentified) Slavic languages as in Karai̯ ‘(male) Karaim’/Karai̯-ka ‘female Karaim’. Kowalski (1929: xxxiii) considers this and similar examples to illustrate the distinction of GG-categories. Since there are no cases of agreement in his description of the Troki variety it can be safely assumed that the phenomenon is an instance of sex-marking. The issue of contact-induced GG in Karaim will be raised again below (Section 4.2.4). Sex-marking is not the same as GG-marking because of the absence of agreement phenomena. In spite of this fundamental difference between the two marking types, sex-marking can be considered a kind of (optional) preparatory

On the (almost im)possible emergence of grammatical gender | 17

stage for the introduction of GG – a bridge according to Leiss (2005). This idea is captured by H5. Hypothesis 5: Prior to the contact-induced NP-internal genesis of GG the sex-marking patterns from the DL.

RL may borrow

4.2.1.3 Diminutives Besides the sex-specific marking of agent nouns, diminutives pop up frequently in the domain of borrowed morphology. An especially successful DL is Spanish whose gendered diminutives in -(t)ito [M]/-(t)ita [F] (as in hijo ‘son’ → hij-ito ‘little/dear son’/hija ‘daughter’ → hij-ita ‘little/dear daughter’) have made it into the morphological inventories of many a RL. Fischer (2007: 391) finds evidence of these suffixes on native stems (including proper names) of the GG-less Polynesian language Rapanui. There is no agreement, the use of the diminutives is infrequent, and considered individual codeswitching by the language expert. Chamoreau (2012) dedicates a study to the Spanish diminutive markers in Mesoamerican languages. Her revealing results are summarized in Table 1. The suffixes are given in the shape they have in Spanish. Table 1: Distribution of Spanish diminutive suffixes

Replica language

First names

Borrowings

General

Tepehua

-ito/-ita

Purepecha Mexicanero

-ito/-ita

-ito

-ito

-ito/-ita

-ito/-ita

Yucatec

-ito

-ito/-ita

-ito/-ita

-ito/-ita

Beyond

-ito/-ita

Grey shading is indicative of the employment of the Spanish F diminutive suffix -ita. Wherever -ita can be used, the M equivalent is also in use. Only in the domain of (mostly Spanish-derived) first names do native speakers consistently respect the sex of the name-bearing person (Chamoreau 2012: 82). In other contexts, the position of -ita is sometimes precarious because it can be replaced with -ito. The Yucatec case is especially interesting since this Mayan language is the only member of Chamoreau’s sample which seems to have developed GGagreement (as apostrophized in the rightmost column). It makes sense therefore to have a closer look at her evidence. Examples (7)–(10) respect Chamoreau’s (2012: 84) original glosses and translations although the glosses do not fully

18 | Thomas Stolz and Nataliya Levkovych

correspond to our own practice. The examples have been provided originally via personal communication via the native-speaker linguist Briceño Chel. (7)

Yucatec – M / Y polok-ito le boox-ito-o’ fat-DIM DEM dark/Maya_man-DIM-DEM ‘The Maya man is fat/chubby.’

(8)

Yucatec – F / Y bek’ech-ita u y-íits’in thin-DIM A.3SG POS-younger_sister ‘His younger sister is slender.’

(9)

Yucatec – M / Y sak-ito in white-DIM POS.1SG ‘My elder brother is white.’

suku’un elder_brother

Yucatec – F / Y xlo’obayan-ita in young-DIM POS.1SG ‘My elder sister is youthful.’

kiik elder_sister

(10)

Several comments are called for. First of all, except the diminutive suffixes themselves, there are no Spanish elements in these sentences. This means that the affixes are attached to native stems. Secondly, four of five instances of diminutives are hosted by the “adjective”. What is more these adjectives are all used predicatively – a situation which we would not expect on the basis of H2. The sole instance of a diminutive occurring outside the predicates is boox-ito in (7) which is again based on an adjective boox ‘dark’. None of the kinship nouns in (8)–(10) combines with a diminutive. The natural sex of the referent is inherent. The predicate agrees in form with the natural sex: it takes the suffix -ita if the kinship term refers to a female human and -ito if the referent is male. This reeks of GG-agreement. However, the status of the above patterns is unclear. Chamoreau (2012: 84) claims that the F form has entered the language only recently and that its use is not compulsory.11 Some speakers may even reject its use. The author emphasizes that “[t]he forms are always used with biologically based nouns” (Chamoreau

|| 11 C. Stolz (p.c.) tells us that in her corpus of 60 hours natural discourse in Yucatec there is no evidence of -ita being used productively by native speakers.

On the (almost im)possible emergence of grammatical gender | 19

2012: 84), meaning animacy is crucial (perhaps even human reference). However, examples (7)–(10) do not feature diminutives attached to “biologically based nouns”, in the first place. The entire predication is about a human being. Note that Yucatec (optionally) specifies sex by way of prefixing x- [F] and h- [M] to autochthonous noun stems (Chamoreau 2012: 78–79). The existence of this morphological pattern may have facilitated the introduction of the Spanish sexmarking devices (Gardani 2019: 114). Last but not least, Chamoreau (2012: 87) herself doubts that we are facing GG-agreement in Yucatec. The question arises what it is we are confronted with if GG-agreement must be ruled out. Muysken (2001: 69–72) makes interesting observations as to the use of the Spanish-derived diminutive suffixes -ito/-ita in Bolivian Quechua. There seem to be conflicting principles which compete when it comes to choosing the diminutive suffix. On the one hand, there are purely phonological (i.e. vowelharmonic) factors which determine the choice to some extent. The quality of the stem-final vowel is crucial in this context with stems ending in -a taking -ita (like wallpa ‘chicken’ → wallp-ita). Those ending in -u opt for -itu (like chuku ‘cap, hat’ → chuk-itu) whereas all other stem-final segments allow for combinations with -situ (like rumi ‘stone’ → rumi-situ) (Muysken 2001: 70). The phonological principle is at odds with the Spanish rules. However, Muysken (2001: 70) also finds that “natural gender, and often the gender of the semantic equivalent in Spanish, determine[s] the forms encountered.” Quechua warmi ‘woman’ → warmi-(si)ta has a female referent and thus may take -(si)ta although the stemfinal segment would call for the use of -situ. Quechua inanimate nouns may host the F -ita because the Spanish translation equivalent has F GG (as e.g. wasi ‘house’ → was-ita because Spanish casa ‘house’ is F). Both vowel harmony and inanimate “gender-copy” from Spanish suggest that the case of Bolivian Quechua cannot be subsumed under sex-marking. At the same time, it is also impossible to classify the data as instances of GG because there is no agreement. We are still without tangible proof of what we are eagerly searching for, namely GG arising from language contact. However, as with sex-marking in the foregoing section, we consider diminution a facilitating factor which might pave the ground for the borrowing of GG. Affective morphology is often used with reference to humans whose sex is relatively easy to determine. H6 reflects this possibility. Hypothesis 6: Prior to the contact-induced NP-internal genesis of GG the RL may borrow sex-marked affective morphology for expressions refer-

ring to human beings.

20 | Thomas Stolz and Nataliya Levkovych

Note that proper names (especially first names derived from the DL) are frequently mentioned as hosts of sex-specific affective morphology. Bolivian Quechua shows that sex-specifying function of the diminutive suffixes may be overwritten by phonological factors.

4.2.2

Glimpses from Aikhenvald’s case-book

The number of doubtful instances of GG-borrowing exceeds that of bona fide cases by far. There are at least two reasons for feeling uncomfortable with most of the alleged pieces of evidence. On the one hand, more often than not there are also counter-claims according to which GG-borrowing has not happened at all. Secondly, the empirical proof of the vast majority of the cases is insufficient either because the borrowing process is assumed for a historically undocumented (“pre-historic”) period or the phenomenon is described on the basis of less than a handful of syntactically isolated NPs. This does not mean that one has to delete these cases from the list of potential instances of GG-borrowing. However, none of the cases is uncontroversial either especially if looked at separately. Perhaps the sporadic examples become more convincing if studied from a comparative perspective. Aikhenvald (2000: 384) mentions three possible cases of “[i]ndirect diffusion of closed grammatical systems” such as the F/non-F distinction in Kakua (Makú) which may be attributable to areal influence from East Tucano. Similarly, she speculates whether the incipient GG-distinctions of the Oceanic languages in Southwest New Britain might result from contacts with the surrounding Papuan languages. The third scenario involves Niger-Congo languages as DL and Eastern Nilotic languages as RL. Since the author invokes indirect diffusion, we are dealing with PAT-borrowing. PAT-borrowing is much harder to prove than MAT-borrowing. Thus this short list of candidates for the status of GG-borrowing cannot fully satisfy our need for hard facts as proof. In the same source, reference is made also to (superficially) more tangible cases of direct diffusion, namely Ilocano, Tagalog and Ayacucho Quechua all of which are presented as examples of GG-borrowing from Spanish. It is worthwhile taking a closer look at what exactly Aikhenvald has to say about these languages. As to Ilocano, Aikhenvald (2000: 388) argues that [s]imilar to many Austronesian languages, Ilocano does not have grammatical gender; however, a large number of loan adjectives from Spanish resulted in the creation of masculine and feminine distinctions in loan adjectives, e.g. tsismoso ‘gossipy’ (masc.), tsismosa ‘gossipy’ (fem.), from Spanish chismoso, chismosa ‘a gossip, gossipy’. Nouns bor-

On the (almost im)possible emergence of grammatical gender | 21

rowed from Spanish often have two gender forms to distinguish the sexes, e.g. kosinero ‘cook (masc.)’, kosinera ‘cook (fem.)’, sugalero ‘gambler (masc.)’, sugalera ‘gambler (fem.).

This quote stems from a paragraph which is dedicated to the borrowing of agreement systems. The (co-)existence of Hispanisms which come in pairs of word-forms is postulated for the lexicon of Ilocano. However, no example of the behavior of the Spanish-derived adjectives and nouns in syntax is given. It thus remains unclear whether there is GG-agreement only between nouns and adjectives of Spanish origin, if at all. Even Rubino’s (2000) Ilocano dictionary and grammar does not yield evidence of NP-internal GG-agreement – a topic that is not mentioned at all in this source. What we find instead are Spanish-derived predicate nouns and adjectives whose morphological form varies according to the sex of the subject-NP as in maestr-o ni David ‘David is a teacher’ and maestr-a ni Lisa ‘Lisa is a teacher’ (adapted from Espiritu 1984). For the time being it suffices to register Ilocano as yet another case where the little evidence we have does not conform to H2. The situation described in the foregoing paragraph is much the same for Ilocano’s close relative Cebuano – a Philippine language which has undergone massive lexical Hispanization with many nouns and adjectives which are GGsensitive in the DL (Quilis 1973: 49, 1992: 140). The picture changes slightly if we turn our attention to Tagalog. Aikhenvald (2000: 48, fn. 27) presents the Tagalog case only in a footnote where she tells the reader that [c]ertain nouns referring to humans and adjectives used to modify them, most or probably all of which are loans from Spanish, distinguish two genders, e.g. loko-ng Pinoy (crazy:MASC-ATT Philippine) ‘a crazy Philippine man’, loka-ng Pinay (crazy:FEM-ATT Philippine) ‘a crazy Philippine woman’.

The evidence which is based on personal communication by a fellow-linguist seems to meet our expectations. The examples are NPs which illustrate adjectival attribution of an ethnonym. The prenominal attribute and the head noun are linked by the cliticized linker particle -ng. The lexical elements in the construction are of Spanish origin, namely the adjective loc-o [M]/loc-a [F] ‘crazy’ > lok-o/lok-a and the noun (Spanish Filipin-o [M]/Filipin-a [F] >) Pin-oy ‘(male) Filipino’/Pin-ay ‘Filipina’. The Tagalog case has repeatedly attracted the attention of linguists in the past (Steinkrüger 2008: 210). López (1965: 503) states the fact of GG-marked borrowings from Spanish. Bowen (1971: 946–947) goes further as he assumes that NP-internal GG-agreement is compulsory in Tagalog. In their reference grammar of Tagalog, Schachter and Otanes (1972: 96–97 and 196–198) provide ample

22 | Thomas Stolz and Nataliya Levkovych

evidence of predicative constructions which invite an interpretation as instances of GG-agreement such as (11)–(12). (11)

(12)

Tagalog Komik-a funny-F ‘Linda is funny.’ Tagalog Komik-o funny-M ‘Fred is funny.’

si PROP.ART

[Schachter and Otanes 1972: 197] Linda Linda [Schachter and Otanes 1972: 197]

si PROP.ART

Fred Fred

The form of the predicative adjective co-varies with the sex of the referent of the subject-NP. The word-form ending in -o is used with all kinds of inanimate and animate subjects whereas the a-form is restricted to combinations with female proper names and human nouns with female referents. Accordingly, Tagalog is one of the very few Austronesian languages in Corbett’s (2005a–c) sample for which a sex-based GG-system is reported. The issue is addressed summarily in T. Stolz (2002, 2012: 98–101). What strikes the eye is the scarcity of examples of the phenomenon within the NP. In (13)–(14), we reproduce the two examples of attributive agreement provided by Baklanova (2016: 28–29) who found them in her journalistic corpus of modern Tagalog (the English translations are Baklanova’s). (13)

Tagalog Kahit even ang

[Baklanova 2016: 28–29] konti ay nakahinga nang maluwag but INVERS ADJ:breathe when QUAL:loose dalaga-ng misteriyos-a FOC girl-LINK mysterious-F ‘At least a little, the mysterious girl got relieved.’

(14)

Tagalog ang

[Baklanova 2016: 28–29] glamoros-a ’t glamoros-o-ng FOC PL glamorous-F and glamorous-M-LINK Claudine Barreto, Piolo Pascual Claudine Barreto Piolo Pascual ‘glamorous Claudine Barreto an’ glamorous Piolo Pascual’ mga

Baklanova’s (2016: 31) major claim is that it is not sufficient for a Tagalog word to be of Spanish origin to participate in agreement patterns as originally assumed by T. Stolz (2012: 100). To her mind there is an additional requirement,

On the (almost im)possible emergence of grammatical gender | 23

namely that only items hosting Spanish derivational morphology are able to reflect GG overtly. The idea behind this stricter version of the constraint is that a) borrowed English adjectives like dead are made to look Spanish by adding -o/-a to them yielding pairs of word-forms such as ded-o [M]/ded-a [F] ‘dead’, b) Spanish derivational affixes such as -ero/-era may attach to Tagalog stems to yield new sex-specified word-forms like Tagalog utang ‘debt’ → utang-ero ‘male debtor’/utang-era ‘female debtor’. Moreover Baklanova (2016: 30) classifies the o-/a-suffixes as derivational. This classification is doubtful if the above phenomena are counted as fully-blown instances of GG-agreement. In this case, we would expect -o and -a to belong to the class of inflectional means. On the other hand, if the Tagalog data are understood as mere instances of sex-marking the derivational analysis might be upheld albeit with reservations. In addition, Baklanova (2017: 28) assumes that GG is expanding in Tagalog via the productive use of borrowed Spanish derivational morphology and the Hispanization of English loans. It is conceivable that the existence of GGagreement in Tagalog has also contributed to the partial survival of GG in the Spanish-based Creole Chabacano in the Philippines (Steinkrüger 2009: 177). At the same time, Steinkrüger (2008: 210) quotes a personal communication by John Wolff, renowned expert of Tagalog, who doubts that the rules of GGagreement are observed by all speakers of the language. This statement falls square with the remarks made in connection with Yucatec in Section 4.2.1.3. Simplifying, the speech-community of the replica language is divided as to GG. H7 is meant to capture the aspect of sociolinguistic variation. Hypothesis 7: The contact-induced NP-internal genesis of GG may remain the mark of a certain social group. H7 probably also holds for the next case on our agenda. Aikhenvald (2000: 48) discusses the problem of languages which restrict their GG-distinctions to only a subset of the nouns and adjectives in their lexicon. We are familiar with this problem from our discussion in connection with H3. Ayacucho Quechua is her paradigm case. With reference to this language she observes that only a few nouns with human referents require agreement with a closed class of adjectives borrowed from Spanish, e.g. loko maqta ‘crazy:MASC boy’, loka sipas ‘crazy:FEM girl’. A few nouns with a human referent distinguish feminine and masculine forms, e.g. biyudo ‘widower’, biyuda ‘widow’ []. There is no agreement elsewhere in the language. It is problematic whether such ‘exceptions’ should be considered separate noun classes at all.

24 | Thomas Stolz and Nataliya Levkovych

Parker (1969: 26), Aikhenvald’s own reference, argues that “[a] substantive category of gender is marginal, appearing only in a very small class of borrowed adjectives.” The first two examples in the above quote are the only NPs in Parker’s grammar which reflect the phenomenon under inspection. The limited set of GG-sensitive adjectives allows us to complement H3 with H8. Hypothesis 8: The contact-induced NP-internal genesis of GG starts with a small subset of loan adjectives in the RL. Furthermore, the Quechua case – be it Ayacucho Quechua or Bolivian Quechua – has given rise to a controversy. Several scholars deny the existence of anything remotely similar to a GG-system. For Imbabura Quechua, Gómez Rendón and Adelaar (2009: 960) seriously doubt that the borrowing of Spanish stems together with their bound GG-markers means that the categories of the DL have been borrowed too. What superficially looks like a Spanish-derived GG-morpheme has become an unanalyzable part of the phonological chain of the stem. This is what the authors term “frozen borrowing.” Bakker and Hekking (2012) compare data from the Hispanization processes in Quechua, Guaraní, and Otomí. The authors find only sporadic reflexes of Spanish GG-sensitive morphology in the RL. For Quechua, they report the existence of the diminutive suffixes -ito/-ita on proper nouns (most of which are Spanish too) and in agent-noun formation with -ero/era and -dor/-dora also on native stems (Bakker and Hekking 2012: 200–201). As to potential cases of agreement, the authors distinguish between productive use of GG-marked adjectives and the borrowing of chunks. In the former case, a Spanish adjective serves as attribute of a Quechua noun as in caprichos-a warmi ‘unaccountable woman’ and ric-a cashca ‘rich fiancée/lover’ where the a-forms of the Spanish-derived adjectives respect the natural sex of the referent of the head noun. Chunks, on the other hand, are lexicalized Spanish NPs like Santa Tierra ‘Holy Land’ (Bakker and Hekking 2012: 210–211). The paper concludes for all three of the RLs jointly by way of stating that [l]oan adjectives sometimes appear in the feminine form, both with Spanish nouns of this gender, or with female referents. This is particularly the case in Quechua, and to a lesser extent in Guaraní. It remains very speculative, though, to assume that at a later stage a gender system for adjectives will be copied by any of the languages outside this very restricted domain since they all lack grammatical gender in the first place. The alternative would be that the female endings on the borrowed adjectives would simply disappear, but that is unlikely given the expected increase in bilingualism (Bakker and Hekking 2012: 216).

Bakker and Hekking’s paper ends on a rather skeptical note as to the chances that an elaborate GG-system might ever develop from the humble beginnings described for the above RLs. Their skepticism notwithstanding, the authors con-

On the (almost im)possible emergence of grammatical gender | 25

cede that F marking in the domain of adjectival modification of human nouns may survive into the future. Between the lines of this and other papers on our topic, the role of sex-marking for female referents and GG-marking for F GG stands out as a particularly crucial factor in the emergence of GG-systems. This is what H9 tries to account for. Hypothesis 9: The contact-induced NP-internal genesis of GG starts with the introduction of a formal marking for words (or their modifiers) referring to female referents. The prominence female sex and F GG have in the context of our project comes also to the fore in the remaining seven sketches of potential instances of GGborrowing.

4.2.3 Mednyj Aleut A special case of a potential GG-borrower is Mednyj Aleut. This language has arisen from contact between the GG-language Russian and the GG-less Aleut (Eskimo-Aleut) in the Commander Islands. Thomason and Kaufman (1988: 235– 236) claim that Mednyj Aleut has borrowed the Russian past tense suffix -l and its paradigm (added vowels for feminine and neuter gender and for plural masculine and feminine); more strikingly, it uses borrowed Russian pronouns to indicate person distinctions in the past.

As to GG, the paradigm of Mednyj Aleut verb inflection provided by the Thomason and Kaufman (1988: 234–235) – quoted from Menovščikov (1969: 132) – contains neither any F, NT or PL verb forms nor GG-sensitive and number sensitive pronouns in the past tense. Comrie’s (1981: 253) rendering of the paradigm is equally devoid of the verb forms and pronouns we search for. This also holds for the Russian sources from which the information has been taken (Menovščikov 1964, 1968, 1969). Constructions like on saɣ̥a-l ‘he slept’ (Thomason and Kaufman 1988: 235) with the Russian pronoun of 3SG M on ‘he’ and the Russian past-tense marker -l (for M) illustrate the extent of the Russian impact on the RL. On account of the above quote, we would expect to find also instances of on-a saɣ̥a-l-a [F], on-o saɣ̥a-l-o [NT], and on-i saɣ̥a-l-i [PL] with the appropriate F, NT, and PL morphology according to the Russian model. Such combinations of GG-marked pronouns and GG-marked verbs would constitute good examples of contact-induced GG-agreement in a Mixed Language.

26 | Thomas Stolz and Nataliya Levkovych

Golovko (1994: 116) makes a counter-claim when he states that “[t]he Russian feminine gender marker is not part of CIA [Copper Island Aleut = Mednyj Aleut], though it is used optionally.” With reference to prior work by Golovko and Vakhtin (1990: 109), Thomason (1997: 458) concedes that although “the Russian conjugational system is largely intact in Mednyj Aleut [] the Russian feminine suffix -a is used only sporadically in past-tense forms.” What is more, she reiterates her claim that GG is distinguished in Mednyj Aleut with the Russian pronouns that are employed in the past tense paradigm (Thomason 1997: 457). Elsewhere Russian pronouns are used in Mednyj Aleut without the GGmorphology of the DL as in (15)–(16). (15)

(16)

Russian Y eto moj-a this 1SG.POSS-F ‘This is my daughter.’

[Thomason 1997: 458] doč’ daughter

Mednyj Aleut Y [Thomason 1997: 458, adapted from Sekerina 1994: 24] eta moj asxinu-ŋ this 1SG.POSS daughter-1SG.POSS ‘This is my daughter.’

In the Russian example (15), the F GG of the noun doč’ ‘daughter’ triggers the asuffix on the possessive pronoun whereas the same pronoun appears in its basic form in the synonymous Mednyj Aleut example (16). This means that Mednyj Aleut does not indiscriminately copy all GG-phenomena from Russian. As to the existence of GG-sensitive Russian pronouns in Mednyj Aleut, we rely on Comrie’s (2008: 30) account which draws on Sekerina (1994: 23–24). The pronominal paradigm hosts different forms for the 3SG/PL, namely on and ani, respectively. There is no mention of a distinct F form. Comrie (2008: 30) remarks that there might be accidental gaps in the paradigm. However, we have not encountered Mednyj Aleut examples in which the Russian F pronoun ona is used. In Golovko (2009), the assumed PL pronoun ani is regularly translated as ona in Russian. In connection to this problem, a hypothesis of Golovko’s (1997) according to which the a-form in the past is possible only if the speaker is female. This remark suggests that we are dealing with a phenomenon that is preferably encountered in the 1SG.PST since agreement in GG with lexical subjects is attested only twice in Golovko’s (2009) text collection. Golovko’s own example from his earlier paper is reproduced in (17). (17)

Mednyj Aleut ja ‘usug’lii-l(-a) 1SG sneeze-PST(-F) ‘I sneezed.’

[Golovko 1997: 122]

On the (almost im)possible emergence of grammatical gender | 27

In this example, the verb stem is of Aleut origin whereas the pronoun ja ‘I’, the past-tense marker -l, and the F marker -a are elements taken from Russian. The speaker is a woman. It is up to her whether she uses the F marker or not, meaning: the sentence, when uttered by a woman, is acceptable also without the final -a. The optional status of F marking is corroborated by the following excerpt from a Russian blog dealing with Mixed Languages. In this blog, a Mednyj Aleut woman’s autobiographical sketch is quoted. The two sentences (18)–(19) are telling. (18)

Mednyj Aleut [https://rousseau.livejournal.com/230345.html] 1924 goda ja aga-l Mīdnam ila. 1924 year:GEN 1SG be_born-PST Mednyj island ‘I was born in 1924 on the island of Mednyj.’

(19)

Mednyj Aleut [https://rousseau.livejournal.com/230345.html] Aba qalī-l-а ja kada bujana-x tin ayugnи-l. work begin-PST-F 1SG when war-SG REFL start-PST ‘I had begun to work when the war started.’

The same female speaker reports on herself in (18) and (19). In (18), the female sex of the speaker is not visible from the past-tense morphology. In contrast, sentence (19) gives evidence of the use of -a in the same tense category. Whether the presence of the marker is in any way triggered by the VS word order in (19) cannot be determined in this paper. In both sentences the pronoun of the 1SG ja functions as subject of a verb inflected for past tense. There is variation as to the marking of the speaker’s sex though. The speaker-orientation of the sex-marking comes also to the fore in an example presented and discussed by De la Fuente (2018: 119–120). The author argues that in Mednyj Aleut the verb (in the past tense) may display agreement with the possessor. Example (20) is produced by a female speaker. (20)

Mednyj Aleut Y čvetk-i-ning hula-l-a flower-PL-3PLx1SG bloom-PST-FEM ‘My flowers bloomed.’

[De la Fuente 2018: 120]

The sex of the speaker is indicated by the F suffix -a on the verb whose subject is the possessed pluralized noun to the left. The possessive suffix is a portmanteau morph which encodes a 1SG possessor together with a 3PL possessee. From the Russian point of view, we would expect to find hula-l-i with number agreement between subject and verb. However, the category specifications of the subjectNP seem to be irrelevant for the choice of word-form for the verb. This is a phe-

28 | Thomas Stolz and Nataliya Levkovych

nomenon which cannot be explained in terms of Russian influence. Since there is no GG or sex-marking in other varieties of Aleut, the speaker-oriented sexmarking must be classified as a Mednyj Aleut innovation. Owing to the general shortage of reliable data the final work on GG in Mednyj Aleut must await an empirically better informed follow-up study. As stated at the beginning of this section, there is no evidence of F marking in the older sources on the language. Thomason (1997: 449 and 461) ponders the idea that the striking differences between the Mednyj Aleut as described in the 1960ies and the same language as pictured in more recent descriptions starting in the 1990ies might be the result of drastic changes Mednyj Aleut underwent in the gap between the two periods of intensive research on the language. It cannot be ruled out that these changes have also affected GG and related categories. Chances are therefore that F marking as such might is a relatively recent phenomenon in Mednyj Aleut. Independent of the exact chronology of events in the history of Mednyj Aleut’s F marking, one thing seems to be certain. After the publication of Thomason and Kaufman (1988) there has been no further mention of the NT GG in Mednyj Aleut. It is safe to assume that this GG-category never made it into the grammar of Mednyj Aleut. On the basis of the above discussion of the Mednyj Aleut situation, we put forward H10–H11. Hypothesis 10: The contact-induced genesis of GG starts in a very limited segment of the grammatical system. Hypothesis 11: If several “marked” GG-categories exist in the DL, the F is the most likely to be borrowed.

4.2.4

Karaim

In Section 4.2.1.2, the Troki variety of Karaim was shown to borrow sex-marking derivational morphology from its Slavic neighbors without developing a fullyblown GG-system on this basis. Csató (2001: 18) argues, however, that “[g]ender agreement is sometimes marked, namely when the adjective is a copied item with adjectival morphology.” The sole example given for this phenomenon is (21). (21)

Karaim Y Ol e-d’i 3SG COP-PAST.3SG ‘She was intelligent.’

[Csató 2001: 18] inteligentn-a. intelligent-F

On the (almost im)possible emergence of grammatical gender | 29

The predicative adjective is of Slavic origin as is its grammatical morphology. The subject pronoun ol ‘s/he’ is neutral as to the sex of the referent. Éva Csató (p.c.) explains that examples of the kind shown in (21) are hard to come by in her corpus. F marking occurs on the adjective only in the case of human referents, if at all. If the referent is non-human or inanimate the a-forms of the adjectives are blocked. As it seems, Karaim attests only very marginally to the sex-sensitive behavior of loan adjectives. The only evidence we have is at odds with H2 since the F forms are reported for predicative adjectives and not NP-internally.

4.2.5

Guaraní

In Paraguay and parts of Argentina, there have been very intensive long-term contacts between the GG-language Spanish and the GG-less Guaraní. A special mixed Spanish-Guaraní variety has emerged which goes by the name of Jopara (Kallfell 2011). It is often difficult to distinguish properly between Paraguayan Spanish, Guaraní-influenced Spanish, Jopara, Hispanized Guaraní, and Paraguayan Guaraní because we are facing a sociolinguistic continuum with blurred boundaries (Dietrich 2010: 49; Gómez Rendón 2019: 708). Therefore, some of the controversies in the debate about the impact of Spanish on the language structure of Guaraní might be attributable to the fact that the opposing views are based on different varieties of the Paraguayan diasystem. Cerno (2010) compares the uses to which the borrowed Spanish articles are put in two varieties of Guaraní, viz. that of Corrientes in Argentina and Paraguayan Guaraní. Since definite and indefinite article carry information on GG in Spanish it is interesting to see to what extent the distinction of M / F of the DL is also relevant structurally in the RL. In example (22) from Paraguayan Guaraní, the Spanish definite article in the F form la is used. (22)

(Paraguayan) Guaraní Y Ani na re-‘u mamíta NEG.IMP ARTPAR 2-eat mammy ‘Don’t eat the meat, mammy!’

[Cerno 2010: 26] la ART

so’o! meat

Prior to contact with Spanish, there were no articles in Guaraní, be they definite or indefinite. The MAT-borrowing of articles from Spanish created a problem for the RL in the sense that these articles come in a package with GG – another category that was originally alien to Guaraní grammar. In point of fact, Correntinian Guaraní has borrowed much of the set of articles from Spanish whereas Para-

30 | Thomas Stolz and Nataliya Levkovych

guayan Guaraní gives evidence of only a subset of the inventory of Spanish articles as shown in Table 2 (adapted from Cerno 2010: 36 and 2013: 194). Table 2: Spanish-derived articles in two varieties of Guaraní.

Spanish categories

Spanish form

Correntinian Guaraní

M.SG.DEF

el

el

F.SG.DEF

la

la

Paraguayan Guaraní

la M.PL.DEF

los

F.PL.DEF

las

lo

lo ~ la … kuéra

M.SG.INDEF

un

un

F.SG.INDEF

una

una

M.PL.INDEF

unos

F.PL.INDEF

unas

uno

In Paraguayan Guaraní, the articles la and lo do not serve the purpose of marking GG. Their function is exclusively that of marking definiteness and/or the pragmatic theme irrespective of sex and animacy of the referents (Gómez Rendón 2019: 703– 706). This is different for the Correntinian variety (Kallfell 2011: 40). According to Cerno (2010: 36), in Correntinian Guaraní, “el and la are used in agreement with the natural gender of the base.” Furthermore, el is also used for all non-sexed entities. Similarly, the Spanish indefinite article is employed in its F form una with nouns that refer to animates of the female sex whereas un is used with all other kinds of nouns. Accordingly, the noun so’o ‘meat’ which takes the article la in (22) from Paraguayan Guaraní requires the article el in Correntinian Guaraní since the referent of so’o is a non-sexed entity (Cerno 2010: 28). Table 3 shows how the definite articles el and la distribute over animate and inanimate nouns in Correntinian Guaraní. The table is based on the information given in Cerno (2013: 194). Note that Cerno (2010: 27) provides a slightly different picture with cases of oscillation between el and la in combination with inanimate nouns. Table 3: Distribution of articles over nouns in Correntinian Guaraní.

Animate la la kuña

‘the woman’

Inanimate el

el karai tuja

‘the old man’

el el mesa

‘the table’

On the (almost im)possible emergence of grammatical gender | 31

Animate la

Inanimate el

el

la guáina

‘the girl’

el i-t-a’ýra

‘the son’

el t-embi’u

‘the meal’

la rygasu

‘the hen’

el kavaju

‘the steed’

el ka’aguy

‘the forest’

la vaka

‘the cow’

el pa’i

‘the priest’

el oke͂

‘the door’

The indefinite article una combines with the same nouns as la (e.g. una rygasu ‘a hen’) whereas un has the same distribution as el (e.g. un i-t-a’ýra ‘a son’, un mesa ‘a table’). Grey shading in Table 3 is indicative of Spanish loan nouns. As can be seen, the articles have been generalized over the entire nominal lexicon so that Spanish and autochthonous nouns combine with them. Moreover, examples like el mesa ‘the table’/un mesa ‘a table’ show that Correntinian Guaraní does not respect the GG-assignment of the DL in the domain of inanimate nouns since in Spanish mesa ‘table’ is F (= la mesa ‘the table’/una mesa ‘a table’). There is no evidence whatsoever of GG being structurally relevant in any other part of the grammar of Guaraní. Given that the above summary description of the distribution of Spanishderived articles in the two varieties of Guaraní is correct, we can conclude that Correntinian Guaraní is a good candidate for the status of GG-borrower. The necessary agreement phenomena are located within the NP as predicted. However, the target is not an adjectival modifier but a determiner which is not in conformity with H2. GG has been borrowed epiphenomenally and incidentally together with the Spanish strategy of definiteness marking. In contrast, Paraguayan Guaraní has eliminated the GG-component of the article system.

4.2.6

Chamorro

Our interest in the topic of this paper was raised when we studied the impact of Spanish on the grammar of the Austronesian language Chamorro for the first time in T. Stolz (2002). In T. Stolz (2012), the foundations for our project were laid by way of exploring the situation in Chamorro in some detail. Among other things, Stolz and Levkovych (this volume) compare the contact-induced GGsystems of Chamorro and Tetun Dili (to be addressed in Section 4.2.7). If one takes a look at the lexicon of contemporary Chamorro it is striking how many words with sex-differentiating morphology are registered there. T. Stolz (2012: 136–140) identifies 300 word pairs of this kind – adjectives (e.g. rabioso [M]/rabiosa [F] ‘talkative’ < Spanish rabioso/rabiosa ‘raging, angry’) as

32 | Thomas Stolz and Nataliya Levkovych

well as nouns (e.g. kantineru [M]/kantinera [F] ‘canteen keeper’ < Spanish cantinero/cantinero ‘canteen keeper’). Pre-contact Chamorro was a GG-less language so that the question should be posed whether the Spanish contribution to the lexicon of the RL has triggered GG into being, too. The conditions seem to be favorable for the contact-emergence of GG since Chamorro displays many signs of Spanish influence in its grammatical system (Pagel 2010: 50–133). Note that after the withdrawal of Spain from Micronesia in 1898/9, direct influence from Spanish and bilingualism with Spanish came quickly to an end in the Marianas. There are cases of agreement albeit not in abundance. One of the few adjectives that is regularly sensitive to the sex of human referents is bunitu [M]/ bunita [F] ‘nice’ (< Spanish bonito/bonita ‘nice’). In (23), the adjective modifies an inanimate noun whereas in (24) the head noun is [+human]. (23)

Chamorro Y [T. Stolz 2012: 125] Desde anti~tites na tiempo esta gof bunit-u na siuda since RED~before LINK time already very nice-M LINK city iya Hagåtña ART:TOP Hagåtña ‘A very long time ago, Hagåtña was a very pretty town already.’

(24)

Chamorro [T. Stolz 2012: 126] Gi un familia guaha un bunita~t-a na sotterit-a in INDEF.ART family EXI INDEF.ART RED~nice-F LINK adolescent-F na’an-ña si Elena name-POR.3SG ART.PROP Elena ‘In a family, there was a very pretty adolescent girl called Elena.’

Both head nouns are lexical Hispanisms, namely the two F nouns siuda ‘town’ < Spanish ciudad ‘city’ and sotterita ‘adolescent girl’ < Spanish soltera ‘unmarried woman’ + diminutive -ita. For the latter noun, there exists a M equivalent sotteritu ‘adolescent boy’ < Spanish soltero ‘bachelor’ + diminutive -ito. F is marked on the adjectival attribute only in (24) because the head is a human noun. If the head noun is nonhuman or inanimate F marking is generally blocked. The rules for the choice of bunitu/bunita are obeyed also with head nouns of Austronesian origin referring to humans (Pagel 2008: 183–184). This holds for a handful of other Spanish-derived adjectives such as kariñosu/ kariñosa ‘loving’ (< Spanish cariñoso/ cariñosa ‘loving’), banidosu/banidosa ‘proud’ (< Spanish vanidoso/vanidosa ‘vane’), etc. Pagel (2010: 75) concedes that the Spanish-derived GG-agreement phenomena in Chamorro are important for the theory of language contacts. On the other hand, their importance for Chamorro should not be overestimated since GGagreement applies only with a very limited set of adjectives beyond which GG-

On the (almost im)possible emergence of grammatical gender | 33

marking becomes optional and is observed only occasionally by individual speakers of the language. Thus, Chamorro conforms to H4, H8, and H9.

4.2.7

Tetun Dili

Together with its partner in contact, Portuguese, Tetun Dili (Austronesian) has the status of official language of Timor-Leste. Timorese languages have been in contact with Portuguese since the 16th century and after full independence from Indonesia was gained in 2002 the role of Portuguese in Timor-Leste has gained in importance. It is therefore no surprise to see that Tetun Dili gives ample evidence of Lusitanization. In pre-contact Tetun there was no GG. In contemporary Tetun Dili, however, GG seems to slowly enter the scene step-by-step via the massive integration of GG-marked loans from Portuguese. Hajek (2006: 171) mentions that “[a] small set of Portuguese nouns and adjectives are obligatorily marked for gender by all speakers.” Strikingly, the paradigm case of a GG-sensitive adjective in Tetun Dili is bonito [M]/bonita [F] ‘handsome; pretty’ (< Portuguese bonito/bonita ‘pretty; beautiful’) which is etymologically identical with bunitu/bunita ‘nice’ in Chamorro as presented in the previous section. The author emphasizes that except bonito and bonita which are also used in predicative function, GG-agreement is limited strictly to the NP. At this point, word order comes into play. Hajek (2006: 171) claims that GGagreement is compulsory with preposed Portuguese adjectives whereas it is largely avoided if the adjective follows the head noun. Hajek and Williams-van Klinken (2019) take up this and related issues in an in-depth study dedicated to contact-induced GG in Tetun Dili. The exceptional status of bonito/bonita is stressed repeatedly (Hajek and Williams-van Klinken 2019: 70–71). No other adjective is reported to experience GG-agreement obligatorily. The authors claim that “the basilect has no gender agreement, and the acrolect has a tendency to use it” (Hajek and William-van Klinken 2019: 77) so that a social differentiation of registers has to be taken in account. Example (25) illustrates the current journalistic style which the authors characterize as a kind of compromise between basilectal and acrolectal properties. (25)

Tetun Dili Y [Hajek and Williams-van Klinken 2019: 76] Komisaun nee xefia-d-a husi Jenerál Tan. commission.F this lead-PASS-F from general Tan ‘This commission was headed by General Tan.’

34 | Thomas Stolz and Nataliya Levkovych

The construction is predicative, i.e. we are not dealing with an example which is in accordance with H2. The noun komisaun ‘commission’ is assigned to the F GG as in the DL (Portuguese comissão [F] ‘commission’). The passive participle agrees in GG with the subject according to the Portuguese model. Passive participles are considered atypical even for journalistic prose so that it is likely that (25) is a direct translation from a Portuguese original (Hajek and Williams-van Klinken 2019: 76). On the other hand, the authors report on the incipient spread of GG-agreement to constructions which involve a head noun of Austronesian origin (Hajek and Williams-van Klinken 2019: 77–79). In their conclusions Hajek and Williams-van Klinken (2019: 85–86) propose the following general typological observations, as they put it: a) “Contact-induced gender phenomena are very sensitive to lectal type.” = H7 b) “Contact-induced gender phenomena occur almost exclusively in loans from the gendered language.” c) “Gender agreement is expected in borrowed fixed phrases.” d) “Gendered pairs of human-related lexemes follow the following scale: kin > common professions > other professions.” = H5 e) “Adjectives which semantically distinguish essential male and female human traits such as attractiveness are more likely to show general gendermarking than other human-related adjectives.” = H8 f) “Agreement follows a predictable scale in terms of word order and syntactic structure.” = H1 Several of their observations correspond to hypotheses formulated by us in the previous sections. On the basis of the Tetun Dili situation, we can add H12–H13 to the list. Hypothesis 12: The contact-induced genesis of GG may exclusively involve loanwords (controllers and targets) in the RL. Hypothesis 13: The contact-induced genesis of GG starts under particular syntactic conditions (e.g. word order). For reasons of space, we skip the discussion of point (c) of the above list.

4.2.8

Basque

In Section 4.2.1.2, Basque has been mentioned already among those languages which borrow sex-specifying derivational morphology. Basque is generally

On the (almost im)possible emergence of grammatical gender | 35

considered a GG-less language. Hualde and Ortiz de Urbina (2003: 137) speak of “[a] number of adjectives, mostly borrowed from Spanish, [which] exceptionally exhibit sex-marking.” These adjectives reflect the Spanish-style distinction of forms ending in -a (on adjectives modifying nouns with female referents) and those in -o which are used when modifying nouns with male or inanimate referents. This is pictured as a typically western phenomenon whereas easterly Basque varieties “invariably borrow only the M form of a Romance adjective.” The topic of GG-borrowing in Basque has been addressed by several authors. Hurch (1989: 24–25) and (with identical examples) Oñederra and Hurch (1990), too touch upon the subject only very briefly so that many open questions remain. Hualde et al. (1994: 108–109) describe the GG-sensitive morphology of Lekeitio Basque. They stress the fact that GG is structurally relevant only for a small set of nouns and adjectives. The members of these sets are almost all borrowings from Spanish and show GG-sensitivity only in the context of reference to human beings. The Spanish loan-adjective álto ‘tall’ (< Spanish alto ‘tall’) belongs to the class of GG-sensitive adjectives as shown in Table 4. Other adjectives like sendo ‘strong’ in Table 5 are morphologically invariable although they end in -o and thus resemble álto sufficiently to also invite GG-inflexion (Hualde et al. 1994: 109). Table 4: GG-marking in the NP of Lekeitio Basque.

Referent

Noun

Adjective

Indefinite article

Translation

female

neska

ált-a

bat

‘a tall girl’

mal

mutil

ált-o

bat

‘a tall boy’

inanimate

etxe

ált-o

bat

‘a tall house’

Table 5: Absence of GG-marking in the NP of Lekeitio Basque.

Referent

Noun

Adjective

Indefinite article

Translation

female

neska

sendo

bat

‘a strong girl’

mal

mutil

sendo

bat

‘a strong boy’

inanimate

etxe

sendo

bat

‘a strong house’

Besides a small number of originally Basque nouns and adjectives to which the distinction of a-forms and o-forms has been transferred by analogy, the authors mention a fact which is very surprising from the point of view of the DL Spanish. The GG-morphology can be maintained in the derivation of de-adjectival verbs

36 | Thomas Stolz and Nataliya Levkovych

such as maj-o-tu ‘to become handsome (male)’/maj-a-tu ‘to become pretty (female)’ from majo/maja ‘handsome, pretty’ (< Spanish majo/maja ‘good-looking, nice’) (Hualde et al. 1994: 109). Eliasson (2012: 278–279 and 283) comments upon the above examples from the Lekeitio variety to conclude that a) the Spanish GG-distinction is transparent enough semantically and morphologically to be borrowable and b) Basque language structure puts no obstacle to the borrowing as such. These findings characterize the purely morphological level. The borrowing of agreement patterns is not directly addressed by Eliasson (2012). GG is a regional phenomenon in the Basque diasystem. The introduction of GG via contact with Spanish has affected only the westernmost varieties of the language. On these grounds, we postulate H14. Hypothesis 14: The contact-induced genesis of GG may start in a regional variety of a RL whose standard remains immune against the introduction of contact-borne GG.

4.2.9

Ottoman Turkish

The final sketch is dedicated to what Johanson (2013) calls High Ottoman, i.e. the official language of the Ottoman Empire from the late 15th century until the early years of the Turkish Republic in the 20th century. Johanson (2013: 285– 286) characterizes the written register of High Ottoman as “not usuable for everyday communication” because of the high density of Arabic-Persian lexical and structural elements in High Ottoman texts. What is most important for our subject matter is the fact that “[g]ender distinctions, alien to Turkic, were expressed by [Arabic] feminine endings []. Adjective attributes could agree in gender with their head in the Arabic way” (Johanson 2013: 289–290). Royen (1929: 566–567) reports on the same facts and adds that (with reference to the 19th century) the high register of Persian integrated GG-bearing Arabic loans in the same way as Ottoman Turkish. His claim is that [w]enn im Türkischen oder im Persischen zu einem entlehnten weiblichen Substantiv, gleichgültig ob es der Name eines sexualen Wesens oder eines Dinges ist, auch ein ent-

On the (almost im)possible emergence of grammatical gender | 37

lehntes arabisches Adjektiv tritt, wird auch das Adjektiv femininisiert, mit anderen Worten es bleibt die Kongruenz erhalten.12

The only two examples of this phenomenon given by Royen (1929: 566–567) are zenne ḳïsmï ‘female sex’ and zenne ṭaḳyasi ‘ladies haircut’ with zenn-e ‘female’ as Arabic F of Persian zen ‘woman’. The Arabic F markers -e and -a are said to be commonly used in High Ottoman. It is claimed further that in Ottoman Turkish the Arabic GG-morphology has occasionally been transferred also to non-Arabic nouns and adjectives. Weil (1917: 86–87) argues that Arabic adjectives remain uninflected in Ottoman Turkish. The documentation in the sources we have consulted on this issue is insufficient to get a better grip on the phenomenon. It is beyond the scope of this study to delve into the philological details of the problem. What we learn from the Ottoman Turkish case nevertheless is that GG-borrowing may be the monopoly of a high style or high register. This is in line with H7 and the idea of Hajek and Williams-van Klinken (2019: 85) that GG-borrowing depends on certain lectal differences within the replica language. This is what H15 is about. Hypothesis 15: The contact-induced genesis of GG may start in and be confined to the high style or high (written) register of the replica language. Amiridze (2016, 2020) describes the parallel case of the written register of earlier stages of Georgian which give evidence of GG-agreement with loan-nouns and loan-adjectives from Russian. The agreement was never compulsory and can be characterized as the ephemeral fashion of a certain period. It has become obsolete by now.

5 Conclusions Looking back on the empirical sections of this study, it is not surprising that GG is not mentioned in Grant’s (2019b: 23–24) selection of features which are borrowed. GG-borrowing is not common enough to qualify it for a list of this kind (Nichols 2003: 303). Our review of the data as they are presented in the literature corroborates Audring’s (2016: 19) nicely put statement according to which “[g]ender sys|| 12 Our translation: ‘In Turkish and Persian, if a borrowed Arabic adjective is added to a borrowed noun, be it the name of a sexed being or that of a thing, then the adjective is also made F, with other words the agreement is preserved.’

38 | Thomas Stolz and Nataliya Levkovych

tems do not arise overnight.” To her mind, GG is a sign of linguistic maturity (Dahl 2004). In her opinion, so-called young languages – such as Creoles, for instance – may also display GG provided a broader definition of the category is applied. If maturity is paralleled metonymically to the advanced stages of long-term language contacts we recognize similarities in the emergence scenarios postulated for GG with and without external trigger. For GG to arise in language-contact situations, these contacts must be of long standing. The cases we have discussed in Section 3–4 reflect language contacts which have lasted for several centuries. Since in most of the cases GG in the RL still seems to be in its infancy we conclude that establishing GG via language contact is possible only at a mature stage of the contact situation, in a manner of speaking. What is the lesson we have learned in this paper? First of all, we have seen that despite the often fragmentary documentation of the matter at hand it is possible to detect recurrent themes across the RLs addressed above. These similarities are captured at least in approximation by our hypotheses. It is neither necessary that each of the hypotheses applies to all of the RLs, nor is it required that a given RL is in line with each of the hypotheses. We briefly tick off the hypotheses one by one. – H1: The paper clearly shows that the emergence of GG in language-contact situations is a possible but rarely realized option. – H2: In the majority of the cases reviewed in this paper H2 is borne out by the facts in the sense that the little evidence there is of GG-borrowing stems mostly from NP-internal agreement of adjectival attributes with their headnouns. Note, however, that in some cases the phenomenon is illustrated exclusively with predicative adjectives. – H3: Again, for most of the case-studies the sources claim that only a limited number of nouns participate in GG-agreement patterns. (Cf. H8 below) – H4: The optionality (and even marginal frequency) of GG-agreement is also stated by the bulk of our sources. – H5: For many of the RLs, the borrowing of sex-specifying derivational means is mentioned in the sources. – H6: For several of the RLs, the sources report the borrowing of affective morphology (diminutives) which, at the same time, also has sexdifferentiating function. – H7: As to the use of GG-agreement as social marker, we lack the necessary information for most of the languages scrutinized above. However, there are some explicit statements in the sources suggesting that GG-agreement is characteristic of the speech of the elite. (Cf. H15)

On the (almost im)possible emergence of grammatical gender | 39

– –

– – –

–

–

–

H8: Like stated for nouns with reference to H3, the majority of our sources assume that GG-agreement is restricted to rather small set of adjectives. H9: This hypothesis is robust since wherever there is evidence of sexmarking or GG-agreement morphology, female sex and F GG are the first to be marked overtly. (Cf. H11) H10: Except Tetun Dili and Mednyj Aleut, none of the RLs is described in sufficient detail to test the validity of this hypothesis. (Cf. H13) H11: This is again a robust hypothesis because we have found no evidence of a GG other than the F to be borrowed. (Cf. H9) H12: The majority of the RLs restrict the GG-phenomena to the interaction of loan-adjectives and loan-nouns. The diffusion into the autochthonous part of the lexicon is not entirely unheard of but infrequent. H13: Tetun Dili is the only RL for which special syntactic conditions are invoked for the emergence of GG. It cannot be excluded that this is also the case for other RLs. However, the descriptions are not explicit about this issue. (Cf. H10) H14: Convincing proof of the validity of this hypothesis comes from Basque and to a certain extent also from Mednyj Aleut (because other varieties of Aleut are not affected). Regional varieties might function as prime mover also in other cases but we cannot decide this issue for lack of evidence. H15: In accordance with H7, Ottoman Turkish, Persian, and Georgian suggest that GG-borrowing can be the privilege of the stylistically elaborated educated written register.

The enumeration of the successes and failures of the hypotheses does not answer the question whether the RLs have GG or not. H1–H15 suggest that the answer is not straightforward. If compulsory marking and agreement is the yardstick then Correntinian Guaraní is probably the sole exemplar of a GG-language among the above RLs. Correntinian Guaraní also stands out because it conforms to Greenberg’s (1978) model according to which determiners such as demonstratives and definite articles are major sources for the grammaticalization of GG-markers. In none of the other RLs do determiners participate in the agreement patterns. In contrast to Correntinian Guaraní, the remainder of the languages under review fails to meet some of the criteria which are necessary to be admitted to the class of GG-languages. However, we doubt that it makes sense linguistically to sweepingly declare them GG-less languages indiscriminately. At least some of them are instances of languages equipped with marginal GG (T. Stolz 2012). Marginal GG applies if in a given language the marking and agreement of GG is subject to sociolinguistic, pragmatic, syntactic, lexical, and/or semantic re-

40 | Thomas Stolz and Nataliya Levkovych

strictions so that the type and token frequency of overt GG-patterns is limited and variation applies. Marginal GG must be distinguished from proto-GG which involves cases of overt sex-marking without any further ramifications in morphosyntax (T. Stolz 2012). The false friends discussed in Section 4.2.1 instantiate cases of proto-GG. Most of the other RLs display properties of marginal GG albeit to different degrees. Whether there is a dynamic continuum connecting proto-GG with proper GG via marginal GG as the intermediate stage is a topic that needs to be looked into in future studies dedicated to the contact-induced emergence of GG. We start from the bottom, i.e. with a language whose status as language with marginal GG is the most doubtful. Superficially, Option (b) – NP-V agreement (Section 2) – is represented only once, namely by Mednyj Aleut. But Mednyj Aleut is exceptional since no examples of fully-blown agreement could be identified. The data suggest that we are dealing with sex-marking in reference to the speaker. Therefore, Mednyj Aleut is provisionally counted out as language with marginal GG. In most of the remaining cases, Hockett’s criterion of exhaustiveness seems to be violated, meaning not all of the language’s nouns are assigned to a GG. On closer inspection the situation is not as bad as that. The usual scenario is as follows. A very small segment of the nominal lexicon is singled out for the purpose of GG-marking and GG-agreement. The members of this privileged group are human nouns which refer to female beings. Those nouns which lack at least one of the features [+human] or [+female] together constitute the second but numerically much stronger class of the marginal GG-system. This system has a basic binary structure along the lines of the opposition FHUMAN vs. NON-F. In her formal account of the morphosyntax of GG, Kramer (2015: 247) proposes the feature [+/– FEM] as criterion for the generation of GG-systems (which, however, cannot account for GG-systems of the Algonquian or Bantu kind). It is tempting to relate this feature to the prominent role the F category plays in the borrowing of GGsystems illustrated in the empirical part of this study. Animacy and sex interact when it comes to create a GG-system by way of borrowing. This is in line with the cross-linguistic dominance of sex-based GG-systems (Corbett 2005b). Of the three Options (a)–(c) presented in Section 2 it is Option (a) which applies almost exclusively – and that with a clear preference for NP-internal agreement with adjectives. In contrast, there is no convincing example of Options (b)–(c). NP-V GG-agreement and pronominal GG are thus unlikely candidates for the emergence of GG in language contact situations. The small number of targets which agree in GG with an equally restricted number of controllers adds to the marginal status of GG in the RLs. If the GGsystem is understood as fundamentally binary with a highly specialized F GG on

On the (almost im)possible emergence of grammatical gender | 41

the one hand and a non-F GG on the other, one might dare to claim that agreement also applies in the case of the non-F GG albeit without any dedicated morphology. Admittedly this is a very strong claim which cannot be substantiated satisfactorily on the basis of the empirical data reviewed in this paper. It is promising for the theory of GG and that of agreement to investigate this issue further in order to determine how exactly cases of marginal gender can be integrated into a general framework. Tangible evidence of GG-borrowing stems predominantly from contact scenarios which involve a donor language whose GG-markers are easy to identify (to take up an argument developed by Eliasson 2012). Spanish, Russian, and Arabic too (incidentally) display identical marking for the F GG by way of suffixing -a on controllers and targets. We assume that this piece of concatenative morphology ranks relatively high on the borrowability scale. Hindi -i is a similar case. In the absence of examples of GG-borrowing on the basis of opaque morphological strategies, we hypothesize that the globally small turnout of bona fide instances of GG-borrowing can be explained in part as the effect of the different degrees of transparency of the morphological means the donor languages employ for the coding of GG. Again, this is a daring assumption which calls for being thoroughly tested in cross-linguistic perspective. The grammar-oriented theory of language contact might gain important insights into the mechanisms of borrowing by way of inquiring further into this issue. This study has demonstrated that it is worthwhile researching the emergence of GG in language-contact situations. It has also shown that the empirical support is still largely insufficient. There is no guarantee that what some of our sources present as instances of GG-borrowing is constitutive of a system (in the making). We are in dire need of more dedicated research on the matter at hand on a much larger and more reliable cross-linguistic empirical basis. Moreover, it has become clear that the data need to be re-analyzed on the basis of a unitary model and theory. In a way, we have presented a kind of problem inventory that invites follow-up studies. These follow-up studies should address the many sociolinguistic aspects which we have only touched upon in this paper. The occasional survival of (marginal) GG under Creolization and language mixing resembles the early stages of GG-emergence in language contact. Does this resemblance mean that there is some kind of mirror effect in the dynamics of GGsystems in general? If we want to understand better how GG behaves under the conditions of language contact it is crucial to know whether and how language contact can trigger the genesis of GG.

42 | Thomas Stolz and Nataliya Levkovych

Acknowledgments: Test-runs of this study were presented at the 52nd Annual Meeting of the Societas Linguistica Europaea (Leipzig, 22 August, 2019), at the 12th Nordwestdeutsches Linguistisches Kolloquium (Bremen, 24 November, 2019) and at the 19th International Morphology Meeting (Vienna, 8 February, 2020). We are grateful to our discussants for their thought-provoking comments on our talks. We say thank you to Éva Csató, Francesco Gardani, Martin Haspelmath, John Hayek, Nicole Hober, Susanne Michaelis, Benjamin Saade, Frank Seifart, Christel Stolz, and Catharina Williams-van Klinken for their expert advice. We are indebted especially to Greville Corbett who commented on the draft version of this paper. The entire responsibility for what is said how in this study remains ours.

Abbreviations 1/2/3 A ADJ AOR ART ATT CL COP DEF DEM DEP DIM DL EXI F(EM) FOC FRUIT GEN GG H IMP INDEF INVERS LEAF LINK M(ASC) MAT NEG NOM

1st/2nd/3rd person a-set/ergative adjectivizer aorist article attributive (in quotes from other sources) class copula definite demonstrative dependent marker diminutive donor language existential feminine focus fruit class genitive grammatical gender hypothesis imperative indefinite sentence inversion marker leaf class linker particle masculine matter negation/negative nominative

On the (almost im)possible emergence of grammatical gender | 43

NP NT PAR PASS PAT PERF PL POR POS(S) PROP PRS PST QUAL RED REFL RL S SG TOP V

X Y

noun phrase neuter particle passive pattern perfective plural possessor possessive proprial present tense past tense (possessor of a) quality reduplication reflexive replica language subject singular toponym verb cross-reference original hyphenization, morpheme glosses, and English translation

References Aikhenvald, Alexandra Y. 2000. Classifiers. A typology of noun categorization devices. Oxford: Oxford University Press. Amiridze, Nino. 2016. On gender-copy in Georgian. Presentation at the workshop Language contact in the territory of the former Soviet Union. The 49th Annual Meeting of the Societas Linguistica Europaea. 31 August, 2016, Naples, Italy. Amiridze, Nino. 2020. Borrowing feminine marking in Middle vs. Modern Georgian. Poster presentation at the 19th International Morphology Meeting, Vienna University of Economics and Business, February 6–8, 2020, Vienna, Austria. Arnold, Werner. 2007. Arabic grammatical borrowing in Western Neo-Aramaic. In Yaron Matras & Jeanette Sakel (eds.), Grammatical borrowing in cross-linguistic perspective, 185–196. Berlin & New York: Mouton de Gruyter. Audring, Jenny. 2016. Gender. Oxford research encylopedias: Linguistics. 42 pp. [oxordre.com/linguistics] (accessed May 22, 2020). Bakker, Dik, Jorge Gómez Rendón & Ewald Hekking. 2008. Spanish meets Guaraní, Otomí and Quichua: A multilingual confrontation. In Thomas Stolz & Dik Bakker & Rosa Salas Palomo (eds.), Aspects of language contact. New theoretical, methodological and empirical findings with special focus on Romancisation processes, 165–238. Berlin & New York: De Gruyter.

44 | Thomas Stolz and Nataliya Levkovych

Bakker, Dik & Ewald Hekking. 2012. Constraints on morphological borrowing: Evidence from Latin America. In Lars Johanson & Martine Robbeets (eds.), Copies versus cognates in bound morphology, 187–220. Leiden & Boston: Brill. Baklanova, Ekaterina. 2016. On marginal gender in Tagalog: A case study. In B. W. Kasevič, A. Ju. Vixrovoj & I.M. Rumjanceva (eds.), Materialy XII Meždunarodnoj naučnoj konferencii “Jazyki Dal’nego Vostoka, Jugo-Vostočnoj Azii i Zapadnoj Afriki” LESEWA, Moskva, 16–17 nojabrja 2016 goda [Proceedings of the XII International Scientific Conference “Languages of Far East, South-Eastern Asia, and Western Africa”, LESEWA, Moscow, 16–17 November 2016], 25–33. Moskva: Jazyki narodov mira. Baklanova, Ekaterina. 2017. Types of borrowing in Tagalog/Filipino (with special remarks on Ortograpiyang Pambansa, 2013). Kritika Kultura (Manila) 28. 35–54. Bartels, Hauke. 2009. Loanwords in Lower Sorbian, a Slavic language of Germany. In Martin Haspelmath & Uri Tadmor (eds.), Loanwords in the world’s languages. A comparative handbook, 304–329. Berlin & New York: De Gruyter. Boretzky, Norbert & Birgit Igla. 1994. Interferenz und Sprachwandel. In Benedikt Jeßing (ed.), Sprachdynamik. Auf dem Weg zu einer Typologie sprachlichen Wandels. Band III: Interferenzlinguistik, 7–138. Bochum: Brockmeyer. Bowen, J. Donald. 1971. Hispanic languages and influence on Oceania. Current Trends in Linguistics 8. 938–952. Cerno, Leonardo. 2010. Spanish articles in Correntinean Guaraní. A comparison with Paraguayan Guaraní. STUF/Language Typology and Universals 63(1). 20–38. Cerno, Leonardo. 2013. El guaraní correntino: Fonología, gramática, textos. Frankfurt a. M.: Lang. Chamoreau, Claudine. 2012. Spanish diminutive markers -ito/-ita in Mesomarican languages. A challenge for acceptance of gender distinction. In Martine Vanhove, Thomas Stolz, Aina Urdze & Hitomi Otsuka (eds.), Morphologies in contact, 71–90. Berlin: Akademie Verlag. Claudi, Ulrike. 1985. Zur Entstehung von Genussystemen: Überlegungen zu einigen theoretischen Aspekten verbunden mit einer Fallstudie des Zande. Hamburg: Buske. Comrie, Bernard. 1981. The languages of the Soviet Union. Cambridge: Cambridge University Press. Comrie, Bernard. 2008. Inflectional morphology and language contact, with special reference to mixed languages. In Peter Siemund & Noemi Kintana (eds.), Language contact and contact languages, 15–32. Amsterdam & Philadelphia: John Benjamins. Corbett, Greville G. 1991. Gender. Cambridge & New York: Cambridge University Press. Corbett, Greville G. 2005a. Number of genders. In Martin Haspelmath, Matthew S. Dryer, David Gil & Bernard Comrie (eds.), World atlas of language structures, 126–129. Oxford: Oxford University Press. Corbett, Greville G. 2005b. Sex-based and non-sex-based gender systems. In Martin Haspelmath, Matthew S. Dryer, David Gil & Bernard Comrie (eds.), World atlas of language structures, 130–133. Oxford: Oxford University Press. Corbett, Greville G. 2005c. Systems of gender assignment. In Martin Haspelmath, Matthew S. Dryer, David Gil & Bernard Comrie (eds.), World atlas of language structures, 134–137. Oxford: Oxford University Press. Corbett Greville G. 2006. Agreement. Cambridge: Cambridge University Press. Corbett, Greville G. & Sebastian Fedden. 2016. Canonical gender. Journal of Linguistics 52. 495–531.

On the (almost im)possible emergence of grammatical gender | 45

Corbett, Greville G., Sebastian Fedden & Raphael Finkel. 2017. Single versus concurrent systems: Nominal classification in Nian. Linguistic Typology 21(2). 209–260. Csató, Éva. 2001. Karaim. In Thomas Stolz (ed.), Minor languages of Europe. A series of lectures at the University of Bremen, April–July 2000, 1–24. Bochum: Brockmeyer. Dahl, Östen. 2004. The growth and maintenance of linguistic complexity. Amsterdam & Philadelphia: John Benjamins. De la Fuente, José Andrés Alonso. 2018. The influence of Russian on the Eskaleut languages. Przegląd Rusycystyczny 2(162). 99–125. Dietrich, Wolf. 2010. Lexical evidence for a redefinition of Paraguayan ‘Jopara’. STUF/Language Typology and Universals 63(1). 39–51. Dolberg, Florian. 2019. Agreement in language contact: Gender development in the AngloSaxon Chronicle. Amsterdam & Philadelphia: John Benjamins. Duke, Janet. 2009. The development of gender as a grammatical category. Five case studies from the Germanic languages. Heidelberg: Winter. Eliasson, Stig. 2012. On the degree of copiability of derivational and inflectional morphology: Evidence from Basque. In Lars Johanson & Martine Robbeets (eds.), Copies versus cognates in bound morphology, 259–297. Leiden & Boston: Brill. Enger, Hans-Olav. 2011. Gender and contact: A natural morphology perspective on Scandinavian examples. In Peter Siemund (ed.), Linguistic universals and language variation, 171– 203. Berlin & Boston: De Gruyter Mouton. Epps, Patience. 2007a. Grammatical borrowing in Hup. In Yaron Matras & Jeanette Sakel (eds.), Grammatical borrowing in cross-linguistic perspective, 551–580. Berlin & New York: De Gruyter. Epps, Patience. 2007b. The Vaupés melting pot: Tucanoan influence on Hup. In Alexandra Y. Aikhenvald & R. M. W. Dixon (eds.), Grammars in contact. A cross-linguistic typology, 267–289. Oxford: Oxford University Press. Epps, Patience. 2008. A grammar of Hup. Berlin & New York: De Gruyter. Espiritu, Percy. 1984. Let’s speak Ilokano. Honolulu: University of Hawaii Press. Field, Fredric W. 2002. Linguistic borrowing in bilingual contexts. Amsterdam & Philadelphia: John Benjamins. Fischer, Steven Roger. 2007. Grammatical borrowing in Rapanui. In Yaron Matras & Jeanette Sakel (eds.), Grammatical borrowing in cross-linguistic perspective, 387–402. Berlin & New York: De Gruyter. Gardani, Francesco. 2008. Borrowing of inflectional morphemes in language contact. Frankfurt a.M.: Lang. Gardani, Francesco. 2012. Plural across infection and derivation, fusion and agglutination. In Lars Johanson & Martine Robbeets (eds.), Copies versus cognates in bound morphology, 71–98. Leiden & Boston: Brill. Gardani, Francesco. 2019. Morphology and contact-induced language change. In Anthony P. Grant (ed.), The Oxford handbook of language contact, 96–122. Oxford: Oxford University Press. Gardani, Francesco. 2020. Borrowing matter and pattern in morphology. An overview. Morphology 30. 263–282. Golovko, Evgenij V. 1994. Mednyj Aleut or Copper Island Aleut: An Aleut-Russian mixed language. In Peter Bakker & Maarten Mous (eds.), Mixed languages: 15 case studies in language intertwining, 113–121. Amsterdam: IFOTT.

46 | Thomas Stolz and Nataliya Levkovych

Golovko, Evgenij V. 1997. Mednovskix aleutov jazyk [Language of Mednyk Aleut]. In A. P. Volodin, Nikolaj Vaxtin & A. A. Kibrik (eds.), Jazyki mira. Paleoaziatskie jazyki [Languages of the world. Paleo-Asiatic languages], 117–125. Moskva: Indrik. Golovko, Evgenij V. 2009. Aleutskij jazyk v Rossijskoj Federacii (struktura, funkcionirovanie, kontaktnye javlenija) [Aleut language in the Russian Federation (structure, functioning, contact phenomena)]. Sankt-Peterburg: Institut Lingvističeskix Issledovanij Rossijskoj Akademii Nauk PhD-thesis. Golovko, Evgenij V. & Nikolai B. Vakhtin. 1990. Aleut in contact: The Copper Island Aleut Enigma. Acta Linguistica Hafniensia 22. 97–125. Gómez Rendón, Jorge. 2019. Language contact in Paraguayan Guaraní. In Anthony P. Grant (ed.), The Oxford handbook of language contact, 694–712. Oxford: Oxford University Press. Gómez Rendón, Jorge & Willem Adelaar. 2009. Loanwords in Imbabura Quechua. In Martin Haspelmath & Uri Tadmor (eds.), Loanwords in the world’s languages. A comparative handbook, 944–967. Berlin & New York: De Gruyter. Grant, Anthony P. (ed.). 2019a. The Oxford handbook of language contact. Oxford: Oxford University Press. Grant, Anthony P. 2019b. Contact-induced linguistic change. In Anthony P. Grant (ed.), The Oxford handbook of language contact, 1–48. Oxford: Oxford University Press. Greenberg, Joseph H. 1978. How does a language acquire gender markers? In Joseph H. Greenberg, Charles A. Ferguson & Edith A. Moravcsik (eds.), Universals of human language III: Word structure, 47–82. Stanford: Stanford University Press. Gutiérrez Morales, Salomé. 2012. Morphological borrowing in Sierra Popoluca. In Lars Johanson & Martine Robbeets (eds.), Copies versus cognates in bound morphology, 221– 232. Leiden & Boston: Brill. Haase, Martin. 1993. Sprachkontakt und Sprachwandel im Baskenland. Die Einflüsse des Gaskognischen und Französischen auf das Baskische. Hamburg: Buske. Hajek, John. 2006. Language contact and convergence in East Timor: The case of Tetun Dili. In Alexandra Y. Aikhenvald & R. M. W. Dixon (eds.), Grammars in contact, 163–178. Oxford: Oxford University Press. Hajek, John & Catharina Williams-van Klinken. 2019. Language contact and gender in Tetun Dili: What happens when Austronesian meets Romance? Oceanic Linguistics 58(1). 59–91. Heath, Jeffrey. 1978. Linguistic diffusion in Arnhem Land. Canberra: Australian Institute of Aboriginal Studies. Heine, Bernd. 1973. Pidgin-Sprachen im Bantu-Bereich. Berlin: Dietrich Reimer Verlag. Hockett, Charles F. 1958. A course in modern linguistics. New York: Macmillan. Holm, John. 2008. Creolization and the fate of inflections. In Thomas Stolz, Dik Bakker & Rosa Salas Palomo (eds.), Aspects of language contact. New theoretical, methodological and empirical findings with special focus on Romancisation processes, 299–324. Berlin & New York: De Gruyter. Hualde, José Ignacio, Gorka Elordieta & Arantzazu Elordieta. 1994. The Basque dialect of Lekeitio. Bilbo: Euskal Herriko Unibertsitatea. Hualde, José Ignacio & Jon Ortiz de Urbina. 2003. A grammar of Basque. Berlin & New York: De Gruyter. Hurch, Bernhard. 1989. Hispanisierung im Baskischen. In Norbert Borezky, Werner Enninger & Thomas Stolz (eds.), Vielfalt der Kontakte. Band 1, 11–36. Bochum: Brockmeyer.

On the (almost im)possible emergence of grammatical gender | 47

Janurik, Boglárka. 2015. The emergence of gender agreement in code-switching verbal constructions in Erzya-Russian bilingual discourse. In Christel Stolz (ed.), Language empires in comparative perspective, 199–218. Berlin & Boston: De Gruyter Mouton. Johanson, Lars. 2002. Structural factors in Turkic language contact. Richmond: Curzon. Johanson, Lars. 2013. Written language intertwining. In Peter Bakker & Yaron Matras (eds.), Contact languages. A comprehensive guide, 273–332. Berlin & Boston: De Gruyter Mouton. Kallfell, Guido. 2011. Grammatik des Jopara. Gesprochenes Guaraní und Spanisch in Paraguay. Frankfurt a.M.: Lang. Kleinewillinghöfer, Ulrich. 2017. Adamawa-Ubangi. In Jacob E. Mabe (ed.), Das Afrika-Lexikon: Ein Kontinent in 1000 Stichworten, 3–4. Stuttgart: Metzler. Kobayashi, Masato & Bablu Tirkey. 2017. The Kurux language. Leiden & Boston: Brill. Kossmann, Maarten. 2010. Parallel system borrowing. Parallel morphological systems due to paradigm borrowing. Diachronica 27(3). 459–487. Kovačec, August. 1968. Observations sur les influences croates dans la grammaire istroroumaine. La Linguistique 4(1). 79–115. Kowalski, Tadeusz. 1929. Karaimische Texte im Dialekt von Troki. Warszawa: Nakładem Polskiej Akademji Umiejętności. Kramer, Ruth. 2015. The morphosyntax of gender. Oxford: Oxford University Press. Leiss, Elisabeth. 2005. Derivation als Grammatikalisierungsbrücke für den Aufbau von Genusdifferenzierungen im Deutschen. In Torsten Leuschner, Tanja Mortelmans & Sarah De Groodt (eds.), Grammatikalisierung im Deutschen, 11–30. Berlin & New York: De Gruyter. López, Cecilio. 1965. The Spanish overlay in Tagalog. Lingua 14. 467–504. Loporcaro, Michele. 2018. Gender from Latin to Romance. History, geography, typology. Oxford: Oxford University Press. Matras, Yaron. 2007. The borrowability of structural categories. In Yaron Matras & Jeanette Sakel (eds.), Grammatical borrowing in cross-linguistic perspective, 31–74. Berlin & New York: De Gruyter. Matras, Yaron. 2009. Language contact. Cambridge: Cambridge University Press. Matras, Yaron & Jeanette Sakel. 2007. Grammatical borrowing in cross-linguistic perspective. Berlin & New York: De Gruyter. Maurer, Philippe. 2013. Gender agreement of adnominal adjectives. In Susanne Maria Michaelis, Martin Haspelmath, Magnus Huber, Philippe Maurer, Bradley Taylor & Robert Forkel (eds.), The atlas of Pidgin and Creole language structures, 154–157. Oxford: Oxford University Press. Menovščikov, G. A. 1964. K voprosu o pronicaemosti grammatičeskogo stroja jazyka [Towards the question of the grammatical permeability in the grammatical system of the language]. Voprosy Jazykoznanija 5. 100–106. Menovščikov, G. A. 1968. Aleutskij jazyk. Jazyki narodov SSSR, 5: mongol’skie, tungusoman’čžurskie i paleoaziatskie jazyki [Languages of the peoples of the USSR, 5: Mongolian, Tungus-Manchzhur, and Paleo-Asiatic languages], 386–406. Leningrad: Nauka. Menovščikov, G. A. 1969. O nekotoryx social’nyx aspektax ėvoljucii jazyka [On some social aspects of the language evolution]. In A. V. Desnickaja, V. M. Žirmunskij & L. S. Kovtun (eds.), Voprosy social’noj lingvistiki, 110–134. Leningrad: Nauka. Muysken, Pieter. 2001. Spanish grammatical elements in Bolivian Quechua: The Transcripciones. In Klaus Zimmermann & Thomas Stolz (eds.), Lo propio y lo ajeno en las

48 | Thomas Stolz and Nataliya Levkovych

lenguas austronésicas y amerindias. Procesos interculturales en el contacto de lenguas indígenas con el español en el Pacífico e Hispanoamérica, 59–82. Frankfurt am Main: Vervuert. Neukom, Lukas. 2001. Santali. München & Newcastle: Lincom Europa. Nichols, Johanna. 2003. Diversity and stability in language. In Brian D. Joseph & Richard D. Janda (eds.), The handbook of historical linguistics, 283–310. Malden/MA: Blackwell. Oñederra, Miren Lourdes & Bernhard Hurch. 1990. Borrowing in Basque. In Werner Bahner, Joachim Schildt & Dieter Viehweger (eds.), Proceedings of the 14th International Congress of Linguists. Berlin/GDR, August 10–15, 1987, 1732–1735. Vol. II. Berlin: Akademie Verlag. Pagel, Steve. 2008. The old, the new and the in-between: comparative aspects of Hispanisation on the Marianas and Easter Island (Rapa Nui). In Thomas Stolz, Dik Bakker & Rosa Salas Palomo (eds.), Hispanisation. The impact of Spanish on the lexicon and grammar of the indigenous languages of Austronesia and the Americas, 167–202. Berlin & New York: De Gruyter. Pagel, Steve. 2010. Spanisch in Ozeanien. Frankfurt a.M.: Lang. Parker, Gary John. 1969. Ayacucho Quechua grammar and dictionary. The Hague: Mouton. Pasch, Helma. 1988. Entlehnung von Bantu-Präfixen in eine Nicht-Bantusprache. Zeitschrift für Phonetik, Sprachwissenschaft und Kommunikationsforschung 41(1). 48–63. Pensalfini, Rob & Felicity Meakins. 2019. Gender lender: Noun borrowings between Jingulu and Mudburra in Northern Australia. Journal of Language Contact 12. 444–482. Peterson, John. 2011. A grammar of Kharia. A South Munda language. Leiden & Boston: Brill. Petrovici, Émile. 1967. Le neutre en istro-roumain. In N.N. (eds.), To honor Roman Jakobson: Essays on the occasion of his seventieth birthday, 1523–1526. The Hague & Paris: Mouton. Poppe, Nikolaus. 1964. Das Mittelmongolische. In Nikolaus Poppe (ed.), Mongolistik, 96–103. Leiden & Köln: Brill. Putniņš, Eduārds. 1985. Svētciema izloksnes apraksts [Outline of the Svetciema dialect]. Rīga: Zinātne. Quilis, Antonio. 1973. Hispanismos en Cebuano. Madrid: Ediciones Alcalá. Quilis, Antonio. 1992. La lengua española en cuatro mundos. Madrid: Mapfre. Reershemius, Gertrud. 2007. Grammatical borrowing in Yiddish. In Yaron Matras & Jeanette Sakel (eds.), Grammatical borrowing in cross-linguistic perspective, 245–260. Berlin & New York: De Gruyter. Rosen, Nicole & Carrie Gillon. 2018. Nominal contact in Michif. Oxford: Oxford University Press. Rothe, Astrid. 2012. Genus und Mehrsprachigkeit. Zu Code-Switching und Entlehnung in der Nominalphrase. Heidelberg: Winter. Royen, Gerlach. 1929. Die nominalen Klassifikations-Systeme in den Sprachen der Erde. Historisch-kritische Studie, mit besonderer Berücksichtigung des Indogermanischen. Mödling: Anthropos. Rubino, Carl. 2000. Ilocano dictionary and grammar. Honolulu: University of Hawaii Press. Sakel, Jeanette. 2007. Types of loan: Matter and pattern. In Yaron Matras & Jeanette Sakel (eds.), Grammatical borrowing in cross-linguistic perspective, 15–30. Berlin & New York: De Gruyter. Schachter, Paul & Fe T. Otanes. 1972. Tagalog reference grammar. Berkeley: University of California Press. Schmitt, Rüdiger. 1989. Mitteliranische Periode. In Rüdiger Schmitt (ed.), Compendium Linguarum Iranicum, 95–105. Wiesbaden: Harrassowitz.

On the (almost im)possible emergence of grammatical gender | 49

Seifart, Frank. 2012. The principle of morphosyntactic subsystem integrity in language contact. Evidence from morphological borrowing in Resígaro (Arawakan). Diachronica 29(4). 471–504. Seifart, Frank. 2020. AfBo: A world-wide survey of affix borrowing. Leipzig: Max Planck Institute for Evolutionary Anthropology. DOI: 10.5281/zenodo.3610155 [accessed on 23 May, 2020]. Sekerina, Irina A. 1994. Copper Island (Mednyj) Aleut (CIA): A mixed language. Languages of the World 8. 14–31. Siemund, Peter. 2008. Pronominal gender in English. A study of English varieties from a crosslinguistic perspective. London & New York: Routledge. Steinkrüger, Patrick. 2008. Hispanisation processes in the Philippines. In Thomas Stolz, Dik Bakker & Rosa Salas Palomo (eds.), Hispanisation. The impact of Spanish on the lexicon and grammar of the indigenous languages of Austronesia and the Americas, 203–236. Berlin & New York: De Gruyter. Steinkrüger, Patrick. 2009. The morphology of Chabacano: Its complexity in comparative perspective. In Nicholas Faraclas & Thomas B. Klein (eds.), Simplicity and complexity in Creoles and Pidgins, 175–183. London & Colombo: Battlebridge. Stolz, Christel. 2005. Zur Typologie der Genuszuweisung im Standarddeutschen und Zimbrischen. In Ermenegildo Bidese, James R. Dow & Thomas Stolz (eds.), Das Zimbrische zwischen Germanisch und Romanisch, 131–163. Bochum: Brockmeyer. Stolz, Christel. 2008. Loan word gender. A case of Romanicisation in Standard German and related enclave varieties. In Thomas Stolz, Dik Bakker & Rosa Salas Palomo (eds.), Aspects of language contact. New theoretical, methodological and empirical findings with special focus on Romancisation processes, 399–440. Berlin & New York: De Gruyter. Stolz, Christel. 2009. A different kind of gender problem. Maltese loan-word gender from a typological perspective. In Bernard Comrie, Ray Fabri, Elizabeth Hume, Manwel Mifsud, Thomas Stolz & Martine Vanhove (eds.), Introducing Maltese linguistics. Selected papers from the 1st International conference on Maltese Linguistics, Bremen, 18–20 October, 2007, 321–353. Amsterdam & Philadelphia: John Benjamins. Stolz, Thomas. 2002. General linguistic aspects of Spanish-indigenous language contacts with special focus on Austronesia. Bulletin of Hispanic Studies 79(2). 133–158. Stolz, Thomas. 2012. Survival in a niche. On gender-copy in Chamorro (and sundry languages). In Martine Vanhove, Thomas Stolz, Aina Urdze & Hitomi Otsuka (eds.), Morphologies in contact, 93–140. Berlin: Akademie Verlag. Stolz, Thomas. 2015. Adjective-noun agreement in language contact: Loss, realignment and innovation. In Francesco Gardani, Peter Arkadiev & Nino Amiridze (eds.), Borrowed morphology, 269–301. Berlin & Boston: De Gruyter Mouton. Stolz, Thomas & Nataliya Levkovych. this volume. Parallel Romancisation: Chamorro and Tetun Dili – two heavy borrowers compared. Szemerényi, Oswald. 1980. Language decay, the result of imperial agrandissement? In Jean Bingen, André Coupez & Francine Mawet (eds.), Hommages à Maurice Leroy. Recherches de Linguistique, 206–214. Bruxelles: Éditions de l’Université de Bruxelles. Tadmor, Uri. 2007. Grammatical borrowing in Indonesian. In Yaron Matras & Jeanette Sakel (eds.), Grammatical borrowing in cross-linguistic perspective, 301–328. Berlin & New York: De Gruyter. Thomason, Sarah G. 1997. Mednyj Aleut. In Sarah G. Thomason (ed.), Contact languages: A wider perspective, 449–468. Amsterdam & Philadelphia: John Benjamins.

50 | Thomas Stolz and Nataliya Levkovych

Thomason, Sarah G. 2001. Language contact. An introduction. Washington/DC: Georgetown University Press. Thomason, Sarah G. & Terrence Kaufman. 1988. Language contact, creolization, and genetic linguistics. Berkeley, London & Los Angeles: University of California Press. Weil, Gotthold. 1917. Grammatik der osmanisch-türkischen Sprache. Berlin: Reimer. Wohlgemuth, Jan & Michael Cysouw (eds.). 2010. Rara & rarissima. Documenting the fringes of linguistic diversity. Berlin & Boston: De Gruyter Mouton.

Deborah Arbes

Language contact and number inflection in Patagonian Welsh Language contact and number inflection in Patagonian Welsh Abstract: Patagonian Welsh as a heritage language offers a valuable opportunity to investigate language contact between Welsh and Spanish. In this study, Spanish and English loan nouns from the Patagonia Corpus are analyzed regarding their inflection for number in order to measure the extent to which they are integrated into Welsh, and a comparison is drawn to data from the Siarad Corpus recorded in Wales. The findings from the corpus study reveal that former language contact with English is still a key element in Patagonian Welsh. Spanish nouns occur mostly as single word codeswitches and are very rarely integrated into Welsh morphologically. Keywords: borrowing; codeswitching; heritage language; plural; Y Wladfa

1 Introduction When the community of Welsh settlers became an integrated part of Argentina, a predominantly Spanish speaking country, at the turn of the 20th century, it was foreseeable that Spanish would have an effect on the Welsh variety spoken there. Many years after the establishment of the settlement, the question arises: to what extent is the Welsh language in Patagonia in contact with Spanish, and how does this contact situation and the previous language contact with English shape the language spoken today? Since speakers of Patagonian Welsh are now bilingual or multilingual and Spanish is most often their dominant language, language contact is as thorough as it gets. This is also confirmed by R. O. Jones (1984), who demonstrated that certain phonological differences to the language spoken in Wales were present in the speech of large parts of the community in the 1970s. According to the borrowing scale established by Thomason (2001), borrowing on both a lexical

|| Deborah Arbes: University of Bremen, FB 10: Linguistics/Language Sciences, UniversitätsBoulevard 13, 28359 Bremen, Germany. E-Mail: [email protected] https://doi.org/10.1515/9783110785517-003

52 | Deborah Arbes

and on a morphological level can be expected at this stage, especially in typologically similar languages (see also Thomason 2015). According to several borrowing hierarchies, nouns are more likely to occur as loanwords and codeswitches than other parts of speech (see Matras 2009). Welsh nouns mark grammatical number and gender. The latter is covert in that it only shows in agreement with some adjectives and by mutation (see Section 1.2). Singular nouns are often unmarked as well, therefore multiplex1 nouns are chosen as the subject of this study. Through examining the suffixes and sound changes involved in the number inflection of loanwords, the extent of integration into Welsh can be assessed. The Patagonia Corpus provides a valuable insight into the Welsh language spoken in the Chubut province of Argentina and is chosen as a resource for this study. The Siarad Corpus will also be examined as it provides a benchmark for comparisons drawn with the language spoken in Wales. The following research questions regarding borrowing and codeswitching of lexical items will be answered in this study: 1a) To what extent are English and Spanish nouns integrated by Welsh number inflection in the Patagonia Corpus? 1b) Is there a difference between the Siarad and Patagonia Corpora in terms of the integration of English loanwords? 2a) How frequent are insertions of Spanish and English nouns pluralized by (e)s in a Welsh context (i.e. single word codeswitches)? 2b) Are these insertions integrated into Welsh by soft mutations? Plurality markers in general have been found to be “more prone to borrowing than, for instance, case or person morphology” (Matras 2015: 61). It is therefore imaginable that English as well as Spanish morphology have an effect on Welsh nouns. Both possible donor languages employ a form of the plural suffix -s, (sometimes realized as -es or -ies, represented as -s in the following) and it has been shown that Welsh-origin nouns may adapt this plural suffix, e.g. pregethwrs/pregethwyrs (instead of pregethwyr) ‘preachers’ (P. W. Thomas 1996: 175). These considerations lead me to the hypothesis that, as a result of language contact with two languages which employ the plural suffix -s, Welsh-origin nouns in Patagonia are increasingly pluralized by this suffix. Research question 3) is therefore concerned with morphological rather than lexical borrowing: 3) Is the suffix -s borrowed (from either Spanish or English) and applied to Welsh-origin nouns in the Patagonia Corpus? || 1 This includes plural as well as collective nouns (cf. Haspelmath and Karjus 2017).

Language contact and number inflection in Patagonian Welsh | 53

The remainder of this paper will be structured as follows: first, more background information about the history of the Welsh settlement in Chubut will be given (Section 1.1), before previous studies conducted on Patagonian Welsh, other relevant studies and key concepts for this paper are introduced (Section 1.2). The method used for this study is described in Section 2, alongside an insight into the creation of the Patagonia Corpus. Section 3 is centered around nouns with Welsh number inflection, and Spanish and English loanwords are discussed separately in Sections 3.1 and 3.2 before reaching a conclusion in Section 3.3. Before the subject of codeswitching is discussed in Section 5, English loanwords employing the suffix -ys are sorted into the codeswitchingborrowing-paradigm in Section 4. Subsequently, Spanish and English single word codeswitches are discussed in Sections 5.1 and 5.2 respectively. Finally, morphological borrowing of the suffix -s is addressed in Section 6 and the conclusions are summarized in Section 7.

1.1 Historical background When a Welsh colony overseas was planned in the 19th century, the Argentine government ultimately granted a request by Michael D. Jones for a group of Welsh people to settle down in a part of Chubut. The settlers wished to establish a community in which they were disconnected from the influence of the British Empire, the Anglican Church and the English language, as the living conditions in their own country had become difficult, economically as well as culturally (see Williams 1975). For Argentina’s politicians the objective was to increase the population via immigration from Europe and thus achieve economic growth. On the 28th of July 1865, the first 153 Welsh emigrants arrived in Patagonia and many more were to follow: Between 1865 and 1911, around 3,000 settlers left Wales to begin a new life in Argentina. In the decades following its founding, the primary language of the settlement was Welsh, and language domains included companies, political and legal institutions as well as schools, churches and homes, as all these institutions were independent and Y Wladfa (Welsh for ‘the colony’) was largely self-sustaining (see R. O. Jones 1998). In fact, the settlers were now in the paradox situation of being both colonizer and (formerly) colonized: Argentine, Welsh, and indigenous leaders thus all understood Y Wladfa to be a settler colony, but the Welsh ‘colonizers’ were not all-commanding. The first years were characterized by vulnerability and dependency, rather than power. Like many pioneer settlers, their relationship with indigenous inhabitants was vital for the Welsh, and the archives express the ambivalence that their subject position dictated. (Taylor 2018: 462)

54 | Deborah Arbes

Not many years after their first arrival, the settlers witnessed the “Conquest of the Desert2”: the indigenous groups living in the regions around Chubut River suffered from military aggressions, which resulted in many killings and confinement to reserves set up by the national state (see Williams 2017: 55). While the settlers were privileged in that the Argentine military was not attempting to kill them, they were not powerful enough to stop the army’s attacks on their neighbors. By the end of the nineteenth century the Argentine government enforced stricter rules on the settlement, especially concerning the use of Spanish. One measure taken to assimilate the community was e.g. the creation of laws which made Spanish medium education obligatory in all public schools. Bilingualism further became the norm when more European immigrants moved into the region in the following decades (see Rees 2021: 246–247). The use of Welsh was decreasing in the area until the Welsh Language Project (WLP) was founded in 1997 with the purpose to promote and develop the Welsh language in the Chubut region of Patagonia. Since the 1990s teachers from Wales have travelled to Argentina every year and supported the Welsh community and the newly founded bilingual schools. According to the annual report on the Welsh language project, the number of Welsh speakers in Argentina is unknown, however, the numbers of classes and people who learn Welsh through WLP is determined exactly: in 2019, 114 classes were offered (among them about 70% for children and young people) and the number of Welsh-learners in Chubut rose to 1411. It is estimated that about 50.000 people of Welsh descent were living in Patagonia at the beginning of the 21st century (Arwel 2019).

1.2 Previous work and theoretical concepts Nowadays, Welsh remains as a heritage language spoken by people who acquired the language in their homes and were then sent to Spanish-medium schools, as well as people who have started learning Welsh in classes as children or adults. The term “heritage language” is defined as “the home/minority language of a bilingual who is dominant in the main societal language” (Polinsky 2018: 10). The unique language biographies, which are intertwined with the decline and revival of the language, are addressed by Rees (2021) and R. O. Jones (1984), who provide detailed sociolinguistic analyses of the varieties of Patagonian Welsh. Berg reports that “Welsh national tourists in Patagonia often consider the dialect of the Welsh language there to be an older, more tra-

|| 2 See Masés (2002) for more information on this topic.

Language contact and number inflection in Patagonian Welsh | 55

ditional form of Welsh, and speak highly of this form of the Welsh language upon their return to the homeland” (Berg 2019)3. This impression is also attributable to the fact that most Patagonian Welsh speakers do not speak English and therefore, codeswitching and spontaneous borrowing of English forms are not the norm (but cf. Sections 3–5). A study which has provided valuable insights into codeswitching in Patagonian Welsh was carried out by Carter et al. (2011). Employing Myers-Scotton’s (2002) Matrix-Language Framework (MLF), their study focuses on bilingual sentences and their underlying structures. Matrix Language (ML) is defined as the variety which “supplies essential morphosyntactic structure for mixed constituents, while the Embedded Language may supply content morphemes to be inserted into this frame. The Matrix Language also controls, in various ways, any monolingual Embedded Language constituents (Embedded Language islands) within the larger bilingual constituent” (Myers-Scotton 2002: 25). Although my research has been inspired by the work of Myers-Scotton and others who have employed MLF, I do not regard the framework as a theoretical base for this study, as it is mostly concerned with the morphology of single words rather than clauses. When researching codeswitching in Patagonian Welsh, Carter et al. (2011) found out that in the bilingual (Welsh and Spanish) Patagonia Corpus, 97% of simple clauses are monolingual, i.e. no intra-clausal codeswitching occurs. The Matrix language in 81% of these monolingual clauses is Welsh and in 19% of clauses it is Spanish (see (4) for a monolingual Spanish clause). Example (1) illustrates one of the 3% of bilingual clauses4. (1)

PC: PATAGONIA-11, (1020) HER5 mae digon o o medios de comunicación rŵan. be.3SG.PRES enough of of means.PL of communication now ‘there are enough means to communicate these days.’

|| 3 Anecdotal evidence for this is found e.g. in two newspaper articles: “’I’d like to visit Wales before I die,’ another man says in the pure and fluent Welsh he has spoken since a child.” (BBC News article 27 July 2017) https://www.bbc.com/news/uk-wales-33666169 (accessed 05.05.2021). “He speaks better Welsh than we do,” said Mrs Tardioli. “[…] The Welsh he speaks is no different, except it’s a bit more pure.” (WalesOnline article about a Patagonian Welsh speaking waiter in Swansea, 25 August 2004) https://www.walesonline.co.uk/news/wales-news/waiterwaiter-theres-patagonian-pub-2427260 (accessed 05.05.2021). 4 Here and in the following, the codeswitching elements and/or the relevant plural nouns are highlighted in bold. 5 Here and forth the data taken from a corpus is presented as follows: Corpus (PC= Patagonia Corpus; SC= Siarad Corpus): File in the Corpus, (Utterance no) Abbreviation of speaker’s pseudonym.

56 | Deborah Arbes

As shown in (1), switching to Spanish does happen in Patagonian Welsh, but, as this data suggests, not usually on an intra-clausal level. In comparison, Carter et al. (2011) report a rate of 81% bilingual and 19% monolingual clauses in the Siarad Corpus, suggesting that intra-clausal codeswitching, as shown in (2), occurs significantly more often in Wales than in Patagonia. (2)

SC: Davies-15, (276) TEG dw i meddwl bod fi mynd am few be.1SG.PRES 1SG think.VN be.VN 1SG go.VN for few ‘I think I’m going for a few days’

days day.PL

Perhaps the virtual absence of codeswitching is what led the aforementioned people to call Patagonian Welsh “more traditional”. However, as Rees (2017, 2021) observes, Patagonian Welsh is by no means untouched by influences from other languages: besides words and phrases which originate in different dialects of Welsh, he finds English and Spanish borrowings. Welsh speakers living in Chubut have reported that the words they used for fruits and vegetables are often English terms, which apparently had been present in the Welsh language when the community first settled in Patagonia: Since instances of English loanwords occur (as well as Spanish ones), it is important to emphasize that these are also common in Wales (especially among older generations): it may therefore be assumed that loanwords from English formed part of the initial mixture of dialect forms. Interestingly, several heritage speakers informed me that they were unaware of the English origins of some loanwords until the arrival of teachers from Wales. (Rees 2021: 256)

Examples mentioned in interviews carried out by Rees (2017) include the nouns caretsh/caritsh ‘carrots’, grêps ‘grapes’ and plyms ‘plums’. In Wales, the forms morron ‘carrots’, grawnwyn ‘grapes’ and eirin ‘plums’ appear to be more common. Furthermore, Rees (2017) lists several Spanish elements which occur in Patagonian Welsh discourse: Among others, the nouns estancia ‘farm’, baño ‘bathroom’ and corral ‘corral/farmyard’ are mentioned. Having mentioned loanwords, borrowing and codeswitching, there is a need to define the terms before employing them in the following research questions and analysis. Codeswitching and borrowing have been described as being on opposite ends of a continuum (e.g. Matras 2009), while others draw a strict line between the definitions of the two (e.g. Poplack 2018). While codeswitching usually occurs between bilingual speakers and loanwords may also be employed by monolinguals (Matras 2009: 111), the linguistic proficiency of the individual speakers is not employed as a criterion to distinguish the two in this study.

Language contact and number inflection in Patagonian Welsh | 57

In the following analysis, the term codeswitching6 applies to nouns which a) are of Spanish or English origin; AND b) are inflected by the Spanish or English plural suffix -s (or one of its allomorphs). By this definition, the criteria Poplack (2018) established to distinguish codeswitches from borrowed nouns, are met: – Codeswitches are not morphologically integrated into the recipient language. – Codeswitches may be phonologically integrated into the recipient language. An additional criterion for codeswitches cannot be met by single word codeswitches, which is that the grammar of the donor language applies. Multiword codeswitches, for which an internal syntax would apply, are intentionally left out of this study. Poplack (2018) claims that single word codeswitches are extremely rare, however, whether this applies to the Patagonia Corpus as well is open to question. Integrating a word into Welsh is possible by various means, however, the primary focus of this study is number inflection (i.e. plural or singulative suffixes or vowel change). A noun employing Welsh number inflection in its multiplex or uniplex form is regarded as morphologically integrated into Welsh and therefore labeled as a loanword. Borrowing can refer to the adaptation of lexical or structural items (e.g. morphemes). “Lexical borrowings” and “loanwords” are used as synonyms here and I follow Haspelmath (2009: 43) in using the terms “integration” and “adaptation” synonymously in this paper. Another means of measuring integration is the variable “mutation when expected”, which has been established by Stammers and Deuchar (2012). Welsh, like all Celtic languages, has a morpho-phonological feature called initial mutation, which makes some words change their first consonant (see Table 1). These mutations can be triggered lexically or syntactically. For this study, it is mostly lexically triggered mutations that will play a role. There are three types of mutations: soft, nasal and aspirate. Only soft mutations will be taken into account here, as they are the most frequently occurring ones (see Stammers and Deuchar 2012: 638).

|| 6 The terms used here are chosen for the sake of labelling two concepts, whose differences are visible on the surface of the words. I do not intend to make theoretical implications by the labels. It may be the case that English forms uttered by Spanish-Welsh bilinguals with little or no knowledge of English are labeled as codeswitches, which is in line with a purely morphological interpretation of the concept but disregards the sociolinguistic implication of the term.

58 | Deborah Arbes

Table 1: Soft mutation in Welsh (cf. King 2003: 14).

Original consonant Original consonant Soft mutation Soft mutation Examples (phonetic) (orthographic) (phonetic) (orthographic) c

[k]

g

[g]

cegin → gegin ‘kitchen’

p

[p]

b

[b]

plant → blant ‘children’

t

[t]

d

[d]

tegell → degell ‘kettle’

g

[g]

(disappears)

–

gardd → ardd ‘garden’

b

[b]

f

[v]

bara → fara‘bread’

d

[d]

dd

[ð]

defaid → ddefaid ‘sheep’

ll

[ɬ]

l

[l]

lloeau → loeau ‘calves’

[m]

f

[v]

merch → ferch ‘girl’

[]

r

[r]

rhosyn → rosyn ‘rose’

rh

2 Method The Patagonia Corpus consists of 43 conversations between 92 participants which were recorded in 2009. The recordings amount to almost 21 hours of conversation containing 195,190 word tokens. Considerably larger is the size of the Siarad Corpus, which will be taken into account for comparisons in this study. It consists of 69 transcripts of conversations from 151 speakers who were recorded between 2005 and 2007, totaling 40 hours and containing 460,000 word tokens. It is important to keep in mind that the Siarad Corpus does not represent all regiolects of Wales equally: 74% of participants grew up in the North West and 7% in the North East of Wales, therefore there is a bias towards Northern Welsh in the data7 (Deuchar et al. 2018: 22). In order to minimize any influence the researchers may have on the conversations, several measures were taken. Firstly, no thematic suggestions were given and the participants were free to choose any subject they wished to talk about. Secondly, participants talked to a familiar person they chose as their bilingual recording partner, and thirdly, the researcher was not present when the conversation was recorded (Carter et al. 2011: 167–168).

|| 7 For more information on the corpus and the participants see http://bangortalk.org.uk/ speakers.php?c=siarad (accessed 05.05.2021).

Language contact and number inflection in Patagonian Welsh | 59

The transcribed text was glossed automatically by the Bangor Autoglosser8. A complete wordlist is provided by ESRC Centre for Research on Bilingualism and can be downloaded at http://bangortalk.org.uk/. Note that the glosses I added in this paper are not identical to the outcome of the Autoglosser as I used the Leipzig Glossing Rules. From this file a list of all multiplex nouns was extracted. Those nouns which occur in monolingual Spanish clauses were deleted from the list, as well as Spanish-origin nouns which are directly preceded or followed by another Spanish noun. Thus, the only Spanish nouns left in the list for this analysis are single word codeswitches9. To illustrate this, the plural nouns in (3) and (4) were excluded from the list, while the noun in (5) was included: (3)

PC: PATAGONIA-11, (1904) HER dach chi (y)n mynd yn um setenta kilómetros go.VN PRT um seventy kilometre.PL be.2PL.PRES 2PL PRT ‘You are doing 70 kilometres’

(4)

PC: PATAGONIA-18, (813) FRO por eso hay como sectores. for that EXI like sector.PL ‘that’s why there’s something like sections.’

(5)

PC: PATAGONIA-07, (45) ESM wnaeth o golli ei documentos i L do.3SG.PRET 3M.SG Llose.VN 3.M.SG.POSS document.PL to ‘he lost all his documents.’

gyd. union

L

Each multiplex noun was looked up in Geiriadur Prifysgol Cymru (GPC) or in a Spanish dictionary to determine its etymology. Moreover, each noun was assigned to one of the eight categories of the Welsh number system (see Table 2). A detailed description of how grammatical number manifests in nouns morphologically is provided by Thomas et al. (2014). The eight categories are based on P. W. Thomas (1996), who states that nouns can be pluralized by adding a suffix, removing a suffix or alternating two suffixes. These three options can be combined with an additional sound change. Adding “sound change only” and a “suppletion” category, the model is complete with eight categories, which are labeled by Thomas et al. (2014) as described in the second column of Table 2.

|| 8 http://bangortalk.org.uk/autoglosser.php (accessed 05.05.2021). 9 Depending on preferred theory and terminology, other terms may apply for this concept, e.g. “lone LD items” (see Poplack 2018 and Deuchar 2006).

60 | Deborah Arbes

Categories 2, 4, 6 and 7 employ what Thomas et al. (2014) label “V”, signifying “vowel change”. However, they specify that other types of sound change are included in this, e.g. cyngerdd → cyngherddau ‘concert(s)’. Therefore, the term “sound change” is adopted for this study. Additionally, Thomas et al. (2014) do not specify whether the vowel change occurs on the singular or the plural form. This information is added in column 3 and 4. As suggested by Nurmio (2017), the collective form is the morphological base in the singulative-collective paradigm. Therefore, category 6 is better described as adding a suffix and a vowel change on the uniplex form rather than subtracting a suffix and adding a vowel change to the multiplex form. Consequently, in this study the term “collective” is applied instead of “-suffix”. Table 2: Categories of Welsh pluralization.

No. of category

Label according to Thomas et al. (2014)

Uniplex form

Morphological explanation Multiplex form

1

+ suffix

base

base + suffix

peth

peth-au

base

base + suffix + sound change

2

+ suffix + V

3

~suffix

hog-yn

hog-iau

4

~suffix + V

base + sound change + suffix 1

base +suffix 2

deigr-yn

dagr-au

5

–suffix

base + suffix

base

coed-yn

coed

base + sound change + suffix

base

plent-yn

plant

base

base + sound change

car

ceir

base 1

base 2

person

pobl

6

–suffix + V

7

V

8

suppletive

gair

geir-iau

base + suffix 1

base + suffix 2

Translation

‘thing(s)’

‘word(s)’

‘boy(s)’

‘tear(s)’

‘tree(s)’

‘child(ren)’

‘car(s)’

‘person/people’

Language contact and number inflection in Patagonian Welsh | 61

Initially, the Welsh-origin nouns which occur in the Patagonia Corpus are presented alongside their assigned categories. Differences between the categories’ frequencies will be addressed further. Subsequently, the nouns not originating in Welsh are sorted into the categories, counted and analyzed in Sections 3.1 and 3.2 in order to answer research questions 1a) and 1b), before taking into account those pluralized by -s in Sections 5.1 and 5.2.

3 Nouns with Welsh number inflection As Figure 1 shows, the most dominant source languages among those nouns employing Welsh number inflection are Welsh and Latin. A list of the most frequent Latin-origin nouns in the Siarad Corpus reveals that most Latin borrowings are first attested between the 9th and 13th centuries, e.g. pobl ‘people’, plant ‘children’ and ysgol ‘school’ (Parina 2010). Some appeared in the language in the 17th century, e.g. coleg ‘college’ (from lat. collegium) and there are also examples of Latin borrowings entering Welsh through English (e.g. stori/(y)storïau ‘story/ stories’). In the case of several possible etymologies, this study adopts GPC’s interpretations10 (e.g. stori is counted as an English loanword).

types (n=318)

tokens (n=2457)

0%

20% Welsh

40% Latin

English

60% Spanish

Figure 1: Source languages of nouns with Welsh number inflection.

|| 10 The same approach is taken by Parina (2010: 189).

80% Other

100%

62 | Deborah Arbes

The type-token-ratio shows that there are few types of Latin origin, however, they occur very frequently in the corpus. For English nouns the opposite applies: The percentage of types is higher than that of tokens, indicating that the individual English-origin-types occur less often in the corpus (e.g. ticedi ‘tickets’ and tiwtoriaid ‘tutors’ occur once each). The percentages for integrated loanwords in the Siarad Corpus are very similar (see Section 3.3).

3.1 Spanish nouns and Welsh number inflection There is one single occurrence of a Spanish noun pluralized by a Welsh suffix, as attested in (6): (6)

PC: PATAGONIA-06, (418) SAR ac oedd gyda nhw ardd fawr a llysiau L and be.3SG.IMPERF with 3PL garden Lbig and vegetable.PL a codi gwair anifeiliaid efo nhw a corral_iau and lift.VN hay animal.PL with 3PL and farmyard.PL da cwt ieir da. good hut hen.PL good ‘and they had a big garden and vegetables and a harvester and animals and good cattle enclosures and a good chicken coop.’

The other Spanish nouns are less integrated into Welsh in that 100 types of Spanish nouns retain their original plural suffix -s. These are discussed in Section 5.1. In conclusion, it is not entirely impossible to pluralize a Spanish noun with a Welsh suffix, but despite many years of language contact between Welsh and Spanish, this example seems to be a rare exception.

3.2 English nouns and Welsh number inflection 24% of types and 11% of tokens applying Welsh number inflection are of English origin, as Figure 1 shows. An overview about which pluralization categories loanwords mostly appear in is provided in Figures 3–4. Figure 2 shows only Welsh- origin nouns and serves as a benchmark against which one can measure the frequency of loanwords in the individual categories.

Language contact and number inflection in Patagonian Welsh | 63

types (n=189)

tokens (n=1311)

0%

20% 1

40% 2

3

60% 4

5

6

80% 7

100%

8

Figure 2: Categories of Welsh-origin nouns.

Comprising 57% of tokens and 53% of types, category 1 (+suffix) is by far the most frequent one. Two more categories contain more than 10% of types and tokens, namely 7 (sound change) and 2 (+suffix +sound change). The Latin loanwords operate similarly to the Welsh nouns regarding the proportions of types (see Figure 3). Considering tokens, however, there are a few nouns (belonging to categories 8 and 6) which occur disproportionately often. The relevant nouns are pobl ‘people’ (SG: person, category 8 “suppletion”) and plant ‘children’ (SG: plentyn, category 6 “collective +sound change”).

types (n=49)

tokens (n=879)

0%

20% 1

40% 2

3

Figure 3: Categories of Latin-origin nouns.

60% 4

5

6

80% 7

8

100%

64 | Deborah Arbes

Among the English-origin nouns, 62% of types and 58% of tokens are pluralized by only adding a suffix (category 1).

types (n=75)

tokens (n=262)

0%

20% 1

40% 2

3

60% 4

5

6

80% 7

100%

8

Figure 4: Categories of English-origin nouns.

This was expected considering the English pluralization system and the large share of category 1 nouns with Welsh etymologies. Category 1 contains nouns borrowed from Old-, Middle- and Modern English. Examples include: (7)

cwpanau (SG: cwpan) ‘cup(s)’

(from OE cuppe ‘cup’)11

(8)

papurau (SG: papur)

‘paper(s)’

(from ME papur(e)/ papir(e) ‘papers’)

(9)

ffrindiau (SG: ffrind)

‘friend(s)’ (from Modern English)

Three more categories include 10% of types or more: category 2 (+suffix + sound change), category 3 (~suffix) and category 5 (collective nouns). In the following, the English-origin nouns in these categories are listed and analyzed regarding their integration into Welsh. Comparisons with the Siarad Corpus are drawn in order to uncover possible differences in the use of English loanwords.

|| 11 All translations and etymologies by GPC (Geiriadur Prifysgol Cymru)

Language contact and number inflection in Patagonian Welsh | 65

3.2.1 Category 5 (Collective nouns) The individual nouns in category 5 and their frequencies in the Patagonia Corpus are listed in Table 3. English-origin nouns are highlighted in grey. Table 3: Nouns in category 5.

Collective* noun

Singulative form (GPC)

Translation

Tokens in Patagonia Corpus

Origin language

coed

coeden

‘trees’

37

cym.

cyrens

cyrensen

‘currants’

9

eng.

moron

moronyn, moronen

‘carrots’

5

ME (from moren ‘roots‘)

cenllysg

cynllysgyn, cenllysgen ‘hail, hailstones’

4

cym.

moch

mochyn

‘pigs’

4

cym.

ffa

ffäen

‘beans’

3

lat.

brics

bricsen

‘bricks’

3

eng.

losin

losinen, losen

‘sweets’

3

eng. (from lozenge)

pys

pysen

‘peas’

3

lat.

gwenyn

gwenynen

‘bees’

2

cym.

caretsh

caretshen

‘carrots’

2

eng.

plyms

n.a. (Geiriadur yr Academi: plymsen)

‘plums’

2

eng.

sêr

seren

‘stars’

2

cym.

mefus

mefusen

‘strawberries’

2

cym.

swîts

switsen

‘sweets’

2

eng.

carets

caretsen

‘carrots’

1

eng

pysgod

pysgodyn

‘fish’

1

lat.

dodrefn

dodrefnyn

‘furniture’

1

cym.

pìls

pilsen

‘pills’

1

eng.

eirin

eirinen

‘plums’

1

cym

* There is another category composed of collective nouns, namely category 6, which involves a vowel change. However, only one English loanword is found in that category (racs, SGV: recsyn ‘rag(s)’). This indicates that in terms of integrating loanwords, category 5 has been more productive.

The large proportion of English nouns in category 5 is especially striking: in fact, this category contains more English than Welsh types. Most English nouns

66 | Deborah Arbes

in this category end with the letter . While this represents the plural morpheme in English, in Welsh the noun has been reinterpreted as a collective noun. This interpretation is also put forth by Stolz (2001), who provides a list of similar loanwords listed in Welsh dictionaries and concludes that “massive language contact did not accelerate the expected disintegration of marked singulative-collective distinctions. On the contrary, the integration of English loan-words has even contributed to strengthening the system-internal role of singulative-collective distinctions” (Stolz 2001: 69). The singulative counterparts (e.g. switsen ‘sweet’, cyrensen ‘currant’ or moronen ‘carrot’) do not occur in the Patagonia Corpus, however, they are attested in GPC and Geiriadur yr Academi. Among the singulative forms of Welsh origin nouns, several are attested in the Patagonia Corpus, e.g. blewyn ‘hair’, bluen (mutated form of pluen ’feather’) coeden ‘tree’ and mochyn ‘pig’. A comparison with the Siarad Corpus reveals that only three English nouns in this category were mentioned in the larger corpus created in Wales: brics ‘bricks’ (SGV: bricsen), pills ‘pills’ (SGV: pilsen) and tools ‘tools’ (SGV: twlsyn). A question arises about whether this is a coincidence. Did the participants in Patagonia simply talk more about these specific items than the participants in Wales did? Or is the integration of English nouns into the collective category characteristic of Patagonian Welsh? The findings presented in Table 3 raise more questions than they provide answers. It has been reported that some English loanwords were present in the language when the Welsh community in Patagonia was established (Rees 2017, 2021) and therefore those collective nouns could be an example for old English loanwords which have fallen out of use to some extent in Wales, but continue to occur in Patagonian Welsh conversations. However, in order to determine the exact differences between the use of category 5 nouns in Y Wladfa and Wales, a sample including more examples of these collective nouns and their singulative counterparts is needed.

3.2.2 Category 3 (~suffix) Category 3 nouns are pluralized by exchanging a suffix (usually -yn/ -en) with a plural suffix. The most common Welsh-origin noun in this category is blodyn  blodau ‘flower(s)’. As illustrated in Figure 5, category 3 entails as many Englishas Welsh-origin types (and almost as many tokens). All relevant English-origin nouns are listed in Table 4.

Language contact and number inflection in Patagonian Welsh | 67

types 4

(n=11)

3

4

tokens 31

(n=69)

0%

12

20%

40% Welsh

26

60% Latin

80%

100%

English

Figure 5: Category 3 (~suffix).

Table 4: English-origin nouns in category 3.

Plural form

Singular form (GPC)

Translation

tatws

taten

‘potato(es)’

First attested (see GPC)

Tokens

Form attested in the Siarad Corpus

1562 (tatw)

14

tatws

1796 (tatws) cwinsys

cwinsen

‘quince(s)’ (from 1400 (queyns) ME qwince) cwinsys: unattested

7

–

hogiau

hogyn

‘boy(s)’ (from hogg ‘young animal’)

1688 (hoccŷn, hogŷn)

4

hogiau (SG: hogyn)

‘pipe(s)’

1931 (beipen)

1

pipes (SG: peipen)

peips

peipen

18th century (hogieu)

peips: listed without example

Most of these nouns have been in use for a long time. As a result, some may not be perceived as borrowings anymore. The nouns tatws, hogiau and peips (spelled

68 | Deborah Arbes

) occur in the Siarad Corpus as well12. It is noteworthy that as well as in category 5, the suffix -(y)s prevails here. The form cwinsys can be interpreted as a double plural. Cwinsys is not attested in GPC13, however, it is employed by three different speakers in two interviews14 in the Patagonia Corpus.

3.2.3 Category 2 (+suffix +sound change) Although Welsh-origin nouns are in the majority, English nouns have also integrated into category 2 by undergoing a sound change, as Figure 6 shows.

types 34

(n=55)

11

10

tokens 144

(n=273)

0%

20%

89

40% Welsh

60% Latin

40

80%

100%

English

Figure 6: Category 2 (+suffix +sound change).

All English-origin nouns from category 2 are listed in Table 5. It is clear that this category, as well as the previous ones, does not contain spontaneous borrowings as those plural nouns have been established for a long time. The most widespread suffixes among the English loanwords in category 2 are -ys and -au. All others occur only once. This is different for the Welsh-origin nouns, where -(y)s does not occur at all and -iau and -au are the most widespread suffixes (e.g. geiriau ‘words’, ffyniau ‘sticks’). || 12 Additional findings from the Siarad Corpus in this category are: cardiau ‘cards’, teilsau ‘tiles’, indiaid ‘Indians’ and briciau ‘bricks’. 13 GPC lists cwins, cweins, and cwinsiaid as multiplex forms for the uniplex cwinsen. Geiriadur yr Academi lists cwinsen/cwinsyn – cwins. 14 PC: PATAGONIA-09 (BEL); PATAGONIA-21 (ISA, LIN)

Language contact and number inflection in Patagonian Welsh | 69

Table 5: English-origin nouns in category 2.

Plural noun

Singular noun

Translation

Attested since (see GPC)

Tokens

Form attested in the Siarad Corpus

ffermydd

ffarm*

‘farm’ (from ME ferm(e))

15th century (fferm)

19

ffermydd (SG: fferm)

1722 (ffermydd) bysys/bysus

bws

‘bus’

n.a./unattested 5

buses (SG: bus, bws)

cesys

cês

‘case’

14th century (kaes)

3

cases (SG: case, cesyn)

3

straeon (SG: story, stori)

3

byrddau (SG: bwrdd)

1706 (chasus) (cesys unattested in GPC) straeon

stori

‘story’

1786 (ystraie) 1862 (ystreuon)

byrddau

bwrdd

‘table’ (from OE bord)

14th century (bwrd)

botymau

botwm

‘button’ (from 1346 (botymev) ME botoun)

2

botymau (SG: botwm)

heffrod

heffer

‘heifer’ (from 1547 (heffyr) EME heffre, 1777 (heffrod) heffour, effer)

2

–

clybiau

clwb

‘club’

1766 (clwb)

1

clybiau (SG: club)

gletsys (cletsys)

clats/ clatsh

‘smack, clatch’

n.a.

1

–

bunnoedd

punt

‘pound’ (from 13th century OE pund) (punhoed)

1

bunnoedd (punnoedd), bunnau (punnau) (SG: punt)

ca. 1400 (byrdeu)

1775 (glybion)

(punnoedd)

(clatsys, clatsiau)

* The singular form fferm is also present in the Patagonia Corpus, but it occurs less often than ffarm. Seven participants employed the pair ffarm and ffermydd, while only one participant uttered both fferm and ffermydd. Therefore, ffermydd is listed again in category 1.

A glance at the Siarad Corpus reveals that -iau and -au are also the most common suffixes in category 2 among both Welsh-origin and English-origin nouns in Wales, but here the suffix -ys does not occur. However, the forms bysys and cesys are attested in GPC and therefore not unknown in Wales. Moreover, the

70 | Deborah Arbes

transcribers’ spelling conventions need to be taken into account: The nouns bysys and cesys occur in the Siarad Corpus as well but are spelled and and therefore appear in a different category.

3.2.4 Category 1 (+ Suffix) About 26% of nouns in category 1 are English loanwords, as illustrated in Figure 7.

types 101

(n=180)

29

47

2 1

tokens 741

(n=1022)

0%

20% Welsh

126

40% Latin

English

60% Irish

152

80%

2 1

100%

Spanish

Figure 7: Category 1 (+suffix).

It is worthwhile examining how frequently the individual suffixes in this category are employed in comparison to the Welsh-origin nouns (see Figure 8).

Welsh types… Welsh tokens… English types… English tokens…

-au

0% -iau

-i

20% -ion

40% -oedd -od

60% -ydd/-edd

80% -on -(i)aid

100% -ed

Language contact and number inflection in Patagonian Welsh | 71

-au

English tokens (n=152)

English types (n=47)

Welsh tokens (n=741)

Welsh types (n=101)

51 (33%)

18 (38%)

436 (59%)

42 (42%)

-iau

75 (49%)

18 (38%)

92 (12%)

10 (10%)

-i

13

5

23

8

-ion

0

0

65

17

-oedd

2

1

22

6

-od

0

0

15

3

-ydd/-edd

2

2

34

7

-on

0

0

11

5

-(i)aid

9

3

1

1

-ed

0

0

42

2

Total

152

47

741

101

Figure 8: Suffixes of Welsh- and English-origin nouns in category 1.

Note that overall, about twice as many Welsh-origin types occur in this category compared to English-origin types. The Welsh nouns also show a much higher token count. What is visible at first sight in Figure 8 is the difference between the frequencies of the suffixes -au and -iau occurring in English and Welsh nouns respectively. The suffix -au is sometimes described as a “one size fits all”-plural suffix (cf. Stolz 2008), yet this is more visible in the token count of the Welsh nouns (e.g. peth-au ‘things’) than in the list of individual types. Still, it is the most widespread plural suffix among Welsh-origin nouns in that it makes up for 42% of Welsh-origin types, whereas -iau plays a smaller role among those types and pluralizes 10% of nouns in this category (e.g. llun-iau ‘pictures’). For English origin nouns, the situation is different as both -iau and -au pluralize 38% of types in this category. The token count for nouns pluralized by -iau is higher than that for nouns pluralized by -au, (e.g. bag-iau ‘bags’, siop-au ‘shops’) while the opposite is the case in Welsh-origin nouns. The remaining 48% of Welsh-origin types is divided between another eight suffixes. Among the English-origin nouns, four different suffixes are included in the remaining 24% of types. The suffixes -ion, -od, -on and -ed do not occur on English-origin nouns of category 1 in this corpus. The Siarad Corpus presents a similar picture concerning the suffixes in category 1, indicating that the tendency to pluralize English nouns by means of the suffix -iau exists in both countries.

72 | Deborah Arbes

3.3 Quantitative thoughts Having pointed out pluralization patterns and individual nouns in both corpora, the question remains whether different strategies for inflecting borrowed nouns are visible on a quantitative level. It is important to keep in mind the different sizes of the corpora: With about 460,000 word tokens the Siarad Corpus is more than twice the size as the Patagonia Corpus. One possible difference between the corpora could be the number of integrated loanwords. For inflected loanwords it could be the case that since the establishment of Y Wladfa, more English loanwords have been integrated in Wales. On the other hand, we already know about specific English loanwords which seem to be common in Patagonia (see Sections 1.2 and 3.2). The percentage of English loanwords is ca. 24% of types and 11% of tokens in the Patagonia Corpus (see Figure 1). This compares to 27% of types and 15% of tokens in the Siarad Corpus. A chi-square test found no significant difference between the number of inflected multiplex loanwords in the corpora. Another possibility which emerges from the qualitative analysis in Section 3.2 is a difference between the number of nouns employing the suffix -(y)s in addition to another suffix or sound change. Tables 6 summarizes the occurrences of nouns employing -(y)s in the Patagonia Corpus and Table 7 provides a list of similar forms which are attested in the Siarad Corpus. Table 6: Suffix -(y)s in the Patagonia Corpus.

Category

Example

Translation

Types

Tokens

6 5

racs (SGV: recsyn)

‘rags’

1

4

cyrens (SGV: cyrensen)

‘currants’

6

18

4

gwsbris (SG: gwsberen)

‘gooseberries’

1

4

3

tatws (SG: taten)

‘potatoes’

3

22

bysys (SG: bws)

‘buses’

3

9

14

57

2

Total

Table 7: Suffix -(y)s in the Siarad Corpus.

Category

Example

Translation

Types

Tokens

6

racs (SGV: recsyn)

‘rags’

2

3

5

tools (SGV: twlsyn)

‘tools’

3

6

Language contact and number inflection in Patagonian Welsh | 73

Category

Example

Translation

Types

Tokens

4

cases (SG: cesyn)

‘cases’

1

3

3

pipes (SG: peipen)

‘pipes’

2

16

2

–

–

–

Total

8

28

As mentioned for the data in Table 5, some English nouns which were borrowed in Wales are represented with their English spelling in the Siarad Corpus. This leads to some difficulties when comparing the corpora, because nouns which are sorted into category 1 in the Siarad Corpus are found in category 2 in the Patagonia corpus, even if, as in some cases, a difference in pronunciation is not detectable (e.g. bysys/buses and cases/cesys15). Furthermore, most nouns in category 5 are only attested in their collective form and the corresponding singulative form was not mentioned. It can therefore not be determined with certainty that all nouns from category 5 do in fact belong in this category. If, for instance, the singular form plym ‘plum’ were in use instead of the singulative plymsen, this would place the noun in category 1 (+suffix). Because of these limitations and the small number of nouns in each category, it is not possible to conduct meaningful statistical tests. At this point, it cannot be claimed that there are differences between the two corpora with regards to the use of the plural suffix -(y)s in morphologically integrated nouns.

4 The suffix -ys: codeswitching or borrowing? In the case of nouns pluralized by the suffix -ys, the question of whether those should be treated as loanwords or single word codeswitches arises. This is an issue concerning nouns in category 1 (+suffix) only, as the nouns in categories 2–8 are integrated into Welsh by additional means, e.g. a vowel change or a singulative suffix. Parry-Williams (1923) mentions that forms ending in -ys, -is, and -us have been documented in Welsh texts since the fifteenth century. Therefore, the phenomenon is not at all new. Looking at the criteria for codeswitches listed in Section 1.2, nouns such as clasys ‘classes’ or sandwitshys ‘sandwiches’ fulfil only one out of two: they are of Spanish or English origin, but they are not pluralized by a Spanish or English suffix: -ys does not occur in Spanish or English nominal morphology. However, phonologically, -ys closely resembles the

|| 15 Both singular forms case and cesyn are uttered in the Siarad Corpus, therefore case appears in categories 1 and 4 (see Table 7).

74 | Deborah Arbes

English allomorph -es and is usually employed parallel to it as the same conditions apply for the distribution of -es and -ys: they appear only when the preceding phoneme (i.e. the final phoneme of the singular form) is a sibilant16. When comparing with the Siarad Corpus, it is important to keep in mind that different spelling conventions were applied by the transcribers. Since English codeswitching was not expected in Patagonian Welsh, English-origin nouns were assimilated to Welsh orthographically17. This was done, in some cases, by applying the suffix -ys instead of -es, while in the Siarad Corpus the English form remains (see (10)–(13)). (10)

(11)

(12)

(13)

PC: PATAGONIA-31, (567) SOF oedden nhw yn agor y bocsys be.3PL.IMPERF 3PL PRT open.VN DET box.PL ‘they opened the boxes and everything.’

a and

SC: FUSSER-05, (588) DYF mae nhw yn cyrraedd mewn boxes o dau be.3SG.PRES 3PL PRT arrive.VN in box.PL of two pump fel arfer five like habit ‘they arrive in boxes of 25, usually’ PC: PATAGONIA-07, (417) VLM un o yr bòsys18 yr boss.PL DET one of DET ‘one of the crew’s bosses’

bopeth L everything

ddeg L ten

criw crew

SC: ROBERTS-02, (392) ION ydy yr bosses19 yna yn chwythu lawr dy gefn be.3SG.PRES DET boss.PL there PRT blow.VN down 2SG.POSS Lback di hefyd 2SG.POSS also ‘are those bosses on your back too?’

While this need not always be the case, the orthographically different form may entail a change in pronunciation. In the nouns bòsys/bosses there is an audible || 16 This rule is not without exceptions. P. W. Thomas (1996: 175) lists marblys ‘marbles’; and several borrowings from Middle English, e.g. hopys ‘hops’, mintys ‘mints’ are found in GPC. 17 Marika Fusser, p.c., March 13, 2021. 18 [bɔsiz] 19 [bɔsɪz]

Language contact and number inflection in Patagonian Welsh | 75

difference between the two nouns in that the /ɪ/ in bosses is more centralized in Wales, whereas in Patagonian Welsh the in bòsys resembles the Welsh long vowel [i]. In Spanish, there is no distinction between long and short vowels and there is no centralized /i/ available. Therefore, a Spanish accent may manifest in the pronunciation of a less centralized /i/. Table 8 shows nine different nouns which are pluralized by -ys. Three of them are mentioned twice while six are mentioned once each. Table 8: Nouns pluralized by -ys in category 1 (+suffix).

Noun

Translation

Difference from English form only orthographical

Plural form attested in GPC

Tokens

Form attested in the Siarad Corpus

bòsys

‘bosses’

yes

yes (bosys)

2

bosses

clasys

‘classes’

yes

yes (clàsys)

2

classes

yes

2

sandwiches

yes

1

patches

sandwitshys ‘sandwiches’ yes batsys (mutated)

‘patches’

no (/tʃ/ → /ts/)

bocsys

‘boxes’

yes

yes

1

boxes

ffensys

‘fences’

yes

yes

1

–

rwmsys

‘rooms’

no (double plural)

yes

1

rooms

stofsys

‘stoves’

no (double plural)

no (stof(i)au, stofs)

1

–

watsys

‘watches’

no (/tʃ/ → /ts/)

yes (watshys)

1

–

Total

12

All nouns pluralized by -ys in this category are adaptations from English. In all cases, the suffix -ys instead of -s or -es is not the only orthographical change compared to the English noun. The graphemes and have been replaced by digraphs and to match the Welsh alphabet. The change from trigraph to digraph (in batsys and watsys) on the other hand indicates a change in pronunciation. For some nouns, the variant presented in Table 8 is not the only option. Sandwitshys is also found as sándwiches20 in another interview and clasys occurs in the Spanish form clases21 as well. The nouns plural-

|| 20 PC: PATAGONIA-22, 764, SAN. 21 PC: PATAGONIA-16, 457, MOL.

76 | Deborah Arbes

ized by -ys in Patagonian Welsh are for the most part attested in GPC, therefore they are most likely known in Wales. Although most English-origin nouns pluralized by -s retain their original spelling in the Siarad Corpus, there are some exceptions (see Table 9). Table 9: Nouns from the Siarad Corpus employing -ys.

Noun

Translation

Tokens

jobsys

‘jobs’

3

seisys

‘sizes’

2

cwrsys

‘courses’

1

rasys

‘races’

1

rhosys (latin)

‘roses’

1

The nouns pluralized by -ys may not be prototypical examples for codeswitching, as some have been integrated into Welsh by means such as double plural and phonological changes. However, because they do not apply native Welsh number inflection, they will be included in the analysis of single word codeswitches.

5 Single word codeswitching An overview about how often codeswitching occurs overall in the Patagonia Corpus is given in Figure 9. The variables “Welsh number inflection” and “English suffix -ys” have been discussed in Sections 3 and 4. In the following, the nouns employing all possible allomorphs of the suffix -s (i.e. single word codeswitches) are introduced. As illustrated in Figure 9, nouns employing any allomorph of -s make up 35% of types and 10% of tokens in this corpus. Altogether, single word codeswitching of plural nouns occurs 274 times in the Patagonia Corpus. Within these 274 occurrences, 167 individual nouns are involved. This percentage is quite small compared to the Siarad Corpus in which 47% of noun types and 18% of tokens are pluralized by -(y)s. Although both communities are bilingual (including some who are multilingual), codeswitching is employed significantly less often in Patagonia.

Language contact and number inflection in Patagonian Welsh | 77

types (n=488)

tokens (n=2735)

0%

20%

40%

60%

80%

Welsh number inflection

English -ys

English -s

Spanish -s

English or Spanish -s

Welsh -s

100%

Figure 9: Number inflection in the Patagonia Corpus.

Figure 10 zooms in on the right part of Figure 9, (focusing only on single word codeswitches), and shows the proportions of Spanish and English nouns employing allomorphs of -s. There are three Welsh-origin types pluralized by -s, which are discussed in Section 6.

types (n=167)

61

100

6

98

170

6

tokens (n=274)

0%

20% English

40% Spanish

60%

80%

English or Spanish

Figure 10: Single word codeswitches in the Patagonia Corpus.

100%

78 | Deborah Arbes

It is striking that the percentage of types and tokens is almost identical. The majority (106 types) of these plural nouns occurs only once22. For six nouns it could not be determined whether they should be counted as English or Spanish, as they occur in both languages. They are therefore left out of the following analysis. In order to answer research question 2a) and 2b), the frequencies of Spanish and English codeswitches and their ability to undergo a soft mutation will be analyzed separately in the following sections.

5.1 Spanish single word codeswitching The Spanish nouns which occur most often (six times each) in Welsh conversations are empañadas ‘empanadas/pasties’ and pesos ‘pesos (currency)’. (14)

PC: PATAGONIA-29, (29) MER a rhaid i ni prynu empanadas hefyd. and necessity to 1PL buy.VN empanada.PL also ‘and we have to buy some empanadas [pastries] as well.’

(15)

PC: PATAGONIA-38, (505) ESF espera un deg pump wait.2SG.IMP one ten five ‘wait, fifteen pesos a class’

pesos peso.PL

y DET

dosbarth class

Note that according to Welsh grammar, a number is followed by a singular noun. This plural noun following a number in (15) could be regarded as a case of grammatical borrowing. Another factor may also be the frequency of the Spanish plural form pesos. Single word codeswitching is virtually the only way Spanish plural nouns occur within Welsh discourse, as only one Spanish noun is integrated into Welsh through its plural suffix (see Section 3.1). Contrarily to the English forms (see e.g. Table 8), the spelling was not changed in any of the 100 Spanish types. Thus, there is no indication in the corpus that the Spanish nouns are pronounced differently than they would be in a Spanish conversation. However,

|| 22 This result matches the theory about codeswitches of Myers-Scotton (2002), who states that “the codeswitching form may or may not reoccur; it has no predictive value” (Myers-Scotton 2002: 41). This is contrary to loanwords, which will reoccur because they have “a status in the recipient language” (Myers-Scotton 2002: 41).

Language contact and number inflection in Patagonian Welsh | 79

subtle phonological differences which are not represented orthographically could still be present. Apart from phonetic changes, the question remains: do these nouns assimilate into Welsh through the use of mutations? This would suggest an adaptation on the morphosyntactic level (see Stammers and Deuchar 2012). Any kind of morphological change raises the question of whether these nouns should still be defined as codeswitches or if they are on their way to become loanwords of the Welsh language. For the sake of simplicity, the nouns pluralized by -s are still referred to as codeswitches even if a soft mutation applies. In fact, there are two mutated Spanish plural nouns in the Patagonia Corpus as shown in (16) and (17). (16)

(17)

PC: PATAGONIA-43, (21) SAN dau gant a hanner L two hundred and half ‘250 pesos per lamb’

o of

PC: PATAGONIA-31, (1416) CZA a wedyn mae gen and afterwards be.3SG.PRES with ‘and afterwards I have tomatoes’

besos yr L peso.PL DET

i 1SG

oen lamb

domates tomato.PL

L

This confirms that a soft mutation in Spanish plural nouns is possible within Welsh discourse. However, these two examples alone do not show whether the mutation of a Spanish noun is rather the rule or the exception. Initial mutations in Welsh can only occur in words which start with specific consonants (see Section 1.2 and King 2003: 13). Even if they start with one of these consonants, most plural nouns occur in a context where they are not expected to mutate (e.g. as the first word of a sentence or after the definite article y, which only causes a soft mutation for feminine singular nouns). A soft mutation of a plural noun is expected after certain prepositions (such as i ‘to’ or o ‘of’ as in (16)) or in the case that it is the object of an inflected verb (see King 2003: 18). In order to determine whether the two mutated nouns are a small or a large percentage of the Spanish nouns which are expected to undergo a soft mutation, it is necessary to check each Spanish token for whether it is expected to mutate. Altogether, 170 Spanish plural nouns employing -s have been discovered as single word codeswitches in the Patagonia Corpus (see Figure 11). As outlined in Figure 11, 85 out of these 170 tokens start with a vowel or with a consonant impossible to mutate. Another 75 tokens could theoretically be mutated, but the environment in which they occur does not demand a mutation.

80 | Deborah Arbes

2 8 mutation not possible mutation not expected 85 not mutated when expected

75

mutated when expected

Figure 11: Spanish-origin plural nouns ending in -(e)s (n=170).

There are 10 tokens left which are expected to mutate, and two of them (see (16) and (17)) are in fact mutated. Examples for sentences in which a soft mutation is expected but does not occur are given in (18) and (19): (18)

(19)

PC: PATAGONIA-14, (594) ROC: mae rhyw piojillos a ryw be.3SG.PRES some greenfly.PL and Lsome ‘some little lice and some things are killing.’ PC: PATAGONIA-28, (444) ZER: dw i ddim yn L be.1SG.PRES 1SG NEG PRT ‘I don’t like a lot of subjects’

hoffi like.VN

bethau L thing.PL

lot lot

o of

(y)n PRT

lladd. kill.VN

materias subject.PL

Two out of ten Spanish origin plural nouns in this corpus undergo a soft mutation when expected (see Figure 11). In order to determine whether this is a high or a low number compared to the percentage of mutated nouns with Welsh number inflection, a random sample of 170 plural nouns from the Patagonia Corpus employing a Welsh suffix is consulted and serves as a benchmark against which the frequency of soft mutations in Spanish plural nouns can be measured (see Figure 12).

Language contact and number inflection in Patagonian Welsh | 81

20 40

8

mutation not possible mutation not expected not mutated when expected mutated when expected

102

Figure 12: Soft mutation in nouns with Welsh number inflection (n=170).

Among the 28 Welsh nouns expected to mutate, 20 did and eight did not undergo a soft mutation (see Figure 13). This is a significantly higher rate of soft mutation when expected than the Spanish nouns show.

Welsh plural nouns 20

(n=28)

8

Spanish plural nouns 2

(n=10)

0%

8

20%

Soft mutation when expected

40%

60%

80%

100%

No soft mutation when expected

Figure 13: Soft mutation in Spanish vs. Welsh plural nouns.

A Fisher exact test generated a statistic value of 0.0082. The result is significant at p < .05. These findings indicate that while some Spanish single word codeswitches may be integrated into Welsh morpho-phonology through soft mutation, they are less likely to undergo a mutation than nouns with Welsh number inflection.

82 | Deborah Arbes

5.2 English single word codeswitching Figure 14 presents an overview of the integration of English loanwords. The nouns integrated by Welsh number inflection have been discussed in Section 3.2 and the nouns pluralized by -ys are introduced in Section 4. What remains are the nouns pluralized by the suffix -s. In the following, the suffixes -ys and -s are analyzed together.

types 75

(n=136)

9

52

tokens 262

(n=360)

0%

20%

12

40%

Welsh inflection

60% Suffix -ys

86

80%

100%

Suffix -s

Figure 14: Integration of English-origin nouns in the Patagonia Corpus.

As Figure 14 shows, 52 individual English-origin nouns are pluralized by -s. The majority have been assimilated to Welsh orthography, as examples (20) and (21) illustrate. (20)

(21)

PC: PATAGONIA-31, (330) SOF Yli mae nymbars mawr arno a you_know be.3SG.PRES number.PL big on.3SG and ‘look, there are big numbers on it and everything’

bopeth L everything

PC: PATAGONIA-08, (202) ELE a wedi cael oparesions and after get.VN operation.PL ‘and has had operations’

Comparing the proportion of English loanwords and codeswitches in the corpora, an obvious difference becomes clear: while in the Patagonia Corpus about 55% of English-origin noun types (and ca. 73% of tokens) are integrated by

Language contact and number inflection in Patagonian Welsh | 83

Welsh number inflection, this is only the case for 24% of English-origin types (and 41% of tokens) in the Siarad Corpus (see Figure 15). 76% of types and 59% of tokens are inserted into Welsh discourse without any morphological changes.

types 24%

(n=825)

76%

tokens 41%

(n=2188)

0%

20%

59%

40% Welsh inflection

60%

80%

100%

Suffix -(y)s

Figure 15: Integration of English-origin nouns in the Siarad Corpus.

In the following, we will find out which of the English nouns undergo a soft mutation when expected. Figure 16 illustrates the result. However, the number of nouns expected to mutate is too small in this case to make a statement about the quantity of soft mutations in English single word codeswitches. In the Patagonia Corpus only six of these nouns were expected to mutate and three of them did (see (22)–(24)). (22)

PC: PATAGONIA-35, (100) AMA a wedyn amser te o yr newydd yn mynd and afterwards time tea of DET new PRT go.VN.PRES â te, te i b awb a teisen gyrens neu L with tea tea to Leveryone and cake currant:COLL or teisen bitshis neu unrhyw beth fysai efo hi cake Lpeach.PL or anything what Lbe3SG.COND with 3SG.F ‘then at tea time she’d bring tea for everybody and a currant cake or peach cake or anything that she had’

84 | Deborah Arbes

3 mutation not possible

3

mutation possible but not expected

48

not mutated when 44

expected mutated when expected

Figure 16: Soft mutation of English-origin nouns pluralized by -(y)s (n=98).

(23)

PC: PATAGONIA-22, (315) CON ryw faint o fetrs? L L some Lamount of metre.PL ‘some... how many metres?’

(24)

PC: PATAGONIA-07, (765) VLM glas â ryw batsys yno blue with Lsome Lpatch.PL in.3SG ‘blue with some patches in it’

fo 3SG

In order to determine to what extent single word codeswitches are integrated into Welsh by soft mutation, it would be necessary to consider (feminine) singular nouns and possibly other word forms or a larger corpus in order to have a larger sample for analysis. Nevertheless, this section contains some valuable findings. It has shown that the Patagonian Welsh strategies for adapting English and Spanish nouns differ substantially. While the majority of English nouns are integrated through Welsh plural morphology, this is almost never the case for Spanish nouns. Instead, Spanish nouns occur more often as single word codeswitches than English nouns. It is debatable whether the term ‘single word codeswitching’ is an accurate label for English nouns pluralized by -s in a Welsh sentence by Spanish-Welsh bilinguals who are not fluent in English (cf. Section 1.2). However, this section has shown that, although the nouns are not morphologically integrated (except for some instances of soft mutation), they are part of Patagonian

Language contact and number inflection in Patagonian Welsh | 85

Welsh conversations. It is possible that these lexical items were part of the ‘initial mixture’ (cf. Rees 2021: 256) or that they have made their way into Patagonian Welsh through contact with English-Welsh bilingual speakers.

6 Borrowing of -s from English or Spanish While the previous sections explained what influence Welsh morphology has on English and Spanish nouns uttered in a Welsh context, this section is concerned with a possible assimilation of Welsh to both donor languages. It has been shown for a number of languages including Welsh that “the category of nominal plural has a higher-than-average borrowing rate” (Gardani 2012: 71). This is one of the reasons to assume that in the long-lasting language contact situation in Y Wladfa, borrowing of the plural suffix -s from the donor language Spanish to the recipient language Welsh has taken place. Regarding the borrowing of number inflection, a Spanish plural suffix -s is not distinguishable from the English equivalent, therefore both scenarios could be possible. However, the Patagonia Corpus contains only four examples of the suffix -s in connection with a Welsh-origin noun. The nouns are baswrs ‘bass-players’, pregethwrs ‘preachers’ (two occurrences) and pilipalas ‘butterflies’ (see (25)). (25)

PC: PATAGONIA-11, (414) HER w i (y)n cofio rhedeg ar ôl y pilipalas oh 1SG PRT remember.VN run.VN on back DET butterfly.PL a trio dal nhw and try.VN catch.VN 3PL ‘Oh I remember running after the butterflies and trying to catch them’

In fact, pilipalas ‘butterflies’ is the only purely Welsh-origin noun, since baswr ‘bass-player’ contains an English element (bas ‘bass’) and pregethwr ‘preacher’ contains a Latin element (pregeth ‘sermon’). The forms pregethwrs and baswrs are not unexpected as they are both mentioned in GPC. Instead of pilipalas, GPC and other dictionaries suggest the form pilipalod. The Siarad Corpus includes a few more Welsh plural nouns employing this pattern, e.g. blaidds ‘wolves’ (instead of bleiddiaid) or taids ‘grandfathers’. However, their share among Welsh-origin nouns accounts for less than 3% of types and less than 1% of tokens. Judging by this low number of plural suffixes -s on Welsh nouns, there is no indication that borrowing of this suffix from either Spanish or English is taking place among the Welsh speakers in Patagonia.

86 | Deborah Arbes

7 Conclusion The results of this study shed light on a unique contact situation of a heritage language: Welsh had already been influenced by English in the 19th century, and some effects of this are still present in the language. Additionally, Spanish continues to influence Patagonian Welsh discourse, although Spanish nouns remain mostly isolated from Welsh morphosyntax. Signs of past language contact with English are detectable through English loanwords from various stages, which are integrated into the Welsh number inflection system. These loanwords are present in today’s spoken Welsh on both continents, as the corpora reveal. Among the integrated English loanwords, no significant differences between the two corpora could be found. The number of English nouns retaining their original form is smaller than the number of nouns integrated by Welsh suffixes, indicating that most English nouns are not spontaneously inserted but have become part of the Welsh lexicon. This is significantly different to the Siarad Corpus, where among English origin nouns, codeswitches retaining the suffix -(y)s are in the majority. Concerning Spanish-origin nouns, the results of this study indicate that codeswitching and borrowing is taking place at a smaller scale than could be expected given the extended period of language contact between Spanish and Welsh in Y Wladfa. Despite many years of language contact, Welsh number inflection remains mostly untouched by the influence of Spanish plural morphology, since the plural suffix -(e)s is not usually transferred onto Welsh-origin nouns, as the Patagonia Corpus shows. The examples which are present in the corpus substantiate the statement that “[b]ilingual speakers readily transfer morphological patterns and morphemes from one language into the other at structure points where the two languages are typologically congruent” (Thomason 2015: 43). However, their scarcity also supports Matras’ (2015) claim that “the borrowing of individual inflectional morphemes is […] dis-preferred.” (Matras 2015: 77). As this study takes only multiplex nouns into consideration, the questions arise of whether Spanish-origin uniplex nouns, verbs or other word forms might be more integrated into Welsh or whether Spanish influence may be more visible on the syntactic level. Among many others, these are some of the open questions which remain for future research on Patagonian Welsh.

Language contact and number inflection in Patagonian Welsh | 87

Acknowledgments: I would like to thank Margaret Deuchar and Kevin Donnelly not only for publishing the corpora which are the foundation for this study, but also for answering my questions and helping me access the corpus files and data. Too numerous to name all of them are the Welsh speakers who have taught me their language and answered my questions (especially “Beth wyt ti’n galw mwy nag un?”). Diolch o galon i chi gyd. Furthermore, I am grateful to Thomas Stolz, Natascha Levkovych and Jago Williams for their comments and corrections. Despite having received lots of support, the sole responsibility for form and content of this article lies with me.

Abbreviations 1/2/3 COLL COND

cym. DET

EME eng. EXI F

GPC IMPERF L

lat. M

ME MLF NEG

OE PL POSS PRES PRET PRT SG SGV

V VN

WLP

first/second/third person collective conditional Cymraeg/Welsh determiner Early Modern English English existential feminine Geiriadur Prifysgol Cymru imperfect lenition Latin masculine Middle English Matrix Language Frame negation Old English plural possessive present preterite particle singular singulative vowel change verbal noun Welsh Language Project

88 | Deborah Arbes

Primary Sources Siarad Corpus (SP) = Deuchar, Margaret. & Peredur Davies. 2014. Bangor Siarad Corpus. http://bangortalk.org.uk (accessed 06.05.2021). Patagonia Corpus (PC) = Deuchar, Margaret. et al. 2011. Bangor Patagonia Corpus. http://bangortalk.org.uk (accessed 06.05.2021). Geiriadur yr Academi = Griffiths, Bruce & Jones Dafydd Glyn. Geiriadur yr Academi|The Welsh Academy English Welsh Dictionary Online. https://geiriaduracademi.org/ (accessed 06.05.2021) GPC = Thomas, R. J. et al. (eds.). 1950–2002. Geiriadur Prifysgol Cymru: A dictionary of the Welsh language. Cardiff: University of Wales Press. https://www.geiriadur.ac.uk/ (accessed 06.05.2021)

References Arwel, Rhisiart. 2019. The Welsh Language Project in Chubut: Annual Report 2019. https://wales.britishcouncil.org/en/programmes/education/welsh-language-project. Berg, Kimberly. 2019. Creating a destination through language: Welsh linguistic heritage in Patagonia. In Catherine Palmer (ed.), Creating heritage for tourism (Current developments in the geographies of leisure and tourism). London: Routledge. Carter, Diana, Margaret Deuchar, Peredur Davies & María del Carmen Parafita Couto. 2011. A systematic comparison of factors affecting the choice of matrix language in three bilingual communities. Journal of Language Contact(4). 1–31. Deuchar, Margaret. 2006. Welsh-English code-switching and the Matrix Language Frame model. Lingua 116(11). 1986–2011. Deuchar, Margaret, Kevin Donnelly & Peredur Webb-Davies. 2018. Building and using the Siarad Corpus: Bilingual conversations in Welsh and English. Amsterdam & Philadelphia: John Benjamins. Gardani, Francesco. 2012. Plural across inflection and derivation, fusion and agglutination. In Martine Robbeets & Lars Johanson (eds.), Copies versus cognates in bound morphology, 71–97. Leiden: Brill. Haspelmath, Martin. 2009. Lexical borrowing: concepts and issues. In Martin Haspelmath & Uri Tadmor (eds.), Loanwords in the world's languages, 35–54. Berlin & New York: De Gruyter. Haspelmath, Martin & Andres Karjus. 2017. Explaining asymmetries in number marking: Singulatives, pluratives, and usage frequency. Linguistics 55(6). 1213–1235. Jones, Robert O. 1984. Change and variation in the Welsh of Gaiman, Chubut. In Martin J. Ball & Glyn E. Jones (eds.), Welsh phonology: Selected readings, 237–261. Cardiff: University of Wales Press. Jones, Robert O. 1998. The Welsh language in Patagonia. In Geraint H. Jenkins (ed.), Language and community in the nineteenth century (Social history of the Welsh language), 287–316. Cardiff: University of Wales Press. King, Gareth. 2003. Modern Welsh: A comprehensive grammar. London [u.a.]: Routledge. Masés, Enrique. 2002. Estado y cuestión indígena: el destino final de los indios sometidos en el sur del territorio (1878–1910). Buenos Aires: Prometeo Libros. Matras, Yaron. 2009. Language contact. Cambridge, UK & New York: Cambridge University Press.

Language contact and number inflection in Patagonian Welsh | 89

Matras, Yaron. 2015. Why is the borrowing of inflectional morphology dispreferred? In Francesco Gardani, Peter Arkadiev & Nino Amiridze (eds.), Borrowed morphology, 47–80. Berlin & Boston: De Gruyter Mouton. Myers-Scotton, Carol. 2002. Contact linguistics: Bilingual encounters and grammatical outcomes. Oxford: Oxford University Press. Nurmio, Silva. 2017. Collective nouns in Welsh: A noun category or a plural allomorph? Transactions of the Philological Society 115(1). 58–78. Parina, Elena. 2010. Loanwords in Welsh: Frequency analysis on the basis of Cronfa Electroneg o Gymraeg. In Dunja Brozovic Roncevic, Maxim Fomin & Ranko Matasovi (eds.), Celts and Slavs in Central and Southeastern Europe. Studia Celto-Slavica III: Proceedings of the IIIrd International Colloquium of the Societas Celto-Slavica, 183–194. Parry-Williams, Thomas H. 1923. The English element in Welsh: A study of English loan-words in Welsh. London: The honourable society of Cymmrodorion. Polinsky, Maria. 2018. Heritage languages and their speakers. Cambridge: Cambridge University Press. Poplack, Shana. 2018. Borrowing: Loanwords in the speech community and in the grammar. New York: Oxford University Press. Rees, Iwan. 2017. Cyflwyno Tafodieithoedd Cymraeg y Wladfa [Introducing the Welsh dialects of the settlement]. https://llyfrgell.porth.ac.uk/View.aspx?id=3685~4w~xXCsoNiy (27 January, 2021). Rees, Iwan. 2021. Hispanicization in the Welsh settlement of Chubut Province, Argentina: Some current linguistic developments. In Danae Perez & Eeva Sippola (eds.), Postcolonial language varieties in the Americas, 239–269. Berlin & Boston: De Gruyter Mouton. Stammers, Jonathan & Margaret Deuchar. 2012. Testing the nonce borrowing hypothesis: Counter-evidence from English-origin verbs in Welsh. Bilingualism: Language and Cognition 15(3). 630–643. Stolz, Thomas. 2001. Singulative-collective: Natural morphology and stable classes in Welsh number inflexion on nouns. STUF/Language Typology and Universals 54(1). 52–76. Stolz, Thomas. 2008. Kymrische Ausnahmen oder walisische Regeln? Was die substantivische Pluralvariation uns lehrt. In Cornelia Stroh & Aina Urdze (eds.), Morphologische Irregularität: Neue Ansätze, Sichtweisen und Daten, 111–150. Bochum: Brockmeyer. Taylor, Lucy. 2018. Global perspectives on Welsh Patagonia: The complexities of being both colonizer and colonized. Journal of Global History 13(3). 446–468. Thomas, Enlli Môn, Nia Willimas, Llinos Angharad Jones & Susi Davies & Hanna Binks. 2014. Acquiring complex structures under minority language conditions: Bilingual acquisition of plural morphology in Welsh. Bilingualism: Language and Cognition 17(03). 478–494. Thomas, Peter W. 1996. Gramadeg y Gymraeg [The grammar of Welsh]. Caerdydd: Gwasg Prifysgol Cymru. Thomason, Sarah G. 2001. Language contact: An introduction. Edinburgh: Edinburgh University Press. Thomason, Sarah G. 2015. When is the diffusion of inflectional morphology not dispreferred? In Francesco Gardani, Peter Arkadiev & Nino Amiridze (eds.), Borrowed morphology, 27–46. Berlin & Boston: De Gruyter Mouton. Williams, Glyn. 1975. The desert and the dream: A study of Welsh colonization in Chubut, 18651915. Cardiff: University of Wales Press. Williams, Guillermo. 2017. La relación entre indígenas y galeses en Chubut: representaciones y reproducciones de una memoria histórica “feliz”. Revista del Instituto de Cultura, Identidad y Comunicación(2). 48–79.

Thomas Stolz

VOY – PARA – SIEMPRE: Three Spanish-derived

function words and the Chamorro irrealis Abstract: The study investigates the system of the the irrealis mood in Chamorro by way of addressing, synchronically as well as diachronically, the role played by the Spanish-derived function words bai, para, and siempre in the shaping of this system. The contemporary data stem from written Chamorro of the Guam variety. For the diachronic part, a particular printed text from the 19th century is evaluated. It is shown that the accounts given of the Chamorro irrealis in part of the extant descriptive-linguistic literature call for being revised thoroughly. Moreover, the borrowed Spanish function words are classified according to the typology of MAT/PAT borrowing. Keywords: borrowing; Chamorro function words; Hispanization; irrealis; MAT/PAT

1 Introduction The lexicon and grammar of Chamorro1 provide ample evidence of Hispanization insofar as numerous meaningful sound chains used to express certain concepts can be easily traced back to their Spanish origins (Rodríguez-Ponga 1995). In some domains of the grammatical system of Chamorro, there is a striking density of Spanish-derived function words which contribute to shaping the structure of these sub-systems and adding a certain Hispanic flair to Chamorro discourse (Stolz 1998). A particularly intriguing case is the irrealis mood because three of the most salient and most frequent free morphemes employed to mark categories subsumed under the heading of this category have a Spanish

|| 1 Chamorro (aka CHamoru) is an internal isolate within in the Western Malayo-Polynesian branch of Austronesian. It is natively spoken by some 65,000 people residing in the Commonwealth of the Northern Mariana Islands and the unicorporated US territory Guam on the western rim of the Pacific in Micronesia. The language is co-official with English (and Carolinian). It is nevertheless endangered because of the dominance of English and wide-spread bilingualism with English. From 1565/1665 until 1898/9, the Marianas were a Spanish colony and Chamorro underwent intensive Hispanization. || Thomas Stolz, University of Bremen, FB 10: Linguistics/Language Sciences, UniversitätsBoulevard 13, 28359 Bremen, Germany. E-Mail: [email protected] https://doi.org/10.1515/9783110785517-004

92 | Thomas Stolz

etymology (Pagel 2010: 92), namely (Spanish voy >) bai, para, and siempre2 whose interaction and position within Chamorro grammar is in the focus of this study. Further Spanish-derived function words like debi(di) ‘must’ (< Spanish debe de) which are also prominently mentioned in connection with the irrealis mood are not specifically addressed because their functional domain is relatively clearly delimited – in this case to the modality of obligation (Topping and Dungca 1973: 263 and 265; Chung 2020: 77–78) – and thus do not call for an indepth investigation. The Chamorro case deserves to be studied thoroughly because it ties in with generalizations put forward within the wider framework of the theory of language contact. In his summary of the cross-linguistic project on grammatical borrowing, Matras (2007: 46) states that as to TMA and modality, we have seen the high density of (matter) borrowing in the domain of modality, in some cases also in mood, frequent matter and pattern replications in the area of aspect and aktionsart, and few cases of pattern replication in tense, all involving the future.

The Chamorro case seems to corroborate Matras’s hierarchy of contactsensitivity which is presented in the format of the following chain: MODALITY > ASPECT/AKTIONSART > FUTURE TENSE > (OTHER TENSES) (Matras 2009: 162). Under a chronological reading, the chain assumes that borrowings in the domain of modality precede those in the other domains with the future tense representing some kind of latecomer. If we read the same chain as an implication, borrowings in the domain of the future tense presuppose that borrowings in the domain of modality are already in place.3 On the basis of the above quote, it can be assumed that modality serves as an umbrella under which mood is subsumed as well. For the Chamorro case, it is particularly interesting to see that mood and future tense are crucial concepts if one wants to account for the structural and functional properties of the above three function words. The ultimate goal of the subsequent paragraphs is to determine the extent to which these instances of function-word borrowing from Spanish contribute to the internal organization of the irrealis mood of the replica language. The question is asked where in the typology of MAT and PAT borrowing (Gardani 2020) these cases find their proper place. In this way, new light is shed on phenomena

|| 2 For ease of recognition, the three function words rare represented consistently as bai, para, and siempre throughout the text except in direct quotes and in the sentential examples. 3 The limited time-depth and size of the documentation of Chamorro notwithstanding, it is nevertheless possible to confirm that when the first signs of borrowings in the domain of tense (future) appeared in the 19th century, borrowings in the domain of modality were already firmly established. Thus, Chamorro nicely conforms to Matras’ (2007, 2009) borrowing hierarchy.

Three Spanish-derived function words and the Chamorro irrealis | 93

which are interesting not only for language contact studies in general (Matras 2009) but also for the research program dedicated to Romancization processes in general (Stolz 2008) and Hispanization in particular (Stolz et al. 2008). Furthermore, taking a closer look at bai, para, and siempre is also to the benefit of Chamorro descriptive linguistics as such. To familiarize ourselves with the phenomena, I present the three Hispanic elements as used in a Chamorro context in examples (1)–(3). For reasons to be explained subsequently, the three function words under scrutiny are glossed BAI, PARA, and SIEMPRE, respectively.4 In this study, small caps are also used for the putative future marker u which accordingly appears as U in the glosses although it has no Spanish background.5 (1)

[Taimanglo et al. 1999: 64] Bai dalalak hao para i metkåo på’go, BAI accompany 2SG.ABS PARA DEF.CN market today ya siempre un li’e’ and SIEMPRE 2SG.ERG see na tåya’ ni unu hu na’-då~dañu. SUBORD nothing NEG one 1SG.ERG CAUS-RED~harm ‘I will accompany you to the market today and you will surely see that I am hurting not a single one.’

(2)

[Taimanglo et al. 1999: 85] Malago’ yu na para bai hu faisen hao want 1SG.ABS SUBORD PARA BAI 1SG.ERG ask 2SG.ABS kao siña un po’lo yu’ tåtte gi tasi Q be_able 2SG.ERG put 1SG.ABS back in sea ‘I want to ask you whether you can put me back in the sea [].’

|| 4 Except otherwise stated, the English translations and glosses are mine. I keep the (varying) original orthography of my contemporary sources whereas those from the 19th century are presented first in their original shape which is then subjected to moderate modernization. Those elements which are of interest for the discussion in the ensuing argumentative text are marked out in boldface. 5 Many questions relating to the structural properties of Chamorro are controversial. To avoid burdening this paper with too many side-issues which cannot be settled satisfactorily here, I adopt a pragmatically motivated compromise approach to the hotly debated problems. This is why inter alia the opposition ergative vs. absolutive is postulated pace Cooreman (1987) and Chung (1998). At the same time, I occasionally mention syntactic functions such as subject and object. In addition, I assume two passives – the ma-passive and the in-passive (Chung 2020: 211–221) – and speak of oblique case (Chung 2020: 94) in lieu of accepting Topping and Dungca’s (1973) actor-goal distinction in a Philippine-style focus system.

94 | Thomas Stolz

(3)

[Saint-Exupéry 2021, ch. I] Lao todo i tiempo siempre man oppe, but all DEF.CN time SIEMPRE AP answer ‘But surely he would always answer: ‘This is a hat.’’

tuhong ayo. hat this

What the three examples (1)–(3) tell us is not immediately clear without further explanations. In the immediately subsequent paragraphs of this introduction, the phenomena of interest are only identified superficially. However, they will be in the centre of the discussion in the remainder of this paper where additional information about them is provided. To start with a general characterization of the above examples, it needs to be mentioned that, in (1) and (3), beyond bai, para, and siempre, there are further lexical Hispanisms such as metkåo ‘market’ (< Spanish mercado), ni ‘not even’ (< Spanish ni), tiempo ‘time, weather’ (< Spanish tiempo), todo ‘all’ (< Spanish todo), unu ‘one’ (< Spanish uno), and dåñu ‘harm’ (< Spanish daño) (Rodríguez-Ponga 1995: 225, 468, 486, 636, 641, 659). The three function words thus fit in nicely with the outwardly Hispanic character of large parts of the written register of Chamorro. Bai in (1)–(2), siempre in (1), and para in (2) – but not in (1) – are connected to the irrealis. In (1), para functions as directional preposition with allative meaning, i.e. as a translation equivalent of English to. The function of siempre in (3) is that of an adverbial modifier corresponding to English surely. In (2), bai co-occurs with hu, the pronoun of the 1st person singular ergative, whereas in (1), siempre combines with un, the pronoun of the 2nd person singular ergative. In (2), para and bai are direct syntactic neighbours. These distributional facts do not reveal too much about the grammar of the function words. Their properties need to be scrutinized further. In the descriptive-linguistic literature dedicated to Chamorro, there is disagreement as to the proper way of glossing these function words. For Topping and Dungca (1973: 261–262), all three function words fall under the rubric of future markers (alongside u) so that it would seem to make sense to indiscriminately use the gloss FUT[ure].6 According to Pagel (2010: 98–101), however, bai should be glossed as IRR[ealis]1 for the 1st person (singular and dual/plural exclusive) with FUT[ure] being the appropriate gloss for both para (excluding its prepositional and conjunctional usages) and siempre. The third option is that of Chung (2020: 28–39) who uses the gloss AGR[eement] for bai, FUT[ure] for para (except when it is used as preposition), and surely for siempre. What the three

|| 6 Note that in their reference grammar of Chamorro, Topping and Dungca (1973) refrain from adding morpheme glosses to the examples.

Three Spanish-derived function words and the Chamorro irrealis | 95

proposals have in common is that at least one of the function words is assigned the task of coding future – and this function word is para. Accordingly, Gibson (1992: 28) classifies para as preverbal future marker. For the two remaining function words, the authors’ views differ widely in the sense that Topping and Dungca and Pagel allocate para and siempre in the domain of grammar whereas Chung reserves this interpretation for para only and considers siempre to be part of the lexicon. The most striking difference between Pagel’s (2010) point of view and that shared by Topping and Dungca (1973), Gibson (1992), and Chung (2020) is the unexplained refusal of the latter group of four to acknowledge that one has to distinguish the purposive conjunction para from the homophonous TMA marker para. That this distinction is necessary, however, will be shown in the empirical centre piece of this study. The above multitude of glossing practices and analyses impels me to adopt a different approach in order to avoid making prejudgments about the range of functions for which bai, para, and siempre are employed, respectively. This is why the morpheme glosses of these function words are identical with the glossed items in the original example except that the glosses appear in small caps. In the case of PARA and SIEMPRE, this means that no distinction is made between those functions which associate the function words with the domain of the irrealis and those which belong to other sectors of Chamorro grammar. The different functions are lumped together exclusively on the level of the glosses. That we are dealing with multifunctional or distinct albeit homophonous elements will by no means be ignored. To the contrary, their multifunctionality or homophony will be addressed time and again throughout this study. The Chamorro irrealis has been paid due attention in a number of previous studies. Topping and Dungca (1973: 261) assume a basic tense-based distinction of future vs. non-future for the verb system of Chamorro although they emphasize that “[i]f a verb phrase has the future tense markers, it will be considered future tense even though it may not always translate into the future tense of another language”. Chung and Timberlake (1985) pave the way for Chung’s (2020: 35) alternative analysis according to which we are not dealing with a tense-based system, in the first place. Similar thoughts can already be found in Costenoble (1940: 297–298) who discards the idea that we are facing a grammatical tense because he is in favor of an interpretation as mood. In Chung’s (2020: 35) interpretation, the future is only a sub-category of the irrealis mood which “is used for situations that are not actual – situations that are not presented as facts.” Nonactual situations cover categories such as future, necessity, obligation, optative, imperative, prohibitive, desiderative, etc. (Chung 2020: 35–37). The dichotomy of tenses assumed by Topping and Dungca (1973) is replaced

96 | Thomas Stolz

with a dichotomy of moods, namely realis vs. irrealis. The latter interpretation is adhered to also in my study. Embracing the irrealis hypothesis does not automatically imply that it is accepted blindly with all its implicit and explicit entailments. As comes to the fore in the above comment on the glossing practice of Pagel (2010), the author distinguishes markers of the future (= para and siempre) from that of the irrealis (= bai). In line with Cooreman (1987: 38) and ultimately with Chung (1998: 26–27), Pagel (2010: 91–92) argues that bai functions exclusively as agreement marker of the 1st person (exclusive) (of all numbers) in the irrealis. Does this mean that bai is more of a mood marker than para and siempre whose glosses invoke a tense-based function? This and further issues will be raised again in due course below. This study is motivated by three factors. First of all, there is hard evidence of variation as to the shape the irrealis constructions can take. This variation calls for being described and explained. Secondly, the linguistic experts disagree as to the most appropriate way of analysing the empirical facts. This controversy is suggestive of the possibility that the phenomena have not been evaluated exhaustively yet. Thirdly, Pagel’s (2010) largely convincing (languagecontact oriented) account of the roles played by bai, para, and siempre in Chamorro grammar is an invitation to study this network of function words indepth to exactly determine their position within the irrealis category. To achieve this, it is necessary to also take account of further aspects of the system under scrutiny independent of the presence of Spanish-derived elements. The range of uses to which bai, para, and siempre are put in Chamorro is determined primarily on the basis of the empirical data collected from a small corpus of written Chamorro prose of different genres (both fiction and nonfiction, original and translated) as of the late 20th and early 21st centuries.7 The texts are representative of the variety of Chamorro spoken on Guam. The synchronic facts are complemented with a selection of examples from printed texts dating back to the second half of the 19th century. The primary sources are identified in a separate section at the end of the paper. Additional evidence of given phenomena stems from the extant descriptive-linguistic studies of Chamorro. || 7 The four texts which constitute the corpus for this study are in chronological order: Perez and Faustino (1975) – a collection of children’s stories designed for the use in primary education [3,290 words], Taimanglo et al. (1999) – traditional legends and other short stories told by adult native speakers of Chamorro [37,709 words], Onedera (2007) – the first MA-thesis ever written and defended in Chamorro dedicated to a special literary genre [29,308 words], Saint-Exupéry (2021) – the Chamorro translation of the modern French classic Le petit Prince [16,847 words]. With altogether 87,154 words, this corpus is still relatively small. Its moderate size makes it necessary that the results of this study are tested against a much larger corpus in the future.

Three Spanish-derived function words and the Chamorro irrealis | 97

The methodology is predominantly qualitative with the occasional very simple frequency count added in support of the arguments put forward. My approach is indebted to Haspelmath’s (2010) concept of framework-free grammatical theory and, at the same time, it is firmly rooted in linguistic functionalism. The phenomena of interest are illustrated with examples from Chamorro and Spanish. Further languages will be touched upon only in passing, if at all. The study is organized as follows. In Section 2, a sketch of the irrealis is provided on the basis of the extant descriptions (including those published prior to 1950). This sketch serves as background for the subsequent Section 3 which focuses on the three function words each of which is the topic of a separate subsection. The initial part of Section 3 informs the reader about the token frequency with which the function words are registered in the corpus. Sections 3.1–3.3 are dedicated to para, bai, and siempre (in this order) as they are employed in written Chamorro and as to their interpretation in the descriptive-linguistic literature. For each of the function words, the major properties of its Spanish etymon are presented and compared. Similarly, for each of the function words, I extensively discuss historical data from the 19th century. Section 3.4 summarizes the historical processes in which these function words were involved. In partial revision of the picture painted in Section 2, the involvement of siempre, para, and bai in the wider context of the irrealis category is looked into in Section 4. The evaluation of the findings in terms of the typology of MAT/PAT-borrowing is provided in Section 5 whereas Section 6 contains the general conclusions.

2 The irrealis construction To state that Chamorro distinguishes two major mood categories – realis vs. irrealis – seems to suggest that there is a straightforward bipartition. On closer scrutiny, it turns out that the irrealis is a relatively colourful category whose internal make-up is not easy to capture. In contrast to the realis which can be considered the unmarked partner of the opposition (but cf. below), the irrealis is equipped with a battery of morphological means which one way or another serve to encode this mood and/or further grammatical categories associated with it. How exactly the division of labour between these markers is organized is still a matter of debate as will transpire from the discussion in this section. The constructions used to express categories within the domain of the irrealis can be understood as multi-layered structures which consist of a nucleus and ever more peripheral extensions (cf. Figure 1). For the purpose at hand, it makes sense to start at the centre and proceed from there layer by layer to ulti-

98 | Thomas Stolz

mately reach the outer margins of the construction (this is done in Sections 2.1– 2.4). For reasons of prototypicality, it also makes sense to look at verbal predicates. Nonverbal predication is not specifically addressed in this study although some of the examples illustrate this kind of predication. Section 2.5 presents the intermediary results and briefly goes back in time to determine how the irrealis system has looked on earlier stages of Chamorro.

2.1

The stem-initial m/f-alternation

There is a class of verbs in Chamorro whose stems are sensitive to the mood distinctions of the language. These verbs have different initial segments in the realis and irrealis, namely /m/ and /f/, respectively. Topping and Dungca (1973: 83) assume that these verbs contain a fossilized erstwhile prefix ma-. Synchronically, this replacement affects the segmental chain of the stem itself and does not involve affixation of any kind. Thus, we are dealing with a process of inner modification. In contrast to Costenoble (1940: 294) and Topping and Dungca (1973: 92), Chung (2020: 32) assumes that the /f/-initial form of the irrealis is the basic one from which the realis with initial /m/ is derived although the /m/initial from is registered in the lexicon. The class of predicates which gives evidence of this alternation is said to be small and exclusively composed of intransitive verbs and adjectives. These predicates stand out further from the bulk of intransitive predicates insofar as they do not take the nonplural infix -um(Chung 2020: 31, fn. 7).8 Examples (4)–(5) illustrate the m/f-alternation with the intransitive verb malingu ‘disappear’ which comes as falingu in the irrealis in (5). The stem-initial segments which are subject to replacement are underlined. (4)

[Taimanglo et al. 1999: 93] Ha~hasso yu’ gi todu i kalamten-mu RED~think 1SG.ABS in all DEF.CN move-POR.2SG hao patgon-hu sino malingu REAL.disappear 2SG.ABS child-POR.1SG or_else ‘Keep thinking about me in all your movements or else you disappear, my child.’

|| 8 Whether this constraint is obeyed strictly remains doubtful since there are occasional examples of -um- being used with verbs like malingu ‘disappear’ (irrealis: falingu) such as shown in (i): (i) Chamorro [Taimanglo et al. 1999: 117] Lao håfa mohon, Marina, yanggenmalingu håo? but what desire Marina when disappear 2SG.ABS ‘But what, Marina, were you wishing for when you disappeared?’ Note that with sonorant-initial stems, the infix -um- may also be used as prefix mu-.

Three Spanish-derived function words and the Chamorro irrealis | 99

(5)

[Taimanglo et al. 1999: 117] Tatnai måtto gi hasso-ku almost_always arrive in think-POR.1SG na bai falingu. SUBORD BAI IRR.disappear ‘It almost always comes to my mind that I will disappear [].’

The irrealis marking on the innermost layer of the construction is severely restricted by factors from domains as diverse as phonology and transitivity. The vast majority of Chamorro predicates is not affected by the m/f-alternation of stems even if they meet the phonological conditions and are intransitive (Chung 2020: 34). Thus, the irrealis needs to be expressed by different means. This applies to all verbs and adjectives borrowed from Spanish. It is most likely that the stem-initial m/f-alternation ceased to be productive before the massive borrowing of lexical items from Spanish gained momentum.

2.2

The prefix-initial m/f-alternation

On the next layer, there is no escape from irrealis marking, in a manner of speaking. There are two homophonous prefixes man1- and man2- which may occur together as direct neighbours in the same word-form and whose functions are very different. Man1- is used to indicate plural of the subject with intransitive predicates whereas man2- marks the antipassive under detransitivization (Chung 2020: 32). What makes these prefixes interesting for the issue under scrutiny is their involvement in the m/f-alternation. The leftmost of the two prefixes shows initial /m/ in the realis as opposed to initial /f/ in the irrealis (Chung 2020: 33). This pattern holds for the bulk of the predicates in Chamorro including those borrowed from Spanish so that one may claim that the alternation of man- ~ fan- forms the morphological foundation of the coding system of the irrealis. Examples (6)–(7) illustrate this alternation with the verb espiha ‘seek, search for’ (< Spanish espia(r) Rodríguez-Ponga 1995: 278). (6)

[Taimanglo et al. 1999: 151] man-espiha un trongko-n pågu REAL.AP-seek INDEF tree-LINK hibiscus ya ha papa pokse’ para godde. strip_off fibre PARA string and 3SG.ERG ‘He searched for a hibiscus plant and stripped off some fibre to use as a string.’

100 | Thomas Stolz

(7)

[Taimanglo et al. 1999: 115] yu’os i Kåo bai fan-espiha Q BAI IRR.AP-seek god DEF.CN para u ayuda yu’? PARA U help 3SG.ABS ‘Will I find a mountain god to help me?’

sabåna mountain

Chung (2020: 33–34) defends her idea that for the m/f-alternation the /f/-initial forms are basic also in the case of the above prefixes. Her approach seems to be inspired ultimately by comparative Austronesian linguistics, given that this prefix historically undergoes nasal substitution. Safford (1909: 79) assumes that original fan- changes to man- under certain conditions. The problem of determining the appropriate “underlying” form of the prefixes and stems undergoing the m/f-alternation is not essential to the topic of this study. It suffices to note that there are means of encoding the irrealis which are independent of Spanish influence and whose existence is acknowledged by all descriptive linguists of Chamorro. Starting with the next layer(s) we enter grounds for which no general consensus seems to hold.

2.3 Person marking (and beyond) In point of fact, the next two layers are tightly interconnected with each other so that it becomes necessary to look at some of the aspects they share in one go. On the left of the verb stem and its prefixes, there are two slots which host morphemes which some authors interpret as bound to the verb whereas others consider them to be free morphemes. It is not the task of this study to solve this puzzle since it has no further bearing on the issues raised here. Therefore, I present the morphemes under scrutiny as free elements without claiming that this analysis corresponds to the morphological reality.9 In the irrealis, the basic dichotomy of transitive vs. intransitive verbs is neutralized insofar as a set of preverbal person markers is exclusively made use of which overlaps with the set of ergative person markers which are obligatorily employed to encode the agent with fully transitive but pragmatically neutral verbs. In this study, these identical markers are consistantly glossed as ergative, independently of the mood they are employed for. The two sets are not absolutely identical – a fact which comes to the fore in the 3rd person of all numbers. Espe-

|| 9 Sandra Chung (p.c.) explains that the issue is actually prosodic, since even those who treat these as free morphemes believe they are prosodically dependent, i.e. proclitics.

Three Spanish-derived function words and the Chamorro irrealis | 101

cially in the 3rd person singular the boundary between the layer of person marking and that of mood marking is blurred as shown in Table 1. Table 1 hosts four proposals which are meant to capture the paradigmatic relations which involve those grammatical elements which occupy slots to the left of the verb stem in irrealis constructions. Grey shading identifies cases of variation across the four proposals. Pagel’s (2010: 90) account follows that of Chung (1998) which in turn is at the basis of Chung (2020). Gibson (1992: 28) provides a paradigm which resembles Chung’s (2020) proposal but does not include the dual number. Table 1: Paradigms of morphemes occupying slots to the left of the verb stem in the irrealis.

Person

Safford (1909: 74)

Topping and Dungca (1973: 263)

Cooreman (1987: 39)

Chung (2020: 29)

1SG

hu-

bai hu

(bai)hu-

(bai) hu ~ bai

2SG

un-

un

un-

un

3SG

u-

u

u-

u

1DU.INCL

uta-

(u) ta

(u)ta-

(u)ta

1DU.EXCL

in-

bai in

(bai)in-

(bai) in

2DU

en-

en

en-

en

3DU

uha-

u ha

u-

u

1PL.INCL

uta-

u ta

(u)ta-

(u)ta

1PL.EXCL

in-

bai in

(bai)in-

(bai) in

2PL

en-

en

en-

en

3PL.ITR

uha-

u

u-

u

3PL.TR

uha-

u ma

uma-

uma

Table 1 calls for explanations. First of all, the ternary number system of singular vs. dual vs. plural follows the principles of constructed numbers (Corbett 2000: 169–171) in the sense that the dual is expressed by way of combining nonsingular pronouns with nonplural verb forms (Stolz 2019). This is why we find 1NSG.INCL ta, 1NSG.EXCL in, and 2NSG en not only in the dual but also in the plural; only in the latter number, however, these pronouns combine with the pluralizing verb prefix fan- which is thus relevant for the distinction of the two nonsingular numbers. The pronouns 1SG hu, 2SG un, 1NSG.INCL ta, 1NSG.EXCL in, 2NSG en, and 3PL ma are identical to the corresponding pronouns in the ergative set. Ha is also taken from this set. However, ha in the irrealis and ha in the realis display different distribution profiles. Moreover, ha in the irrealis is registered

102 | Thomas Stolz

only by Safford (1909) and Topping and Dungca (1973) whereas it is absent from the paradigms put forward in the two more recent studies by Cooreman (1987) and Chung (2020).10 With regards to the blurred boundary between the layers of person marking and irrealis marking, this point is crucial and thus requires closer scrutiny. According to Table 1, Safford (1909) identifies the presence of ha in the 3rd person dual and plural for both transitive and intransitive verbs. Ha has vanished from the plural in Topping and Dungca’s (1973) proposal and disappears completely from the paradigms of Cooreman (1987) and Chung (2020). What is additionally interesting is the absence of ha in the 3rd person singular irrealis of all proposals surveyed in Table 1 although in the realis, the domain of ha comprises the 3rd person singular and dual, i.e. the 3rd person nonplural (Stolz 2019). The slot of the preverbal pronoun is empty so that its neighbour to the immediate left (which belongs to the next layer) seems to take over the function of encoding the 3rd person singular. For Chung (2020: 30) – in accordance with Cooreman (1987), u serves as person and number agreement marker not only in the 3rd person singular as in (8) but also in the 3rd person dual and plural (for intransitives) as in (9)– (10).11 The symbol Ø marks the absence of a separate pronoun. (8)

[Perez and Faustino 1975: 43] Si Tata-n Osu matto para DEF.PN father-LINK bear arrive PARA ‘Father Bear arrived to have breakfast.’

u U

Ø Ø

amotsa. breakfast

(9)

[Perez and Faustino 1975: 33] Annai masa i sena mata’chong papa’ when cook_ready DEF.CN dinner sit down i dos para u Ø sena. DEF.CN two PARA U Ø dinner ‘When dinner was ready the two of them sat down to eat.’

(10)

[Saint-Exupéry 2021, ch. VIII] Po’lo ya u Ø fan put and U Ø IRR.PL

matto come

i DEF.CN

tigre tiger

siha PL

|| 10 According to Sandra Chung (p.c.) there is probably a dialectal difference here: Safford (1909) and Topping and Dungca (1973) are primarily describing the Guam dialect, whereas Cooreman (1987) and Chung (2020) are primarily describing the Saipan dialect. 11 A similar conclusion can be found in Kats (1917: 111) who assumes that u is but the marker of the 3rd person(s).

Three Spanish-derived function words and the Chamorro irrealis | 103

yan i kaka’guas-ñiha! and DEF.CN scratcher-POR.3PL ‘Let the tigers come with their claws!’ What comes to mind first is the possibility that u is insensitive to number and only marks the 3rd person. Since in the function of person marking, u contrasts with ha and ma in the realis, one might want to consider it a portmanteau morph which expresses 3rd person and irrealis jointly. This hypothesis is not particularly convincing as results from the ensuing discussion. To start with, I present examples (11)–(12) which prove that it is still possible to use the combination u + ha in the 3rd person nonplural. (11)

[Saint-Exupéry 2021, ch. II] Kao un po’lo na u ha nesesita Q 2SG.ERG put SUBORD U 3NPL.ERG need meggai na chå’guan este na kinilo? many LINK grass this LINK sheep 12 ‘Do you think that this sheep will need a lot of grass?’

(12)

[Taimanglo et al. 1999: 145] Dos na hobensit-a ma-tågo’ two LINK teenager-F PASS-command para u ha atan i mannok PARA U 3NPL.ERG look_at DEF.CN chicken gi halom tinanom mai’es in inside plant corn ‘Two teenage girls were told to look for the chicken in the cornfield.’

In (11), the pronoun refers to an individual – the sheep – whereas in (12) reference is made to the two teenage girls. In the latter case, the absence of the prefix fan- from the verb atan ‘look at’ indicates that we are facing a constructed dual. I have not found any evidence of u + ha being employed in the plural. Thus, it seems that the cells of the 3rd persons nonplural in Table 1 should host the pronoun ha in brackets alongside u. Topping and Dungca (1973: 262) register u as marker of the future tense with restricted distribution. What further impairs the analysis of u as agreement marker is the use of u in combination with other pronouns. This is optionally the case in 1st person inclusive of the nonsingular numbers and obligatorily also in the 3rd person plural of transitive verbs, i.e. combinations of u + ta (as in (13))

|| 12 Sandra Chung (p.c.) mentions the possibility that ha might be the stressed prefix há- ‘often’.

104 | Thomas Stolz

and u + ma (as in (14)) are attested in which the association of u with a certain person and number category is doubtful. The zero sign signals the absence of u in the second clause in (13). (13)

[Saint-Exupéry 2021, ch. XI] Ya håfa debe de u ta cho’gue and what must U 1NSG.INCL do para Ø ta na’ tunok i tihong? PARA Ø 1DU.INCL CAUS get_down DEF.CN hat ‘And what must we two do to let the hat come down?’

(14)

[Taimanglo et al. 1999: 4] I famagu’on para u ma fa’tinas DEF.CN PL:child PARA U 3PL.ERG make Belen Christmas_crib ‘The children are going to build a Christmas crib.’

i DEF.CN

The two clauses in (13) give evidence of the optional character of u in irrealis constructions which involve the pronoun ta. It is clear that defining u as marker of the 3rd person creates a conflict with the optional use of the same marker in association with the 1st person nonsingular inclusive. In combinations of this kind – no matter how optional they are – u cannot be an agreement marker in the first place. This is not much different for the combination u + ma in (14). The task of u is by no means that of encoding a person category. To the contrary, it helps to distinguish realis from irrealis constructions because the only difference that exists between the two moods in the 3rd person plural (apart from the verb prefixes) is the presence or absence of u. Thus, u does not belong to the layer of person marking but has to be moved to the layer of irrealis marking – which leaves us with the problem of postulating a zero pronoun for those cases which we have reviewed in (8)–(10). Note, however, that examples (11)–(12) show that the zero pronoun is not compulsory since it competes with ha. At this point, it makes sense to turn our attention back to Table 1. The proposals are ordered chronologically from left to right. The differences which distinguish Safford’s paradigm from those of his successors could be interpreted diachronically, i.e. they are perhaps the result of language change. Costenoble (1940: 296–297) assumes that the irrealis marker u has coalesced with the pronoun in several steps. According to Costenoble’s reconstruction, the order of irrealis marker and pronoun must have been variable in the past so that u could be absorbed by those pronouns which end in a vowel as well as by those whose initial segment is a vowel. On an intermediate stage, the contraction of u with

Three Spanish-derived function words and the Chamorro irrealis | 105

the vowel of the pronoun resulted in an excrescent glottal stop which originally served to separate the two vowels. At the terminus of the process, all traces of u outside the 3rd persons have vanished as has ha (in the plural only optionally). In support of his hypothesis, Costenoble (1940: 295) provides examples whose correctness is rather doubtful since the author admits that he quotes from his memory (Costenoble 1940: v). These doubtful examples are reproduced as (15)– (16) with their original German translation added in square brackets. (15)

[Costenoble 1940: 295] ? U hu li’i haw agupa‘? U 1SG.ERG see 2SG.ABS tomorrow ‘Will I see you tomorrow?’ [German: ‘Werde ich dich morgen sehen?’]

(16)

[Costenoble 1940: 295] ? Hago u un koṇe’? 2SG.EMPH U 2SG.ERG catch ‘Will you catch him?’ [German: ‘Wirst du ihn fangen?’]

It is likely that these and other examples of this kind do not reflect the actual language use of native speakers of Chamorro in the first decades of the 20th century. Nevertheless, it should not go unmentioned that Matsuoka (1926: 222) assumes the optional presence of the irrealis marker u (future marker according to the Japanese author) in all cells of the paradigm as shown in Table 2. The boldface, brackets, and morpheme segmentation are mine.13 Table 2: Irrealis paradigms (transitive/intransitive) according to Matsuoka (1926).

taitai ‘read’

hanau ‘go’

1SG

(u)-fu-taitai

(u)-fu-hanau

2SG

u-on-taitai ~ un-taitai

u-on-hanau ~ un-hanau

3SG

u-(ha)-taitai

u-(ha)-hanau

1PL.EXCL

(u)-in-taitai

(u)-in-hanau

1PL.INCL

u-ta-taitai

u-ta-hanau

2PL

(u)-en-taitai

(u)-en-hanau

3PL

u-ha-taitai

u-ha-hanau

|| 13 The initial fricative of the prefix fu- represents the voiceless bilabial [ɸ]. Its use reflects a Japanese phonological rule according to which [f], [ç], and [h] are positional allophones of each other whose choice depends on the following vowel (Hinds 1986: 393–394). The Hiragana symbol for the syllable fu used by Matsuoka (1926) is ふ.

106 | Thomas Stolz

The irrealis marker is depicted as compulsory component of the construction in the 3rd persons and in the 1st person nonsingular inclusive. In all other cells, the presence of u is marked as optional. It is not clear to me whether the author postulates variation of actually realized word forms or assumes that the u-less constructions are derived from “underlying” constructions equipped with a virtual irrealis marker. Matsuoka speaks of reductive euphonic processes. Whether the word forms in Table 2 are reliable at all cannot be determined on the basis of my present corpus. In the absence of further empirical corroboration of Matsuoka’s analysis, I register these examples as questionable. Costenoble (1940: 296–297) goes on to sketch two possible origins of u, namely a) an erstwhile disyllabic transitive verb *uhu or *uʔu or b) the particle tu which is said to serve as future marker also in unidentified other languages. No matter how speculative Costenoble’s reasoning might seem to be, it is clear that on a par with the above m/f-alternation, u must be a very old constituent of the irrealis construction which cannot be connected to any processes of grammatical Hispanization at all. What additionally strikes the eye in Costenoble’s account is the absence of the combinations u + ta and u + ma. The former is only admitted as a reconstructed stage. The latter does not figure anywhere in his grammar. This absence of u + ma concurs with the description in Safford (1909) so that it cannot be ruled out that u + ma in the 3rd person plural of transitive verbs is a relatively recent innovation in the grammatical system of Chamorro (Sandra Chung, p.c).

2.4 Mood marking (and beyond) On account of the foregoing paragraphs, we are prepared for the next layer which is responsible for mood marking. We know already that marking the irrealis is the task of u. However, this irrealis marker does not combine with each of the pronouns as shown in Table 1. There are huge gaps in the distribution of u over the cells of the irrealis paradigm. The 2nd person of all numbers is exempt from combining with u as is the 1st person singular and dual/plural exclusive. The absence of a dedicated irrealis marker notwithstanding, the construction is distinctive nevertheless for intransitive and detransitivized verbs as illustrated with the motion verb måtto ‘arrive’ in Table 3.

Three Spanish-derived function words and the Chamorro irrealis | 107

Table 3: Realis vs. irrealis in the 2nd person of all numbers with intransitive verbs.

REALIS

IRREALIS

V

PRO

PRO

V

2SG.ITR

måtto

hao

un

fåtto

2DU.ITR

måtto

hamyo

en

fatto

2PL.ITR

man-måtto

hamyo

en

fan-måtto

The linear order of pronoun and verb is already sufficient to distinguish the two moods. In terms of coding economy, there is no need for any additional morphological marking. Nevertheless, there is the m/f-alternation as well. With fully transitive predicates, however, the formal distinction of realis and irrealis cannot be achieved in this way. For a verb like loffan ‘transport’ the 2nd persons are identical across the moods: un loffan ‘youSG (will) transport (it)’, en leffan ‘youNSG (will) transport (it)’ (Stolz 2015: 487). This is where the Spanish function words come into play (cf. Section 3.2). The use of u in the 1st person nonsingular exclusive is optional. Compare example (17) below with example (13) above. The empty slot of the irrealis marker is signalled by Ø. (17)

[Taimanglo et al. 1999: 43] Håfa para Ø ta cho’gue put for what PARA Ø 1NSG.EXCL.ERG do ‘What shall we two do about this, my comrade?’

este, this

atungo’? acquaintance

According to Cooreman (1987) and Chung (2020), in the 1st person nonsingular inclusive, u is only optional. In contrast, Topping and Dungca (1973) assume that u must be used if the intended meaning is that of plural (and not dual).14 In (13), u is present although the dual applies. In (18), u is missing although the plural applies. (18)

[Taimanglo et al. 1999: 148] Ilek-mu nai na when LINK say-2SG para Ø ta fan-deskånsan-ñaihon PARA Ø 1NSG.EXCL.ERG IRR.PL-rest-a_moment ‘You tell when we will rest for a while.’

|| 14 Topping and Dungca (1973: 263) claim the marker to be obligatory in the paradigm, but on the preceeding page Topping and Dungca (1973: 262) they say that in the first person, u is “not obligatory”.

108 | Thomas Stolz

Like in the above cases of the 2nd person, the irrealis construction is sufficiently distinctive also in the absence of u provided the verb is intransitive or detransitivized because the combination with a preverbal pronoun and the presence of the /f/-initial plural prefix on the verb leave no doubt that we are dealing with the irrealis. Again, these possibilities of retrieving the irrealis meaning from pronoun-verb order and the m/f-alternation are blocked for fully transitive predicates. The next optional irrealis marker is the first of the three Spanish-derived function words which are focused upon in this study. In contrast to the earlier reference grammar (Topping and Dungca 1973), both Cooreman (1987) and Chung (2020) consider the use of bai for the 1st person (exclusive) of all numbers to be optional. Moreover, Chung (2020) assumes that in the singular, neither bai nor the pronoun hu is mandatory though the presence of at least one of them seems to be required as the examples (19)–(21) suggest. (19)

(20)

(21)

[Perez and Faustino 1975: 42] Bai hu maigo’ BAI 1SG.ERG sleep ‘I will sleep in the big chair.’

gi in

siya-n chair-LINK

dankolo. big

[Perez and Faustino 1975: 6] Bai Ø po’lo papa’ i kanastra-ku, BAI Ø put down DEF.CN basket-POR.1SG ya bai Ø hokka i dikike’ na and BAI Ø pick_up DEF.CN small LINK ‘I will put down my basket and pick up the little pig.’ [Taimanglo et al. 1999: 105] Bai hu tattiyi i kayon BAI 1SG.ERG follow DEF.CN path siempre Ø hu sodda’ si SIEMPRE Ø 1SG.ERG find DEF.PN ‘I will follow the path, I will certainly find daddy.’

babui. pig

Tåta. father

In (20) as well as in (21), there are two irrealis clauses. The pronoun hu is absent from both these clauses in (20) whereas bai is missing from the second one in (21). That the presence of siempre might be favourable for bai-dropping will be discussed below (cf. Section 3.2–3.3). The situation is similar to that of u + ha addressed in the previous paragraphs. Like ha alternates with zero so do hu and zero. In analogy to the u-case, I assume that the primary function of bai is that of an irrealis marker. Its status as agreement marker is doubtful at least with regard

Three Spanish-derived function words and the Chamorro irrealis | 109

to a functional monopoly. I will come back to this intricate issue in Section 3.2. The examples (22)–(24) support the idea that bai first of all marks the irrealis. (22)

[Taimanglo et al. 1999: 117] Månu na bai in LINK BAI 1NSG.EXCL.ERG where ‘Where will we two find you?’

sedda’ find

(23)

[Taimanglo et al. 1999: 148] håyi tågo’ ham 1NSG.EXCL.ABS who command para bai in fan-deskånsa? PARA BAI 1NSG.EXCL.ERG IRR.PL-rest ‘[] who commands us to take a break?’

(24)

[Taimanglo et al. 1999: 159] Ø in fan-kene’ Ø 1NSG.EXCL.ERG IRR.PL-catch ‘[] we will be caught by the dwarf.’

ni’ by

hao? 2SG.ABS

duendes. dwarf

There is again variation. This time it is the 1st person nonsingular exclusive which gives evidence of the alternation of bai and zero. Agreement is marked solely by the pronoun in whereas bai has the function of marking the irrealis – a task it shares with m/f-alternation of the plural prefix in (23). Example (24) stands out insofar as there are no preverbal irrealis markers in the first place. The mood category results from the combination of the (detransitivized) passive verb and the ergative pronoun on the one hand and the presence of the plural prefix on the verb which is subject to the m/f-alternation. This situation is not absolutely exceptional but constitutes a minority option. In most of the examples given in this section, the left margin of the irrealis construction is occupied by para (examples (8), (9), (12), (14), (18), and (23)), siempre (example (21)), or debe de (example (13)) all of which – together with bai – belong to the Spanish component in the lexicon and grammar of Chamorro. These elements are situated on the fifth and outermost layer of the irrealis construction. This layer and the function words populating it will be put under the microscope in Section 3 – together with bai.

2.5 Looking back twice To close this section, I first present the five layers of the irrealis construction in the shape of a template in Figure 1. The position of bai and siempre is provisional as it needs to be inquired into again in Sections 3.2–3.3. Final comments relating to the template will be given in Section 4.

110 | Thomas Stolz

left margin

(para), (siempre)

V

irrealis

pronoun

m/f-alternation prefix

stem

(hu), un, (ha), ta, in, en, ma, Ø

antipassive; plural

[m…]ADJ/V.ITR

(u) IV

III

II

I

(bai)

Figure 1: The five layers of the irrealis construction and their fillers.

What we have learned so far from examining the data of my corpus of literary Chamorro is that the last word on the paradigm of the irrealis has not been spoken yet. Table 1 seems to require a number of adjustments. The phenomena discussed in the foregoing paragraphs invite a construction-based approach in lieu of a morpheme-oriented one. The grey-shaded parts of the fourth and fifth layers in Figure 1 play a particular role in the architecture of the irrealis construction. This role is addressed in the subsequent main part of this study. The position of the Spanish-derived elements on the periphery of the template iconically reflects their status as latecomers, i.e. bai, para, and siempre have been added to a mood category that was already in full bloom before language contact with Spanish set in. The genuinely Austronesian pre-contact component can be identified with the positions I–III and u in Figure 1. This means that Spanish influence has not created the Chamorro irrealis but probably enriched it by way of providing means to differentiate otherwise indistinguishable functions. As to the diachronic succession of events, it is hard to pinpoint when exactly the Spanish admixtures established themselves in Chamorro.15 In Sanvitores’s short grammar of Chamorro written in 1668, there is no trace of the irrealis construction as of today. In the paragraph dedicated to the future tense, the author provides examples of the incompletive aspect (Burrus 1954: 946). There is a gap of some 200 years before the first printed matter in Chamorro saw the light of day. Among the first booklets to be published in Chamorro is the translation of a pedagogical grammar of Spanish by Ibáñez del Cármen (1865) who provides paradigms of Spanish verbs with the appropriate Chamorro version.16 By way of example, the future tense of Spanish amar ‘love’ and its Chamorro equivalent guflíi = gofli’e’ is given in Table 4 (Ibáñez del Cármen 1865: 21). || 15 For critical appraisals of the early texts in and on Chamorro (prior to 1950), the reader is referred to the contributions in Reid et al. (2011) and Winkler (2016). 16 The authorship of the 19th century texts in Chamorro which bear Ibáñez del Cármen’s name on the title page is controversial. It is possible that José Palomo, a Chamorro native speaker

Three Spanish-derived function words and the Chamorro irrealis | 111

Table 4: Spanish and Chamorro paradigms of future/irrealis as of the mid-19th century.

Spanish

Chamorro (original) Chamorro (segmented) Chamorro (modernized)

1SG

amaré

juguflíi

ju-guflíi

hu gofli’e’

2SG

amarás

unguflíi

un-guflíi

un gofli’e’

3SG

amará

uguflíi

u-guflíi

u gofli’e’

1PL

amaremos

utaguflíi

u-ta-guflíi

u ta gofli’e’

2PL

amaréis

enguflíi

en-guflíi

en gofli’e’

3PL

amarán

ujaguflíi

u-ja-guflíi

u ha gofli’e’

The pattern of the paradigm is determined by Spanish to the detriment of dual and clusivity in Chamorro. However, the data resemble those reported by Safford (1909) in Table 1. The irrealis marker u combines with ta, there is no pronoun in the 3rd person singular, and in the 3rd person plural the pronoun ha is used where contemporary Chamorro would prefer ma. We can interpret the fragment of a paradigm given for Chamorro in Table 4 as follows. The first three layers of the irrealis construction were in place by the 19th century and there is no good reason to doubt that the system had been operative already in the more distant past. This hypothesis receives support from the use of the m/falternation on the prefix in verb forms like utafanmaañao ‘weINCL will fear’ as translation of Spanish temeremos (Ibáñez del Cármen 1865: 25) which can be segmented into u-ta-fan-maañao which is equivalent to contemporary Chamorro u ta fan-ma’a’ñao. Note also that obligation is already expressed with debe de ‘must’ as in debe de ufanguflíi ‘s/he must love someone’ (= Spanish tiene de amar) (Ibáñez del Cármen 1865: 32) with m/f-alternation on the antipassive prefix (segmented u-fan-guflíi = modernized u fanggofli’e’). Ibáñez del Cármen’s paradigms do not involve any of the three function words of interest. Their absence from the paradigms does not, however, preclude the possibility that they were in use at the time of the Spanish cleric’s writings. In point of fact, in another (this time bilingual Chamorro-Spanish) publication of Ibáñez del Cármen’s addressed to a Chamorro audience, the author himself employs the Spanish voy ‘I go’ (> bai) in the Chamorro version of the text as shown in (25). For the reader’s convenience, the original orthographic representation is given between pointed brackets followed by a version in

|| and presbyter on Guam in the 19th century, was involved crucially in the production of these booklets (Enciclopedia Univeral 1920: 503). If José Palomo was indeed responsible for the Chamorro parts of the publications, their linguistic reliability increases substantially.

112 | Thomas Stolz

modernized spelling reflecting the norms of Topping et al. (1975). The original Spanish parallel version is added in the translation in square brackets. (25)

[Ibáñez del Cármen 1887: 25]

Bai hu sangani hamyo BAI 1SG.ERG tell 2PL.ABS hafa taimanu risibi-ta maolek what how receive-POR.1NSG.EXCL good as Jesukristo gi komuñón. DEF.PN.OBL Jesus Christ in communion ‘I am going to tell you how to receive Jesus Christ well during Holy Communion.’ [Spanish version: Voy á deciros, còmo recibiréis bien ó dignamente á Jesucristo al comulgar.]

Para and siempre are also around at the same period but on the basis of Ibáñez del Cármen’s writings it is difficult to determine to what extent, if at all, their use was independent of that of their etymological sources in the parallel Spanish version of the texts. As will transpire from the discussion in Section 3.1, in the 19th century, para occurred frequently also in contexts which invoke an irrealis interpretation whereas siempre seems to be restricted to the use as temporal adverb. In the light of Pagel’s (2010: 91–93, 98–103) discussion of bai, para, and siempre as potential cases of grammaticalization, it is important to keep in mind that the documented history of the three function words starts at very different stages of development. Furthermore, Pagel (2010: 92) agrees with Chung and Timberlake (1985) when he denies that there was a proper category FUTURE on older stages of Chamorro. I raise this issue again in Section 3. To complement this section, a glimpse at what the grammarians of Chamorro of the early 20th century had to say about the irrealis is in order. Except bai in the 1st person singular, Fritz (1903: 16) does not recognize distinct forms of the irrealis but assumes that buente ‘perhaps’ may serve as a marker of this category (Fritz 1903: 17). Para is given as a purposive conjunction and benefactive preposition (Fritz 1903: 24 and 26). Only outside the paradigms does Safford (1909: 101) mention béa (= bai) as a defective verb whose domain he restricts to the 1st person singular with the verb “following it [being] in the future.” Siempre ‘always’ is listed among the Spanish loan-adverbs (Safford 1909: 108). Para ‘to, for, in order to’ ranges among those Hispanisms which were introduced by the early missionaries because of their limited understanding of the principles of

Three Spanish-derived function words and the Chamorro irrealis | 113

Chamorro grammar. Lopinot (1910: 11–13) provides paradigms which replicate those of Safford’s but mentions that the purposive conjunction para requires the future tense of the verb (Lopinot 1910: 12). In the dictionary to which the grammatical sketch is just an annex, Lopinot (1910: 95) registers bai as prefix of the 1st person singular in the future tense. Siempre on the other hand is translated into German as immer ‘always’ without further comments. Kats (1917: 116) has nothing to add to Safford’s paradigm. The same holds for Von Preissig (1918: 18). As a separate lexicon entry, para is inexistent whereas siempre is glossed with English always (Von Preissig 1918: 222). Matsuoka (1926: 228) assumes that bai is restricted to the 1st person singular and expresses strong intention on the part of the speaker. Para is mentioned only as preposition; no mention is made of siempre. In his Chamorro-Spanish dictionary, De Vera (1932: 29) mentions that bai (~ bae ~ boi) is not only prefixed to the 1st person singular but also to the 1st person plural exclusive in the future tense. In contrast, De Vera (1932: 218 and 251) has nothing remarkable to report on para and siempre. This is the first mention of bai being used outside the singular. Costenoble (1940: 305–307) introduces a mixed bag of “Hilfsmittel zur Kennzeichnung von Tempus, Aspekt und Modus” [ancillary tools for marking tense, aspect, and mood] which inter alia contains para and bai (bay ~ boy) with the former being defined as a marker which emphasizes the imminent start of a given action whereas the latter is used with the 1st person singular and plural exclusive to indicate the volitional future. Siempre is not mentioned. Izouî (1940: 18–19) proposes a category termed “intentionnel” (aka “futur”) whose paradigm looks like a transition from the stage described by Safford (1909) and that described by Topping and Dungca (1973) because the optional presence of bai in 1st person singular and dual/plural exclusive is acknowledged whereas the 3rd person nonsingular irrealis is still expressed by u + ha. What these pieces of evidence boil down to is the strong possibility that the irrealis mood has undergone substantial changes not only since the first printed records of the language. These changes comprise the entrance of the Spanish function words on the scene as well as the replacements/disappearance of pronouns on the third layer of the construction. In what follows, I concentrate on the Spanish impact.

3 On the left margin The corpus yields a turnout of 1,746 tokens for the three function words added up. Some 79% of these tokens go to the credit of para, second best is bai with almost 16%, whereas siempre is responsible for the remaining 5% of the tokens.

114 | Thomas Stolz

As can be gathered from Figure 2, the ranking order para > bai > siempre holds also for each individual text of the corpus. 800 700 600 500 400 300 200 100 0 Taimanglo et al.

Onedera (2007)

(1999) para

bai

Saint-Exupéry

Perez and Faustino

(2021)

(1975)

siempre

Figure 2: Token frequency of para, bai, and siempre in the corpus.

The dominant position of para can be explained by way of referring to the range of functions to which the function word is put. Not all of para’s functions lend themselves to an analysis as categories of the irrealis. It is even possible to assume the co-existence of several functionally distinct but homophonous function words. In contrast to para, bai is monofunctional (but perhaps a case of a portmanteau morph) whereas the extent of the functional load of siempre is not as clear as that. Sections 3.1–3.3 feature the three function words in the order of their decreasing token frequency. The initial paragraph of each of these sections contains a quote from Rodríguez-Ponga (1995, 2009) which serves as a reference point for the ensuing discussion. Within the sections, there are further subdivisions (not necessarily presented as separate sections) which inform briefly about the use of the function words in the donor language Spanish and in more detail about the situation in the older Chamorro texts. Section 3.4 sketches the possible diachronic developments as a prelude for the comments on the above template of the irrealis construction. As to the comparison with Spanish, a word of caution is called for. The Spanish data are taken from contemporary Spanish – Peninsular and other – although they are meant to serve the purpose of determining whether the uses the function words are put to in Chamorro can be explained as direct copies

Three Spanish-derived function words and the Chamorro irrealis | 115

from the donor language. Since the Hispanization of Chamorro took place during the 18th and 19th century, the comparison is thus anachronistic because the modern situation is projected into the past. It cannot be ruled out that the Spanish items under scrutiny underwent (albeit only minimal) changes in this period of time, too. I am aware of the methodological pitfalls of reconstructing the past solely on the basis of modern data.

3.1 para The multifunctionality of Chamorro para is clearly visible from the extended paragraph Rodríguez-Ponga (1995: 512) dedicates to this function word: pára. [] (preposición) para, a, hacia. Funciona libremente. [] Indica dirección ‘hacia’, en el espacio y en el tiempo: Kínse minútos para las siéte, quince minutos para las siete (06,45 h.). Para mánu hao? ¿para dónde vas? I asaguá-hu humánao para i tenda, mi cónyuge fue para la tienda. Indica destino, finalidad o utilidad: Un regálo para i familia, un regalo para la familia. Dos sapátos para i lancho, dos zapatos para (trabajar en) el rancho. En combinación con otros elementos gramaticales, forma el futuro, equivalente al futuro perifrástico español o a construcciones finales [added underlining]: Pára bai hú máigo (lit. ‘para voy yo dormir’), yo dormiré. Para ún máigo (lit. ‘para tú dormir’), tú dormirás. Para bai in saga giya Agáña (lit. ‘para voy nosotros residir…’), residiremos en Agaña. Si Diégo ha espípiha manéra siha para u sétbe i Mariános, Diego busca las maneras para server a los marianos. La marca de futuro con el pronombre de segunda persona ún puede formar una contracción: para un > pa un > pon. Aparece en expresiones fijas [] y en alguna expression idiomatica [].17

The underlined sentence in the above quote is especially interesting because it supposes that para always combines with other functional elements to express categories of the irrealis. In a posterior publication, however, Rodríguez-Ponga

|| 17 My translation: ‘pára. [] (preposition) for, to, towards. It functions freely. [] It indicates direction ‘towards’, in space and time: Kínse minútos para las siéte, fifteen minutes to seven (06.45 hrs.). Para mánu hao? Where do you go? I asaguá-hu humánao para i tenda, My spouse has gone to the shop. It indicates destination, purpose or use: Un regálo para i familia, a present for the family. Dos sapátos para i lancho, two shoes to work at the farm in. In combination with other grammatical elements, it forms the future, equivalent to the Spanish periphrastic future or to purposive constructions: Pára bai hú máigo (lit. ‘for go I sleep’), I will sleep. Para ún máigo (lit. ‘for you sleep’), you will sleep. Para bai in saga giya Agáña (lit. ‘for I go we reside…’), we will live in Agaña. Si Diégo ha espípiha manéra siha para u sétbe i Mariános, Diego looks for ways and means to serve the people of the Marianas. The future marker and the pronoun of the 2nd person ún may contract: para un > pa un > pon. It also occurs in fixed expressions [] and in some idioms [].’

116 | Thomas Stolz

(2009: 153) assumes that “[i]ncluso para puede ser marca de futuro sin que haya otras marcas” [para may even be the future marker without any other markers being present]. The author has evidently revised his original hypothesis. Given that the new idea meets the empirical facts, does this mean that para renders superfluous the use of those elements which occupy the slots on layers III–IV in Figure 1? And if the answer to this question is positive, does para oust the other markers indiscriminately or depending on the function and/or marker? Similar to what Costenoble (1940: 184–185) postulates in his grammar of Chamorro, Pagel (2010: 98) assumes four major functions for para, namely a) directional preposition (both spatial and temporal), b) benefactive preposition, c) discourse particle, d) marker of future, subjunctive, and purposive. Except (c)18, all these functions deserve being inquired into. Note also that Chung (1998: 25) speaks of two distinct homophonous function words, namely para1, the marker of the uncertain future (the neutral future in Pagel’s (2010: 95) terms) and subjunctive, vs. para2, the directional-benefactive preposition equivalent to English to and for. The question arises whether we are dealing with a single but multifunctional/polysemous element or several homophonous but functionally disjunct elements.

3.1.1 para as preposition For a start, we look at instances of para with a complement NP. According to the Real Academia Española (2009: 2270) the basic function of the Spanish preposition para is that of marking the destination of a physical or figurative movement. All other functions associated with para are derived from there metonymically or metaphorically, namely recipient (“destinatario”), purpose (“utilidad, servicio”), and orientation/intention (“orientación”, “intención”). Table 5 reproduces the examples given in Real Academia Española (2009: 2270).

|| 18 In point of fact, it is not entirely accurate to classify para as discourse particle. What Pagel (2010: 98) alludes to are examples like Para hafa? ‘So what?’ in which para is only part of a binary or even more complex expression.

Three Spanish-derived function words and the Chamorro irrealis | 117

Table 5: Major meanings of para in contemporary Spanish.

Meaning

Example – Glosses – Translation

Destination

Voy para mi go.1SG PARA POR.1SG ‘I go to my house.’

Purpose

Para el

viaje he preparado una merienda journey AUX.1SG prepare:PTCPL INDEF:F snack ‘For the journey, I have prepared a snack [].’

PARA

Benefactive

Intention

casa. house

DEF.M

Lo compré 3SG.M.ACC buy:1SG.PAST ‘I bought it for her.’

para PARA

ella 3SG.F

Estudia para medico. study.3SG PARA doctor ‘S/he is studying to become a doctor.’

In case of substantial changes over time and variation in space, the Real Academia Española normally complements the purely descriptive paragraphs of the grammar with the appropriate information as to earlier stages of the language and further diatopic phenomena. What the Real Academia Española (2009: 2270) finds worth mentioning is the attestation of para in spatial functions as early as the Middle Ages. Since no further diachronic information is given in connection to para, it can be taken for granted that the present situation is not much different from that of several centuries ago. Several uses of para in contemporary Chamorro are discussed in Topping and Dungca (1973: 122–126). They are largely in line with those listed in Table 5 for modern Spanish. The spatial and benefactive functions stand out because of their high frequency. In (26), the spatial meaning is obvious. (26)

[Taimanglo et al. 1999: 1] siempre man-hånao i man-åmko’ na la~låhi SIEMPRE PL-go DEF.CN PL-old LINK RED~man yan si Tatan-ñiha para i lanchon-ñiha. and DEF.PN father-POR.3PL PARA DEF.CN farm-POR.3PL ‘[] the adult men and their father would go to their farm.’

Topping and Dungca (1973: 122–123) ponder the idea – originally put forward by Costenoble (1940: 184) – that there might have been a directional marker in Chamorro prior to contact which was subsequently ousted by the imported preposition. They hypothesize that there already existed a homophonous Aus-

118 | Thomas Stolz

tronesian function word para to whose domain the functions of Spanish para were added. Examples (27)–(28) illustrate the use of para as marker of a relation of purpose and/or intention. (27)

[Taimanglo et al. 1999: 2] man-mamo~moksai månnok, babui yan guaka PL-AP:RED~raise chicken pig and cow para nengkanno’ i familia. PARA food DEF.CN family ‘[] they were raising chicken, pigs, and cows as food for the family.’

(28)

[Onedera 2007: 253] Ma håtsa un palåsyo gi iya Hagåtña para 3PL.ERG lift INDEF palace in DEF.PLN Hagåtña PARA guma’ yan ufisina-n i Gibetno-n Espåña house and office-LINK DEF.CN government-LINK Spain ‘They built a palace in Hagåtña as house and office of the Spanish governor.’

The benefactive function of para is illustrated in (29). (29)

[Taimanglo et al. 1999: 68] Ha sokne si Juan i amot na 3SG.ERG accuse DEF.PN Juan DEF.CN medicine SUBORD ti gof macho’cho’ para i Españot NEG very work PARA DEF.CN Spaniard taiguihi i Chamoru. the_way DEF.CN Chamorro ‘Juan blamed the medicine for not working so much for the Spaniard in the Chamorro way.’

With regards to the benefactive, Topping and Dungca (1973: 124) assume that the Spanish form (using para) is replacing the Chamorro form, at least with some verbs. [] The fact that the Spanish form follows the structure of English grammar very closely probably helped to establish the form using para as the more common form used by younger Chamorro speakers who learn English at a very early age.

In this context they support their hypothesis with the coexistence of two options for expressing the benefactive relation as in (30a–b) [original English translations]. For further details on the alternation captured by (30a–b), the reader should consult Gibson and Raposo (1986: 314–315).

Three Spanish-derived function words and the Chamorro irrealis | 119

(30) a. b.

[Topping and Dungca 1973: 124] Ha sangan-iyi yo’ ni 1SG.ABS DEF.CN.OBL 3SG.ERG tell-BEN Ha sangan i estoria para DEF.CN story PARA 3SG.ERG tell ‘He told the story for me.’

estoria. story guahu. 1SG.EMPH

The supposedly ongoing replacement of the Austronesian benefactive construction with the Spanish equivalent is an example of combined MAT-borrowing and PAT-borrowing (cf. Section 5), i.e. not only has the phonological chain para been borrowed but also the morphosyntactic pattern in which it normally occurs in the donor language. This can be considered a case of strong Hispanization. The abundance of attestations of para in the Chamorro texts from the 19th century notwithstanding, it is difficult to present evidence for each of the above major functions of the function word in Spanish. That is the case with the spatial function of para for whose illustration I had to sieve the extant texts of that period very thoroughly.19 The spatial function is illustrated in example (31), purpose in (32), and benefactive in (33). (31)

[Ibáñez del Cármen 1887: 65]

i gipot i Asunsion pat i hanao DEF.CN feast DEF.CN Ascension or DEF.CN go santa María para i langhet gi 15 de Agosto PARA DEF.CN heaven in 15 August St. Mary ‘The feast of Ascension or the ascension of St. Mary to the heavens on 15 August.’

(32)

[Forbes 2009: 27] (original English translation)

Ha po’lo i sakramento-ña put DEF.CN sacrament-POR.3SG 3SG.ERG para i suette-ta PARA DEF.CN fortune-POR.1PL.EXCL ‘He instituted the sacraments for our good fortune [].’

siha PL

|| 19 An abundance of instances of spatial para can be found in Fritz (1907). This sketch of the history of the Marianas before imperial Germany took control of the islands north of Guam is difficult to judge because it is unclear whether the German governor was the sole author so that the text would count as a piece of non-native use of Chamorro. It is equally possible that native speakers of the language narrated the story and Fritz noted it down. The authorship is nowhere disclosed (Stolz 2007).

120 | Thomas Stolz

(33)

[Ibáñez del Cármen 1887: 4]

Hu tuge para hamyo ini na dikike’ leblo 1SG.ERG write PARA 2PL.EMPH this LINK small book ‘[] I have written this booklet for you [].’ [original Spanish version: ‘he escrito para vosotros este librito’]

On this basis and further examples of the same kind, I assume that the basic functions of para were already firmly established in the mid-19th century although some of them fail to show up frequently in the available texts of that time because of the genre and topic of these writings. Given that this assumption holds, it can be concluded that Chamorro para and Spanish para covered very similar ranges of functions when used prepositionally.

3.1.2 para as conjunction This section is divided into two parts, viz. a synchronic account of the conjunction para in Chamorro and, to a minor extent also, in Spanish (= Section 3.1.2.1) and a diachronic review of the properties of para as attested in a bilingual text of the 19th century (= Section 3.1.2.2). 3.1.2.1 The present In Spanish, para may also take a verbal complement. The latter can come as infinitive if there is a shared participant between the higher and the subordinated verb (Real Academia Española 2009: 1988) as shown in (34). (34)

Spanish me envían para avisarlos 1SG.ACC send:3PL PARA warn:INF:3PL.M.ACC ‘[] they send me to warn you.’

[Tristante 2011: 34]

In case the subjects of the two verbs are different a biclausal solution with two finite verbs is triggered. The two clauses are joined to each other by a binary purposive conjunction, namely para que ‘so that, in order to’ (Real Academia Española 2009: 3456). This conjunction requires the finite verb in the subordinate clause to be in the subjunctive as shown in (35). (35)

Spanish Se lo REFL.3 3SG.M.ACC

han AUX.3PL

llevado take:PTCPL

a to

[Tristante 2011: 36] su cuarto, POR.3 room

Three Spanish-derived function words and the Chamorro irrealis | 121

para

que

PARA

SUBORD

lo atendiera su médico 3SG.M.ACC take_care:3SG:SBJ.PRETII POR.3 doctor ‘They have taken him to his room in order for his doctor to take care of him.’ Topping and Dungca (1973) mention para ki ‘so that’ as Spanish-derived subordinator in Chamorro. Their sole example is reproduced as (36) [original English translation]. (36)

[Topping and Dungca 1973: 152] Para ki ti un matmas, usa PARA KI NEG 2SG.ERG drown use ‘So that you won’t drown, use the floater.’

i DEF.CN

floater. floater

Rodríguez-Ponga (1995: 513) registers para ké ‘so what?’ as idiomatic expression. In my sample, the evidence for para ki is scarce. There are only two tokens – both from the same source as shown in Table 6. Table 6: Para ki in Saint-Exupéry (2021).

Chapter

Example – Glosses – Translation

IV

Lao naturåt hame ni kompre~rende i lina’la’ but natural 1NSG.EXCL.EMPH REL RED~understand DEF.CN life para ke ham ni numero siha! PARA KI 1NSG.EXCL.ABS REL number PL ‘But, naturally, what life means for us is numbers!’

VII

Para ke

yo’ ni mattiyu-ho, 1SG.ABS with hammer-POR.1SG ni i totniyu-ho, ni i mina’ho pat finatai? with DEF.CN bolt-POR.1SG with DEF.CN thirst or death ‘For what was I there with my hammer, my bolt, thirst or death?’ PARA KI

These are emotionally loaded utterances in which para ki does not function as a full-blown subordinator or conjunction. What it seems to fulfil is an expressivepragmatic function. There is no discernible connection to the main topic of this study. Therefore, para ki will not be discussed further. In contrast, there is ample evidence of the presence of para taking a predicate as complement. This is the pattern that will be of interest to the discussion throughout the remainder of this section. The donor language Spanish and the replica language Chamorro differ considerably as to the morphosyntax of their versions of para. Given that para que

122 | Thomas Stolz

is not an established grammatical Hispanism in Chamorro, one would expect to find evidence of para + INFINITIVE in Chamorro in accordance with the Spanish model. The existence of infinitives in Chamorro notwithstanding (Chung 2020: 454–460), this verbal category is never used in combination with para. What we find instead are finite clauses in the irrealis, i.e. constructions which resemble the Spanish subordination with para que + SUBJUNCTIVE. What comes to mind in connection to this resemblance is the possibility that Chamorro para may retain properties of a conjunction in analogy to Spanish para que. In (37) for instance, the first two instances of para feature this function word in subordinate clauses which are connected to the higher predicate via the subordinator na. In both cases, para is followed by the irrealis marker u. (37)

[Onedera 2007: 241] ha disidi na para u fama’nå’gue gi iya 3SG.ERG decide SUBORD PARA U AP:teach in DEF.PLN Unibetsedåt ya malago’ na para u cho’gue i university and want SUBORD PARA U make DEF.CN Teahouse of the August Moon para tinituhon cho’cho’-ña guihi. Teahouse of the August Moon PARA start work-POR.3SG there ‘He decided to teach at the university and wanted to do Teahouse of the August Moon as an opener for his work there.’

Chung (2020: 75) argues that the general subordinator (“complementizer”) na may be dropped “if it is immediately followed by para.” The embedded clause remains finite nevertheless. In a way, under na-deletion, para assumes the functions which are otherwise fulfilled by the subordinator. Alternatively, we could assume that superordinated and subordinated predicates may be asyndetically juxtaposed. In this way, para remains what it is supposed to be, namely a TMA marker. In contrast to Rodríguez-Ponga (2009: 53) and Pagel (2010: 95), para is featured neither in Topping and Dungca’s (1973: 145–147) nor in Chung’s (2020: 432–433) list of subordinating conjunctions of Chamorro. For Gibson (1992: 60) too, para is never a conjunction but always either a preposition with a complement NP or a TMA marker. In the remainder of this section, I address the problem of the functional ambiguity of para with particular focus on the retention of structural properties of a purposive conjunction. Chung (2020: 74) states that “para can also occur in irrealis clauses that are embedded under verbs of desire, commitment, permission, effort, and so on.” The use of the modal verb can in this quote suggests that para is optional in these contexts. The same seems to hold for the subordinator na. If we focus on

Three Spanish-derived function words and the Chamorro irrealis | 123

the modal verb malago’ ‘want’, we notice that there is considerable variation as to how superordinate and subordinate predicates are linked to each other. Table 7 discloses the frequency of the linking strategies found with the modal verb malago’ in the corpus. Table 7: Token frequency of linking strategies with the modal verb malago’.

Source

Ø

na

para

para + na

Other

Sum

Taimanglo et al. (1999)

7

13

8

3

1

32

Saint-Exupéry (2021)

13

6

Onedera (2007)

2

Perez and Faustino (1975)

3

1

Total

25

20

1

20

1

2

10

5

5 4 1

61

In (38)–(39), examples are given for the absence of any clause linking (= infinitival construction), the subordinator na, and the use of para in the absence of na. (38) a.

b.

(39)

[Perez and Faustino 1975: 4 and 33] Malago’ yo’ chocho want 1SG.ABS eat ‘I want to eat.’ Yanggen esta saga hao un semana gi siuda, when already stay 2SG.ABS INDEF week in city ni ngai’an ta’lo nai un malago’ na NEG when again when 2SG.ERG want SUBORD un cha’ka-n la~lancho ha’. 2SG.ERG rat-LINK RED~farm INTENS ‘When you have lived in the city for a week, never will you want to remain [= keep on being] a country rat.’

[Taimanglo et al. 1999: 71] malago’ para u gai want PARA U have ‘[] she wanted to have a child.’

patgon. child

The constructions in (38)–(39) are representative of 90 % of all instances of malago’ serving as modal verb superordinated to a lexical verb. Of the 61 malago’-constructions, 41 % go to the credit of the juxtaposition of modal and lexical verbs. The shares of the syndetic construction with na (= 33 %) and para

124 | Thomas Stolz

(= 16 %) are considerably smaller. With a share of some 8 %, the combination na + para following malago’ is relatively infrequent and shows up in only two of four sources. This small turnout makes it necessary to check whether, independent of the modal verb malago’, there is more evidence of the co-occurrence of na and para in the corpus. The primary source Perez and Faustino (1975) does not contain any evidence of this combination. In the three remaining sources, however, we find altogether 59 cases 46 (= 78 %) of which attest to the linkage of two predicates. Figure 3 reveals the token frequencies per individual source. These absolute numbers alone do not tell us very much. To get a grasp of the phenomenon under investigation, we need to know which higher predicates allow for the co-occurrence of na and para – and, what is more, to what extent there is variation as to the linkage of the higher and the lower predicate. 100% 7 5

1

9

2

Onedera (2007)

Saint-Exupéry (2021)

80% 60% 40%

35

20% 0% Taimanglo et al. (1999)

verb

other

Figure 3: Attestations of na + para in combination with verbs and other elements in the corpus.

There are altogether 24 different verbs and adjectives on the side of the higher predicate. There are 17 hapaxes: cho’gue ‘do’, deseha ‘desire’, enkåtga ‘commission’, entensiona ‘intend’, essalao ‘shout’, fa’nu’i ‘show’, guaha ‘have; exist’, hasso ‘think’, hongge ‘believe’, kontento ‘content’, malagu ‘run’, månda ‘order’, nahong ‘enough’, pega ‘place’, prisisu ‘urgent’, promesayi ‘promise’, and propiu ‘proper’. In what follows I will take account only of those seven verbs which combine with na + para at least twice in the corpus (no matter in how many sources the combi-

Three Spanish-derived function words and the Chamorro irrealis | 125

nation is attested), namely disidi ‘decide’, ilek ‘say’, malago’ ‘want’, sangåni ‘tell’, tago’ ‘order’, tungo’ ‘know’, and ya ‘like’. Figure 4 reveals how important the different kinds of linkage are for each of these predicates.20 The absolute numbers given in Figure 4 cover subordinate predicates in both moods.

disidi ya tago’ malago’ tungo’ sangani ilek 0%

20%

40%

60%

80%

100%

ilek

sangani

tungo’

malago’

tago’

ya

disidi

na

34

27

37

20

4

11

6

asyndetic

0

0

15

25

0

32

1

para

2

2

0

10

8

0

11

na + para

2

5

3

8

2

3

8

Figure 4: Frequency of linkage types over selected superordinated verbs.

For all verbs in Figure 4, there is evidence of na alone as well as in combination with para. Two verbs do not yield any example for para whereas three verbs do not allow for asyndetic combinations of higher and lower predicates. This result is of course dependent on the size and composition of the corpus. Nevertheless, I take it to be indicative of certain preferences. Only for two verbs do para and na + para claim shares which oust the other options. In the case of disidi, 42 % of all attestations as higher verb involve para alone and another 31 % go to the credit of na + para, i.e. para is involved in 73 % of all those cases in which disidi is superordinated to another predicate. As to tago’, para alone is responsible for 57 % of all examples of this verb in the function of higher predicate. The combination na +

|| 20 Some of the higher predicates allow for additional options which I have chosen to ignore for the purpose of this study. On account of this practical decision, I had to remove the single case in the column OTHER in Table 6 for malago’ as well. Thus, Figure 4 assumes 60 tokens for malago’ as higher verb in lieu of the 61 tokens mentioned in Table 6.

126 | Thomas Stolz

para is attested only twice with this verb which equals a share of 14 %. With 15 %, sangåni also displays a relatively high percentage of na + para combinations. Figures 5–6 identify the shares which the individual higher predicates have of combinations with para alone and combinations with na + para. Note again that all seven verbs are attested in combination with na + para but only four of them combine with para alone. Disidi stands out in both figures as the verb with the biggest share. sangani

ilek

6%

6% disidi 34%

tago’ 24%

malago’ 30% Figure 5: Shares of verbs combining with para.

tago’

ilek

6%

6%

disidi 26%

ya 10%

tungo’ 10%

sangani 16%

Figure 6: Shares of verbs combining with na + para.

malago’ 26%

Three Spanish-derived function words and the Chamorro irrealis | 127

The quantitative data give evidence of a high degree of variation. The verbs tungo’, ilek, and sangåni prefer na over all other options of linkage. Malago’ and ya are clearly in favour of asyndetic subordination whereas disidi and tågo’ lean towards the use of para. For none of the verbs is the combination na + para blocked. In contrast, para alone is not attested with tungo’ and ya. Asyndetic constructions are not registered in the case of ilek, sangåni, and tågo’. Since the latter three are speech-act verbs, it might be the case that the differential behaviour of the above seven verbs depends at least partly upon the semantic class to which a given verb belongs. This issue has to be investigated more thoroughly in a separate study. For the topic at hand, however, it suffices to have a look at the variation associated with disidi because, except malago’, disidi is the only verb in the above selection which allows for all four possible options. In (40), the zero symbol indicates the site where a potential subordinator could be placed. The realized option is that of an asyndetic juxtaposition of higher and lower predicate. (40)

[Taimanglo et al. 1999: 12] I ufisiåt siha ma disidi Ø u ma na’hånao tåtte DEF.CN officer PL 3PL.ERG decide Ø U 3PL.ERG direct back gi sagå-ña los uttemos. in place-POR.3SG last ‘The officers decided to send it back to his last address.’

In (41), the general subordinator na occupies the interclausal position. (41)

[Perez and Faustino 1975: 32] Annai matto i ora-n sena si Cha’kan Lancho when arrive DEF.CN time-LINK dinner DEF.PN Country Rat ha disidi na u planta-yi i amigu-ña decide SUBORD U set_up-BEN DEF.CN friend-POR.3SG 3NPL.ERG i mas mannge’ na nenkanno’ ni’ siña ha sodda’. DEF.CN more delicious LINK food REL be_able 3SG.ERG find ‘When dinner time approached, Country Rat decided to prepare for his friend the most exquisite food he could find.’

In (42), the higher verb disidi is followed by na which in turn precedes para. (42)

[Taimanglo et al. 1999: 58] ma disidi na para u ma nå’i palu U 3PL.ERG give some 3PL.ERG decide SUBORD PARA i atungo’-ñiha DEF.CN acquaintance-POR.3PL ‘[] they decided to give some [fish] to their acquaintance [].’

128 | Thomas Stolz

Example (43) illustrates the absence of na and at the same time attests to the presence of para in the slot to the right of disidi. In addition, there are two further instances of para (identified by underlining) in the same sentence which need to be taken account of in the ensuing discussion. (43)

[Taimanglo et al. 1999: 23] Esta para u malachai i nengkanno’, ya kada already PARA U exhaust DEF.CN food and each maga’låhi ma disidi para u fan-hånao guatu gi chief 3PL.ERG decide PARA U IRR.PL-go there in otro bånda isla para u fan man-a’ligåo nengkanno’. other side island PARA U IRR.PL AP-search_for food ‘The food was becoming scarce and all the chiefs decided to go to the other side of the island to search for food.’

The extant descriptions of the morphosyntax of para in Chamorro emphasize its role as TMA marker (cf. Section 3.1.3). The emphasis put on this function is so strong that other functions are glossed over by several authors. I fully agree with the idea that one of the functions of para is that of a TMA marker – but at the same time, I assume that there is also a conjunction para which has to be distinguished from the homophonous TMA marker although making this distinction often proves to be difficult. I thus take sides with Rodríguez-Ponga (1995) and Pagel (2010) who make this distinction but, surprisingly, do not argue explicitly against the competing analysis which denies the existence of the purposive conjunction para. With regards to the three cases of para in (43), I argue that only one of them can be considered a bona fide TMA marker whereas the other two are better classified as purposive conjunctions. As to esta para u malachai i nengkanno’ ‘the food was becoming scarce’, the TMA-status of para can be taken for granted since the predicate in which it occurs is not subordinated to any other predicate. In the two remaining cases, para introduces a predicate that is structurally/semantically connected to a preceding predicate in the same sentence. There is a ternary chain of events starting with the decision (disidi) which envisages a subsequent motion event (fanhånao) which in turn is necessary to facilitate the third step, namely the search for food (fanmana’ligao). At the time of the decision the two subsequent events were still nonactual situations and thus require being expressed in the irrealis. As we know from Figure 4, there is only a single uncontroversial case of an asyndetic combination of disidi with a lower verb. In 25 (= 96 %) out of 26 cases, the slot to the immediate right of disidi is occupied by na or para. If we assume

Three Spanish-derived function words and the Chamorro irrealis | 129

that para is a TMA marker and nothing else, we create further 11 cases of asyndesis whose share thus rises from 4 % to 46 % of all instances of disidi functioning as higher predicate. What supports this analysis is the plethora of combinations of na + para identified for disidi. For these cases, it can be argued that the subordinator na precedes the TMA marker which belongs entirely to the subordinated predicate in the irrealis. In accordance with Chung’s (2020: 431) above statement that na is optional in the presence of para, the eleven instances of disidi + para can be classified as cases of na-deletion. Since na is attested half a dozen times as subordinator for a predicate in the irrealis without para intervening, one may conclude that the presence of na can have a blocking effect on para, too. In a way, disidi requires something to occupy the slot to its right in order to avoid asyndesis. What is required there is an element which links the higher to the lower predicate – and a good candidate for this linking function is a conjunction. Na is a general subordinator devoid of any semantics of its own (Chung 2020: 430). Para on the other hand can be understood to convey the meaning of a purposive conjunction along the lines of English (in order) to which corresponds closely to the goal-oriented meanings of the spatial and benefactive uses of the preposition para as sketched in Section 3.1.1. It is worth noting that Chung’s (2020: 432–433) stock of Chamorro conjunctions does not contain any dedicated purposive conjunction, in the first place. Topping and Dungca (1973: 151–152) review all Spanish-derived conjunctions (“connectors”) without mentioning para. Remember, however, that, in their catalogue of conjunctions, there is para ki ‘so that’ whose exclusion from this study has been argued above in connection with the data in Table 6. There are good reasons to add para to the lists to fill the gap. To prove my point, I have to take a detour. First of all, there is the third instance of para in example (43), namely para u fan mana’ligåo nengkanno’ ‘to search for food’ which is the motivation for the motion event u fanhånao ‘they will go’. Given that the use of para is optional generally, one might want to ask why it is realized in this particular case. A possible explanation assumes that para is needed there to clarify the relation between the chronologically prior motion event and the posterior search for food. Without para, the latter could be (mis-)understood as an event which is causally independent of the motion event. Para connects the two events to each other logically in the sense that facilitating the search for food constitutes the goal of the motion event. The search party moves to the other side of the island in order to look for food there. To my mind, the assumption that even in this case, para marks the future or the subjunctive is not compelling. If it is correct, on the other hand, that para makes the purposive relation explicit whose interpretation would otherwise depend

130 | Thomas Stolz

entirely on context, it suggests itself that para can fulfil the tasks of a purposive conjunction. It must be mentioned that the motion verb hånao ‘go’ does not frequently combine directly with the conjunction para in the corpus. There are nine tokens of the co-occurrence of hånao + para with para introducing a predicate. Hånao takes an infinitive as in (44) (= 25 times) or is juxtaposed asyndetically to the subordinated predicates as in (45) (= 3 times) or the co-ordinating conjunction ya ‘and’ links the two predicates to each other as in (46) (= 14 times). (44)

(45)

(46)

[Taimanglo et al. 1999: 58] ha disidi si Pedro 3SG.ERG decide DEF.PN Pedro ‘[] Pedro decided to go hunting [].’ [Taimanglo et al. 1999: 9] bai hu hånao Ø BAI 1SG.ERG go Ø ‘[] I am going to have a bath [].’

para PARA

bai BAI

u hånao U go

peska hunt

hu 1SG.ERG

o’mak bathe

[Taimanglo et al. 1999: 45] Låo si Juan Måla tåya’ intension-ña para u hånao but DEF.CN Juan Mala NEG intention-POR.3SG PARA U go ya u espiha i tres dikike’ na babui. look_for DEF.CN three little LINK pig and U ‘However, Juan Mala by no means had the intention to go looking for the three piglets.’

In all these and similar cases, the nuclei of the two predicates are syntactically adjacent or relatively close to each other. Under this condition, the use of para is exceptional in the corpus. Para turns up more often, however, if other elements are intercalated between the two nuclei or the linear order of the clauses is marked. There are altogether fifteen examples of this situation. In (47)–(48), we witness the presence of a postverbal21 lexical NP referring to the agent of the motion verb which separates the higher predicate from the lower one. The intercalated elements are highlighted in boldface. The motion verb and the conjunction are marked out by underlining.

|| 21 In this study, the terms postverbal and preverbal cover all kinds of predicates (verbal and nominal), i.e. they refer to positions on the right or the left of a predicate.

Three Spanish-derived function words and the Chamorro irrealis | 131

(47)

(48)

[Saint-Exupéry 2021, ch. XV] Hanao i DEF.CN go ‘Have people gone to see?’ [Onedera 2007: 255] Despues di hånao after go nebi-u para u bridegroom-M PARA U ‘After the bride and the the party.’

taotao people

para

u

PARA

U

li’e’? see

i

nebi-a yan i bride-F and DEF.CN komfesat ma tutuhon i gipot confess 3PL.ERG start DEF.CN party bridegroom have gone to confess, they start DEF.CN

The same effect of separating the two predicates can be achieved by way of intercalating the adverbials as in (49). (49)

[Perez and Faustino 1975: 19] na Pues si Jose hanao then DEF.CN Jose go LINK u aligao si Señot Osu. para PARA U look_for DEF.CN Mr Bear ‘Then Jose went alone to look for Mr Bear.’

maisa alone

This pattern is especially recurrent in the collection of stories edited by Taimanglo et al. (1999). In Table 8, I present all twelve examples of para being separated from the motion verb by intercalated elements. As in the above examples, these intercalations appear in boldface whereas hånao and para are underlined. Table 8: Para at a distance from hånao in Taimanglo et al. (1999).

Page Example + Translation

Intercalated

10

I Pale’ ha na’hånao i imåhen para Roma para u mana’gåsgas ‘The priest sent the statue to Rome in order for them to clean it [].’

NP+ PP

18

manhånao i tatan i matai yan i taotåo sengsong para u mana’tachu i latte. ‘[] the father of the deceased and the village people go to erect the stone.’

NP + NP

27

I sigente ogga’an, humånao si Mataquåna para Inalåhan para u espiha si NP + PP Gadåo. ‘The next morning, Mataguana went to Inalahan to look for Gadao.’

29

Humånao si Alu para u espiha ya para u atungo’ yan si Pang. ‘Alu went to look for and get acquainted with Pang.’

NP

132 | Thomas Stolz

Page Example + Translation

Intercalated

44

humånao tåtte gi palåsyo para u chule’ i rementå-ña ni’ ha nisisita. ‘[] he went again to the palace to fetch the tools he needed.’

ADV + PP

58

humånao para i gima’ para u fa’nu’i si Jose ni’ pinino’-ña. ‘[] he went to the house to show Jose what he had killed.’

PP

61

i taotåo Guåhan manhånao para i gima’ Yu’os para u fan mannå’i grasia. ‘[] the people of Guam went to the Church to give their thanks.’

PP

111

Pues humånao tåtte para u sangåni i pale’ ni’ todu i lini’e’-ña. ‘Then he went again to tell the priest everything he had seen.’

ADV

128

Un ogga’an humånao i tatan Leon para Barrigada para u fañule’ odda’ para tinanom. ‘One morning, Leon’s father went to Barrigada to fetch soil for planting.’

NP + PP

137

humånao yu’ para bai hu påsto i ga’-måmi karabåo. ‘[] I went to graze my water buffalo.’

PRO

149

humånao yu’ para bai espiha i buruka. ‘[] I went to see what the noise was.’

PRO

167

Humånao si Thomas yan i asaguå-ña para u fanakudi. ‘Thomas and his wife went to assist her.’

NP + NP

In several of the examples in Table 8, the intercalations involve a PP headed by the preposition para. I doubt, however, that the presence of the homophonous preposition motivates the use of para on the subordinated clause. In the same source, para combines directly with hånao only four times. Moreover, if the subordinate predicate is not placed in the vicinity of the motion verb, para is always used. I take this to mean that at least in these cases, para’s function is primarily that of expressing the purposive relation and not that of marking the future or subjunctive. The intercalation of meaningful elements renders the parsing of discontinuous parts of the utterance difficult and thus calls for the presence of para to ensure understanding. This hypothesis receives support from example (50). (50)

[Taimanglo et al. 1999: 6] Un natibu ha sodda’ i lassas dos gå’ga’ ya INDEF native 3SG.ERG find DEF.CN skin two animal and [ha disidi [na [para ha akachayi i dos] [3SG.ERG decide [SUBORD [PARA 3SG.ERG tease DEF.CN two] u ha tulaika i lassas-ñiha]] U 3SG.ERG exchange DEF.CN skin-POR.3PL]] ‘A native found the skin of the two animals and [decided [to exchange their skins [in order to tease them]]].’

Three Spanish-derived function words and the Chamorro irrealis | 133

The square brackets identify the boundaries of the relevant clauses. In this complex sentence the chain of events is not straightforwardly presented. The protagonist decides to exchange the hides of the two animals in order to make fun of them. The goal of his planned action is mentioned before the action itself is introduced. Para is not there to mark that something will happen in the relative future but to indicate the purpose for which the action will be taken. The para-clause comes too early in a manner of speaking and thus gives rise to a marked order. It seems that syntactic markedness (inverted order, intercalation) is favourable to the employment of para as conjunction. What example (50) also shows is that there are false friends, meaning: the adjacency of na and para is purely incidental. Neither the one nor the other of the two is responsible for the presence of the one or the other. One might want to know whether this has always been the case in Chamorro. As last piece of evidence for para functioning as purposive conjunction, I present the dialogue between Little Red Riding Hood (= A) and the wolf (= B) who pretends to be the girl’s grandmother in Table 9. The examples are taken from Perez and Faustino (1975: 28). Table 9: Dialogue of Little Red Riding Hood and the wolf.

Speaker

Example – Gloss – Translation

A

sa’ hafa Guella na sen anakko’ i kannai-mu DEF.CN hand-POR.2SG because what grandma SUBORD INTENS long ‘How is it, grandma, that your hands are so very long?’

B

para

bai

PARA

BAI

A

B

hu la-toktok hao 1SG.ERG more-embrace 2SG.ABS ‘To better embrace you [].’

ya Guella i na man-in-anakko’ and grandma EXCLAM SUBORD NSG-ADJVZ-long ‘[] and, grandma, oh your ears are so long!’ para

bai

PARA

BAI

hu la-hungok 1SG.ERG more-hear ‘To better hear you [].’

talanga-mu ear-POR.2SG

lao Guella i na man-dankolo but grandma EXCLAM SUBORD NSG-big ‘But, grandma, oh your eyes are so big!’

B

para

bai

PARA

BAI

la-li’e more-see

PL

hao 2SG.ABS

A

hu 1SG.ERG ‘To better see you [].’

siha

hao 2SG.ABS

atadok-mu eye-POR.2SG

134 | Thomas Stolz

Speaker

Example – Gloss – Translation

A

lao Guella i na man-dankolo but grandma EXCLAM SUBORD NSG-big ‘But, grandma, oh your teeth are so big.’

B

para

bai

PARA

BAI

hu 1SG.ERG ‘To better eat you [].’

la-kanno‘ more-eat

nifen-mu tooth-POR.2SG

hao 2SG.ABS

Little Red Riding Hood is stunned by the enormous size of her supposed grandmother’s body-parts and she wonders about the reasons for these abnormities. In the four answers by the fake grandmother the wolf explains what the extralarge body-parts are good for, meaning: the wolf mentions the purposes these body-parts serve. The initial para in the above B-sentences does not situate the predicate in the sphere of the future/subjunctive but introduces the identification of the use to which the body-part can be put. Thus, para marks a purposive relation. To analyse it as TMA marker of the future or subjunctive in these contexts would not make much sense. 3.1.2.2 The past If we look back in time, we notice that there is not a single case of na and para co-occurring as direct syntactic neighbours in the writings of Ibáñez del Cármen from the 19th century. This observation holds for instance, for constructions with the modal verb malago’ ‘want’. As shown in Table 10, the source Ibáñez del Cármen (1887) contains four tokens of this modal verb combining with a subordinated predicate. In the Chamorro and Spanish versions, the relevant elements are identified by indexes. Table 10: Constructions with the modal verb malago’ in Ibáñez del Cármen (1887).

Example– Translation – Spanish version

53

Si Jesucristo malágôMOD mataeINF gui quiluus para unafanlibrejit nu y ísao ‘Jesus Christ wanted to die on a cross to liberate us from the sin.’ QuisoMOD Jesucristo morirINF en una cruz para librarnos del pecado.

57

ya malagôMOD unaejitIRR ni y lang̃etña ‘[] and he wants to give us his heaven’ y nos quiereMOD darINF su Gloria

Pattern infinitive irrealis

Page

Three Spanish-derived function words and the Chamorro irrealis | 135

Page

Example– Translation – Spanish version

59

y santa madre iglesia sumen malagôMOD, naSUBORD todosjit utafalálagüeIRR cada rato y sen gufliion nanata as Santa María ‘[] the Holy Mother Church wants very much that we all go and get each moment our much beloved mother Saint Mary [].’ la santa madre iglesia nos recomiendeV con tanto encarecimiento, el queSUBORD recurramosSBJ con frecuencia á nuestra amantísima madre María

subordinator

66

yan y manmalagôMOD man-ásaguaINF sin y bendisionñija ‘[] and those wanting to marry without his benediction []’ y los que quierenMOD casarseINF sin su bendicion

infinitive

Pattern

The linkage between the two predicates comes in three shapes, namely (i) the subordinate verb is in the infinitive (cf. example (37)), (ii) the two predicates are asyndetically juxtaposed with the subordinate one appearing in the irrealis, and (iii) the two predicates are linked to each other by the subordinator na and the second predicate is in the irrealis. There is no evidence of para in the slot to the right of malago’ as is attested in contemporary Chamorro (cf. example (39)). As mentioned above, there is no example of the co-occurrence of na and para in combination with malago’ as in (37). In the text under review, there is a plethora of attestations of either na or para being used without the other being present, i.e., in principle, there would be ample opportunity for the two appearing together in the same syntactic environment. Since na and para miss this opportunity, I hypothesize that the cases in (37) and other examples constitute an innovation in Chamorro grammar. Combining na and para is a new option for native speakers of the language that was not available to the author of the printed Chamorro material of the 19th century. This tentative conclusion gives rise to the question how old other aspects of the synchronic grammar of para in the verbal domain are. In Ibáñez del Cármen (1887), para is attested 104 times in combination with a verbal predicate (another 35 tokens go to the credit of para as benefactive preposition and only one case illustrates the use of para as directional preposition). Of these 104 attestations, exactly 100 are found in the bilingual main body of the publication (the remaining four occur in the short Chamorro-only prayers section). What is telling about this turnout is the fact that for the vast majority of the cases, Chamorro para corresponds directly to para (que) in the Spanish version of this bilingual text. On the other hand, there is only a handful of examples of para (que) + V on the Spanish side which is not translated as para in the Chamorro version. The numerical facts are presented in Table 11.

136 | Thomas Stolz

Table 11: Parallel occurrence of para in the Chamorro and Spanish parts of Ibáñez del Cármen (1887).

Spanish

Chamorro Total

Sum

para

Other

para

69

25

other

6 75

94 6

25

100

In 73 % of all attestations of Chamorro para, this function word corresponds to para (que) in the Spanish version. From the Spanish perspective, the parallelism is even stronger because 92 % of all attestations of para (que) in the Spanish part of the text correspond to the use of para in the Chamorro part. This in turn means that, in the vast majority of the cases, the function attributed to para in Chamorro is that of a purposive conjunction in accordance with the functional domain of its translation equivalent in the donor language. Typical examples of this equivalence are (51) for Spanish para + INFINITIVE and (52) for Spanish para que + SUBJUNCTIVE. (51)

[Ibáñez del Cármen 1887: 6]

un na’i yo’ grasia para hu fañotsot ya 2SG.ERG give 1SG.ABS grace PARA 1SG.ERG IRR.AP:repent and para hu si~sigi ha’ gef-setbe hao PARA 1SG.ERG RED~continue INTENS very-serve 2SG.ABS asta i hekkok i ha’ani until DEF.CN end DEF.CN day ‘[] you give me grace to repent and continue to serve you till the end of my days.’ [Spanish version: me daréis gracia para enmendarme y para preseverar hasta el fin de mi vida.]

(52)

[Ibáñez del Cármen 1887: 14]

para u ta tungo’ i mi-bale-n PARA U 1NSG.ERG.INCL know DEF.CN have_plenty-value-LINK anti-ta basta u ta atituyi i sigi soul-POR.1NSG.INCL enough U 1NSG.ERG.INCL heed DEF.CN follow

Three Spanish-derived function words and the Chamorro irrealis | 137

‘In order for us to know the great value of our souls it suffices to pay attention to what follows [].’ [Spanish version: Para que conozcamos el gran valor de nuestras almas basta que nos fijemos en lo siguiente [].] It strikes the eye that the incidence of para is higher in the Chamorro part than it is in the Spanish part of the text. The surplus on the Chamorro side needs to be explained. The 25 cases of Chamorro para for which Spanish does not offer para (que) as equivalent form two almost equally big classes. There are twelve cases in which Spanish employs a different preposition to link the higher verb to an infinitive, namely a ‘to’ (6 times), por ‘because of, for’ (twice), de ‘of’ (twice), and en ‘in’ (twice). These prepositions govern the infinitive like Spanish para and are functional equivalents thereof, meaning: they encode a purposive relation as shown in (53). (53)

[Ibáñez del Cármen 1887: 55]

Desde ayu u ma~maila’ para u sentensia from that U RED~come PARA U sentence gi vaye-n Josafat todu i taotao siha in valley-LINK Josafat all DEF.CN people PL ‘He will be coming from there to judge all people in the valley of Josaphat [].’ [Spanish version: Desde allí vendrá á juzgar en el valle de Josafat á todas las gentes [].]

Except Spanish por which has been borrowed as pot ‘because of, about’, none of the above Spanish preposition has been fully integrated into the grammatical system of Chamorro (Rodríguez-Ponga 1995: 125–126, 133–141, 143–144). Because of their functional equivalence with para in Spanish, these cases can nevertheless be added to the instances of parallel behaviour of Spanish and Chamorro. Thus, 81 attestations of Chamorro para correspond to a purposive marker in Spanish. Accordingly, only 19 tokens of Chamorro para (+ PREDICATE) have no equivalent function word in the Spanish version. Most of the remaining cases of disagreement between the two versions as to the use of the function word para are easy to explain in terms of the different stylistic preferences of author and translator. In (54) for instance, the Chamorro version depicts the relation between the two predicates as purposive whereas the Spanish version avoids subordination by way of employing the coordinating conjunction y ‘and’.

138 | Thomas Stolz

(54) a.

b.

[Ibáñez del Cármen 1887: 11] Chamorro

u na’i yo’ grasia-ña para hu setbe gue’ U give 1SG.ABS grace-POR.3SG PARA 1SG.ERG serve 3SG.ABS ‘[] he will give me his grace in order for me to serve him.’ Spanish me concede su gracia y le sirva 1SG.ACC grant:3SG POR.3 grace and 3SG.DAT serve:1SG.SBJ ‘[that] he has mercy on me and I serve him.’

Grammatically, nothing speaks against linking the two predicates in (54b) by way of using the conjunction para que + SUBJUNCTIVE. It is possible, however, that the author of the Spanish section wanted to avoid a purposive interpretation for religious reasons. In (55), we witness another case of coordination. The Chamorro version features para having scope over both of the coordinated predicates whereas in the Spanish version, para is repeated after the disjunctive ó ‘or’. (55) a.

b.

[Ibáñez del Cármen 1887: 75] Chamorro

Lokkue’ siña u mahgong i kilisyanu gi also be_able U calm_down DEF.CN Christian in pagat i mediku gigon i konfesót advise DEF.CN doctor together_with DEF.CN confessor para u munga um-ayunat pat Ø u munga PARA U NEG INF-fast or Ø U NEG chocho guihan. eat fish ‘The Christian can also calm down on account of his/her doctor’s and the priest’s advice not to fast or Ø eat fish.’ Spanish Tambien puede aquietarse el Cristiano con also be_able:3SG calm_down:INF:REFL3 DEF.M Christian with el parecer del medico juntamente con el DEF.M opinion of:DEF.M doctor together with DEF.M consejo del confessor para no ayunar ó para advice of:DEF.M confessor PARA NEG fast:INF or PARA

Three Spanish-derived function words and the Chamorro irrealis | 139

no

comer de pescado. eat:INF of fish ‘The Christian can also calm down on account of the opinion of the doctor together with the advice of the priest not to fast or not to eat fish.’ NEG

Neither is there a structural constraint which would block the repetition of para under coordination in Chamorro, nor is it mandatory in Spanish to repeat the preposition on each of the conjuncts. These differences between the Chamorro and the Spanish version cannot be explained by referring to rules of grammar. They are a matter of style. Further examples of this kind are presented in the Appendix at the end of this study. There are two cases – (56)–(57) – which, at least superficially, seem to host Chamorro para as marker of the future. (56) a.

b.

[Ibáñez del Cármen 1887: 56] Chamorro

lao i otro para siempre u other PARA SIEMPRE U but DEF.CN ‘[] but the other one will last forever.’ Spanish y la otra vida durará and DEF.F other:F life last:3SG.FUT ‘[] and the other life will last forever.’

dura last

siempre always

In (56b), we find the verb form durará ‘(it) will last’ (← durar ‘last’) which is accompanied by the temporal adverb siempre ‘always’. The presence of siempre in the Spanish version is replicated by the co-presence of siempre in the Chamorro version (56a). It is safe to assume that in this example, Chamorro siempre is not a marker of the certain future but the direct translation of its Spanish equivalent. If siempre does not express the future, what about para then? Since para does not occur in the Spanish version, it might look like Chamorro para is used independently of the Spanish patterns and thus, could be associated with the function of future marking. However, there is the lexicalized Spanish prepositional phrase para siempre ‘forever’ which would also be fine stylistically in (56b). This means that we cannot be absolutely certain as to the function para fulfils in (56a). To my mind, we are facing an instance of the collocation para siempre which is also attested in contemporary Chamorro so that TMA-status is ruled out for para. What corresponds to the Spanish durará is Chamorro u dura with para siempre functioning as temporal adverbial. Example (57) is equally ambiguous.

140 | Thomas Stolz

(57) a.

b.

[Ibáñez del Cármen 1887: 74] Chamorro

para u gef um-ayunat i kilisyano PARA U very NPL-fast DEF.CN Christian u kanno’ gi ogga’an onsa i media na chokolati U eat in morning 1.5_ounces LINK chocolate (i) ‘For the Christian to fast well s/he shall consume, in the morning, one and a half ounces of chocolate [].’ (ii) ‘The Christian will fast well consuming, in the morning, one and a half ounces of chocolate [].’ Spanish Será bueno el ayuno del Cristiano, be:3SG.FUT good DEF.M fasting of:DEF.M Christian tomando por la mañana onza y media de chocolate drink:GER for DEF.F morning 1.5_ounces of chocolate ‘The Christian’s fasting will be good if s/he drinks one and a half ounces of chocolate in the morning [].’

Spanish uses the copular verb form será ‘(it) will be’ (← ser ‘be (permanently)’) in (57b). There is no copula in Chamorro (Topping and Dungca 1973: 238–239). A nominal predicate with maolek ‘good’ would have be possible grammatically. What we find in (57a) instead is syntactically more complex. The question arises what the function of para is in para u gef umayunat i kilisyano which could be translated either as a purposive clause or as a straightforward predication with future reference as shown by the alternatives (i) and (ii) above. It is tempting to go for option (ii) because of the Spanish future tense. However, the Chamorro para-clause is followed by a second clause whose nuclear predicate is u kanno’ ‘s/he will eat’. If option (ii) is chosen, the problem arises that a succession of two asyndetically juxtaposed clauses in the irrealis is produced while the logical relation that ties the one to the other is not made explicit. Two unrelated events are going to take place in the future – the Christian will fast the proper way and someone will drink a certain amount of chocolate in the morning. However, if we interpret (57a) along the lines of (50) and (52) above, the supposedly unexpressed relation between the two clauses becomes directly visible. The purposive clause precedes the clause in which the action is identified which has to be taken to achieve what is said in the para-clause. To summarize the above discussion, in the written sources of the 19th century, the functional domain of

Three Spanish-derived function words and the Chamorro irrealis | 141

Chamorro para was still largely identical to that of Spanish para (que), i.e. Chamorro para behaved like a purposive conjunction. If in the 19th century texts, para cannot be confirmed in the function of a TMA marker, it is legitimate to ask how future and/or subjunctive were expressed in this period, if at all. The answer is straightforward. These categories were expressed by the core of the irrealis construction as constituted by the layers I–IV in Figure 1. Layer V on the other hand, was not yet fully established. The sequence of three predicates in the irrealis in (58) is a case in point. (58) a.

b.

[Ibáñez del Cármen 1887: 27] Chamorro

i tinaitai hinengge u na’-libre i malangu DEF.CN prayer belief U CAUS-free DEF.CN sick ya u in-alibia as Yu’us yanggen gá~gaige and U PASS-ease DEF.PN.OBL God when RED-be_at ha’ gi isao siha u fan-in-asi’i INTENS in sin PL U IRR.PL-PASS-forgive ‘[] the prayer of faith will liberate the sick person and s/he will be alleviated by God; when s/he is living in sins, they will be forgiven.’ Spanish la oración de la fé salvará al enfermo, DEF.F prayer of DEF.F faith save:3SG.FUT to:DEF.M sick y el Señor le aliviará; y si se and DEF.M Lord 3SG.DAT alleviate:3SG.FUT and if REFL.3 halla con pecados, se le perdonarán. REFL.3 3SG.DAT forgive:3PL.FUT find.3SG with sin:PL ‘The prayer of faith will save the sick person and the Lord will alleviate him/her, and if s/he is involved in sins, they will be forgiven.’

All Chamorro verbs in the irrealis in (58a) correspond to Spanish verbs in the synthetic future in (58b). This is the dominant pattern throughout the reference text: the synthetic future in the Spanish section of the bilingual sources translates or is translated into the Chamorro irrealis. Para is additionally present in Chamorro only if the purposive relation between two predicates is made explicit. The Spanish periphrastic future and its equivalent in Chamorro will be scrutinized in Section 3.2 below. The parallelism of Spanish synthetic future and Chamorro irrealis is, however, not a one-to-one correspondence. It is not only the Spanish synthetic fu-

142 | Thomas Stolz

ture which associates with the Chamorro irrealis. There is also the Spanish subjunctive as in (59). (59) a.

b.

[Ibáñez del Cármen 1887: 28] Chamorro

kombieni na u kónfesat convenient SUBORD U confess ‘[] it is appropriate that s/he confesses [].’ Spanish es conveniente que se confiese SUBORD REFL.3 confess:3SG.SBJ.PRETI be.3SG convenient ‘[] it is appropriate that s/he confesses [].’

In other words, the data collected from Ibáñez del Cármen (1887) suggest that in the 19th century future and irrealis were not formally distinguished since no dedicated marker of the former seems to have existed at that time (but cf. Section 3.2). Chamorro para was still employed exclusively in contexts in which it received a purposive interpretation. Its contemporary status as TMA marker must therefore be considered a relatively late innovation.

3.1.3 para as TMA marker What we have learned from the foregoing discussion is that not every nonprepositional instance of para in contemporary Chamorro is a TMA marker. The purposive conjunction para has to be separated from para as TMA marker no matter how difficult their separation might be in specific contexts. Chung (2020: 73–76) provides an overview of the meanings and uses of para in the TMA domain. More generally, the meanings of para as TAM marker are defined as future and subjunctive (Chung 2020: 70). Para serves to mark relative future and non-actual situations which are “wanted, promised, allowed, or which effort is devoted to bringing about” (Chung 2020: 74). In (60), para is employed to mark the future. (60)

[Perez and Faustino 1975: 4] Un manada-n taotao para u fan-malak i INDEF plenty-LINK person PARA U IRR.PL-arrive DEF.CN pa’go na ha’ani. now LINK day ‘Many people will come to the shop today.’

tenda shop

Three Spanish-derived function words and the Chamorro irrealis | 143

There is no preceding predicate on which the para-clause could depend. The sentence is mono-clausal. The only verb is malak ‘arrive’ which takes the appropriate f-initial plural prefix. The verb is thus already marked for the irrealis. The mood is additionally expressed by u. We know from the above analysis of texts from the 19th century that this once was sufficient to encode the future or the subjunctive. Example (61) from contemporary Chamorro shows that para does not have to be repeated in a multi-clausal sequence of predicates. (61)

[Perez and Faustino 1975: 24] Para u rompe halom i gaddon cha’guan, PARA U cut_through inside DEF.CN entangled grass pues u nangu gi hagoe, ya u utot halom then U swim in lake and U shortcut inside chalan-ña gi halomtano’ path-POR.3SG in forest ‘He will cut through the thicket, then swim across a lake and take a shortcut on his way through the forest.’

All three successive events are going to happen in the future but only the first predicate is introduced by para. This sentence-initial para seems to have scope over all following clauses. In contrast to para, however, u is repeated on each of the predicates. This behavioural difference speaks against an identical status of the two markers in the sense that u appears to be more tightly connected to the verb/predicate whereas the link between para and verb/predicate is relatively loose. Following Pagel (2010: 92), I interpret these differences in the degree of bonding as indicators of the “older age” of u as opposed to the relatively younger para. In Section 3.1.2.1, I have quoted Chung (2020: 74) as to the occurrence of para in the predicate subordinated to certain classes of verbs. These cooccurrences would constitute the domain of para as subjunctive marker. On account of the evidence of para functioning as purposive conjunction exposed in the previous two sections, I assume that certain combinations of HIGHER PREDICATE + PARA + LOWER PREDICATE feature the conjunction and not the subjunctive marker. In other cases, para is ambiguous as to its status. Uncontroversial evidence of para as subjunctive marker stems from examples which illustrate a different pattern, namely HIGHER PREDICATE + CONJUNCTION/SUBORDINATOR + PARA + LOWER PREDICATE with para belonging to the subordinate predicate in lieu of linking the two predicates. Of eighteen subordinating conjunctions identified by Chung (2020: 432– 433), twelve never combine with a following para in the corpus. These are in alphabetical order

144 | Thomas Stolz

achuk ‘although’ (28 tokens), åntis di ‘before’ (61 tokens), asta ki ‘until’ (10 tokens), disdi ki ‘(ever) since’ (19 tokens), dispues di ‘after’ (43 tokens), kada ‘whenever, each time that’ (104 tokens), kosa ki ‘so that’ (17 tokens), maseha ‘even though’ (46 tokens), mientras ‘while’ (40 tokens), putnó ‘so that … not’ (3 tokens), sa’ ‘because’ (412 tokens), and sin ‘without’ (38 tokens). The turnouts for the six conjunctions which combine with para are disclosed in Figure 7. These conjunctions are annai ‘when’, gigun ‘as soon as’, inlugåt ‘instead of’, kumu ‘if’, put ‘because, so that’, and yanggen ‘if, when’. The shares for the pattern CONJUNCTION + PARA range from 1% to 10% and are thus comparatively small. The highest number of tokens are registered for put + para which occurs 46 times in the corpus. Most of the combinations are attested much less frequently. The different combinations are illustrated in (62)–(67) and commented upon step by step. (62)

[Taimanglo et al. 1999: 48] I dos umasagua ma disidi para u saga DEF.CN two married 3PL.ERG decide PARA U stay ha’ gi tasi annai para u ma~magof ha’. INTENS in sea where PARA U RED~happy INTENS ‘The two married people decided to stay in the sea where they would be very happy.’ 100% 90% 80% 70% 60% 50% 40% 30% 20% 10% 0% without para with para

annai

kumu

yanggen

put

gigon

inlugat

470

99

248

539

29

10

4

2

18

46

3

1

Figure 7: Subordinating conjunctions combining with para in the corpus.

Three Spanish-derived function words and the Chamorro irrealis | 145

In this sentence, there are two instances of para. I interpret the underlined para as conjunction which links the higher verb disidi ‘decide’ to the lower predicate u saga ‘will stay’ in analogy to the analysis I have proposed for (43) above. In contrast, the second instance of para has to be analyzed differently. Para follows the temporal-spatial relative pronoun annai ‘where, when’ which links the relative clause to the preceding PP. There is thus no need for an additional element of clause linkage. In this case, it makes perfect sense to consider para to form part of the irrealis construction to which it contributes the meaning of the relative future. In (63) too, there are two instances of para to which the same analysis as in the previous case can be applied. (63)

[Taimanglo et al. 1999: 97] Gigon para bai go’te, as_soon_as PARA BAI hold kanna’ ha åkka’ yu’ almost 3NPL.ERG bite 1SG.ABS pues hu gacha’ para bai konne’. then 1SG.ERG detect PARA BAI catch ‘As soon as I was about to get hold of it, it nearly bit me, then I found out how to catch it.’

In my opinion, the underlined para is a case of the conjunction which links the higher predicate hu gacha’ ‘I found out’ to the lower predicate bai konne’ ‘I will catch (it)’. In the position between these two predicates, para cannot serve as mere TMA marker because then we would have an asyndetic succession of two predicates whose logical connection remains unclear. Para makes this relation explicit. The event of finding out what to do precedes the naming of the purpose for which the newly gained insights can be used with para linking the former to the latter. As to the first para (in boldface) in (63), the situation is completely different since there is no higher predicate to which para could possibly link the clause because the linking is done by gigon ‘as soon as.’ The presence of gigon blocks the possibility of para functioning as conjunction. Para belongs to the irrealis construction in the temporal clause. Its contribution to the meaning of this clause consists in situating the event in the relative future. The next case is more difficult to judge. In (64), there is only one instance of para this time. (64)

[Taimanglo et al. 1999: 15] Man-daña’ i pumalu siha na maga’låhi gi singsong siha PL-join DEF.CN some PL LINK chief in village PL

146 | Thomas Stolz

ya ilek-ñiha na kumu para guiya u i and say-POR.3PL SUBORD if PARA 3SG.EMPH U DEF.CN etmas takhelo’ na må’gas, pues u nisisita u most eminent LINK master then U need U kumple tres ni’ ga~gågao-ñiha. accomplish three DEF.CN.OBLRED~request-POR.3PL ‘Some of the village chiefs gathered and said that for him to become the highest master three requests of theirs needed to be fulfilled.’ There is a nominal predicate in the irrealis, namely u i etmas takhelo’ na må’gas ‘he will be the highest master’ with the irrealis marker u preceding the definite NP. Para is separated from the predicate by the emphatic pronoun of the 3rd person singular guiya. Note that “[t]he emphatic pronouns serve as pronominal Obliques and as the objects of prepositions” (Gibson 1992: 19). This means that para guiya ‘for him’ could be a regular PP so that para could be counted out as a TMA marker, in the first place. I prefer this analysis over the alternative interpretation of para as TMA marker not the least because intercalations inside the irrealis construction according to the template in Figure 1 are exceptional, if they occur at all. The next three cases pose no problems. In (65)–(67), para (in boldface) cannot execute the task of joining two clauses since this is already done by the dedicated conjunction positioned to its immediate left. This is enlugåt di ‘instead of; in lieu of’ in (65) and pot ‘about, so that’ in (66). In both cases para belongs to the subordinated clause. (65)

[Onedera 2007: 274] Debidi u la-meggai ni’ este must U more-many DEF.CN.OBL this enlugåt di para ma sångan ha’ instead PARA 3PL.ERG tell INTENS empottante-ña ya despues tåya’ piku~kura. important-POR.3SG and afterwards NEG RED~procure ‘There should be more of this instead of emphasizing its importance and afterwards nothing is done.’

(66)

[Saint-Exupéry 2021, ch. 5] Ya ha afuetsa yo’ and 3SG.ERG force 1SG.ABS pot para bai komprende este so_that PARA BAI understand this

mu-nå’e INF-give na LINK

ånimo effort problema problem

Three Spanish-derived function words and the Chamorro irrealis | 147

guåho ha’ na maisa. 1SG.EMPH INTENS LINK alone ‘And he forced me to take an effort so that I would understand this problem by myself.’ In both (65) and (66), there are gaps in the irrealis construction. The absence of hu in (66) is nothing to worry about since this is what we expect as a possibility on the basis of the information given in Table 1. Interestingly, the very same Table 1 does not prepare us for the absence of u in example (65). According to all sources consulted for Table 1, the irrealis marker u is mandatory in the 3rd person plural (independent of transitivity). There is homophony of the pronoun of the 3rd person plural ergative ma and the ma-passive marker which are probably connected to each other diachronically (Topping and Dungca 1973: 79). It is not always easy to tell the one from the other. However, even if ma has to be reinterpreted as passive marker in (65), the basic problem remains the same – the supposedly compulsory u is missing. Further research has to determine whether cases like (65) feature para taking over the functions of u. In (67), we have once more two cases of para. (67)

[Taimanglo et al. 1999: 1] Yanggen para u ma’-åfte i gima’-ñiha, when PARA U PASS-roof DEF.CN house-POR.3PL man-måtto i bisinu, i parientes yan i PL-arrive DEF.CN neighbor DEF.CN relative and DEF.CN atungo’ siha para u fan-man-ayuda acquaintance PL PARA U IRR.PL-AP-help yan lokkue’ siempre mang-gupot. and also SIEMPRE PL-party ‘When the house is going to be roofed, the neighbors, relatives, and acquaintances come to help and they will surely also party.’

In analogy to the cases presented in Table 8, I interpret the underlined para as a purposive conjunction which is separated from the motion verb to which it links the subordinated predicates by the lexical subject NPs. Para encodes the logical relation between the predicates. The motion event is carried out in order for the relatives, neighbours, and acquaintances to lend a hand and party afterwards. In this case, para is not a TMA marker. This function applies, however, to the first para (in boldface) which occurs in the initial temporal clause. The clause linkage is achieved by the conjunction yanggen ‘when’. There is thus nothing left to link for para. Para forms part of the irrealis construction. The roofing of

148 | Thomas Stolz

the houses is depicted as taking place only in the relative future – and this temporal relation is indicated by para. Chung (2020: 432–433) distinguishes realis anai ‘when’ from yanggin ‘if, when’ which is said to be associated with future, irrealis, and habitual. As examples (62) and (67) show, both of the conjunctions are compatible with clauses in the irrealis which involve para as TMA marker. However, anai is ambiguous as to a spatial or temporal/conditional interpretation. In (62), the spatial interpretation applies. Examples (68)–(69) illustrate that even in the function of temporal conjunction, anai and yanggin can be followed by PARA + irrealis. (68)

(69)

[Taimanglo et al. 1999: 80] Annai para u laknos i kannai-ña when PARA U put_forth DEF.CN hand-POR.3SG siña sa’ mampos ma’i’ot i pachot i be_able because very narrow DEF.CN mouth DEF.CN ‘When he was about to put forth his hands, he could not (reach cause the mouth of the jar was very narrow.’

ti NEG

taru. jar it) be-

[Taimanglo et al. 1999: 3] Desde ki ayu tiempo, i peskadot siha ma a’~atan since this time DEF.CN fisherman PL 3PL.ERG RED~look este na åcho’ yanggen para u fameska this LINK rock when PARA U IRR.PL:hunt ‘Since this time the fishermen keep watching out for this rock when they are going to catch fish [].’

I assume that the use of the irrealis in the subordinate clauses is independent of the inherent realis/irrealis distinction of the two conjunctions. Since in Spanish – Peninsular or other – para never fulfils the functions of a TMA marker, it is legitimate to attribute the emergence of the future marker para to processes which affected the Hispanism only in the replica language. This scenario is sketched by Pagel (2010: 100) who hypothesizes that Spanish para has been copied into Chamorro at first in ihren [] Bedeutungen als direktionale und benefaktivische Präposition sowie als finaler Junktor []. Aus analogen Finalsatzkonstruktionen mit pära im Chamoru wäre dann über die [] reanalytischen und analogiebildenden Prozesse eine weitere, futurische Bedeutung von pära emergiert. [] Ein ursprünglich finalsatzeinleitendes pära [] wird per Metonymie futurisch reanalysiert []. Später wird es in futurischer Bedeutung über metaphorische Prozesse auch auf andere syntaktische Kontexte übertragen []. Die Markierung für Irrealis am Prädikat u- verweist dabei auf die gemeinsame modale Grundlage von Finalität und Futur zurück. Bemerkenswert ist daran nicht allein die Grammatikalisierung von Ch[amoru] pära, sondern vor allem, dass diese [] weitgehend unabhängig vom Spanischen im

Three Spanish-derived function words and the Chamorro irrealis | 149

Chamoru selbst vollzogen wurde, denn eine im eigentlichen Sinne futurische Bedeutung von para ist im Spanischen nicht belegt.22

Pagel’s (2010) ideas are based exclusively on synchronic data. My diachronic evidence strongly supports his hypothesis that the rise of the TMA marker para is a relatively recent development which culminated only after Spanish had ceased to co-exist as an adstratum in the Marianas. The starting point and the further course of the grammaticalization process are captured by Figure 8. Figure 8 assumes that both the Spanish preposition and the conjunction served as initial input for the process of grammaticalization in Chamorro. The first step to nativize the Spanish elements consisted in the deletion of que so that homophonous forms for preposition and conjunction arose no later than the 19th century. One might want to call this a merger across (Spanish!) wordclass boundaries. In the next phase – posterior to the first printed documents of Chamorro – the conjunction was subject to a functional split which gave rise to the TMA marker para. However, the emergence of this TMA marker has not sealed the fate of the conjunction para. The latter is still there and fully functional in the grammatical system of Chamorro. The TMA marker is an offspring of the conjunction. The TMA marker neither ousts nor replaces the conjunction. The long-term coexistence of source and target of a given grammaticalization process is the default pattern in cross-linguistic perspective. This means that the scenario in Figure 8 is in line with what we expect of grammaticalization processes in general. What happened to the conjunction para is that it was reanalyzed in certain contexts (and only there) as TMA marker. Reanalysis leads to functional splits which allow source and target to continue to coexist (Heine et al. 1991: 215–220).

|| 22 My translation: ‘in its meanings as directional and benefactive preposition as well as purposive connector. A further future meaning would then have emerged from analogous purposive constructions with para in Chamorro via processes of reanalysis and analogy. A para which originally introduced purposive clauses is reanalyzed via metonymy as future marker. Later on, it will be transferred to other syntactic contexts via metaphorical processes. The irrealis marker u- on the predicate refers back to the common modal basis of purposive and future. What is remarkable about this is not only the grammaticalization of Chamorro para as such, but especially the fact that it took place within Chamorro itself largely independently of Spanish because a genuinely future meaning of para is not attested in Spanish.’

150 | Thomas Stolz

Spanish:

PREP para

≠

CONJ para que

Chamorro (19th c.)

PREP para

=

CONJ para

Chamorro (21st c.)

PREP para

=

CONJ para

= TMA para

Figure 8: The grammaticalization of para in Chamorro.

We know now that there are three instead of just two function words para in Chamorro. There is compelling empirical evidence for separating the purposive conjunction para from the homophonous TMA marker para and both from the preposition para. The descriptions of the irrealis system and that of clause linking as put forward by Topping and Dungca (1973), Cooreman (1987), Gibson (1992), and Chung (1998, 2020) have been shown to simplify matters too much and this needs to be revised. Given that a revision is called for in the case of para, the possibility arises that the same holds for bai and siempre.

3.2 bai In the following quotes from Rodríguez-Ponga (1995), two forms of the function word under scrutiny are presented, namely bai and boi with the latter being described as the chronologically earlier form: bái. voy a. Partícula utilizada en el futuro de primera persona (singular y plural exclusive). Cf. bói. [] Para bai hu sasága giya Agáña, (lit. ‘para voy yo vivir en Agaña’), voy a vivir en Agaña, viviré en Agaña. Para bai hu péska agúpa’, voy a pescar mañana, pescaré mañana (Rodríguez-Ponga 1995: 168).23

|| 23 My translation: ‘bái. To be going to. Particle used in the future of the 1st person (singular and plural exclusive). Cf. bói. Para bai hu sasága[sic!] giya Agáña, (lit. ‘in order to go I live in Agaña’), I am going to live in Agaña, I will live in Agaña. Para bei hu péska agúpa’, I am going to fish tomorrow, I will fish tomorrow.’

Three Spanish-derived function words and the Chamorro irrealis | 151

bói. (< voy. La perífrasis española voy a tiene claro uso de future). [P]artícula para marcar el futuro en la primera persona. Forma antigua de bái. Cf. bái. (Rodríguez-Ponga 1995: 200).24

Matsuoka (1926: 223) ignored the Spanish origin of Chamorro bai which he assumed to be a reduced form of a supposed *baya ‘trend’ whose etymology is opaque. I have not been able to trace it in any of the Chamorro dictionaries. The Spanish background of bai was still not entirely clear to Topping and Dungca (1973: 262) who mention an alternative hypothesis. Bai would go back to baimbai (bye’m bye in their spelling) which originates from an unidentified English-based Pidgin in Melanesia or Hawaii. This hypothesis cannot explain why the domain of Chamorro bai is restricted to the 1st person singular and dual/plural exclusive. In Tok Pisin for instance, the future tense marker bai is compatible with all persons (Todd 1984: 194). Moreover, the existence of bói on earlier stages of Chamorro speaks against the Neo-Melanesian origin of bai. It is safe to follow Rodríguez-Ponga (1995) and others and assume that Chamorro bai is historically connected to Spanish voy. Spanish voy is the 1st person singular of the motion verb ir ‘go’. This motion verb is employed in Spanish to form the periphrastic future25 which encodes the proximate future as opposed to the synthetic future which is neutral or conjectural (Real Academia Española 2009: 2155–2156). The construction involves the spatial preposition a ‘to’ and the infinitive of the lexical verb as shown in (70)–(71). (70) a.

b.

(71)

Spanish ¿Y este hombre me [va and this man 3SG.DAT [go.3SG ‘And this man is going to help me?’ Voy a la debacle. go.1SG to DEF.F debacle ‘It will be a fiasco [lit. I go to the debacle].’

Spanish Voy a avisarlo. go.1SG to inform:INF:3SG.ACC ‘I am going to inform him.’

a to

[Tristante 2013: 26] ayudar]? help:INF]

[Tristante 2011: 196]

|| 24 My translation: ‘bói. (< voy. The Spanish periphrasis voy a clearly has a use as future). Particle for marking the future in the 1st person. Old form of bái. Cf. bái.’ 25 Not to be confounded with the compound future (“futuro compuesto”) haber + PARTICIPLE (Real Academia Española 2009: 1791–1795).

152 | Thomas Stolz

In (70a), the bracketed part of the interrogative sentence illustrates the periphrastic future which involves a form of ir as auxiliary whereas the use of ir as proper motion verb is shown in (70b). In (71), we find the auxiliary in the 1st person singular (present tense) voy, i.e. the morphological form which has been borrowed into Chamorro. What needs to be taken account of is the fact that in Spanish, there are forms of the auxiliary not only for all persons of the paradigm but also for several tenses and moods as shown in Table 12 (based on an extract from Real Academia Española (2009: 2154–2160)). The category labels classify the forms of the auxiliary and not the entire construction. Table 12: Selection of forms of the auxiliary of the Spanish proximate future.

Person

PRESENT

IMPERFECT

PRETERIT

POTENTIAL

PERFECT

1SG

voy

iba

fui

iría

he ido

2SG

vas

ibas

fuiste

irías

has ido

3SG

va

iba

fue

iría

ha ido

1PL

vamos

ibamos

fuimos

iríamos

hemos ido

2PL

vais

ibais

fuisteis

iríais

habéis ido

3PL

van

iban

fueron

irian

han ido

a VINF

The grey-shaded cell hosts the only member of this extended paradigm of verb forms which has been borrowed into Chamorro. Apart from voy > bai, the Spanish motion verb ir has left only marginal traces in Chamorro.26 In (25), I have given an example of the use of bai (voy) in the Chamorro of the 19th century. The second example of voy as future marker I have been able to identify so far for this earlier period stems from the same printed source and is reproduced in (72). (72)

[Ibáñez del Cármen 1887: 19]

|| 26 Rodríguez-Ponga (1995: 167) reports that the Spanish imperative va ‘go!’ is reflected in Chamorro bá’ which is used to address cattle. Similarly, the author assumes that the homophonous verb bá’ (ba’) ‘crawl on all fours’ might be etymologically connected to the 3rd person singular va ‘s/he goes’ of ir.

Three Spanish-derived function words and the Chamorro irrealis | 153

bai

hu sangani hafa hu 1SG.ERG tell what 1SG.ERG ‘[] I am going to tell [you] what I [would] do [].’ [Spanish version: voy á deciros lo que yo haría]

BAI

fa‘tinas make

Structurally, Spanish and Chamorro differ markedly as to the form of the construction. The Spanish preposition a ‘to’ has no functional equivalent in Chamorro. Moreover, Spanish makes use of the infinitive of the lexical verb whereas Chamorro employs a finite verb. The ergative pronoun separates bai from the lexical verb in Chamorro. No pronoun is admitted between the constituents of the periphrastic construction in Spanish. Simplifying, one may claim that the Spanish auxiliary retains many properties of a full verb. In contrast, Chamorro bai has undergone decategorialization from verb to particle. Pagel (2010: 91–92) emphasizes that voy was already part of a highly grammaticalized construction in Spanish when it entered the grammatical system of Chamorro. What happened to voy once it was accepted in Chamorro can be understood as a kind of secondary grammaticalization the course of which was largely independent of the original pattern in the donor language. The indirect evidence of the diachronic spread of Chamorro bai from the 1st person singular to the 1st person dual/plural exclusive presented in Section 2.5 confirms Pagel’s (2010: 92) purely synchronic line of argumentation. The process starts from a stage on which bai was restricted to the 1st person singular which probably was typical of Chamorro in the 19th century. By the mid-20th century, bai had diffused to the 1st person dual/plural exclusive. It is now associated with the 1st person non-inclusive. Pagel (2010: 92) characterizes this development as a “kognitiv plausibler Schritt” [‘a cognitively plausible step’] which is in no way connected to the rules which are in vigour for the proximate future in Spanish. With a view to determining the role of bai in the system of the Chamorro irrealis, Pagel (2010: 92–93) raises the issue of the optional use of this marker (cf. Sections 2.4–2.5). Topping and Dungca (1973: 262) argue that it is often omitted “in normal speech.” Pagel (2010: 93) rightfully states that none of the elements in an irrealis construction is absolutely compulsory. What he considers to be necessary is that [d]as Prädikat jeder Futurphrase muss im Chamorro wenigstens einmal für Futur oder Irrealis und einmal, per Kongruenzmarker, für Person markiert sein. Bai kommt hier eine

154 | Thomas Stolz

interessante Brückenposition zu []. [B]ai markiert Person, Modus und gewissermaßen auch Tempus. (Pagel 2010: 93)27

On account of what I put forward throughout Section 2, I take issue with some of the conclusions drawn in the above quote. Figure 9 serves as reference point for the subsequent discussion. The figure takes stock of the (co-)occurrences of para, bai, hu, and in in irrealis constructions in the corpus of contemporary Chamorro. In the case of para, I exclusively register direct combinations with the pronouns. Across the four primary sources, only the option bai + hu is attested ubiquitously. With 165 tokens, it covers 55 % of the 301 tokens attested in the corpus. Second best is bai alone, i.e. in the absence of a pronoun, which is the case for exactly 100 tokens (= 33 %). No other option exceeds the 8 %-mark.28 Many of those cases which involve para are problematic because they could also be interpreted as hosting the purposive conjunction para. In (73), I contrast two sentences from the same short dialogue. The zero symbol marks the potential position of absent elements of the irrealis construction. (73) a.

b.

[Taminaglo et al. 1999: 37] Para Ø hu fangonne’ PARA Ø 1SG.ERG IRR.AP:catch ‘I will catch crabs.’ Ø bai Ø konne’ i BAI Ø catch DEF.CN Ø ‘I will catch the coconut crab.’

pånglao. crab ayuyu. coconut_crab

Superficially, examples of this kind seem to prove Pagel’s hypothesis. Three of four primary sources attest to variation between bai occurring on its own and bai being accompanied by the appropriate pronoun. Onedera (2007) is exceptional insofar as in this text every instance of bai implies the co-presence of hu. The differences between the sources can be attributed to individual stylistic

|| 27 My translation: ‘in Chamorro, the predicate of each future construction must be marked at least once for future or irrealis and, via agreement markers, for person. Bai occupies an interesting bridging position. Bai marks person, mood and in a way also tense.’ 28 The eleven tokens for bai + in include four tokens of bai + en – a combination which is exclusively attested in one of the texts in Taimanglo et al. (1999). It seems to be the case that the author generally replaces in with en, meaning: the en encodes the 1st person nonsingular exclusive and not the 2nd person nonsingular. Note that already Matsuoka (1926) observed that in everyday speech, Chamorro speakers would make no difference between in and en. Chung’s (1998: 26–27) paradigms display in for both the 1st person nonsingular inclusive and the 2nd person nonsingular.

Three Spanish-derived function words and the Chamorro irrealis | 155

preferences of the authors. However, in the next paragraph, it is shown that there is an alternative interpretation of the facts. 70 60 50 40 30 20 10 0

Taimanglo et al. (1999)

Onedera (2007)

Saint-Exupéry

Perez and

(2021)

Faustino (1975)

bai (alone)

65

0

27

8

bai + hu

59

62

21

23

bai + in

11

0

0

0

para + hu

9

0

11

3

para + in

1

0

0

1

Figure 9: Co-occurrences of para, bai, hu, and in in irrealis constructions.

Pagel’s generalization according to which it is obligatory for future/irrealis to be marked overtly at least once in a construction does not do justice to the 2nd person nonplural. With intransitive verbs which do not undergo m/f-alternation, there simply is no dedicated phonologically realized marker of the irrealis on Layer IV. As argued in Section 2, the use of the ergative pronouns un and en (in addition to their preverbal position also with intransitive verbs) alone can be indicative of the irrealis nature of the construction. In (74), for instance, there is no irrealis marker on the left of the pronoun of the 2nd person singular ergative. (74)

[Perez and Faustino 1975: 4] Hagu Ø un attok gi halomtano' gi kanto-n chalan. 2SG.EMPH Ø 2SG.ERG hide in forest in side-LINK road ‘As for you, you will hide in the forest at the side of the road.’

Examples of this kind are not numerous. In the majority of the cases, para precedes un or en. Therefore, it is possible that the original absence of a dedicated irrealis marker for the 2nd persons supported the reanalysis of para as TMA marker. Nevertheless, para is not mandatory with 2nd persons either. Following Pagel’s above train of thoughts, one would have to assume that, in the absence of a dedicated irrealis marker, un and en are portmanteau morphs which co-encode person, number, and mood/tense. This solution

156 | Thomas Stolz

would, however, lead to further complications since the same pronouns are used in the realis, too. Similarly, the 1st person nonsingular inclusive only optionally takes the irrealis marker u (cf. examples (17)–(18) above). Does it follow that ta takes over this function in the absence of u? The answer to this (rhetorical) question is negative. These are typical problems one has to face when a morpheme-based approach is applied. In Section 2.5, I already mentioned that it makes more sense to take a construction-based approach at least to complement the ideas one has built on the grounds of a purely morphological analysis. We now begin to tread on slippery ground. In the light of Chung’s (2020: 182–188) null-pronoun analysis, the idea that bai assumes the function of person marker in the absence of hu might sound unconvincing. Simplifying, Ø can function as null-pronoun for each person provided that the context allows the identification of the participants. Superficially, this possibility even casts doubt on the supposed bridging function of bai. However, the alternative hypothesis creates problems, too. Layer IV (as of Figure 1) is populated by functionally homogenous but formally heterogeneous elements as shown in Table 13. Table 13: Paradigm of markers on Layer IV.

1st person exclusive SINGULAR

2nd person

3rd person

Ø

u

inclusive

(bai)

DUAL

(bai)

(u)

Ø

u

PLURAL

(bai)

(u)

Ø

u

All bracketed markers alternate with zero. What makes the situation as depicted in Table 13 special is the fact that we have different expressions for identical grammatical purposes. Since the differences affect members of the same paradigm, the phenomenon of marker suppletion would apply. In Corbett’s (2007: 30) taxonomy of morphological mismatches, there is no place for marker suppletion. The paradigm is thus highly marked. The most serious problem is not so much the absence of a dedicated marker in the 2nd persons but the distribution of bai and u. The latter cannot be counted as agreement marker of the 3rd persons because it is attested also in the 1st person nonsingular inclusive (cf. Section 2.4). On account of its presence in the cells of different person categories, u behaves more like a general irrealis marker. Bai, on the other hand, cannot be the agreement marker of the 1st persons because it is excluded from the inclusive category. If at all, it encodes the 1st person non-inclusive – and should be

Three Spanish-derived function words and the Chamorro irrealis | 157

glossed accordingly. If, however, bai is accepted as agreement marker of the person category alone then it is only logical to apply the same analysis also to u because bai and u are fillers of the same slot on Layer IV. To take up an argument of Section 2.2, this solution in turn would be in conflict with the combination u + ta in which the agreement marker of the 3rd persons joins the pronoun of the 1st person nonsingular inclusive. Given that u is not a person marker, bai’s chances of being a person marker diminish too. However, the phonological differences between bai and u as well as the restriction of bai to the 1st persons non-inclusive cause serious problems for an analysis of both bai and u as exclusive irrealis markers if we adopt a morphemebased approach. To complicate things further, it is hard to deny that bai at least indexically (Dressler et al. 1987: 17) signals the 1st person non-inclusive. In contrast, the indexicality of u is severely restricted since it signals ex negativo that neither a 2nd person nor a 1st person non-inclusive is the subject of the clause. To cut a potentially very long story short, none of the hypotheses is fully convincing. This means that bai and the problems related to it deserve to be investigated in more detail in a separate study. For the time being, I conclude that bai is first and foremost an irrealis marker on a par with u. In contrast to the latter, bai co-encodes the 1st person non-inclusive – either as an index or as a portmanteau morph which is responsible also for expressing the irrealis. Thus, Pagel (2010: 92–93) has been confirmed for several of the ideas he put forward. Owing to the small number of attestations of bai in the Chamorro texts from the 19th century, it is difficult to tell whether it encoded a special nuance of the range of meanings of the future although Matsuoka (1926) and Costenoble (1940) argue along these lines. The context of examples (25) and (72) is such that a high degree of immediacy of the envisaged action seems to be involved. The author announces that he is about to deliver certain information to his readership in the immediately following paragraph. Given that this interpretation is valid, the Chamorro use of bai in the 19th century would resemble that of the proximate future in the donor language. It is an intriguing question whether this potential meaning component of the early period of written Chamorro can be causally connected to the optional use of bai in contemporary Chamorro. Is there a (however, subtle) semantic difference between irrealis constructions with and without bai in the 1st person non-inclusive? To answer this question, it is necessary to enlarge the empirical basis for the research. This can only be achieved in a follow-up study.

158 | Thomas Stolz

3.3 siempre The third function word to look into is siempre. Rodríguez-Ponga (1995: 605) characterizes this Hispanism as follows: siémpre. Siempre, en todo momento, en cualquier caso; siempre, definitivamente (como en el uso mexicano) []. Se usa como adverbio: Para siémpre, para siempre, interminable, eterno. Aparece como marca de futuro que indica seguridad o intención firme, en combinación o no con pára: Siémpre [para] bai hú hánao agúpa’, definitivamente iré mañana.29

Two major functions are mentioned. On the one hand, siempre is classified as temporal adverb with a possible modal interpretation. It is this modal interpretation that displays a connection to the use of siempre as TMA marker of the certain future. What we find in Peninsular Spanish is the temporal-adverbial function as shown in (75). (75)

Spanish

[Tristante 2011: 29]

Siempre

has

sido

un

tipo

listo.

always

AUX.2SG

be:PTCPL

INDEF

type

clever

‘You have always been a clever guy.’ Pagel (2010: 100–101) states that in the European standard there is nothing which would connect siempre directly to the TMA function although the adverb is used frequently as modifier of verbs inflected for the future in order to emphasize certainty and volition on the part of the speaker. Moreover, in the above quote, Rodríguez-Ponga (2009) mentions Mexican Spanish where siempre can be used with the meaning of English definitely. Based on prior work by Curcó (2004) and Lipski (1994), Pagel (2010: 101–102) mentions two cases from Latin American varieties of Spanish which seem to be closer to Chamorro siempre. In Bolivian Spanish, siempre can be translated as ‘actually, really’ whereas in Mexican Spanish, siempre is used as discourse particle with the meaning ‘ultimately, in the end.’ Still, a translation with always is possible as in (76).

|| 29 My translation: ‘siémpre. Always, at all times, in any case; always, definitely (as in the Mexican way). It is used as adverb; Para siémpre, for ever, interminably, eternally. Appears as future marker indicating certainty or firm intention, in combination with or without pára; Siémpre [para] bai hú hánao agúpa’, I will definitely go tomorrow.’

Three Spanish-derived function words and the Chamorro irrealis | 159

(76)

Mexican Spanish [Kany 1976: 383] Siempre se tendrá que llevarla a él always REFL.3 have:3SG.FUT SUBORD take:INF:3SG.F.ACC to 3SG.M ‘One will always [= in the end] have to take it to him [].’

For historical reasons, Mexican Spanish served as role model in the Marianas at least during the first 120 years of the Spanish colonial rule. A plethora of lexical Americanisms (“americanismos”) provides ample evidence of the Mexican influence in the Marianas (Albalá Hernández 2000: 67–106). It is therefore possible that the meaning range of Chamorro siempre reflects the impact of Mexican Spanish. The grammaticalization of siempre as TMA marker thus started from Mexican Spanish patterns. According to Chung (2020: 396), Chamorro siempre is an adverb with the meaning ‘certainly, surely’ which can be used to indicate the certain future. This latter function is presented as follows: Clauses that describe situations in the future are typically in the irrealis mood. But the realis mood can be used for future situations that the speaker believes are certain to occur. In such cases, the realis mood occurs along with the adverb siempri ‘surely, certainly’ to present the future situation as virtually a fact. (Chung 2020: 37)

It is clear that under this condition it is not always easy to distinguish properly between an intended future reading and alternative interpretations of a given utterance. In (77), the wolf assumes (= epistemic modality) that Little Red Riding Hood is still staying behind to pick strawberries while he is going to vanish from the scene. (77)

[Perez and Faustino 1975: 26] Hallom gue’ lokkue’ surmise 3SG.ABS also saga para u famfe’ stay PARA U IRR.pick ‘He also surmised that the child strawberries in the forest.’

na

siempre i

patgon SUBORD SIEMPRE DEF.CN child strawberries gi halomtano’ strawberries in forest would certainly stay behind to pick

In this sentence, siempre occurs in a realis clause as adverbial modifier of the verbal predicate. Only in the subordinate clause introduced by the purposive conjunction para do we find an overtly marked irrealis. From the immediate context of the episode, it results that the siempre-clause is a case of the certain (relative) future because at the time of the wolf getting the above idea Little Red Riding Hood had not yet begun to pick strawberries.

160 | Thomas Stolz

What we read in the older reference grammar by Topping and Dungca (1973: 261) differs from Chung’s above account when the authors argue that “[t]he word siempre is sometimes used in place of para when the speaker wishes to indicate strong determination.” If siempre replaces para then the interchangeable function words occupy the same slot in the irrealis construction and this in turn means that siempre combines with the same battery of mood-sensitive markers as is the case for para. In examples with a transitive verb and a subject in any of the 2nd persons this possibility cannot be proved because of the formal neutralization of the moods. This is illustrated in (78). (78)

[Perez and Faustino 1975: 22] Hu hungok na malangu pues hanao ha’ sa’ 1SG.ERG hear SUBORD ill then go INTENS because siempre un sodda’ deska~kansa gi katre. SIEMPRE 2SG.ERG find RED~rest in bed ‘I heard that she is ill, then just go because you will surely find her resting in bed.’

In the corpus, there is nevertheless evidence of siempre as part of a clearly identifiable irrealis clause. With reference to the four primary sources, Figure 10 features the quantities for the two options, viz. siempre + REALIS vs. siempre + IRREALIS. 100% 90% 80% 70% 60% 50% 40% 30% 20% 10% 0% Taimanglo et al.

Saint-Exupéry

(1999)

(2021)

irrealis

10

4

3

2

realis

37

38

13

4

Onedera (2007)

Figure 10: Token frequency of siempre + REALIS vs. siempre + IRREALIS.

Perez and Faustino (1975)

Three Spanish-derived function words and the Chamorro irrealis | 161

The predominance of constructions in which siempre combines with the realis (including ambiguous cases) is uncontroversial. Of the 111 tokens, 17% go to the credit of the combination of siempre with an overtly marked irrealis. What is more, there is evidence for both options in each of the primary sources. Interestingly, all examples of siempre + IRREALIS involve a subject in the 3rd person (all numbers) whereas there is no proof of the combinations siempre + u + ta and siempre + bai (+ hu/in). In (79), I present three cases of siempre-clauses which appear in one and the same short story. Example (80) is from the same collection but belongs to a different story. (79) a.

b.

c.

(80)

[Perez and Faustino 1975: 4] Siempre u fan-lahu gi chalan. SIEMPRE U IRR.PL-walk in road ‘They will surely walk on the road.’ Siempre ma li’e’ yo’. SIEMPRE 3PL.ERG see 1SG.ABS ‘They will surely see me.’ Siempre guaha na’-mami katne-n SIEMPRE EXI CL.FOOD-POR.1PL.EXCL meat-LINK para pa’go na ha’ani. PARA now LINK day ‘We will surely have pork for tonight.’

[Perez and Faustino 1975: 38] I amotsa-ta siempre ti DEF.CN breakfast-POR.1PL.INCL SIEMPRE NEG ‘Our breakfast will surely not be very hot.’

u U

babui pig

gos very

maipe. hot

In (79a) and (80), siempre is involved in irrealis constructions whereas in (79b–c), no additional irrealis marker is present. This variation needs to be scrutinized further in a separate study. For the time being, I register the above cases as instances of free variation. Superficially, all siempre-clauses in (79)–(80) convey the same semantic nuance of the future. It is, however, possible that the alternation of the moods in combination with siempre is connected to different degrees of certainty or other, pragmatic, factors which have to be discovered yet. In my reference text for the Chamorro of the 19th century, there are eighteen instances of siempre. Three cases are of no interest for us because they reflect the formulaic address to the Virgin Mary which is siempre Bithen María ‘eternal Virgin Mary’. In eleven cases, Chamorro siempre corresponds to the same adverb in the Spanish version, in three cases, Chamorro siempre corresponds to the adverb eternamente ‘eternally’ in the Spanish version. In two cases, siempre

162 | Thomas Stolz

itself is the predicate (Chamorro ti siempre = Spanish no siempre ‘(it is) not always (the case)’). There is only one case where the Chamorro version contains an instance of siempre whereas there is no adverbial equivalent in the Spanish version. There is no siempre in the Spanish version that is not paralleled by siempre in the Chamorro version. Example (81) shows how siempre is used in a realis clause with reference to past events. (81)

[Ibáñez del Cármen 1887: 24]

sa ha puni siempre gi konfesión because 3SG.ERG deny SIEMPRE in confession uhi i ma’gas na isao-ña that DEF.CN chief LINK sin-POR.3SG ‘[] because she always denied this biggest sin of hers during confession.’ [Spanish version: por haber callado siempre en la confesion aquel pecado mortal]

The function of siempre is clearly that of a temporal adverb which expresses permanence. The syntactic position siempre occupies is the same in both languages since it follows the lexical verb. In this example, Chamorro siempre does not resemble its later TMA version very much. The postverbal position of Chamorro siempre is characteristic for the bulk of the cases identified in my 19th century source. The relative clause in (82) features siempre undergoing partial reduplication which is impossible in Spanish. This strategy is used repeatedly to translate Spanish eternamente ‘eternally.’ (82)

[Ibáñez del Cármen 1887: 7]

na la~la’la’ sié~siempre ha’ giya guiya mismo SUBORD RED~live RED~siempre INTENS in:DEF.PLN 3SG.EMPH self ‘[] who is living eternally in himself.’ [Spanish version: que vive eternamente en sí mismo]

What is addressed in this example is the Lord’s eternal existence. From a religious point of view, we are dealing with an atemporal fact so that what is described for the present will also hold for the future. Again, siempre does not occupy the pre-verbal TMA position. Moreover, there is no additional marker of the irrealis. Similarly, in (83), we have a generalization that is supposed to hold at all times.

Three Spanish-derived function words and the Chamorro irrealis | 163

(83)

[Ibáñez del Cármen 1887: 67]

sa kasi siempre man-ba~basnak because almost SIEMPRE PL-RED~fall ‘[] because they almost always commit sins.’ [Spanish version: porque casi siempre caen en pecado]

gi in

isao sin

This time, siempre winds up in the preverbal slot (in both languages) but it is accompanied by a modifier of its own – kasi ‘almost’ – which strongly suggests that we are dealing with the temporal adverb and not the TMA marker. In (84) too, siempre is once again outside the TMA position and no further indicator of the irrealis mood is co-present in the sentence. (84)

[Ibáñez del Cármen 1887: 53]

Si santa María sigi siempre bi~bithen DEF.PN St Mary continue SIEMPRE RED~virgin ha todu i tiempo INTENS all_the_time ‘St Mary remained a virgin all the time.’ [Spanish version: María Santísima permaneció siempre Vírgen.]

The everlasting virginity is expressed twice, namely once by siempre and by the sentence-final adverbial todu i tiempo ‘all the time.’ The Spanish parallel version lacks an equivalent for the latter. Does this mean that Chamorro siempre alone failed to convey the intended meaning sufficiently? In this case, one might want to assume that the temporal-adverbial semantics of siempre had already begun to bleach out (to the benefit of a subsequent reanalysis as TMA marker). However, this is an ad-hoc hypothesis. The sole example of Chamorro siempre without adverbial equivalent in Spanish is given in (85). (85)

[Ibáñez del Cármen 1887: 88]

Si pale’ Mariano sen magahet na misioneru, bú~bula DEF.PN priest Mariano very authentic LINK missionary RED~plenty ha’ siempre piti yan in-adahi para todos INTENS SIEMPRE ache and ADJVZ-care PARA all ‘Father Mariano was a very authentic missionary always intensively grieving and caring for everybody.’

164 | Thomas Stolz

[Spanish version: El padre Fr. Mariano fué un verdadero misionero lleno de caridad y celo con todos] In this example too, the best solution is to interpret siempre as temporal adverbial although its preverbal position conforms with that of the contemporary TMA marker. As can be gathered from Table 14, the postverbal position of siempre is the dominant pattern also in connection with predicates in the irrealis. The square brackets single out the predicate (nucleus) and the modifying adverb. Table 14: Postverbal siempre in the 19th century.

Page

Example – Translation – Spanish version

24

lao [mamàjlao siempre] ucómfesat uje y isaoña ‘[] but [she was always (too) ashamed] to confess that sin of hers.’ pero [se abstuvo siempre] de confesar su pecado

26

[Umatuna siésiemprejá] y santos naanmo. ‘Your name [be eternally blessed].’ [Sea bendecido eternamente] tu santo nombre.

32

pues, gui jinanao pat jinagô y tiempo [uguaja siempre] jayí ufamanagüe ni y taotao sija ‘[] then, in the course or succession of time [there will always be] someone who instructs the people []’ pues en el decurso ó sucesion del tiempo [habrá siempre] quien enseñe á las gentes

54

para [ufanmasásapetjá siempre] güije ‘[] for them [to be tormented eternally] there.’ para [ser allí eternamente] atormentados

59

[Tafan dévoto siempre] nu si santa María. ‘[We will be always devout] to St Mary.’ [Seamos siempre devotos] de María Santísima.

60

lao y mas debe utarespeta yan [ta adora siempre] ayo y Lagen Yuus ‘[] but the one we must respect and [adore always] is this Son of God.’ pero, lo que deberemos reverenciar y [adorar siempre], es al Hijo de Díos

The only irrealis clause with a preceding siempre from this source has been presented as and analyzed in (56) above. In this case, the combination para siempre ‘for ever’ can be considered to be a lexicalized adverbial without TMA function. This leaves us with one last example from Ibáñez del Cármen (1887) which is presented in (86).

Three Spanish-derived function words and the Chamorro irrealis | 165

(86)

[Ibáñez del Cármen 1887: 85]

siempre mama’na’~nague nu todos siha i SIEMPRE AP:RED~teach OBL all PL DEF.CN ‘[] he always taught all the people.’ [Spanish version: simpre enseñando á todos]

taotao person

The example has been extracted from a paragraph in which the author describes the missionary activities of Diego Sanvitores which required of him to travel from island to island in order to baptize as many people as possible. The absence of dedicated irrealis morphology and function words indicates that these actions were actually accomplished. Siempre is in the preverbal position not only in the Chamorro version but also in the Spanish version. It can be stated that in the source from the 19th century, there is no hard proof of siempre being used as TMA marker. Its use in the Chamorro version largely parallels that of siempre in the Spanish version – and this parallelism mostly includes the position of siempre relative to the lexical verb. In the 19th century, Chamorro siempre functioned as temporal adverb. The best translation for it in English is ‘always’ because none of the above examples illustrates the specifically Mexican Spanish meaning (corresponding to English definitely, in the end) to which Rodríguez-Ponga (1995) and Pagel (2010) refer. On account of the absence of bona fide examples of siempre as TMA marker in the early text, it must be concluded that the grammaticalization of siempre is a rather recent development in the history of Chamorro. Sandra Chung (p.c.) draws my attention to another possible dialectal difference because in the Saipan variety of Chamorro the postverbal position of siempre can be frequently observed.

3.4 Summing up the diachronic developments Before we turn our attention back to the present state of affairs in Section 4, it is in order to recapitulate the (potential) diachronic developments from which the contemporary system has arisen. Most of what I have to say in the subsequent summary of past events remains hypothetical for obvious reasons. The documentation of the earlier stages of Chamorro is by far too insufficient to allow for watertight conclusions. Yet, it is at least possible to sketch a relatively plausible succession of processes. Prior to contact with Spanish, there was the irrealis category in Chamorro with no further subdivisions. Formal marking of the irrealis was restricted to the m/f-alternation and the marker u in the 3rd persons. In the period between the

166 | Thomas Stolz

Spanish conquest at the end of the 17th century and the publication of the first printed material in Chamorro in the second half of the 19th century, Spanish voy (> bai) entered the scene where it functioned as marker of the proximate future with a high degree of volitionality. The latter criterion is causally connected to the restriction of the expression of the proximate future to the 1st person singular. The primary source of this period on which I have focussed for the empirical part of this study suggests that the irrealis consisted of two sub-categories, namely the general irrealis with a broad range of functions and the functionally narrowly delimited proximate future of EGO. The irrealis marker u already cooccurred with the 1st person nonsingular inclusive. Para was in use as preposition and conjunction whereas siempre was employed adverbially (with a predominantly temporal meaning). By the mid-20th century, bai must have diffused to the 1st person dual/plural exclusive and the association of this marker with the proximate future must have loosened so that it could function as irrealis marker on a par with u. In the second half of the 20th century, the reanalysis of the purposive conjunction para to a marker of the future/subjunctive in certain contexts must have taken place. From these contexts, the reanalyzed para began to spread to other contexts from which the original conjunction would have been excluded. The purposive conjunction continues to co-exist with its grammaticalized offspring. The grammaticalization of para to a TMA marker progressed further so that the irrealis system of the 1970ies contained the general irrealis and the future/subjunctive. Since bai had lost its former function as marker of the proximate (and thus certain) future, the system only included an unspecific or neutral future marked by para. As to siempre, the nontemporal meanings must have gained ground to the detriment of the temporal ones. Siempre moves from the formerly preferred postverbal position to that on the left of the predicate. Its meaning change facilitates its being reanalyzed as marker of the certain future whose grammaticalization is probably not fully accomplished yet. It seems that the change of siempre from modal adverb to TMA marker affected first realis clauses from where the future marker siempre extended its domain to irrealis clauses with 3rd person subjects in partial analogy to the TMA marker para. Presently, the irrealis system is divided into the general irrealis, the neutral future/subjunctive, and the certain future. I repeat: the previous paragraph is largely speculative. However, what can be concluded with a relatively high degree of certainty is that the irrealis system of the 19th century differed markedly from that of the 21st century. How the latter looks like is the topic of the subsequent section.

Three Spanish-derived function words and the Chamorro irrealis | 167

4 One last look at the irrealis in contemporary Chamorro This section recapitulates the insights we have gained as to the internal structure of the irrealis construction in modern Chamorro. Figure 1 in Section 2.5 features a template of the irrealis construction which is subject to modifications in what follows. The modifications result from the discussion of the properties of the three function words bai, para, and siempre in Section 3. Owing to the intricacies of the rules determining the shape of the construction, it is no longer advisable to use a templatic format for the presentation. To guarantee transparency, the crucial phenomena are discussed separately. The most serious problem is posed by the differential behaviour of para and siempre as TMA markers. In Figure 1, both TMA markers occur on Layer V and are thus mutually exclusive.30 This incompatibility is corroborated by my data from modern Chamorro. The difficulties arise when we look at the mood of the predicate. Figure 11 shows that, as to the formally expressed mood of the predicate for which they serve as TMA markers, para and siempre go separate ways. REALIS

PREDICATE

/

→

siempre ___ siempre ___3rd

IRREALIS

/

para ___

Figure 11: Choice of mood.

The possibility of a TMA marker of a category which belongs to the domain of the irrealis combining with a predicate in the realis is not included in Figure 1. One might even object to the TMA-marker status of siempre in combination with realis predicates so that this status would apply only if siempre takes a predicate

|| 30 This incompatibility does not hold for the lexicalized case of para siempre ‘for ever’ as in (ii). For an example from the 19th century, the reader is referred to (56). (ii) [Saint-Exupéry 2021, ch. XXVI] Amigu-ho hao para siempre. friend-POR.1SG 2SG.ABS PARA SIEMPRE ‘You are my friend for ever.’

168 | Thomas Stolz

that is formally marked for the irrealis. More interestingly, the split of siempre in the 1st and 2nd persons which never co-occur with a predicate in the irrealis and the 3rd persons which optionally also allow for combinations with a predicate in the irrealis can be indirectly connected to the mandatory presence of the irrealis marker u only with the 3rd persons. Speakers may feel certain about what they themselves will do in the future whereas they cannot be as sure about the future actions of persons they talk about. How the addressee (= 2nd persons) fits into this picture is unclear to me. Discounting the mood problem captured by Figure 11, we can accept the general scheme represented in Figure 1 as largely correct. There remain, however, several open questions which are real brain teasers if one inspects them closely enough. As discussed in Section 3.2, bai can be analyzed in multiple ways. I have argued against the portmanteau analysis (Pagel 2010) according to which bai is responsible for expressing simultaneously future/subjunctive, irrealis, and 1st person non-inclusive. The major obstacle for interpreting bai as monofunctional irrealis marker is the unlikelihood of marker suppletion to apply. Bai is too different phonologically from u (and Ø, for that matter). Moreover, its (optional) presence is tightly linked to the 1st person non-inclusive. The decision to distinguish primary and secondary functions allows me to continue to classify bai as irrealis marker at least for the time being. Since historically it is possible that bai originally marked only the strongly volitional proximate future, it might be the case that the alternations bai ~ Ø and para bai ~ bai reflect a former distinction of different categories of the future. Whether this distinction is absolutely obsolete nowadays needs to be investigated in a follow-up study. Layer IV in Figure 1 is populated by bai and u with the latter being characterized as optional only on account of the alternation u ta ~ ta in the 1st person nonsingular inclusive. As argued in connection with example (74) and Table 13, there is no dedicated irrealis marker with the 2nd persons. This means that Ø joins bai and u on Layer IV. At the same time, the corpus data show that for subjects in the 2nd person of all numbers the co-presence of para seems to be the norm. If one does not adhere to the principle of Absence of Material Exponence (AOME) (Stolz and Levkovych 2019), it becomes possible to assume that para invades the domain of the irrealis markers by way of turning into a portmanteau morph for future/subjunctive and irrealis. The paradigm of the irrealis markers would then consist of bai for the 1st persons non-inclusive, para for all 2nd persons, u for all 3rd persons and the 1st person nonsingular inclusive. I do not subscribe to this analysis but mention it nevertheless because none of the proposals is absolutely convincing – mine included.

Three Spanish-derived function words and the Chamorro irrealis | 169

Along the lines of the above approach which does not accept zeroes to occupy slots in a template, the optionality of hu, ha, and ma in the context of 3rd person subjects could be understood as proof of the “pronominal” functions of bai and u, respectively. The major flaw of this possible train of thought is that the supposed variation of u ~ u ma in the 3rd person plural is no case of unconditioned variation, in the first place. The presence/absence of ma depends on the transitivity of the verb/predicate. The choice of option is thus grammatically determined. In this case, it does not make sense to assume that for intransitive predicates u is an agreement marker whereas it is not for transitive predicates. I am unconvinced as to the possibility to let the problem disappear by way of analysing the combination u + ma as a unit uma- (and analogously bai + hu > baihu-, u + ta > uta-, bai + in > baiin-) as shown in Table 1 (Cooreman 1987: 39; Pagel 2010: 92). It is doubtful that this practice is meant to depict the new units as monomorphemic elements. The presence and absence of ha in combination with u (as discussed in Section 2.3) is of a different nature because it reflects the transition stage from an earlier system in which ha was also admitted in the plural – and this transition has not been fully accomplished yet. Furthermore, null pronouns are commonplace in Chamorro. Thus, the optional presence of ha and hu can be accounted for without promoting bai and u to fully-blown subject pronouns. In Table 15, I present the synopsis of the insights I have gained. The system represented in Table 15 is that of contemporary Chamorro. The gloss-column hosts my proposals for solving the problem of the competing glossing practices mentioned in Section 1. Table 15: Synopsis of results.

Function word

Word class

para1

preposition

para2

conjunction

Function

Gloss

Translation

spatial

LOC

‘in’

benefactive

BEN

‘for’

purposive

PURP

‘to’

para3

TMA marker

futureneutral/subjunctive

SBJ

bai

TMA marker

irrealis

IRR.1.NINCL

siempre1

adverb

modal

siempre2

TMA marker

futurecertain

‘certainly’ FUT

The leftmost column hosts indexed versions of para and siempre. This means that it is advisable to assume the existence of three distinct homophonous func-

170 | Thomas Stolz

tion words para1–3 and two homophonous function words siempre1–2. In the case of para, this solution is immediately compelling because the functions covered by para are too diverse to be classified as an instance of polysemy. This is different for the spatial and benefactive functions of para1 (= the preposition) because the different meanings can be understood to be close enough to each other to be subsumed under one umbrella category. This is a case of polysemy then. In contrast, para1 ≠ para2 ≠ para3 is a ternary chain of cases of homophony. As to siempre1–2, the situation is less clear because the adverbial meaning ‘certainly’ and the TMA-function of marking the certain future are directly related to each other. For the time being, I treat this case in analogy to that of para1–3. However, the problems posed by this solution need to be looked into in a follow-up study. The three Spanish-derived function words are indeed crucially involved in the make-up of the irrealis construction. Before bai, para, and siempre were available to speakers of Chamorro, the irrealis category looked considerably different from its present-day successor. The borrowing of bai, para, and siempre has made it possible to formally distinguish sub-categories of the irrealis which prior to contact depended entirely on contextual information. The Spanish-derived function words provided the input for the grammaticalization processes which ultimately resulted in the emergence of the categories of the neutral future/subjunctive and the certain future. The grammaticalization processes of para and siempre in their entirety, however, took place in Chamorro. On the basis of indirect and circumstantial evidence, Rodríguez-Ponga (1999) considers the mid-19th century to be the culmination point of Spanish-Chamorro language contacts with processes of incipient Creolization taking place in that period. It is worth noting, however, that the historical data discussed in Section 3 do not support the idea that what we find in contemporary Chamorro now goes back to direct Hispanization. As it seems, some of the crucial processes which have transformed the irrealis system have occurred only after Spanish ceased to be a factor in the linguistic landscape of the Marianas. In addition to being involved in the internal differentiation of the irrealis system, the function words under scrutiny display distributional properties which render an erstwhile transparently organized system relatively heterogeneous since the description of the morphosyntactic behaviour of bai, para, and siempre requires the formulation of a series of rules whose interaction is not always entirely clear. It is legitimate to ask whether the heterogeneity is owed to transfer of Spanish rules into the grammar of the replica language.

Three Spanish-derived function words and the Chamorro irrealis | 171

5 Type(s) of borrowing It is clear that Chamorro bai, para, and siempre have been transferred materially from the donor language Spanish into the replica language. They are thus instances of MAT-borrowing according to Sakel’s (2007) binary distinction of MATborrowing and PAT-borrowing. In a recent attempt to create a more fine-grained typology of borrowing, Gardani (2020: 266) criticizes Sakel (2007: 26) for assuming that MAT-borrowing goes normally along with PAT-borrowing and that exceptions from this close association of the types of borrowing are largely confined to the lexicon. The borrowed function words featured in this study strongly support Gardani’s hypothesis according to which MAT-borrowing is frequent also without parallel PAT-borrowing. Gardani (2020: 269) illustrates his hypothesis with examples from the domain of morphology but proposes a “typology of matter and pattern borrowing that applies to all areas of grammar.” Therefore, bai, para, and siempre as free morphemes also fall within the scope of his approach. A word of caution is called for at this point. To properly judge the issue, we cannot compare modern Chamorro and Spanish because MAT-borrowing and PAT-borrowing are categories which must be applied to the situation that arose when the borrowed items first entered the replica language’s grammar. What happened later to the very same elements within the replica language is not a case of borrowing but of language-internal processes. For our case, this means that we have to content ourselves with looking at the primary source from the 19th century – under the proviso that even this early written document does not coincide with the inception of Spanish-Chamorro contacts. It cannot be ruled out that bai, para, and siempre had already undergone changes in the replica language which altered their properties such that the original state of affairs is obscured. In repetition of what was said above, I start from the self-evident fact that the phonological chains of the Spanish-derived function words have been borrowed materially into Chamorro. Therefore, exclusive PAT-borrowing is no option. We are facing either MAT-borrowing or MAT&PAT-borrowing (Gardani 2020: 265). Since Gardani’s (2020: 270–272) own typology takes account of the specific phenomena of borrowing in the realm of morphology, it is not possible to apply his definitions of sub-types directly to the function words under scrutiny. For the empirical details, the reader is referred back to Sections 3.1–3.3. Para as spatial preposition is unproblematic. Chamorro has always been a prepositional language. The borrowing of Spanish para ‘to’ does not introduce a new structural pattern into the replica language. This is thus a case of MATborrowing (Gardani’s type 1.3). The benefactive function of Spanish para is a

172 | Thomas Stolz

different story because there was and still is dedicated benefactive morphology on verbs. As argued by Topping and Dungca (1973: 124) and Gibson and Raposo (1986: 314–315), the autochthonous benefactive inflection tends to be replaced with a PP whose head is para (cf. Section 3.1.1). The function as such was already established in Chamorro grammar. The borrowed pattern created an alternative way of expressing this function. In this case, the combination MAT&PAT-borrowing applies (Gardani’s type 3.1). Para as conjunction is similar to the case of para as spatial preposition. It is very likely that Chamorro had subordinating conjunctions prior to contact with Spanish. No new word-class was created in the course of integrating para as purposive conjunction into the system. It is unclear whether borrowed para ousted a pre-existent autochthonous conjunction with the same function. If there had been an autochthonous precursor of the borrowed conjunction, then we would have another case of MAT-borrowing. What if not? Combined MAT&PAT-borrowing is an option though a problematic one. Chamorro para – as purposive conjunction with the following predicate in the irrealis – conflates two patterns of the donor language, namely para + INFINITIVE and para que + SUBJUNCTIVE. Is Chamorro para + IRREALIS a case of MAT&PAT-borrowing? The answer is a lukewarm yes-and-no. The irrealis existed already in pre-contact times. It is possible that this mood was always used to express purposive relations between two predicates. The contact-borne innovation is the conjunction para. If we assume that prior to its borrowing the two predicates used to be asyndetically juxtaposed, then a new pattern has been introduced which resembles that of the donor language without being absolutely identical with it. This is an example of Gardani’s type 3.2. If, however, para has superseded an autochthonous conjunction, there is only MAT-borrowing, i.e. another case of type 1.3. The next case is interesting, too. Chamorro bai represents a new category within the irrealis system. On all accounts, there was no formally distinct category future in the replica language before borrowings from Spanish entered the domain of grammar. We are dealing with an instance of Gardani’s type 3.1.3 which is typically associated with verbal categories such as tense and aspect. However, the Spanish pattern as such has not been taken over. There is no subordinator, the subordinated predicate is not in the infinitive, and the borrowed auxiliary is restricted to the 1st person singular. It is therefore doubtful whether the classification of this case as MAT&PAT-borrowing is fully appropriate. However, it cannot only be MAT-borrowing either. Perhaps cases of this and similar kind show that the typology of MAT-borrowing and PAT-borrowing is not yet finegrained enough. Partial or selective MAT&PAT-borrowing where only certain

Three Spanish-derived function words and the Chamorro irrealis | 173

aspects of the donor language’s patterns are replicated are probably difficult to classify according to the extant typology. Finally, Chamorro siempre is not as difficult to account for as the previous cases. In the 19th century, siempre is documented almost exclusively as temporal adverb as in Peninsular Spanish. The similarity between Chamorro siempre and Spanish siempre holds also for the preferred postverbal position of the adverb. I ignore how the concept of ALWAYS used to be expressed in precontact Chamorro. It is therefore possible that expressing it adverbially represented a novel way of verbalizing the context. In this case, we have MAT&PATborrowing. MAT-borrowing alone would also be a possible classification provided that pre-contact Chamorro had an adverb-like means of expressing the concept under inspection. As results from the brief discussion in this section, the Chamorro data more often than not are hard to force into the typology of MAT-borrowing and PATborrowing. I assume that posing these difficulties is no monopoly of Chamorro. To the contrary, it is to be expected that in many language-contact situations phenomena emerge which are not as straightforwardly classifiable as the typology seems to suggest. It follows that we need many more reliable data from as many contact scenarios as possible to develop the typology further. What this means is that there is ample opportunity for work in the future.

6 Conclusions The principal goal of this investigation has been reached. I have shown empirically what the tasks of the Spanish-derived function words bai, para, and siempre are in Chamorro grammar. The three function words have in common that they are put to service in the domain of the irrealis mood. In the case of bai, this is exclusively so whereas para and siempre fulfil other functions also outside the irrealis. As to para and to some extent also as to siempre, one might prefer to assume the existence of several homophonous but functionally distinct function words in lieu of treating them as single polysemous units. I have proved that the TMA marker para must not be lumped together synchronically with the purposive conjunction para whose existence is not acknowledged in all descriptions of Chamorro. On this account, it is advisable to use different glosses for the different functions of para. The indiscriminate use of FUT will never do justice to the structural reality of the language. Similarly, it is possible that the TMA marker siempre has to be distinguished from the adverb siempre although the case is less obvious than that of para.

174 | Thomas Stolz

The evaluation of a particular primary source from the 19th century has helped me to trace possible diachronic developments leading from an earlier stage with a differently organized irrealis system via several intermediate stages to the present situation. In accordance with Pagel (2010), it could be demonstrated that the grammaticalization of the TMA markers para and siempre happened largely independent of Spanish patterns and in relatively recent times. This means that the Spanish-derived function words have contributed substantially to the reorganization of the Chamorro irrealis system but the donor language Spanish has not necessarily imposed its own system on the replica language. This conclusion receives support from my attempt to classify the data from the 19th century according to the categories of MAT-borrowing and PATborrowing on the basis of Gardani’s (2020) typology. New light has been shed on the internal organization of the irrealis system in modern Chamorro. I have shown that a number of hypotheses as to the functions and properties of members of the irrealis construction cannot be upheld in their present form. Several problem zones have been identified and the pros and cons of analysing the facts according to one model or another have been weighed. Admittedly, there remain numerous open question which I was unable to answer in this study. More generally, the cases of the Chamorro irrealis and the Spanish-derived function words are by no means closed. I understand my study as a first tentative step towards an exhaustive description and evaluation of these phenomena both in synchronic and diachronic perspective. The continuation of this line of research is unavoidable because of the following aspects. First of all, the data base is still far too small. The small size of the Chamorro text production notwithstanding, there are many more specimens of written Chamorro that should form part of an enlarged corpus. Secondly, I have only looked at data from the Guam variety. Some of the issues I raised in connection to Chung’s (2020) account of the Chamorro irrealis might turn out to reflect differences between the varieties of Guam and Saipan. Thirdly, one should also seize the opportunity to admit more text material from the period prior to World War II to the corpus to strengthen the diachronic side of the investigation. However, this cannot be done blindly because many if not most of these older texts are suspicious of reflecting the non-native use of Chamorro. Fourthly, the exclusive reliance on the written register is inadequate if we want to make general statements about a given language. Therefore, spoken language data should also be fed into the data-base. Only in the next phase of the project will it be possible to claim that the topic of this study has finally been exhausted.

Three Spanish-derived function words and the Chamorro irrealis | 175

Acknowledgments: This study is a spin-off of previous projects conducted in the framework of Colonial and Postcolonial Linguistics – especially those which have focused on the reedition and evaluation of grammatical descriptions, dictionaries, and texts written in or about Chamorro prior to the 1950s. I am especially grateful to Sandra Chung who agreed to commenting on the draft version of this paper. Her remarks have saved me from making a number of mistakes. A further word of thanks goes to Francesco Gardani, Sonja Kettler, Nataliya Levkovych, Paula Müller, Hitomi Otsuka, Vanessa Pauls, and Beke Seefried for their help with the logistics needed for this study. As to the contents and the form of this contribution, I assume the full and exclusive responsibility.

Abbreviations ? * 1/2/3 ABS ACC ADJ ADJVZ ADV AP AUX BAI BEN CAUS CL.FOOD CN CONJ DAT DEF DU EMPH ERG EXCL EXCLAM EXI F FUT GER INCL INDEF INF

doubtful reconstructed/unattested 1st/2nd/3rd person absolutive accusative adjective adjectivizer adverb antipassive auxiliary bai benefactive causative food classifier common noun conjunction dative definite article dual emphatic ergative exclusive exclamation existential feminine future gerund inclusive indefinite infinitive

176 | Thomas Stolz

INTENS IRR ITR LINK LOC M MAT MOD NEG NINCL NMLZ NP/NP NPL NSG OBL PARA PASS PAST PAT PL PLN PN POR PP/PP PREP PRETI PRETII PRO PTCPL PURP Q REAL RED REFL REL SBJ SG SIEMPRE SUBORD

TMA TR U V

intensifier irrealis intransitive linker locative masculine matter modal verb negation non-inclusive nominalizer noun phrase nonplural nonsingular oblique para passive past tense pattern plural place name person name possessor prepositional phrase preposition peterite I peterite II pronoun participle purposive question marker realis reduplication reflexive relativizer subjunctive singular siempre subordinator tense/mood/aspect transitive u verb

Three Spanish-derived function words and the Chamorro irrealis | 177

Primary Sources Forbes, Eric. 2009. I kuatro man uttimo na kosas. The four last things. Spanish era Chamorro sermons on Death, Judgment, Hell, and Heaven. Agaña Heights: Capuchin Friars. Fritz, Georg. 1907. Kurze Geschichte der Marianen. Mitteilungen des Seminars für Orientalische Sprachen der Friedrich-Wilhelms-Universität 10. 218–228. Ibáñez del Cármen, P.Fr. Aniceto. 1887. Devoción as San Francisco de Borja, patron Luta. Manila: Amigos del País. Onedera, Peter Robert. 2007. Egge’ gi i Kestumbren CHamoru. Unpublished MA-thesis. University of Guam. Perez, Remedios L. G. & Rogelio G. Faustino. 1975. Estera si rai. Agana: Government of Guam. Saint-Exupéry, Antoine [translated by Eric Forbes]. 2021. I Dikkiki’ Na Prinsipi. Hagåtña. Taimanglo, Roland L. G., Aline Yamashita & Maria A. T. Rivera (eds.). 1999. Mandidok yan mamfabulas na hemplon Guåhan. Hagåtña: Government of Guam. Tristante, Jerónimo. 2011. El enigma de la calle Calabria. El detective Víctor Ros en Barcelona. Madrid: Embolsillo. Tristante, Jerónimo. 2013. La última noche de Víctor Ros. Madrid: Debolsillo.

References Albalá Hernández, Paloma. 2000. Americanismos en las Indias del Poniente. Voces de origen indígena americano en las lenguas del Pacífico. Frankfurt a.M. & Madrid: Vervuert. Burrus, E. J. 1954. Sanvitores’ grammar and catechism in the Mariana (or Chamorro) language (1668). Anthropos 49. 934–960. Chung, Sandra. 1998. The design of agreement. Evidence from Chamorro. Chicago & London: The University of Chicago Press. Chung, Sandra. 2020. Chamorro grammar. Santa Cruz: University of California. [https://scholarship.org/uc/item/2sx7w4h5] Chung, Sandra & Alan Timberlake. 1985. Tense, aspect, and mood. In Timothy Shopen (ed.), Language typology and syntactic description. Vol. III: Grammatical categories and the lexicon, 202–258. Cambridge: Cambridge University Press. Cooreman, Ann M. 1987. Transitivity and discourse continuity in Chamorro narratives. Berlin & New York: De Gruyter. Corbett, Greville G. 2000. Number. Cambridge: Cambridge University Press. Corbett, Greville G. 2007. Deponency, syncretism, and what lies between. In Matthew Baerman et al. (eds.), Deponency and morphological mismatches, 21–44. Oxford: Oxford University Press. Costenoble, Hermann. 1940. Die Chamoro Sprache. ‘S-Gravenhage: Nijhoff. Curcó, Cármen. 2004. Procedural constraints on context selection: siempre as a discourse marker. In Rosina Márquez Reiter & María Placencia (eds.), Current trends in the pragmatics of Spanish, 179–201. Amsterdam & Philadelphia: John Benjamins. De Vera, P. Roman María. 1932. Diccionario chamorro-castellano. Manila: Germania.

178 | Thomas Stolz

Dressler, Wolfgang U. et al. 1987. Introduction. In Wolfgang U. Dressler (ed.), Leitmotifs in Natural Morphology, 1–23. Amsterdam & Philadelphia: John Benjamins. Enciclopedia Universal. 1920. Enciclopedía universal ilustrrada europeo-americana. Tomo XLI. Bilbao, Madrid & Barcelona: Espasa-Calpe. Fritz, Georg. 1903. Chamorro-Grammatik. Mitteilungen des Seminars für Orientalische Sprachen 6. 1–27. Gardani, Francesco. 2020. Borrowing matter and pattern in morphology. An overview. Morphology 30. 263–282. Gibson, Jeanne D. 1992. Clause union in Chamorro and in Universal Grammar. New York & London: Garland. Gibson, Jeanne D. & Eduardo Raposo. 1986. Clause union, the stratal uniqueness law and the chômeur relation. Natural Language and Linguistic Theory 4. 295–331. Haspelmath, Martin. 2010. Framework-free grammatical theory. In Bernd Heine & Heiko Narrog (eds.), The Oxford handbook of linguistic analysis, 287–310. Oxford: Oxford University Press. Heine, Bernd, Ulrike Claudi & Frederike Hünnemeyer. 1991. Grammaticalization. A conceptual framework. Chicago & London: The University of Chicago Press. Hinds, John. 1986. Japanese. London & New York: Routledge. Ibáñez del Cármen, P. Fr. Aniceto. 1865. Gramática chamorra. Manila: Ramirez y Giraudier. Izouî, H. 1940. Le système verbal du chamorro actuel de l’île Saïpan. Gengo Kenkyu 6. 14–27. Kany, Charles E. 1976. Sintaxis hispanoamericana. Madrid: Gredos. Kats, J. 1917. Het Tjamoro van Guam en Saipan vergeleken met einige verwante talen. ‘s-Gravenhage: Nijhoff. Lipski, John M. 1994. Latin American Spanish. London: Longman. Lopinot, P. Callistus. 1910. Chamorro-Wörterbuch enthaltend I. Deutsch-Chamorro, II. Chamorro-Deutsch nebst einer Chamorro-Grammatik und einigen Sprachübungen. Hongkong: Typis Societatis Missionum ad Exteros. Matras, Yaron. 2007. The borrowability of structural categories. In Yaron Matras & Jeanette Sakel (eds.), Grammatical borrowing in cross-linguistic perspective, 31–73. Berlin & New York: De Gruyter. Matras, Yaron. 2009. Language contact. Cambridge: Cambridge University Press. Matsuoka, Shinzuo. 1926. Chamorro-go no kenkyū [Study of the Chamorro language]. Shigaku 5(2). 187–264. Pagel, Steve. 2010. Spanisch in Asien und Ozeanien. Frankfurt am Main: Lang. Real Academia Española. 2009. Nueva gramática de la lengua española. Madrid: Asociación de Academias de la Lengua Española. Reid, Lawrence A., Emilio Ridruejo & Thomas Stolz (eds.). 2011. Philippine and Chamorro linguistics before the advent of structuralism. Berlin: Akademie Verlag. Rodríguez-Ponga, Rafael. 1995. El elemento español en la lengua Chamorro (Islas Marianas). Unpublished Doctoral Thesis. Madrid: Universidad Complutense, Facultad de Filología. Rodríguez-Ponga, Rafael. 1999. ¿Qué se hablaba en las Islas Marianas a finales del siglo XIX? In Miguel Luque Talaván et al. (eds.), 1998: España y el Pacífico.Interpretación del pasado, realidad del presente, 521–527. Madrid: Asociación Española del Pacífico. Rodríguez-Ponga, Rafael. 2009. Del español al Chamorro. Lenguas en contacto en el Pacífico. Madrid: Gondo.

Three Spanish-derived function words and the Chamorro irrealis | 179

Safford, William Edwin. 1909. The Chamorro language of Guam. A grammar of the idiom spoken by the inhabitants of the Marianne, or Ladrones, Islands. Washington/DC: Lowedrmilk & Co. Sakel, Jeanette. 2007. Types of loan: Matter and pattern. In Yaron Matras & Jeanette Sakel (eds.), Grammatical borrowing in cross-linguistic perspective, 15–30. Berlin & New York: De Gruyter. Stolz, Thomas. 1998. Die Hispanität des Chamorro als sprachwissenschaftliches Problem. Iberoamericana 70(2). 5–38. Stolz, Thomas. 2007. The Kurze Geschichte der Marianen by Georg Fritz. A commented reedition. In Martina Schrader-Kniffki & Laura Morgenthaler García (eds.), La Romania en interacción: entre historia, contacto y política. Ensayos en homenaje a Klaus Zimmermann, 307–349. Frankfurt a.M.: Vervuert. Stolz, Thomas. 2008. Romancization world-wide. In Thomas Stolz, Dik Bakker & Rosa Salas Palomo (eds.), Aspects of language contact. New theoretical, methodological and empirical findings with special focus on Romancisation processes, 1–42. Berlin & New York: De Gruyter. Stolz, Thomas. 2015. Chamorro inflection. In Matthew Baerman (ed.), The Oxford handbook of inflection, 465–489. Oxford: Oxford University Press. Stolz, Thomas. 2019. The naked truth about the Chamorro dual. Studies in Language 43(3). 533–584. Stolz, Thomas, Dik Bakker & Rosa Salas Palomo (eds.). 2008. Hispanisation. The impact of Spanish on the lexicon and grammar of the indigenous languages of Austronesia and the Americas. Berlin & New York: De Gruyter. Stolz, Thomas & Nataliya Levkovych. 2019. Absence of material exponence. Language Typology and Universals/STUF 72(3). 373–400. Todd, Loreto. 1984. Modern Englishes. Pidgins & Creoles. Oxford: Blackwell. Topping, Donald M. & Bernadita C. Dungca. 1973. Chamorro reference grammar. Honolulu: University of Hawaii Press. Topping, Donald M., Pedro M. Ogo & Bernadita C. Dungca. 1975. Chamorro-English dictionary. Honolulu: University of Hawaii Press. Von Preissig, Edward Ritter. 1918. Dictionary and grammar of the Chamorro Language of the island of Guam. Washington/DC: Government Printing Office. Winkler, Pierre. 2016. Missionary pragmalinguistics. Father Diego Luis de Sanvitores’ grammar (1668) within the tradition of Philippine grammars. Utrecht: LOT.

180 | Thomas Stolz

Appendix (i) page 15

24

30 33

35

40

64

Additional cases of surplus para in the Chamorro part of Ibáñez del Cármen (1887) example Chamorro – translation Spanish – translation matae para unâyejit ni y minaaseña nii taijinecog, yan y grasiaña para umung̃a mayoga jayi ‘[] he died to apply to us his forgiveness which has no end and his mercy in order for nobody to be punished’ = murió para aplicarnos su inifinita misericordia y gracia, y nadie se condenara ‘[] he died to apply to us his infinite mercy and grace and nobody was condemned.’ sa sen mamájlao, pinelóloña basta usapet duro y tataotaoña para utaca misericordia ‘[] since she felt very ashamed she presumed that it sufficed to let her body suffer hard to receive mercy’ = porque tuvo gran rubor, pareciéndola que macerando duramente su cuerpo, alcanzaría misericordia ‘[] because she felt great shame while it seemed to her that castigating her body heavily she would find mercy.’ Fatinas taigüine para injájasoyô ‘Do it this way to keep thinking of me!’ = Haced esto en memoria de mí. ‘Do this in my memory.’ y santa Iglesia na maisa gai siña para udeclara pat usentensia, cao vale pat tivale ‘[] the Holy Church alone has the power to declare and judge whether it is valid or not []’ = la santa Iglesia es la única que puede declarer ó resolver, si fué válido ó nulo ‘[] the Holy Church is the only one that can declare or solve whether it was valid or not.’ yan yiniusan sija na dinesea para uepog y minalag̃ôta guato gui mauleg ‘[] and the saintly desires to entice our wishes for the good’ = y unos santos deseos, con que escita nuestras voluntades para el bien. ‘[] and some holy desires by which he incites our will for the good.’ Y Sacramenton umacamo setbe para unafan-etnon ya unae grasia y man-asagua para ujafan asung̃ong̃ ya ujapogsae y famaguonñija para y lang̃et. ‘The Sacrament of Matrinomy serves to unite and give grace to the married couple to elevate each other and raise their children for the Heavens.’ = El Sacramento del matrimonio sirve para unir y dar gracia á los casados, con la cual se sobrelleve el uno al otro y crien hijos para el cielo. ‘The Sacrament of Matrimony serves to unite and give grace to the married couple by which the one elevantes the other and they raise children for the Heavens.’ lao yaguin guaja dángculo na rason para umung̃a jumosme misa, pat para ufachocho, tiísao ‘[] but when there is an important reason not to

Three Spanish-derived function words and the Chamorro irrealis | 181

85

(ii) 31

31

37

69

attend Mass or to work, this is no sin’ = pero, cuando hay ó media alguna razón ó motive poderoso, no es pecado dejar la misa ó trabajar. ‘[] but when there is half a reason or a powerful motive, the it is not a sin to skip Mass or work.’ pot énao mina guaja sen tunas na razon para umafanaan as respetayon pale Diego Luis de Sanvitores y apostol gui ya Marianas. ‘[] for this reason, there is very good reason to call the respectable Father Diego Luis Sanvitores the Apostle in the Marianas.’ = por lo que, con justa razon, se llama al respectable padre Diego Luis de Sanvitores el apostol de Marianas. ‘[] therefore, with good reason, the respectable Father Diego Luis de Sanvitores is called the Apostle of the Marianas.’ Additional cases of surplus para in the Spanish part of Ibáñez del Cármen (1887) Lii güine y numâsiña manconsagra pat maselébra misa. ‘See there the one with the power to consecrate or celebrate Holy Mass.’ = Hed aqui el poder para consagrar ó celebrar misa. ‘Have here the power to consecrate or celebrate Holy Mass.’ Lii güine y numâsiña ma-asii y ísao taotao todo. ‘See there the one with the power that all sins of man be forgiven.’ = Hed aqui el poder para perdonar los pecados. ‘Have here the power to pardon the sins.’ Maexaminan ísao y uprocura y taotao cuméjaso sija y isaoña ‘Examination of sins means that the person perseveres in trying to think about his/her sins.’ = examen de conciencia es hacer las diligencias para acordarse uno de los pecados ‘[] examination of conscience is trying hard to remember one’s sins.’ yan y mamayon manmatdise ya tijaprocúcura minaquejanao ‘[] and those who have the habit of cursing and do not try hard to quit it [].’ = asi como los que tienen costumbre de maldecir y no hacen diligencias ó no ponen los medios para dejarla ‘[] like those who have the habit of cursing and do not try hard nor provide the means to quit it.’

Nicole Hober

On the borrowing of the English adversative connector but Abstract: This paper contributes to the ongoing study of function word borrowing in situations of language contact. Specifically, the borrowing of the English adversative connector but into the languages of the world is investigated. To categorize the ways in which English but is borrowed, the matter and pattern framework is adopted. I demonstrate that, in contrast to the wealth of attestation elsewhere, only a few languages have borrowed and integrated the English adversative connector. Consequently, I discuss a number of possible explanations for the limited evidence. Keywords: adversativity; Anglicization; borrowing; connectors; language contact

1 Introduction In this contribution, I explore the borrowing of the English adversative connector but into several languages of different genetic, typological, and geographic backgrounds. The motivation for this study stems from the well-established and empirically backed fact that adversative connectors are among the most frequently and widely borrowed function words. Evidence from contact scenarios with e.g. Spanish, Italian, French, Arabic, or Russian as source languages abounds. At the same time, the impact of English on the past and present linguistic world has been emphasized time and again, and its domination appears unparalleled by any other international language. We would thus expect to observe a continuing and increasing Anglicization of many languages of the modern world – an Anglicization that is in line with what we presume to be universally true of contact-induced language change. Therefore, according to the predictions of current language contact theory (cf. Thomason 2001; Matras 2009), there should be numerous instances of the borrowed adversative connector but. This hypothesis is not borne out by the empirical facts as is shown in this paper.

|| Nicole Hober, University of Bremen, FB 10: English-Speaking Cultures, Universitäts-Boulevard 13, 28359 Bremen, Germany. E-Mail: [email protected] https://doi.org/10.1515/9783110785517-005

184 | Nicole Hober

Since the 1980s, researchers of the English language have recognized the plurality of the different varieties of English and turned their attention to studying their similarities and differences (Kachru 1985). While the influence that the contact languages exert on English has been analyzed in some depth (cf. Onysko 2016; Lim 2020), the reverse, i.e. the impact of English on the contact languages’ structures, is less well described. The little that is to be found in the literature leaves the impression that English only affects the content word side in the lexicon of other languages. But, are we to believe that the structures of the languages that come into contact with English remain largely unchanged despite the implantation of the English language around the world and the emergence of new varieties of English? Whatever the reason behind the lack of evidence of grammatical borrowing from English, be it academic neglect or a systematic absence due to intra- or extralinguistic factors, the diverse nature of contact scenarios involving varieties of English serves as an ideal testing ground for our current theories of language contact and fertile ground for further exploration. With this study, I will not only make first strides towards such detailed investigations on English as a source language but also towards situating English and its contact behavior within the wider context of language contact research. To categorize how BUT1 is borrowed, I adopt Matras and Sakel’s (2007a) framework which distinguishes the borrowing2 of matter (MAT), “transfer of morphological material and its phonological shape”, pattern (PAT), “transfer of organization, distribution and mapping of grammatical or semantic meaning”, or a combination thereof (MAT&PAT) (Sakel 2007: 15). The borrowing of BUT falls under either of these three categories. MAT applies if a formally and functionally equivalent conjunction was already present in the recipient language. Borrowed BUT either co-exists alongside the recipient language’s form or replaces it. MAT&PAT applies if no formally and functionally equivalent conjunction existed

|| 1 I use BUT in small capitals as a shorthand to refer to the functional class of adversative connectors equivalent to the English item, and but in small letters and italics to refer to the English-specific lexeme. 2 The term ‘borrowing’ is established in the literature committed to the MAT and PAT framework (Matras and Sakel 2007a, b; Sakel 2007; Gardani 2020). I too will use it as an umbrella term to refer to the adoption of some lexical or grammatical material or structure from one language to another. Still, it would generally be preferable to opt for more precision and less ambiguous notions. ‘Borrowing’ carries a lot of terminological confusion, i.e. it is employed by different researchers to mean different things and leaves open the social and cognitive dominance relations between languages in contact, which may or may not play a crucial role in explanations of language contact. Considerations of terminological preciseness need to be postponed to another time and paper.

On the borrowing of the English adversative connector but | 185

in the recipient language, and other strategies were employed to encode an adversative connection between clauses.3 The Uto-Aztecan language Pipil is a case in point. Prior to coming into contact with Spanish and borrowing its set of conjunctions wholesome, Pipil speakers expressed coordination by means of zerojuxtaposition or, for coordinating nominals, the relational noun i-wan ‘POSS.3SGwith’. All subordinate clauses were marked by the particle ne (Campbell 1989: 97). Besides borrowing the morphological material and phonological shape of the Spanish conjunctions, the strategy of marking coordinate and subordinate clauses by overt conjunctions was borrowed simultaneously (MAT&PAT). PAT only applies if no phonological substance was taken from the source language but rather an abstract usage pattern was replicated with the recipient language's own material. For instance, in addition to the borrowing of conjunctions, contact with Spanish also initiated the grammaticalization of Pipil relational nouns into conjunctions (wan ‘and’ < i-wan ‘POSS.3SG-with’). This development is an instantiation of the cross-linguistically identified grammaticalization path WITH > AND (Kuteva et al. 2020: 108–111).4 What is more, an increase in overall usage frequency of BUT could also belong to the PAT category (cf. Gardani 2020: 271). Blake (2001: 1017) citing Diller (1993: 403–408), for example, states that Thai tæ: ‘but’, along with læ ‘and’, are employed more frequently in a westernized register since the spread of English to Southeast Asia. Furthermore, changes to the adversative constructions themselves could also be envisaged, such as a reconfiguration of the linear sequence of constructional elements.5 Due to the limited scope of this study and scarcity of data, I will primarily be concerned with MAT- and MAT&PAT-borrowings of BUT. The data for this explorative study is taken from grammars, descriptive secondary sources, corpora, and online databases. There is an abundance of models and theories of the processes and outcomes of language contact in the literature. In this paper, I confine myself to considering Matras’ (1998, 2009, 2012) activity-oriented approach to contactinduced language change. Here, purely formal-structural or social explanations are largely abandoned and cognitive-pragmatic factors take centre stage. The model and its implications for the borrowing of adversative connectors are discussed in Section 2.2. Beforehand, I make necessary introductory remarks on the functional class of discourse connectors zooming in on the form and func-

|| 3 PAT is argued to apply only partially here if the recipient language already had intersentential connectors elsewhere in its coordination system. 4 BUT is derived from CAUSE, MORE, or TEMPORAL (Kuteva et al. 2020: 111, 84, 424–425). 5 For the possible linear sequences and properties of adversative constructions that are the cross-linguistically attested, see Section 2.1.

186 | Nicole Hober

tion of English but as the borrowed item in question (Section 2.1). This is followed by an exploration of the borrowing of BUT from a cross-linguistic perspective (Section 2.3). Subsequently, I present evidence of English but as a loan (Section 3). I then review the findings by evaluating potential explanations for the emerging picture (Section 4). Finally, I offer some concluding remarks on the findings and Anglicization on the whole.

2 Setting the stage The label ‘discourse connector’ (DC) is used as a convenient umbrella term and understood as a functional category comprising word classes as diverse as conjunctions, adpositions, discourse markers, and adverbs. The word class membership of a given DC does not only depend on the connector’s status in a given language, i.e. it is not only language-specific, but also often on individual authors’ terminological preferences. It is, therefore, necessary to stipulate a broad functional category including all devices that indicate the semantic and/or syntactic relationship between two or more propositions explicitly (cf. Müller 2018; Stolz in press).

2.1 Adversative connectors and English but Adversative discourse connectors form a special subclass of DCs. For their definition, I avail myself of Stolz’s (in press) account based on a survey of adversative constructions across the world’s languages. The syndetic relationship between two propositions (connect 1 + connect 2) is marked explicitly by an adversative connector whereby “der Inhalt des einen Konnekts mit dem Inhalt des mit ihm verbundenen anderen Konnekts in wenigstens einer Bedeutungskomponente nicht übereinstimmt” [‘the proposition of one connect does not conform to the proposition of the other connect in at least one meaning component’] (Stolz in press). In other words, the connector signals that the interpretation of connect 1 contrasts with the interpretation of connect 2. The three parts together form a grammatical construction in discourse. Languages may have a number of connectors that belong to the functional realm of adversativity. In English, an array of adversative connectors can be identified: but, alternatively, although, contrariwise, contrary to expectations, conversely, despite (this/that), even so , however, in spite of (this/that), in comparison (with this/that), in contrast (to this/that), instead (of this/that), nevertheless, nonetheless, (this/that point), not-

On the borrowing of the English adversative connector but | 187

withstanding, on the other hand, on the contrary, rather (than this/that), regardless (of this/that), still, though, whereas, yet (Fraser 2009: 300).

Given this multitude of expressions, the scope of the research object needs to be sufficiently focused. Thus, only the most frequent6 and prototypical adversative connector, referred to as the primary member of the class by Fraser (2009: 301), is investigated here: the conjunction but. This procedure follows foregoing studies on the borrowing of adversative connectors where translational equivalents of but, e.g. German aber, Spanish pero, French mais, and their grammaticalization are considered (on borrowing: cf. Stolz and Stolz 1996, 1997; Matras 1998; Grant 2012; Stolz in press; on grammaticalization: cf. Mauri 2008; Giacalone Ramat 2012; Giacalone Ramat and Mauri 2012). The structural templates for the BUT-construction types identified by Stolz (in press) based on a cross-linguistic comparison are shown in Figure 1. Connect 1 is represented by P, connect 2 by Q, and the adversative connector by ℛ. For language examples and their discussion see the respective publication. ADVERSATIVE  {[P ℛ Q], [P Q ℛ], [[P ℛ] Q], [P [x y]ℛ Q], [P Q-ℛ], [P Q]}

Figure 1: Types of BUT-constructions.

Crucially, languages differ as to the linear sequence of P, Q, and ℛ as well as to the boundedness and complexity of the connector. Also, as mentioned for Pipil in the introduction, adversativity between two propositions need not be indicated by an overt marker, simple juxtaposition suffices. The English conjunction but prototypically surfaces in a [P ℛ Q] construction. An example is given in (1)7. (1)

but in a [P ℛ Q] construction [The woman likes the man]P butℛ [she does do want to marry him]Q.

As a free morpheme, but appears as an intersentential conjunction and establishes a relation between the P that precedes it and the Q that follows it. Adversativity is instilled into the latter. Of course, but has other incidental or non-defining properties and is decisively polyfunctional, e.g. it can be used as a discourse marker with various meanings (cf. Fraser 2009: 301–305, 310–316). || 6 In the Corpus of Contemporary American English (COCA; Davies 2008–), but boast a frequency of over 4,500 words per million, whereas yet or however feature only approximately 340 and 330 words per million, respectively. A similar picture arises from searches in the British National Corpus (BNC; Davies 2004) where but has 4,339, yet 335, and however 590 hits per million. 7 The bracket notation used to group constituents into connects follows Stolz (in press). The notation is also used in examples of additive and disjunctive coordination.

188 | Nicole Hober

For the purpose of this study, the focus lies on the borrowing of its core meaning and main function as an adversative conjunction. Where feasible, I take evidence of other functions into account.

2.2 Borrowability of categories: Where does the adversative connector fit in? There has been a longstanding interest in the borrowability of categories and contingent universals of MAT-borrowing. Generalizations on the borrowability of categories put forward in the literature are either based on frequency, i.e. how often the respective categories have been borrowed cross-linguistically, or implicational, i.e. if category X is borrowed then category Y is borrowed as well. Two non-implicational hierarchies are exemplarily given in Figures 2 and 3. While the proposals by Thomason (2001) and Matras (2007) present slight differences as to the borrowability of individual categories, it becomes evident that conjunctions, adverbial particles, and discourse markers, all subsumed under the notion of DCs in this approach, rank among the most frequently borrowed categories with only little intense contact required. casual contact: nouns, verbs, adverbs, adjectives > slightly more intense contact: conjunctions, adverbial particles > more intense contact: pronouns, numerals, derivational affixes > intense contact: inflectional affixes Figure 2: Borrowing hierarchy considering intensity of contact (cf. Thomason 2001: 70–71).

nouns, conjunctions > verbs > discourse markers > adjectives > interjections > adverbs > other particles, adpositions > numerals > pronouns > derivational affixes > inflectional affixes Figure 3: Borrowing hierarchy based on frequency (Matras 2007: 61).

Numerous cases of the borrowing of conjunctions have already been reported in the literature (cf. Section 2.3). The proposed explanations as to why conjunctions are so frequently borrowed are based on structural/functional, social, or, more recently, cognitive-pragmatic factors. As usually free morphemes at clause boundaries, conjunctions are readily borrowed and integrated into the recipient

On the borrowing of the English adversative connector but | 189

language’s structure. Traditionally, gap-filling motivations were popular. But, empirical data debunked the claim attesting to several cases of borrowed conjunctions in languages that already had equivalents. This led scholars, amongst others Stolz and Stolz (1996), to shift the focus to prestige-related accounts according to which conjunctions and other DCs are borrowed to imbue authority and high status upon the speakers of the recipient languages. Wary of social explanations, having observed that elements were also taken from source languages of low social prestige, Matras (1998, 2009, 2012) suggests that the true explanation can be found on a cognitive and pragmatic level. In Matras’ cognitive-pragmatic approach, communicative interaction is understood as a goal-oriented task. Bilingual speakers have to balance the selection of appropriate linguistic structures and the (creative) exploitation of their linguistic repertoire (Matras 2012: 48). In this context, the source language in a situation of language contact is considered to be pragmatically dominant. Concerning the susceptibility of DCs, Matras (2009: 194) argues that their function in discourse – to monitor and direct the hearer’s participation in the interaction, and to process instances of a potential clash between hearer-sided expectation based on presupposition and the speaker’s message – […] makes connectors prone to selection errors, creates a free license for the insertion of foreign word-forms in the bilingual mode, and ultimately facilitates long-term borrowing.

Identifying selection errors in high-stake communicative situations to be a locus for the integration of DCs, a link between spontaneous innovation in bilingual speech and long-term contact-induced change is firmly established. Thus, Matras takes a step, deliberately or not, towards bridging the paradigm gap between language contact and bilingual language acquisition research by drawing strong parallels between borrowing and bilingual slips (and codeswitching) – we will touch upon the merits and drawbacks of this claim throughout the discussion. In any case, the suggestion presupposes that in any situation of prolonged bilingual interaction, connectors are among the prime candidates to be borrowed irrespective of the typology or the social dominance relation of the languages in contact. As for BUT’s borrowability within the category of conjunctions, crosslinguistic empirical evidence indicates that there appears to be a tendency for contrastive conjunctions to be borrowed more frequently across languages and earlier than the disjunctive kind which in turn are more frequently borrowed than those marking addition. This results in a borrowability hierarchy of BUT > OR > AND. Matras (2009: 194) applies the proposal to this tendency and states that BUT is most often borrowed because adversative contexts are “‘vulnerable’ due

190 | Nicole Hober

to the clash of expectations” and thus “speakers must ‘work hardest’ in order to sustain their authority”. This prediction is made explicit in one of the three scales determining the likelihood of DC borrowing, see Figure 4. The semantics scale stipulates that DCs of contrast, restriction, or change are borrowed more often than DCs of addition, elaboration, continuation, i.e. but is more susceptible to borrowing than or and both are more susceptive to borrowing than and. The categorysensitive scale dictates that less lexical or deictic DCs are more likely to be borrowed than more lexical and deictic DCs, e.g. well is more susceptible to borrowing than then. Last, the pragmatic scale specifies that turn-related DCs are more readily borrowed than content-related DCs, e.g. so is more susceptible to borrowing than although. The central idea underlying the scales is what Matras (1998: 309) calls the “principle of detachability” according to which “[g]rammatical elements that organize the speech event are perceived as gesture-like, situation-bound devices and are therefore detachable from the content message of the utterance”. Effectively, DCs are good candidates for borrowing because of their communicative function and easy integration into another linguistic system. semantic scale: category-sensitive scale: pragmatic, “operational” scale:

contrast, restriction, change > addition, elaboration, continuation less lexical or deictic > more lexical or deictic more turn-related > more content-related

Figure 4: Likelihood of DC borrowing (Matras 1998: 309).

Overall, based on these previous insights and the cognitive-pragmatic explanation, one would expect to find ample examples of borrowed but.

2.3 Connectors as loans in cross-linguistic perspective The borrowing of function words in general and DCs in specific continues to attract attention among scholars of language contact. While many cases of the borrowing of BUT, along with AND and OR, are reported, specifications as to whether they are instances of MAT- or MAT&PAT-borrowings are often missing. Only occasional comments are made and evidence presented clarifying the precise nature of borrowing (cf. Campbell (1989: 97) and Du Feu (1996: 86), establishing pero as a MAT&PAT in Pipil and Rapanui, respectively). However, the vast majority of cases appear to be MAT-borrowings. Both the co-existence of

On the borrowing of the English adversative connector but | 191

borrowed BUT alongside indigenous equivalents, e.g. in Punjabi (Bhatia 1993: 106) and Hindi (McGregor 1977: 182), and the successive replacement of the latter by the former, e.g. in Romani varieties (Matras 1998), are attested. PAT borrowing has received little attention so far although it may be seen in contactinduced grammaticalization, as in Pipil, or an increased frequency of overt ℛ coordinative constructions in the recipient languages, as proposed for Swahili and Vai (Mithun 1988: 253–253). PAT borrowing of DCs, especially as far as an increasing usage frequency is concerned, requires more detailed analyses and will thus only be mentioned in passing in the present contribution. Before delving into the discussion of this study’s findings, it is vital to outline what we already know about BUT, along with AND and OR, as loans from a cross-linguistic perspective to motivate our expectations and possible predictions concerning the borrowing of English but. Specifically, I will center the discussion around Spanish as a source language which in many ways, i.e. typologically and sociohistorically, is similar to English.

2.3.1

AND & OR

Haspelmath and Tadmor’s World Loanword Database (WOLD; 2009b) and the respective published volume would appear to be a good starting point for the investigation of borrowed BUT from a cross-linguistic perspective. Without a doubt, the project presents an invaluable contribution to the study of borrowing. However, while the function words AND and OR were included in the database as loanwords of interest for the 41 recipient languages (Haspelmath and Tadmor 2009a: 32), borrowings of BUT were not collected – a missed opportunity. Venturing from the assumption that the hierarchy of BUT > OR > AND holds, the languages for which the borrowing of additive and disjunctive connectors are confirmed are also very likely to attest to the borrowing of adversative connectors. Tables 1 and 2 give overviews of the languages that borrowed AND (borrowed score8 = 0.19) or OR (borrowed score = 0.39). The data is taken from the WOLD. The cases for which the borrowed status was indicated as ‘clearly borrowed’, ‘probably borrowed’, and ‘perhaps borrowed’ are listed. Information on the recipient language, the source language, the form of the loanword (both as found in the source language [input loan] and the recipient language [output

|| 8 In the WOLD, the borrowed score (0–1) measures the average borrowability of the words corresponding to a specific meaning, “the higher the average borrowed score of a meaning, the greater its borrowability” (WOLD; https://wold.clld.org/terms).

192 | Nicole Hober

loan]), and, where available, the effect (co-existence, insertion, replacement) are provided. To each overview, I added a column ‘borrowed BUT’ to test the above-mentioned prediction. Again, only available grammars and secondary descriptive resources were consulted. The picture thus remains fragmentary. Table 1: Borrowed AND in the WOLD (https://wold.clld.org/meaning/17-51#2/24.3/-7.8).

Recipient

Output loan

Source

Input loan

Effect

Borrowed BUT

Tarifiyt Berber u

Arabic

u

coexistence

yes: wælækin < Arabic lākinna (McClelland 1996: 29)

Kildin Saami

ja

Finnish

ja

n.a.

yes: ne, a < Russian no, a (Rießler 2007: 239)

Manange

rʌ

Nepali

ra

insertion

?no (Hildebrandt 2009: 455)

Ket

a, i

Russian

a, i

coexistence

yes: no, a < Russian no, a (Werner 1997: 318)

Indonesian

serta

Sanskrit

sārtha

n.a.

yes: (te)tapi < Sanskrit tathāpi (Tadmor 2007: 318)

Saramaccan

ku

Suriname Portuguese

com

replacement

yes: ma < Dutch máar (Smith 2012: 95)

Kali’na

nanka

Sranan

nanga

coexistence

no (Rose and Renault-Lescure 2008: 370–371; RenaultLescure 2009: 985)

Vietnamese

và

Old Chinese

*wA:

n.a.

yes: nhưng < Chinese réng (Alves 2007: 351)

Iraqw

nee

Bantu

na

replacement

no (Mous 1993; 2004)

Zinacantán Tzotzil

‘i

Spanish

y

replacement

yes: pero < Spanish pero (Brody 1987: 519)

On the borrowing of the English adversative connector but | 193

Table 2: Borrowed OR in the WOLD (https://wold.clld.org/meaning/17-54#2/24.3/-4.8).

Recipient

Output loan

Source

Input loan

Effect

Borrowed BUT

Swahili

au

Arabic

ʔ

au

n.a.

yes: ama, lakini < Arabic ˀammā, lākinna (Schadeberg 2009: 91)

Kanuri

â

Arabic

au or lâ

coexistence

yes: ʔàmmaa < Arabic ˀammā (Cyffer et al. 1991: 340)

Selice Romani vad’

Hungarian

vagy

replacement

yes: dë < Hungarian de (Elšík 2007: 273)

Kildin Saami

ele vaj

Russian Finnish

íli vai

n.a. n.a.

yes: ne, a < Russian no, a (Rießler 2007: 239)

Bezhta

ya yagi yałuni

Avar or Persian Avar Avar

ya yagi yałuni

n.a.

n.a.

Archi

ja

Persian

ya

n.a.

?no (Ganenkov and Maisak 2021: 134– 135)

Ket

ili

Russian

íli

insertion

yes: no, a < Russian no, a (Werner 1997: 318)

Vietnamese

hoặc

Chinese

huò

coexistence

yes: nhưng < Chinese réng (Alves 2007: 351)

Ceq Wong

ʔantaw

Malay

atau

n.a.

n.a.

Indonesian

atau

Malay

atau

n.a.

yes: (te)tapi < Sanskrit tathāpi (Tadmor 2007: 318)

Yaqui

o

Spanish

o

insertion

no (Estrada Fernández 2009: 837)

Otomí

o

Spanish

o

coexistence

yes: pero < Spanish pero (Hekking and Muysken 1995: 105)

Imbabura Quechua

o

Spanish

o

n.a.

yes: pero < Spanish pero (Gómez Rendón 2007: 506–507)

Hup

ʔó

Tukano or Portuguese

ʔo or ou

insertion

no (Epps 2008: 816– 818)

Gurindji

waku

Jaru

waku

n.a.

n.a.

Takia

o

Tok Pisin or English

o or or

replacement

no (Ross 2002b: 242, 2009: 763)

194 | Nicole Hober

Tables 1 and 2 show that the predictions made by the BUT > OR > AND hierarchy largely hold. Nevertheless, there are also a not negligible number of exceptions. Despite the borrowing of OR, Yaqui, Hup, and Takia did reportedly not borrow BUT. Other such cases might include Archi, Bezhta (but cf. Stolz and Levkovych this volume), Ceq Wong, and Gurindji. Here, the descriptions do not offer sufficient evidence. Furthermore, Iraqw, Kali’na, and possibly Manange did not borrow BUT or OR despite borrowing AND rendering the claim “no languages borrow ‘and’ without also borrowing ‘but’ and ‘or’” (Matras 2007: 54) invalid, or at least calling for further investigation. Of course, a lack of adequate, balanced, and more substantial data might skew the picture. Moreover, contact is often still ongoing, and the borrowing of the adversative connector might be registered in the near future. In any case, the alert reader will notice that borrowings from English are hardly anywhere to be found. English is among the 395 source languages in the WOLD. And although it is featured with overall 1,103 entries, i.e. there are 1,103 cases where a language borrowed an item from the English lexicon, English function words are generally not among the loans. A possible exception is Takia which might have borrowed the English disjunctive connector or indirectly via Tok Pisin (borrowed status = ‘perhaps borrowed’). Unfortunately, Ross (2009) does not provide an illustration of the sentential usage, and I also did not come across any examples elsewhere. As Takia does not have any conjunctions for coordination, but resorts to parataxis, conjoining, and clause chaining (Ross 2002b: 241), the proposed borrowing of o(r) would constitute a case of MAT&PATborrowing. In (2)9, an example of the indigenous disjunctive construction is given. The clause boundary of P is marked by an enclitic =k which may also be omitted. The precise coordinative relation between the two sentences can only be inferred from context. (2)

Takia [Mao mi-sapal da=k]P taro 1PL.EXCL-mix IPFV=BDRY ‘We mix taro or we mix bananas.’

[fud banana

[Ross 2002b: 241] mi-sapal da]Q 1PL.EXCL-mix IPFV

|| 9 For the interlinear abbreviations, the Leipzig Glossing Rules were applied throughout. Whenever no abbreviation suggestions were provided, the most commonly found option was chosen. All abbreviations are listed below. The interlinear glossing from the original sources was largely kept and only adjusted to conform to the Leipzig Glossing Rules. A small number of modifications were thus made. For cases where the glossing is entirely mine, an indication can be found in a footnote. Whenever the interlinear glossing and the idiomatic translation were given in a language other than English in the original source, the English equivalent was provided instead.

On the borrowing of the English adversative connector but | 195

In Takia’s Oceanic sister languages Kairiru and Sudest, the borrowing of o from English or via Tok Pisin is also reported (Ross 2002a: 215; Anderson and Ross 2002: 330, 345). While I can give no text example for Kairiru, the usage of o in Sudest is shown in (3). In contrast to Takia, Sudest already had other intersententially occurring coordinating conjunctions, na ‘and’ (also borrowed from Tok Pisin na) and ko ‘but’. Thus, o in Sudest is a MAT-borrowing. (3)

Sudest [Anderson and Ross 2002: 345] [Methï=wo-giya e ghïn]P oℛ [methï=vareghare?]Q IMM.PAST.3PL=carry-give PREP 2SG or IMM.PAST.3PL=keep ‘Did they give it to you or did they keep it?’

The borrowing of BUT is not found in Takia (Ø) or Sudest (ko ‘but’). However, in Kairiru, the adversative connector tab is reported (Ross 2002a: 215). The function word might be derived from Tok Pisin tasol ‘but’ which, in turn, comes from English that’s all. There are potentially a few additional examples of English function word borrowing in the literature. According to Grant (2012: 333, 348), Garifuna, as spoken in Belize, has borrowed English an < and. Grant does not give any examples of the usage, and it is also unclear where the analysis is taken from. Cayetano (1993) and Taylor (1958, 1977) are cited as sources. Cayetano’s dictionary on Belize Garifuna gives the conjunction ani ‘and’ and the preposition luma ‘with (him), and’ as markers of AND coordination (Cayetano 1993: 65, 96); it is not specified that ani would be borrowed from English. In Suazo (2002: 221), an ti is mentioned as encoding ‘and, but’. Taylor (1958: 39) discussing Garifuna’s close sister language Island Carib writes that AND takes the form ábą. Again, no specification as to an apparent English origin is given. Haurholm-Larsen (2016: 251–253) states that additive coordination between two independent declarative clauses is marked by an unstressed clause-initial proclitic aban= ‘and’ in Honduras Garifuna. Although there might be two other function words that were borrowed from English, i.e. do < though and den < then (Haurholm-Larsen 2016: 189), aban= is not named among the English loans. Thus, the analysis of AND as borrowed from English into Garifuna is uncertain. The picture on OR is similarly inconclusive. While Haurholm-Larsen (2016: 81) asserts that the only way to express disjunction is via Spanish o, the indigenous form o(di) ‘or’ is listed as a disjunctive conjunction in Cayetano (1993: 76), Suazo (2002: 208), and Grant (2012: 333). Rather than presenting a case of borrowing, Garifuna o might merely be the phonologically reduced variant of odi – a reduction possibly accelerated by Spanish. By contrast, BUT is clearly taken from Spanish. In Garifuna, Spanish pero co-exists with the

196 | Nicole Hober

morphologically complex ahe-yn. Both the loanword and indigenous equivalent are used in free variation to mark adversative coordination. In Upper Kuskokwim Athabaskan, English o < or is sometimes used to express disjunction (4a) (Kibrik 2004: 548–551). Despite the lack of an indigenous way to encode disjunctive coordination, additive coordination is marked by the particles ts’e‘ (4b) and ‘ił. The particles ‘edinh (4c), chu‘(da), or deno, encode adversativity. As the [P ℛ Q] pattern is already firmly established within the realm of coordination, we are dealing with a case of MAT-borrowing only here – but, see the discussion on Isthmus Zapotec (5) and Supyire (8) below on why I consider such cases to be bordering on MAT&PAT-borrowing. (4) a.

b.

c.

Upper Kuskokwim Athabaskan [Kibrik 2004: 547, 543, 549] [help k’a ‘iszriłc]P oℛ [sileka dilghwsr]Q help want I.am.yelling or my.dogs they.howl ‘I was calling for help or my dogs were howling.’ [“hondenh ghwla‘ sidadza‘” yinezinh]P ts’e‘ℛ where unknown my.sister he.thought and [hwts’its’ay’nełghwt]Q he.took.off.pulling.a.sled ‘He wondered where his sister was and took off with a sled.’ [hiyoko tsiłdilghwsr]P ‘edinhℛ [mikwl]Q for.her they.are.sobbing but she.is.gone ‘They are bemoaning her but she is gone.’

All in all, the resistance of English function words to borrowing presents an interesting conundrum. I repeat my question from the introductory paragraph: Why would the language that exerts such a strong influence on the lexicon of its contact languages elsewhere give so little evidence of lending its connectors? – especially if connectors are among the most frequently borrowed elements (cf. Figures 2 and 3). By comparison, Spanish, featured as a source language with 2,014 entries in the WOLD, lends its additive and/or disjunctive connector to Otomí, Imbabura Quechua, Hup, and Zinacantán Tzotzil. Except for Hup, the borrowing of pero is also attested in these languages. We will see shortly that the borrowing of Spanish pero is indeed ubiquitous (cf. Stolz et al. 2021). This insightful quote taken from Epps (2009) on contact between Hup and Tukano provides a window into a possible sociolinguistic reason for the absence of borrowing in some cases. Although Tukano brought about contact-induced changes in Hup in several ways, these changes generally belong to the realm of PAT rather than MAT-borrowing. Epps (2009: 1010) argues that

On the borrowing of the English adversative connector but | 197

[t]his is undoubtedly due to the relative salience of form to speakers, whereas they tend to be much less conscious of categories and patterns. A very similar state of affairs is described for Tariana, whose speakers have experienced long-term bilingualism in Tukano and cultural constraints against language mixing much like those experienced by Hup speakers.

Such purist attitudes may be part of the explanation for a few scenarios involving English as a source language. But, how can we reconcile them with the countless borrowings of nouns, verbs etc.? Even Hup, despite its reluctance to borrow forms, has taken 49 words from Tukano. Perhaps, they are all motivated by lexical gaps. In any event, the sociolinguistic dimension of language contact must not be discarded entirely. I will return this issue in Section 4.

2.3.2

BUT

In Matras (2009), Grant (2012), Stolz (2007, in press), Stolz et al. (2021), and Stolz and Levkovych (this volume) a large number of BUT borrowings are collected. Although the collections present snapshots and not exhaustive lists of the evergrowing database on function word borrowing, the findings illustrate several aspects about the borrowing behavior of BUT. First, its borrowing is indeed ubiquitous and attested repeatedly across continents and phyla. Evidence of some 25 source languages and, depending on whether a token language is classified as a variety or separate language, 150–165 recipient languages is given. The total of recorded BUT borrowings is slightly higher than the number of recipients due to repeated borrowing from different languages. Source and recipient languages come into contact in Africa, the Americas, Eurasia, and Oceania. Second, and more importantly, the constellation of contact languages and their typological profile does not appear to play a critical role in the diffusion of adversative connectors. Source languages belong to various phyla, i.e. Slavic, Chukotko-Kamchatkan, Germanic, Hellenic, Indo-Aryan, Kwa, Mande, Romance, Semitic, Sinitic, Turkic, and Uralic. The genealogical affiliation of the recipient languages is even more diverse. For instance, Spanish pero alone has found its way into Araucanian, Arawakan, Austronesian, Mayan, Mixe-Zoquean, OtoManguean, Quechuan, Tarascan, Tequistlatecan, Totonacan, Tupian, and UtoAztecan, as well as the isolates Huave and Mosetén (cf. amongst others, Brody 1987; Stolz and Stolz 1997; Stolz et al. 2021). Isthmus Zapotec constitutes an example of a recipient language that borrowed Spanish pero. The Oto-Manguean language spoken in Mexico features a phonologically adapted form of the connector, i.e. peru, surfacing in the general pattern of marking adversative construc-

198 | Nicole Hober

tions overtly by a conjunction, compare (5a) and (5b). The employment of peru is not obligatory; there is variation between the encoding of adversativity via juxtaposition [P Q] and the BUT-construction [P ℛ Q]. One can argue that we are dealing with a case of MAT&PAT-borrowing here if we think of adversative coordination as independent from the two other types of coordination. In this view, PAT applies partially because the pattern of overt coordination for adversative constructions was borrowed along with pero – even though the overall pattern of overt coordination was not borrowed.10 Indeed, coordination by means of overt marking was already attested in Isthmus Zapotec with the additive ne ‘and’ (5c) as well as the disjunctives pacaa and o11 ‘or’. For additive coordination, zero-juxtaposition is also an option (5d). It would perhaps be misleading to analyze borrowed pero as a genuine MAT&PAT as the pattern itself was already present in Isthmus Zapotec. However, if it would be analyzed as a MAT-borrowing only, we would gloss over the difference between cases like Isthmus Zapotec where zero-juxtaposition was the default in adversative coordination and cases where the [P ℛ Q] construction was already employed with an indigenous adversative connector, as applies to Cebuano (6) discussed below. Concerning the rise of [P peru Q] in Zapotec Isthmus, I assume an interplay of internal and external factors. (5) a.

b.

c.

d.

Isthmus Zapotec [Pickett and Embrey 1974: 121; Pickett et al. 2001: 102, 103] [Či uǰi’ba Juan giiña’]P Ø [naa ko‘]Q go.INTEN sow Juan chilli Ø 1SG NEG ‘Juan is going to sow chilli but I am not.’ [Ké zuǰiiba Juan]P peruℛ [naa zuǰiiba’]Q NEG sow.FUT Juan but 1SG sow.1SG.FUT ‘Juan is not going to sow but I will.’ [Ti dxi guyé Lexu ra nuu Diux]P ART.PL day COMPL.go rabbit LOC ART.PL God neℛ [rábime l aa]Q … and HAB.say.3SG 3SG.M ‘One day the Rabbit went were God was and said to him: …’ [Bireebe]P Ø [zebe]Q COMPL.leave.3SG Ø COMPL.go.3SG ‘He left and went away.’

|| 10 A similar argument is made for Supyire, see (8). 11 The connector o ‘or’ as one of the two options expressing disjunctive coordination in Isthmus Zapotec also constitutes a borrowing from Spanish.

On the borrowing of the English adversative connector but | 199

While the Bisayan language Cebuano spoken in the Philippines has borrowed pero as MAT too, an indigenous adversative connector existed prior to contact. Here, the Spanish conjunction is slowly but surely replacing the indigenous equivalent apan ‘but’. Tanangkingsing (2009: 93) argues that apan is now considered formal and largely restricted to written registers, while pero is the preferred choice in conversation. Examples (6a) and (6b) substantiate the claim. The former is taken from an article published in the Philippine newspaper Sun Star. The latter is an excerpt from a recorded conversation. (6) a.

b.

Cebuano [Tanangkingsing 2009: 429, 34] [gi-pangayoʔ-an ang kompanya]P apanℛ [walaʔ kini PFV-ask.for-LV NOM company but NEG this mi-hatag]Q AV-give ‘(The caller) asked for (bribe) from the company but this (company) didn’t give.’ [ganahan=gyud=ko ana-ng restaurant-a oy]P peroℛ like=EMPH=1SG.NOM that-LK restaurant-DEF INTERJ but [mahal mga pag-kaʔon diraʔ]Q expensive PL NMLZ-eat there ‘I really like that restaurant oy, but (the) food there is expensive.’

In such cases of adversative connector doubling, it appears that there are different possible developmental trajectories in terms of the diffusion of BUT. Either the loan relegates the indigenous word to informal, spoken registers (‘change from above’; often found in ‘translatese’), or the loan itself becomes the preferred choice in conversation and the indigenous word becomes increasingly restricted to formal, written registers (‘change from below’). Elsewhere, a functionally determined division of labor between indigenous and borrowed grammatical words may be observed, as demonstrated for Supyire below (8).12 In the Oceanic language Rapanui spoken on Easter Island, pero constitutes a genuine case of MAT&PAT-borrowing, contrary to the borderline case of Isthmus Zapotec. All three coordination types were exclusively expressed by zerojuxtaposition before contact with Spanish (Du Feu 1996: 84–88). Pero was borrowed to mark adversative, e < y to mark additive, and o to mark disjunctive coordination. The indigenous adversative [P Q] structure is illustrated in (7a), while [P pero Q] is shown in (7b). Speakers may choose between the two options to express adversativity. Other adversative particles conveying more nuanced || 12 This claim can be extended to and tested against other instances of grammatical borrowing.

200 | Nicole Hober

and specific shades of adversativity are also found, e.g. ‘ina ta’au ‘all the same’ or mau ena ‘to tell the truth’ (Du Feu 1996: 86). Whether these were also options before the introduction of pero is not specified by Du Feu. Notice further that e borrowed as an additive marker from Spanish y ‘and’ can also be used to mean ‘well, then, but’, as exemplified by (7c). Similar discourse functions of y are also attested in Spanish (Real Academia Española 2010: 608–613, 632, 900, 903, 922).13 The polysemy of e in Rapanui is therefore either based on the Spanish model, or we are dealing with a functional extension of the loanword after its integration. In any case, the polysemy or ambiguity of coordinating connectors is found in a number of genealogically diverse languages, e.g. Daakaka (Austronesian) a ‘and, but’ (von Prince 2015: 286) or Russian (Slavic) a ‘and, but’ (Stolz in press; Stolz and Levkovych this volume). (7) a.

b.

c.

Rapanui [Du Feu 1996: 86–87] [He to'o mai te take he hor te take era ai he ACTN take ALL DEF root ACTN cut DEF root POST.DET DEM ACTN hoa]P Ø [he to'o mai te uru he 'oka haka'ou]Q throw Ø ACTN take ALL DET shoot ACTN plant again ‘You take the root, you cut the root and throw it away but you take the shoot and plant it again.’ [E ai ro a te puka inei]P peroℛ [he haka STAT EXI REAL RES DEF book here but ACTN CAUS reo~reo mai]Q lie~RED ALL ‘The book is here but they are telling us lies about it.’ [Mo rahi o te taŋata mo kai ka oti te kai]P BEN much POSS DEF man BEN eat MOM finish DEF food eℛ [mo ta’e rahi ‘ina ko oti]Q well/then/but BEN NEG much NEG NEG finish ‘If there are a lot of people for the meal it is all eaten up, but if not, it is not finished.’

The discussed examples of Isthmus Zapotec, Cebuano, and Rapanui demonstrate that Spanish pero can be borrowed as MAT or MAT&PAT. I did not come across cases of exclusive PAT borrowing. Still, as the grammaticalization of wan ‘and’ from a relational noun i-wan ‘POSS.3SG-with’ in Pipil based on the Spanish model shows (cf. Campbell 1987, 1989: 97), PAT borrowing of connectors, in the form of contact-induced grammaticalization, is theoretically possible. What is || 13 I am grateful to Thomas Stolz for bringing the polysemy of Spanish y to my attention.

On the borrowing of the English adversative connector but | 201

more, the structure of the recipient languages and their coding strategies for coordination appear irrelevant to the possibility of borrowing a function word from the same source language, although it seems that borrowing is more frequently found for languages that already had an equivalent function word, i.e. cases of MAT-borrowing predominate. Region and genealogical affiliation appear to be immaterial parameters. The individual contact scenarios point to the diversity in function word borrowing which needs to be analyzed in the context of the wider functional domain of coordination. Moving on to the third general observation regarding the borrowing of BUT, a recipient language may take BUT from more than one source language and thus show repeated borrowing of the same function word. The Volta-Congo language Supyire borrowed both Bambara ŋ̀kàà < ǹka and French mɛ̀ɛ̀ < mais (cf. Stolz in press); the borrowing of the former preceded the borrowing of the latter. Supyire already had an indigenous adversative connector sí (cf. 8a) which is special in that it follows the subject in Q.14 The loans mɛ̀ɛ̀ and ŋ̀kàà, by contrast, surface as intersentential connectors and thus precede the subject in Q. Again, the intersentential position of connectors itself is not alien to Supyire and found in additive and disjunctive coordination. As for Isthmus Zapotec above (cf. 5), I suggest that the older loan ŋ̀kàà was borrowed as a MAT&PAT, where PAT partially applies because the pattern of overt adversative coordination was borrowed along with the loan. French mɛ̀ɛ̀ constitutes a case of MAT only. Although all types of coordination are typically expressed by juxtaposition in Supyire, the narrative conjunctions kà and mà ‘and then’ as well as the disjunctive conjunction làa ‘or’ can alternatively be employed and are used intersententially (Carlson 1994: 596– 589). The pattern of the Bambara (and French) loan has played a role, perhaps together with the analogical extension of the intersentential usage pattern of kà, mà, or làa, in the spread of the [P ℛ Q]-type in Supyire adversatives. Notice further that the Bambara-derived ŋ̀kàà, as the older and more integrated loan, can be used together with sí (cf. 8b), on its own (8c), or together with the French loan (cf. 8d). French-derived mɛ̀ɛ̀, as a more recent loan, does not (yet) occur alone.

|| 14 Supyire sí constitutes an example of Wackernagel’s Law according to which certain clitics occur as the second constituent within their clause (cf. Goldstein 2014). Based on this example, one could add yet another adversative construction type to the list shown in Figure 1 whereby the connector stands Q-internally (or theoretically also P-internally) yielding a structure akin to [P Qℛ] (or [Pℛ Q]). At present, the only two other attestations of this type in clausal coordination I came across include δε ‘but’ in ancient Greek (Smyth 1956: 484) and =hno ‘and’, =sgini ‘but’, and =le yigi ‘or’ in Cherokee (Pulte and Feeling 1975: 343; cf. (27)) which does arguably not provide sufficient justification to posit the additional type. For the time being, the Supyire structure falls under the category of Q-relator disregarding its exact position.

202 | Nicole Hober

(8) a.

b.

c.

d.

Supyire [Carlson 1994: 593–594] [U yyáhe ɲyɛ á tɔɔn mέ]P [ku síℛ ɲyɛ a pèla 3SG face.DEF NEG PERF long NEG it but NEG PERF be.fat a tòrò mέc]Q SVC pass NEG ‘Her face isn’t long, but/on the other hand it isn’t very fat (either).’ [Νá u à pa nɔ wùù cévóó wí]P ŋ̀kààℛ if 3SG PERF come man 1PL.POSS friend COP but [u sí ká ḿ-pá ceewe wùù cwó wí]Q 3SG but COND INTR-come woman PL.POSS wife COP ‘If it is a boy, he will be our friend, but if it is a girl, she will be our wife.’ [U mpyi náhá]P ŋ̀kààℛ [u a kàrè mέŋ̀i i]Q 3SG was here but 3SG PERF go there.DEF LOC ‘He was here, but he went over there.’ [Yi ɲyɛ Sìrigè nùmbwuuní i]P ŋ̀kààℛℛ mɛ̀ɛ̀ℛ [Siŋkare 3PL COP Sirige head.gourd.DEF LOC but but Singkare ɲyɛ Sìrigè lakyááre]Q COP Sirige antidote.DEF ‘They are in Sirige's skull, but Sinkare is the antidote for Sirige (to get them out).’

Also note that according to Carlson (1994: 594–595), there is a functionally determined division of labor between ŋ̀kàà and sí. The former always indicates a strong contrast between P and Q, whereas the latter expresses weak adversativity. Fourth, international languages such as Arabic, Russian, Spanish, and French leave their lexical and structural marks on many languages around the globe (cf. Stolz et al. 2021; Stolz in press; Stolz and Levkovych this volume). Blake (2001) discusses current global trends in the developmental directions of the world’s languages’ grammars which are said to directly result from contact with international languages. At present, Spanish and English especially have a firm grip on the linguistic landscape. Consequently, Blake (2001: 1025) argues that European-typical features will emerge globally: “numerous languages are endangered and it is likely that those that remain will assimilate in various degrees to the type represented by languages such as English and Spanish”. Among these European-typical features, Blake lists the overt marking of coordination. As we have seen, there is ample evidence of BUT borrowed in contact, along with OR and AND. But, Blake’s claim not only suggests that the overall coding strategy of zero-juxtaposition is on the decline but also that other possible linear sequences, i.e. [P Q ℛ], [[P ℛ] Q], [P [x y]ℛ Q], or [P Q-ℛ], succumb to

On the borrowing of the English adversative connector but | 203

the pressure of the European-typical [P ℛ Q] pattern. Although the current data show that zero-juxtaposition is indeed making way to overtly marked coordination constructions (cf. Isthmus Zapotec in (5) and Rapanui in (7)), in most of these cases, one of the other coordination types, i.e. additive or disjunctive coordination, already had the overtly marked variant. The overall pattern within the coordination domain was thus already available to speakers and one cannot refer to PAT or MAT&PAT only but must factor in language-internal forces, i.e. analogical extension. Similar applies to the more specific trend towards [P ℛ Q] (cf. Supyire in (8)). The most frequently observed type of borrowing, however, entails languages that already had overt connectors (MAT-borrowing). Finally, a brief observational remark on English and Spanish, with a view to the borrowing of BUT needs to be made. For Spanish, a great number of borrowed peros have already been collected – similar applies to Russian and Arabic. Now, in addition to being exceedingly typologically close, the sociohistorical development and global spread of Spanish and English since the beginning of the 16th and 17th century appear similar at first sight.15 Thus, we might expect to observe comparable processes and outcomes of Hispanization and Anglicization. As far as the current data indicate this is not the case. The lack of borrowing of but is striking. One wonders: Why does but not behave like pero? Without a doubt, the representation and attestation of a phenomenon strongly depend on the size of the sample and the extent to which the languages or rather their contact situations are documented. No records of borrowed but are given in the descriptive literature on the borrowing of connectors. Perhaps the answer lies in the imbalance in description, or perhaps the infrequent borrowing has something to do with English but – or English as a contact language – itself. The upcoming sections will hopefully shed some light on the borrowing of the English adversative connector.

3 Evidence of borrowed but There is only limited evidence of borrowed but. The few cases that I came across in my scouring of databases, grammars, corpora and other descriptive second-

|| 15 Crucially, however, at least one potentially important parameter of contact is often different – time-depth. Given the limited scope of this paper, I cannot provide a detailed account of why and how time-depth might have contributed to the diverging pictures for English vs. Spanish function word borrowing. References to the parameter are, however, made at several stages of this paper (see e.g. Sections 3.4, 3.7, and 4.2).

204 | Nicole Hober

ary sources are MAT-borrowings and include the following recipient languages: Pennsylvania German, Low German, Louisiana French, Prince Edward Island French, Chiac, Michif, US Spanish, Shona, Angloromani, and Taglish. I did not find any instances of MAT&PAT-borrowing, and there is only thin evidence of PAT borrowing regarding the frequency- or productivity-increasing kind in Thai, Vai, Cherokee, and Swahili. There are several studies on codeswitching where but (and BUT) along with other connectors is scrutinized, I will turn to those at different stages in Sections 3 and 4 and comment on the relationship between connectors in borrowing and codeswitching. Table 3 summarizes the findings. In addition to the parameters of ‘language’ and ‘loan’, the loan type (MAT, MAT&PAT, PAT), the indigenous function word equivalent(s) and the adversative constructions in which they appear (cf. Figure 1), as well as the effect are given. The feature ‘effect’ is divided into the two values ‘replacement’ and ‘co-existence’. Where possible, I specify the degree of the co-existence (LWINDW ‘the loan is more frequent than the indigenous word’, LW=INDW ‘the loan and indigenous word are equally frequent’). Table 3: Borrowing of English but.

Language

Loan

Loan type

Indigenous equivalent(s)

Construction type

Effect

Pennsylvania German (Fuller 1999, 2001)

but

MAT

aber

[P ℛ Q]

co-existence (LW OR > AND postulated by Matras (1998: 301–305) according to which adversative loan conjunctions precede alternative loan conjunctions which in turn are borrowed before combination conjunctions.7 In terms of explanation, Matras (2009: 136–145) focusses on the role of bilingual speakers and their repertoire of options which allows them to find the adequate choice of linguistic means in a given communication situation. The empirical illustration of the phenomena more often than not involves data from contemporary language contacts. Loan conjunctions are, however, nothing new, in a manner of speaking. Coptic borrowed seventeen conjunctions from Greek in antiquity as shown in Table 1 (Plisch 1999: 27–28). Grey shading marks the cells of the adversative and alternative loan conjunctions. Table 1: Loan conjunctions of Greek origin in Coptic.

Coptic

Greek

meaning

alla

allá

‘but’

ē

ē

‘or’

eimēti

ei mē ti

‘unless’

epei

epeí

‘because’

kaiper

kaíper

‘although’

kan

kan

‘even if’

mēpote

mēpote

‘so that not’

mēpōs

mēpōs

‘so that not’

šina~hina

hina

‘so that’

hōste

hōste

‘so that’

heōs

heōs

‘until’

hoson

hoson

‘as long as’

hōs

hōs

‘when’

hotan

hotan

‘when, if’

oute...oute

oute...oute

‘neither...nor’

|| 7 See Hober (this volume) for a thorough problematization of the implications connected to this hierarchy.

On loan conjunctions | 267

Note that (besides auō ‘and’) Coptic had several autochthonous conjunctions like ešōpe ‘if’, je ‘so that’, jn ‘or’, etc. which covered functions similar to those of the loan conjunctions (Plisch 1999: 26). What we can see is that Coptic meets the expectations of Matras’s above hierarchy since BUT and OR are among the loan conjunctions whereas AND is not affected by borrowing. What we also see is that the domain of loan conjunctions is by no means limited to the three categories involved in the same borrowing hierarchy. If it is possible to put forward an implicational pattern for the borrowing of BUT, OR, and AND, is it equally feasible to assume a hierarchy which goes beyond the ternary set of categories? Stolz et al. (2021) take the widely common borrowing of Spanish pero ‘but’ into languages of Mesoamerica as point of departure for determining whether parallel borrowing of a given item also means that the different replica languages make the same use thereof. It can be shown that there is a wide range of variation as to the uses the loan conjunction is put to in the replica languages independent of their genetic affiliation, typological classification, and geographic location. The differences are visible from token frequencies as well as from the range of relations for which borrowed pero is employed. This is a case of areal micro-variation because the sample languages are all spoken in Mexico. Moreover, the Mexican case has impelled us to scrutinize further areas with a small set of shared donor languages in contact with a sizable number of replica languages which represent different language families, types, and sub-regions. The Soviet Union as successor of the former Russian Tsarist Empire seems to meet the necessary criteria as is argued in Section 3. Given the shared interest of typology and language-contact studies in this subject matter, it is hardly surprising that proponents of the one approach also refer to the other research paradigm when they look into the structural and functional properties of conjunctions. It is also no surprise to see that others before us have studied conjunctions areal-linguistically and have put forward ideas as to the contact-borne diffusion of certain types of conjunctions across a given area.

2.2 Europe – from hardly any to many loan conjunctions Mauri’s (2008) book-length account of coordination relations is a case in point. Her focus is on the European situation but languages from outside this continent form also part of her sample. The author claims that across the 37 languages from Europe, there is a high degree of uniformity (in contrast to the less homogeneous picture resulting from her extra-European control sample). Especially in the western half of Europe, the label AND-BUT-OR-language can be ap-

268 | Thomas Stolz and Nataliya Levkovych

plied almost without exception (Mauri 2008: 289–293). This label indicates that a given language displays three distinct markers for the relations combination (= AND), contrast (= BUT), and alternative (= OR). Moreover, the languages which belong to this class make use of free (inter-conjunct) morphemes as markers, i.e. they employ conjunctions which resemble English and, but, and or, morphosyntactically. They thus reflect the canonical type. To explain the striking pan-European resemblances in this domain, Mauri (2008: 276) argues that [t]he overwhelming predominance of overt markers in the languages of Europe could be related to the high degree of written language tradition that characterizes this area. In principle, it could also be a consequence of the high degree of contact among the languages, which could have caused the borrowing of overt markers in those languages that did not have any, given that combination markers are easily borrowed across languages. However, the only two European instances of borrowing are attested in Dargi and Lezgian, which, besides their native combination markers, normally employ the Arabic connective wa. Borrowing phenomena of markers expressing combination are more widespread outside Europe.

Her assumption as to the almost total absence of borrowings in the domain of AND in Europe fits in with the mention of only two cases of borrowing of BUT, namely amma ‘but’ (< Arabic) in Dargwa (Mauri 2008: 281) and mee ‘but’ (< French mais) in Luxembourgish (Mauri 2008: 291). Moreover, there is no mention at all of any case of borrowing of OR. Superficially, loan conjunctions seem to be exceptional in Europe. On the basis of examples like those from the moribund Baltic language Curonian in (5), we take issue with Mauri’s conclusions. (5)

Curonian a.

[El Mogharbel 1993: 195]

AND

Bij gālts un štūo̯lis un bēņķis un be.PAST.3 table:NOM and chair:NOM and bench and skape cupboard:NOM ‘There was a table and a chair and a bench and a cupboard.’ b.

BUT

dāu̯g tie̯ nivarij ā:spæ̂lnat much there NEG:can:PAST earn:INF aber tie̯ me:s varijam dzîe̯vuo̯t but there 1PL can:PAST:1PL live:INF ‘You couldn’t earn much there, but we could live there.’

On loan conjunctions | 269

c.

OR

Â:du šlukas bij leather:GEN.PL clog:NOM.PL be.PAST.3 ‘The leather slippers were blue or green.’

blā:v o:der zaļ blue or green

Curonian gives evidence of the borrowing of und ‘and’, aber ‘but’, and oder ‘or’ from German, i.e. there are loan conjunctions in the domains of combination, contrast, and alternative. In (5a), the loan conjunction un (~unt) ‘and’ is employed three times to coordinate mono-word NPs. In (5b), aber ‘but’ joins two clauses whereas in (5c), o:der ‘or’ connects two predicative adjectives to each other. The Curonian usages of the loan conjunctions correspond to those reported for the donor language. Note that in Curonian, synonymous autochthonous conjunctions coexist with the borrowings from German, namely ir ‘and’ and va ‘or’ (El Mogharbel 1993: 195). In the case of Curonian alě ‘but’, a Polish origin is most likely (Polish ale ‘but, however’). As a matter of fact, Curonian is not the sole European language to borrow conjunctions. In the subsequent paragraphs in this section, we show that loan conjunctions are frequently attested in Europe. They are numerous enough to deserve an in-depth study of their own. For the time being, however, we content ourselves with providing an incomplete and unsystematic enumeration of pertinent cases without discussing the details related to them. The phenomenon is richly documented for the Romani languages. According to the survey provided by Matras (2002: 201), the same set of German conjunctions is attested in the Romani varieties Manush (un ‘and’, aver ‘but’, otar ‘or’) and Sinti (und ‘and’, aber ‘but’, oder ‘or’) with aber ‘but’ being attested also in Lovari (Germany). At the same time, other European members of the Romani branch of Indo-Aryan such as Lovari (Poland), Romungro (Slovakia), Gurbet (Serbia), Bugurdži, Polska Roma, and North Russian Romani use Slavic loan conjunctions for the three functions under review, viz. i ‘and’, a ‘and’, ale/ali ‘but’, no ‘but’, ili/czy ‘or’ from Polish, Russian or a South Slavic donor language. What is more, Hungarian vagy ‘or’ is attested as a loan conjunction in two of thirteen varieties included in Matras’s synopsis. Hungarian de ‘but’ is attested once (in Romungro (Hungary)) as are Greek i ‘or’ (Agia Varvara), Greek alá ‘but’, Albanian po ‘but’ (Bugurdži), French mais ‘but’ (Lovari – French (Norway)), and ham ‘but’ (Roman) whose etymological origin is probably Hungarian hanem ‘but’ or results from a blend with Persian ham (Hober this volume). The Balkans host also a plethora of cases of loan conjunctions. Sephardic gives evidence of the adversative loan conjunction ama ‘but’ (Marín Ramos 2014: 86) borrowed from Arabic via Turkish. In Aromanian, we find the same loan conjunction ama ‘but’ (< Turkish ama < Arabic amma) and in addition i ‘or’

270 | Thomas Stolz and Nataliya Levkovych

(< Greek i) (Caragiu Marioţeanu 1975: 256). In the same manual of Romanian dialectology, Megleno-Romanian is shown to borrow not only áma ‘but’ and em...em ‘not only...but also’ from Turkish (hem...hem < Persian hem...hem) but also ácu~túcu ‘if’ and dáli~ili ‘or’ from Bulgarian or Macedonian (ako ‘if’, ili ‘or’) (Caragiu Marioţeanu 1975: 286). Croatian ali ‘but’ is frequently attested in IstroRomanian (Kovačec 1968: 104–105). Macedonian and (colloquial) Bulgarian have borrowed ama/ami ‘but, however’ from Arabic via Turkish (FoulonHristova 1998: 242; Radeva 2003: 322). Bulgarian gives evidence of borrowed xem...xem ‘not only...but also’ and ja...ja ‘either...or’ from Arabic/Persian via Turkish (Radeva 2003: 321) – the latter correlative or, in Haspelmath’s (2007: 6) terminology, bisyndetic loan conjunction is also attested in Macedonian. For the Albanian diatopic system, Boretzky (1975: 246) registers ten loan conjunctions whose close donor is Turkish whereas some stem from the distant donors Arabic or Persian not all of which belong to the standard register of modern Albanian: am(m)a ‘but’, anxhak ‘however’ (< Turkish ancak), çynqi/çimçi ‘because’ (< Turkish çünkü < Persian çun ki), demek ‘thus’ (< Turkish demek), dilmi ‘because’ (< Turkish değilmi ki), gjyja ‘as if’ (< Turkish göya), hem...hem ‘not only...but also’, ja...ja ‘either...or’, madem ‘because, although’ (< Turkish madem ki ‘since’), vellakin ‘but, however’ (< Turkish velâkin < Arabic wa-laakin). Loan conjunctions abound also in the Italo-sphere, i.e. in the area of strong Italian influence and that of Italo-Romance varieties on co-territorial languages of different genetic background. As to Maltese, Mauri (2008: 291) only takes account of the adversative conjunction imma ‘but’ although the language also has iżda ‘but’ and, more importantly’ però ‘but, however’ – a loan conjunction corresponding to Italian però ‘but, however’. In her account of Italian-Maltese language contacts, Krier (1976: 104) mentions mentri ‘while’ (< Italian mentre) and the hybrid formation tant li ‘so that’ (< Italian tanto (che) + Semitic li ‘that’). For the Italo-Albanian variety of Falconara Albanese, Camaj (1977: 98) registers the loan conjunction o ‘or’ (< Italian o ‘or’). Stolz (2005: 49 and 55) identifies the Italian loan conjunctions modokè ‘so that’ (< Italian di modo che) and però ‘but’ for the Italo-Albanian variety of San Demetrio Corone and e ‘and’ (< Italian e), ma ‘but’ (< Italian ma), and semaj ‘so that’ (< Italian semmai ‘if at all’) for Slavomolisano. Further cases are mentioned in Stolz (2007: 92–93), namely o ‘or’ (< Italian o) in Italo-Greek and Slavomolisano, ma ‘but’ in Italo-Albanian and Italo-Greek, and again però ‘but’ in Cymbrian and Slavomolisano. In the Slavomolisano dictionary, there are further cases of loan conjunctions such as si ‘if’ (Breu and Piccoli 2000: 183). It is interesting to see that there are several complex formations like aje-ka ‘because’, dòp-ka ‘although’, fina-ka ‘until’, and zašto-ka ‘because’ (Breu and Piccoli 2000: 3, 32, 39, 253) which involve the gen-

On loan conjunctions | 271

eral complementizer ka ‘that’ whose origin is not entirely clear but it is conceivable that it is an adaptation of Italian che. The complex cases come in two different varieties: (a) two Romance elements are combined (Italian dopo ‘after’ and Italian fino ‘until’ + Italian che) and (b) a hybrid combination of Slavic and Italian yields the complex conjunctions (Slavic aje ‘because’ and zašto ‘because’ + Italian che). In the case of (b) it is also possible to drop ka without change of meaning. In addition, there is the correlative ne...ne ‘neither...nor’ from Italian né...né which co-exists with Slavic ni...ni (Breu and Piccoli 2000: 123). In Cymbrian, Tyroller (2003: 181–182) finds not only adversative ma ‘but’ and the general complementizer ke ‘that’ (< Italian che) but also the correlative ne...ne ‘neither...nor’ (< Italian ni...ni) and hybrid formations like dopo ass ‘after’ (< Italian dopo ‘after’ + Germanic ass ‘that’) and intanto ass ‘while’ (< Italian intanto ‘meanwhile’ + Germanic ass ‘that’). Similarly, Italo-Slovenian gives evidence of adversative ma ‘but’, parɔ́ ‘but’ (< Italian però), perké ‘because’ (< Italian perché), ki ‘that’ (< Italian che) and several binary loan conjunctions involving ki such as invé̤ci (ki) ‘in lieu of’ (< Italian invece di), apéna ki (< Italian appena ‘hardly’), dɔ́po ki ‘after’, fí̤n ki ‘until’ (< Italian fino che/finché), sebɛ́n ki ‘although’ (< Italian sebbene), sikome ki ‘because’, ší̤n ki ‘until’ (< Italian sino che), etc. (Steenwijk 1992: 173–181). French has also successfully exported conjunctions. Apart from Luxembourgish mee (< French mais ‘but’) we find a parallel in Breton. Breton mes ‘but’ goes back to Old French mais ‘but’ and was borrowed at a time when the final sibilant of the French conjunction was still pronounced (Rheinfelder 1967: 70). Ephemeral loan conjunctions are attested also for earlier stages of Maltese where Italian e ‘and’ and o ‘or’ occurred in religious texts of the 18th–19th centuries (Stolz 2007: 93–94)8 and, in the same genre, Romanian featured the Slavic loan conjunction i ‘and’ in the 16th–18th centuries (Dimitrescu et al. 1978: 361). None of these cases has survived into the contemporary varieties of the replica languages. There is thus ample evidence of loan conjunctions on European soil. The list is not closed yet. Its present size suffices, however, to prove that the borrowing of conjunctions is a widely common phenomenon at least in certain regions of Europe. We assume that Mauri’s failure to acknowledge this fact is mainly caused by the composition and size of her sample. Another reason might be that Heine and Kuteva (2006: 204–228) discuss the parallel grammaticalization of interrogatives to subordinators in European languages – and in this context PATborrowing is clearly in the foreground of their line of argumentation. What is || 8 Aquilina (1959: 322) still lists o ‘or’ among the Italian loan conjunctions in modern Maltese.

272 | Thomas Stolz and Nataliya Levkovych

more, Kortmann (1996) in his areal-typological study of adverbial subordination in the languages of Europe duly pays attention to grammaticalization processes and language-internal changes in the history of the subordinators (which fall under the rubric of conjunctions in our approach) but does not deeply look into the issue of MAT-borrowing. He cursorily mentions the existence of loan conjunctions especially in languages of the former Soviet Union and in the Balkans (Kortmann 1996: 50–51). In the conclusions, the author prognosticates that on the basis of a considerably enlarged European sample we may be in the position to say more about borrowings in the domain of adverbial subordinators (e.g. for which interclausal relations, in particular?), [] and thus about the significance of language contact for the development and composition of the inventories of adverbial subordinators in Europe. (Kortmann 1996: 350)

In Kortmann (1998: 503–504 and 554–555), we find references to the diffusion of Turkish çünkü ‘because’ (< Persian çun ki) into the Balkans and the Caucasian region as well as that of Greek-derived makar ‘although, even though’ throughout the Mediterranean9 and the borrowing of French en cas que/Spanish en caso que ‘if’ as enkas in the Navarro-Labourdinian variety of Basque.10 These isolated facts are suggestive of a rich phenomenology waiting for being explored. As it seems the story of conjunctions in Europe has not been told in its entirety since the chapter on loan conjunctions still needs to be written. Most of the cases presented in the foregoing paragraphs are well-behaved in the sense that they obey Matras’s above hierarchy. Borrowed BUT can be found practically everywhere in the languages we mentioned. If there is evidence of borrowed OR or borrowed AND, the adversative conjunction is also among the loan conjunctions of a given replica language. Moreover, if languages have borrowed conjunctions other than those of the ternary set of BUT, OR, and AND, then BUT is also a loan conjunction (but not necessarily OR and AND). It is important to note that probably in the vast majority of the cases reported above the loan conjunction does not fill a gap but enters a system in which there was already a functional equivalent prior to contact. The systems in contact were

|| 9 This case has recently been reopened by Ramat (2020) who traces the diffusion of the Greek function word from Lisbon to Bucharest and across what lies between. 10 As to the possibility of using inter-clausal conjunctions, Hurch (1989: 19) assumes that there is PAT-borrowing of the Spanish model in Basque. Instances of MAT-borrowing are not mentioned. Under the proviso that he might be witnessing codeswitching, Haase (1993: 153– 154) takes note of instances of French parce que ‘because’ and mais ‘but’ in spontaneous discourse of his Basque-speaking informants.

On loan conjunctions | 273

structurally similar to each other already from the beginning. Are conjunctions as borrowable as that also in situations where donor language and replica language differ in terms of their morpho-syntactic strategies in the domain of clause linkage, coordination, and subordination? To answer this question empirically, it suggests itself to focus on a comparably large area of long-term language contacts involving a limited set of donor languages and a much greater number of potential replica languages which represent different language types also in the domain under review. To our mind, the erstwhile Soviet Union and its internal linguistic diversity meet these criteria perfectly not the least because its territory spans the European East and the entire Siberian part of Asia. With its geographic share of Europe, the former USSR also provides an additional corrective for the point of view defended by Mauri (2008).

3 The languages of the Soviet Union In Comrie’s (1981) survey of the languages of the Soviet Union, there are several paragraphs in which subordination is mentioned. The scene is set with the following quote: Since much of the literature in languages of the U.S.S.R. is translated from Russian, there has also been Russian influence on the syntax of many languages, especially those whose basic syntactic structure differs most from Russian. Here we may mention in particular those languages where the basic means of expressing subordination is not by means of subordinating conjunctions, but rather by means of special verbal forms, usually verbal adverbs (gerunds), verbal adjectives (participles) or case-forms of verbal nouns (nominalisations). [T]his is particularly characteristic of Altaic languages, North Caucasian languages, and the more easterly Uralic languages. Under Russian influence, subordinating constructions have come to play a much more important role in such languages. In most instances it is not a case of actually borrowing a conjunction from Russian in its Russian form, but rather of calquing a conjunction in the language in question on the basis of the morpheme structure of the Russian conjunction []. Sometimes even the Russian form is borrowed. (Comrie 1981: 34)

Comrie’s line of argumentation suggests that Russian influence normally comes in the guise of PAT-borrowing whereas MAT-borrowing is the marked case. Anderson (2005) confirms the ever-increasing Russification of the autochthonous languages of South Central Siberia (with special focus on Khakas). For the easterly branches of Uralic, Comrie (1981: 134) assumes the emergence of finite subordinate clauses under Russian influence without mentioning loan conjunc-

274 | Thomas Stolz and Nataliya Levkovych

tions. This hypothesis receives support not only from Majtinskaja (1993: 29) but also from Riese’s (1998: 273) statement on the Russification of Permian syntax: The past decades have seen the rapid advance of the ‘Indo-European’ model of sentence extension via subordinate clauses, the catalyst being chiefly Russian. Today all Permian literary languages use a goodly number of both subordinating and co-ordinating conjunctions, many of which have been borrowed from Russian, e.g. и ‘and’ > Z[yrian]/Vo[gul] i, но ‘and, but’ > Z/Vo no, будто ‘as if’ > Z bit’t'e, Vo budlo. etc. [] Russian influence on Permian sentence structure is by no means new; it has simply become more marked in recent years. What Hungarian underwent in the Middle Ages is now occurring in Zyrian and Votyak, namely a radical restructuring of traditional sentence patterns.

According to Laanest (1982: 291), also Finnic languages (i.e. more westerly Uralic languages) have massively borrowed conjunctions from Russian: Die Konjunktionen gehören zu den am spätesten entstandenen Redeteilen. In der früheren Entwicklungsperiode der finnisch-ugrischen Sprachen wurden zusammengesetzte Sätze entweder überhaupt nicht verwendet oder ohne Konjunktionen. […] Im Karelischen, Wepsischen, Ingrischen und Wotischen werden viel aus dem Russischen entlehnte Konjunktionen verwendet.11

In Comrie (1981), loan conjunctions which conform to the canonical type put forward in the introduction are referred to four times: – in connection with the replacement of formerly well-established loan conjunctions from Arabic and Persian with Russian equivalents in Uzbek (Comrie 1981: 48), – more generally for Turkic languages which give evidence of loan conjunctions of Arabic and/or Persian origin (Comrie 1981: 84–85), – for the isolate Ket which is said to integrate Russian conjunctions as of late (Comrie 1981: 265–266), and – for Siberian Eskimo – a formerly conjunction-less language – which gives evidence of several loan conjunctions borrowed from Chukchi (Comrie 1981: 257). The high prestige and dominance of Russian over the centuries of Tsarist and ultimately Soviet rule notwithstanding, there are several other donor languages

|| 11 Our translation: ‘The conjunctions belong to the most recently developed parts of speech. On the earlier stages of development of the Finno-Ugric languages, compound sentences were either not used at all or without conjunctions. [] In Karelian, Veps, Ingrian, and Votic, many conjunctions borrowed from Russian are employed.’

On loan conjunctions | 275

which have contributed to the diffusion of loan conjunctions in the area of interest as transpires from Johanson (2002: 130) where the author states that [i]n older developmental stages especially, the Turkic languages show few free junctors and, above all, no subjunctors. […] The Turkic languages have also globally copied many Iranian, Arabic, Slavic and other free junctors and used the copies productively […]. The junctor ki, copied from Iranian, is the most widespread of these, serving everywhere as a very general kind of connector with a broad functional scope […]. Even contemporary Turkish almost exclusively employs copied free junctors: ne...ne ‘neither...nor’, çünkü ‘as, for’, gerçi ‘though’, eğer ‘if’, etc. As for the Turkic languages under Russian influence, Baskakov describes an ‘increase in the number of various auxiliary words, in particular of conjunctions, of which there were remarkably few in the old language’ (1960: 30). The same is true of many non-Turkic languages []. Karaim displays a number of globally copied Slavic junctors.

In Stolz’s (in press) paper on adversatives, the borrowability of BUT is prominently featured. Russian and Arabic can be shown to be particularly successful exporters of adversative conjunctions. In the case of Arabic, the diffusion of the conjunctions is predominantly a matter of intermediaries, i.e. Persian and Ottoman Turkish have been instrumental in the transfer of originally Arabic conjunctions to replica languages which have never been in direct contact with Arabic. Interestingly, the spheres of influence of Russian and Arabic collide and partly overlap in the southerly regions of the former Soviet Union – and this especially in the Caucasian region, Central Asia, and South Siberia. It is important to note that if we take the territory of the Soviet Union as our geographical frame of reference, this does not mean that the contacts between the donors and the replicas happened during the Soviet period. Far from that – many of the contact-induced processes were triggered many centuries before the foundation of the USSR. The start of the ultimately wide geographic distribution of grammatical Arabisms in these areas is largely an epiphenomenon of the Islamization whose point of departure coincides with the fall of pre-Islamic Persia in the 7th century and the subsequent Arabic intrusion into Central Asia in the 8th century. Russian entered the Asian scene in the 16th century and became an important factor in the Caucasus from the 18th century onwards. Intensive contacts with Uralic and Turkic languages in the European part of the Russian Empire started considerably earlier (13th century); the Baltic languages were exposed to Russian influence already long before the 18th century when Russia took effective control of the region. The chronological differences notwithstanding, it is clear that Russian, Arabic, Persian, and Turkish influence on the languages in the area of interest spans several centuries. However, the time-depth of the contact history

276 | Thomas Stolz and Nataliya Levkovych

does not allow us to assume that the loan conjunctions entered the scene already at its very beginning. As we will see in Sections 4–5, the presence of loan conjunctions in certain replica languages is a relatively recent phenomenon strongly connected to literacy. On Thomason’s (2001: 70–71) borrowing scale, many of the binary donor-replica combinations must have reached at least Stage 2 which assumes a [s]lightly more intense contact (borrowers must be reasonably fluent bilinguals, but they are probably a minority among borrowing-language speakers) [and the borrowing of] function words (e.g. conjunctions and adverbial particles []) (Thomason 2001: 70).

The important part of this quote for us is the mention of loan conjunctions. That the contacts between donors and replicas have advanced far beyond Stage 2 in many of the cases is, as yet, of no relevance for our project. The next step we take consists in characterizing the stock of conjunctions of the major donor languages (Section 3.1).

3.1 The donors This section is anachronistic in the sense that we present contemporary data whereas the contacts of the donors and replicas look back on a long history which initially involved varieties which might have looked very different structurally from their modern successors. This anachronism cannot be helped because we do not have sufficiently reliable diachronic data which would allow us to conduct a historically well-founded comparison of the changing properties of the languages under review in chronological order. However, we are firmly convinced that the results and insights we gain on this basis are reliable and make sense linguistically not the least because many loan conjunctions seem to be relatively recent borrowings. We open the presentation of the donor languages by way of sketching the system of conjunctions of modern Russian. This sketch and those which follow take account only of those conjunctions which consist of maximally two words (except correlatives). It is not our intention to account for all stylistic, diatopic, register-related, and quantitative aspects connected to the word-class conjunction in the donor languages. Moreover, it is possible that the replica languages use as conjunction a borrowed element which is not a proper conjunction in the donor language. The data presented in this section solely serve as informal reference point for the subsequent discussion in Sections 4–5. The lists given in (6), (8), (10), and (12) do not define the upper limit of donor-language input for

On loan conjunctions | 277

the processes which resulted in the creation of loan conjunctions in the replica languages. The inventory in (6) is extracted from Tauscher and Kirschbaum (1983: 408– 415). Several of the conjunctions are multifunctional but are registered only in one category such as a ‘but, and’ which is both copulative and adversative but mentioned only under the heading adversative. Similarly, we refrain from adding all allomorphs. (6)

Russian conjunctions (a) copulative: i ‘and’, da (i) ‘and also’, i...i ‘both...and’, kak...tak (i) ‘both...and’, ni...ni ‘neither...nor’, ne tol’ko...no i ‘not only...but also’, (b) adversative: a ‘but, and’, no ‘but’, (c) disjunctive: ili ‘or’, libo ‘or’, ili...ili ‘either...or’, libo...libo ‘either...or’, to...to ‘one moment...the next moment’, ne to...ne to ‘partly...partly, either...or’, to li...to li ‘partly...partly, either...or’, (d) temporal: kogda ‘when’, kak ‘when, since; as’, kak tol’ko ‘as soon as’, liš’ tol’ko ‘as soon as’, edva ‘as soon as’, poka ‘while, as long as’, (e) causal: potomu čto ‘because’, tak kak ‘because’, poskol’ku ‘because’, ibo ‘because’, ottogo čto ‘because’, (f) purposive: čtob(y) ‘in order to’, (g) conditional: esli (b(y)) ‘if’, eželi (by) ‘if’, koli ‘if’, raz ‘if’, (h) consecutive: tak čto ‘so that’, (i) concessive: xotja (b(y)) ‘although’, tol’ko by ‘if only’, razve tol’ko ‘unless’, pust’ ‘even if’, (j) modal: budto (b(y)) ‘as if’, (k) complementizer: čto(b(y)) ‘that’

The sentential examples in (7a–b) are representative of the use of coordinating and subordinating conjunctions in Russian. (7) a.

b.

Russian coordination [Tauscher and Kirschbaum 1983: 407] On el i pil. 3SG.M eat:PAST and drink:PAST ‘He ate and drank.’ subordination [Tauscher and Kirschbaum 1983: 415] Ja ne znal, čto on ėto skazal. 1SG NEG know:PAST that 3SG.M this say:PAST ‘I did not know that he said this.’

As can directly be seen, the Russian conjunctions conform to the canonical type. For the Arabic conjunctions we rely on Ryding’s (2005: 407–428) chapters on connectives and subordinating conjunctions in Modern Standard Arabic. The inventory in (8) contains also some cases which, from an Indo-European perspective, are atypical in the domain of conjunctions.

278 | Thomas Stolz and Nataliya Levkovych

(8)

Arabic conjunctions (a) copulative: wa- ‘and’, fa- ‘and so’, (b) adversative: bal ‘but (actually)’, ˀinna-maa ‘but moreover’, laakinna ‘but’, (c) disjunctive: ˀaw ‘or’, ˀam ‘or’, ˀimmaa...ˀaw ‘either...or’, (d) temporal: bayn-a-maa ‘while, whereas’, baˁd-a-maa ‘after’, baˁd-a ˀan ‘after’, Hattaa ‘until’, Hiin-a(-maa) ‘when’, ˁind-a-maa ‘when’, qabl-a ˀan ‘before’, (e) causal: ˀidh ‘since, inasmuch as’, liˀanna ‘because’, (f) consecutive: bi-Hayth-u ‘so that, so as to’, (g) modal: rub-a-maa ‘perhaps’, la’alla ‘perhaps’, (h) equative: qadr-a-maa ‘as...as’, (i) complementizer: ˀinna ‘that’, ˀanna ‘that’.

In (9a–b), we again recognize the close resemblance of the coordinating and complementation strategies to those postulated for the canonical type. It is of no consequence that the Arabic conjunction wa ‘and’ is a proclitic and not a free morpheme. (9) a.

b.

Arabic coordination [Ryding 2005: 410] Min-haa miSr-u wa-l-urdunn-u among-3SG.F Egypt-NOM and-DEF-Jordan-NOM ‘Among them are Egypt and Jordan [].’ complementation [Ryding 2005: 426] Dhakar-a ˀanna l-ˁarab-a ˀaˁTaw-haa mention-3SG.M.PERF that DEF-Arab.PL-ACC give:3PL.PERF-3SG.F sm-a-haa name-ACC-POR.3SG.F ‘He mentioned that the Arabs gave it its name.’

As complementizer, ˀanna ‘that’ as of (9b) requires the subject of the complement clause to be in the accusative (Ryding 2005: 177). In addition to the conjunctions in (8), there is the topic shift construction ˀammaa...fa- ‘as for’ (Ryding 2005: 420) whose initial component ˀammaa (for practical purposes translated as ‘but’ in this paper) is one of the most successful loan conjunctions because it has been borrowed into many replica languages to fulfil the function of adversative conjunction (Stolz in press). One of the languages which have integrated Arabic ˀammaa into their system of conjunctions is Persian. The conjunctions of modern Persian in (10) are taken from Lazard (1989: 281–282). Arabisms are underlined. (10)

Persian conjunctions (a) copulative: -o ‘and’, va ‘and’, ham ‘and’, ham...(va) ham ‘both...and’, na...na ‘neither...nor’, (b) adversative valī(kan) ‘but’, ammā ‘but’, balke ‘but, however’, (c) disjunctive: yā ‘or’, xvāh...xvāh ‘either...or’, (d) tem-

On loan conjunctions | 279

poral: čūn (ke) ‘when, since, because’, vaqt-ī (ke) ‘while’, (e) causal: zīrā ‘because’, (f) purposive: tā (ke) ‘in order to, until, so that, since’, (g) conditional: agar ‘if’, magar ‘if not’, (h) concessive: agar če ‘although’, (i) complementizer: ke ‘that’12 Except enclitic -o ‘and’ which represents the monosyndetic postpositive pattern with the connective element forming part of the first conjunct (Haspelmath 2007: 6), all conjunctions meet the criteria of the canonical type as illustrated in (11a–b). (11) a.

b.

Persian coordination [Alavi and Lorenz 1988: 91] Bidār-i jā xāb-i? awake-be.2SG or asleep-be.2SG ‘Are you awake or asleep?’ complementation [Alavi and Lorenz 1988: 123] Midānam ke mariz hasti PROG:know:1SG that ill be:2SG ‘I know that you are ill.’

At this point, we have reached the crossroads, in a manner of speaking. Turkic languages are not only replica languages but also donor languages in the geographic context of the former Soviet Union. We try to do justice to this Janus-like character of the Turkic languages by way of splitting their sketch in two. In this section, we only look at modern Turkish (of Turkey, as successor of Ottoman Turkish) and its system of conjunctions of the canonical brand and those coming close to the canon. The conjunctions in (12) are drawn from Ersen-Rasch (2012: 119–126). Underlining identifies loans from Arabic and/or Persian. (Johanson 1996). Like Persian -o above, the copulative da/de ‘and’ is an enclitic (also in its correlative use) and is subject to vowel harmonic processes. It does not realize the canonical type. (12)

Turkish canonical conjunctions (a) copulative: ile ‘and’, ve ‘and’, da/de ‘and also, but’, da/de...da/de ‘both...and’, gerek...gerek ‘both...and’, hem...hem (de) ‘both...and’, kâh...kâh ‘one moment...the other moment’, ne...ne (de) ‘neither...nor’,

|| 12 Note that Brockelmann (1908: 503) registers an old Semitic purposive conjunction kī ~ ke ~ kai̭ ‘(in order) to’ which functions as complementizer in a number of Semitic languages. Whether there are any historical connections to the Persian complementizer cannot be determined in this study.

280 | Thomas Stolz and Nataliya Levkovych

(b) adversative: ama ‘but’, lakin ‘but’, fakat ‘however’, ancak ‘however’, ya ‘but still’, (c) alternative: yoksa ‘or’, veya(hut (da)) ‘or’, ya...ya (da) ‘either...or’, ha...ha ‘either...or’, ister...ister ‘either...or’, olsun...olsun ‘either...or’, (d) causal: çünkü ‘because’, zira ‘because’, mademki ‘since’, (e) concessive: meğer(se/ki) ‘although’, gerçi...ama ‘although...yet’, (f) conditional: meğerki ‘unless’, eğer ‘if’, şayet ‘if’, (g) complementizer ki ‘that, so that’ With nineteen out of twenty-eight cases, loan conjunctions form a 68%-majority of the entries in the inventory. Some of the loan conjunctions in Turkish result from combinations of (originally) Arabic and Persian elements such as gerçi...ama ‘although...yet’ < Persian agar či ‘although’ + Arabic ˀammaa ‘but’. Ersen-Rasch (2012: 119) claims that most of the conjunctions in (10) can be classified as coordinating devices as in (13). (13)

Turkish – coordination Cem ve Ece Türk’tür Cem and Ece Turkish:MOD ‘Cem and Ece are Turks.’

[Ersen-Rasch 2012: 122]

Properly subordinating strategies are of a different kind as will transpire from the discussion in Section 3.2.

3.2 The potential replicas The members of the replica-language sample are identified by glossonym in (14). The sample of potential replica languages comprises 137 individual languages thirty of which are predominantly spoken in independent states which came into being after the disintegration of the Soviet Union in the early 1990ies (= marked by underlining). Presently, the Russian Federation is the home of the other 107 potential replica languages. Geographically the sample is divided in two, namely the cis-Uralian group (92 languages) and the trans-Uralian group (45 languages) with the latter being identified by grey shading. On the sample are permitted all those languages which were reported as (no matter how precariously) living languages at some point in time between 1917 and 1991 seven of which have experienced language death meanwhile (= marked by †). (14)

Replica-language sample Abkhaz-Adyge (5): Abaza, Abkhaz, Adyghe, Kabardian, Ubykh† Afro-Asiatic (2): Assyrian (Neo-Aramaic), Bohtan Neo-Aramaic Chukotko-Kamchatkan (5): Alutor, Chukchi, Itelmen, Kerek†, Koryak

On loan conjunctions | 281

Eskimo-Aleut (5): Aleut (Mednyj), Aleut (Bering Island), Central Siberian Yupik (Yuit), Naukan, Sirenik† Indo-European (20): Armenian, Belarusian, German (Volga)13, Kurdish, Kurmanji, Latvian, Lithuanian, Moldavian, Ossetic, Pontic Greek, Romani Kalderash (Kotljary), Romani Lithuanian, Lomavren, Romani North Russian, Romani Vlax, Tajiki, Tat (Judeo), Tat (Muslim), Ukrainian, Yiddish Isolates (including Yeniseian) (5): Ainu†, Ket, Koryo-mar (= Korean), Nivkh, Yugh† Kartvelian (3): Georgian, Laz, Mingrelian Mongolic (2): Buryat, Kalmyk Nakh-Daghestanian (29): Aghul, Akhvakh, Andi, Archi, Avar, Bagvalal, Bezhta, Botlikh, Budukh, Chamalal, Chechen, Dargwa, Godoberi, Hinuq, Hunzib, Ingush, Karata, Khinalug, Khwarshi, Kryts, Lak, Lezgian, Rutul, Tabassaran, Tindi, Tsakhur, Tsez, Tsova-Tush (=Batsbi), Udi Sino-Tibetan (1): Dungan Tungusic (8): Even, Evenki, Nanai, Negidal, Oroch, Orok, Udege, Ulch Turkic (24): Altay, Azerbaijanian, Bashkir, Chulym, Chuvash, Crimean Tatar14, Dolgan, Gagauz, Karachay-Balkar, Karaim, Karakalpak, Kazakh, Khakas, Kumyk, Kyrgyz, Nogai, Shor, Tatar, Tofalar (Karagas), Turkmen, Tuvin, Uyghur, Uzbek, Yakut Uralic (= Uralo-Yukaghir) (28): Enets (Forest), Estonian, Ingrian, Kamas†, Karelian, Khanty, Komi-Permyak, Komi-Zyrian, Livonian†, Livvi, Ludian, Mansi, Mari Eastern (= Meadow), Mari Western (= Hill), Mordvin Erzya, Mordvin Moksha, Nenets, Nganasan, Saami Akkala, Saami Kildin, Saami Skolt, Saami Ter, Selkup, Udmurt, Veps, Votic, Yukaghir Northern, Yukaghir Southern Socio-linguistically, the sample languages cover a wide range of situations. There are fully-blown state languages with a venerable history of written documentation reaching back to the Late Antiquity (e.g. Armenian, Georgian) side by || 13 The glossonym German (Volga) is employed as an umbrella for all German and Low German varieties (formerly) spoken in the Soviet Union. It is a deliberate simplification to indiscriminately classify them as cis-Uralian languages. Forced migration and other processes have scattered the German-speaking groups over trans-Uralian territories notably Central Siberia and Kazakhstan. For the time being, we prefer the anachronistic geographic allocation in the erstwhile Volga German SR over trying to trace the whereabouts of each of the subgroups in other parts of the Soviet Union. In a more detailed successor to this study, we will certainly aim at a higher degree of geographic precision. 14 On account of the annexation of the Crimean Peninsula by Russia, Crimean Tatar is considered to be a language of the Russian Federation.

282 | Thomas Stolz and Nataliya Levkovych

side with minority languages on the verge of extinction with hardly any written tradition (e.g. Ket, Veps). Bilingualism with and language shift to Russian was widespread during the Soviet period (Comrie 1981: 301). We acknowledge that these and other socio-linguistic factors might influence the transfer of conjunctions in the course of language contacts. For obvious reasons, however, we have to turn a blind eye on this possibility in this study because it is meant primarily as a first approach to the empirical facts the socio-linguistic interpretation of which is reserved for a follow-up study. In terms of the structural properties in the domain of syntactic linkage, the replica languages can be divided in two major groups. On the one hand, there are those languages for which the canonical conjunction is the autochthonous default strategy to connect syntactic units to each other no matter whether we are looking at NPs or clauses. The second group, on the other hand, consists of languages which prefer different strategies of linkage over the canonical type in at least one of the domains under scrutiny. The division is by no means strict. Different strategies may co-exist and often do so in the languages of the sample. Since we do not aim at exhaustively describing the systems of linkage of the replica languages, we restrict the discussion to a selection of phenomena. Thus, our generalizations tend to simplify the at times more complex situation. The examples (15)–(17) illustrate the employment of canonical conjunctions in representatives of Chukotko-Kamchatkan, Indo-European, and Kartvelian. (15)

Chukchi [Kämpfe and Volodin 1995: 126] әtljon ik=vˀm enmen ergatәk r=ekvet=gˀe 3SG say=PERF that tomorrow FUT=leave=PERF ‘He said that he would leave tomorrow.’

Comrie (1981: 251–252) remarks that [u]nlike nearly all other languages of Siberia, Chukchi makes frequent and regular use of finite subordinate clauses, and has a wide range of native subordinating conjunctions. [] This seems to be a long-established traditional means of expressing subordination, free from foreign influence.

For a member of the Indo-European language family, it is hardly surprising to find a system of canonical conjunctions as is the case in Armenian. (16)

Armenian [Dum-Tragut 2009: 290] Gn-um em t’atron isk du kino go-PART.PRES be.1SG theatre.NOM but 2SG cinema.NOM ‘I go to the theatre, and/but you to the cinema.’

On loan conjunctions | 283

The presence of autochthonous conjunctions does not preclude the borrowing of additional conjunctions as we will see in Section 4.2. Armenian’s next-door neighbor Georgian likewise displays autochthonous means of coordination. (17)

Georgian Me ċaval da 1SG PV:1SG:go and ‘I go and you stay here.’

šen 2SG

ak here

[Fähnrich 1986: 175] darčebi PV:stay:PRS

In his summary of the syntactic properties of the different autochthonous language families in the Caucasian region, Comrie (1981: 230) argues that subordinating conjunctions are virtually confined to South Caucasian. The North Caucasian languages make extensive use of participles and other nonfinite verbal forms to express subordination. The role played by such nonfinite verb forms is one of the most striking features of North Caucasian syntax.

This statement leads us directly to the second group of replica languages. This second group diversifies the picture considerably. The relevant properties of Abkhaz – as representative of Abkhaz-Adyge – are described by Hewitt (1989: 68–82). There is an impressive wealth of structural options including the canonical adversative axà ‘but’, juxtaposition with or without constraints on the TMA categories of the clauses, and the quotative marker ħoa. Furthermore, affixal marking is common not only for coordination but also for subordination. In (18), we present examples for three of these strategies in the domain of coordination. (18) a.

b.

c.

Abkhaz [Hewitt 1989: 68] asyndesis Yә-nap’ә̀ (Ø-)ʒ˙ oʒ˙ oa-nә̀ a+k’rә̀fa-ra d-à+la-ga-yt’ his-hand (it-)wash-ABS something+eat-MASD he-it+PV-begin-AOR ‘Having washed his hands, he began to eat.’ coordination – bisyndetic I a+k’rә̀fa-ra-g’ә̀ Yә-nap’-g’ә̀ (Ø-)yә-ʒ˙ oʒ˙ oa-nә̀ his-hand-and (it-)he-wash-ABS something+eat-MASD-and d-à+la-ga-yt’ he-it+PV-begin-AOR ‘He washed his hands and began to eat.’ coordination – bisyndetic II a+k’rә̀fa-ra Yә-nap’ә̀ (Ø-)ag’ә̀-y-ʒ˙ oʒ˙ oa-nә̀ his-hand (it-)and-he-wash-ABS something+eat-MASD d-ag’-à+la-ga-yt’ he-and-it+PV-begin-AOR ‘He washed his hands and began to eat.’

284 | Thomas Stolz and Nataliya Levkovych

As to (18a), Hewitt (1989: 68) explains that [w]here two verbs share a subject and the first action precedes the second, instead of coordination as such the first verb will go into the Past Absolute or, if the tense is past, the Past Indefinite,

i.e. no phonologically realized element joins the two clauses to each other. This is different in (18b–c) where the two conjuncts are equipped parallelly with morphologically bound markers (clitics) as in correlative, i.e. bisyndetic constructions. There is -g’ә̀ ‘and’ which is enclitically attached to the first constituent of each conjunct as in (18b). Alternatively, -ag’- ‘and’ can be infixed into the verbal complex of each conjunct as in (18c). Conditional subordination via clause-final -r is illustrated in (19). (19)

Abkhaz d-a:-r s/he-come.NON_FIN.AOR-if ‘If s/he comes, I shall kill him/her.’

[Hewitt 1989: 76] dә-s-š-wà-yt’ her/him-I-kill-DYN-PRS

The structural solutions found in Abkhaz are largely different from the canonical type. This judgment holds by and large also for the other members of the Abkhaz-Adyge language family. In the context of the Nakh-Daghestanian language family, Tsova-Tush serves as representative example. Coordination mostly requires all conjuncts to host either the suffix -e ‘and’ (with variable vocalic realization) or the free morpheme le ‘or’ is placed in clause-initial position so that bisyndetic constructions arise which have the shape of correlatives. The final conjunct cannot host the suffix -e which has to be moved to the syntactic word in the slot to the left as is shown in (20). (20)

Tsova-Tush [Holisky and Gagua 1994: 201] nan-en badr-en Datxar xac‘-en-e ču-a Vax-en mom-DAT child-GEN crying hear-AOR-and in-and go-AOR ‘Mother heard the child’s crying and went in.’

There are also more canonical conjunctions such as je~ne ‘and’, le ‘or’, and ma(gram) ‘but, however’ (< Georgian magram ‘but’) (Holisky and Gagua 1994: 200–201). The authors also state that [a]lthough finite subordinate clauses are possible in Tsova-Tush and are usually given freely in elicitation, at times they seem to be artificial or are calques from Georgian []. They occur much less often in texts than nonfinite clauses, which contain verbal nouns, infinitives, or participles. (Holisky and Gagua 1994: 201)

On loan conjunctions | 285

Example (21) illustrates the use of the past absolute in the subordinate clause. There is asyndesis. No dedicated morphological means connects the two clauses to each other. (21)

Tsova-Tush kalkI Jaix-čeħ-as Tbilisi go-PAST_ABS-1SG ‘If I go to Tbilisi, I will see Shukia.’

[Holisky and Gagua 1994: 204] šukia gu-as Shukia see-1SG

A structural leitmotif connected to the potential replica languages throughout the entire macro-area under scrutiny is their reliance on functionally differentiated converbial constructions.15 For Turkic, Mongolic, and Tungusic (subsumed under the label Altaic), Comrie (1981: 77) states that conjunctions, whether coordinating or subordinating, tend to be absent, and either one finds finite clauses strung one after the other paratactically, or longer periods are built up in which all subordinate verbs are nonfinite, and only the last verb in the whole period is a finite form.

The most informative in-depth study of converb constructions in trans-Uralian languages and their contact-induced changes under Russian influence is Anderson (2005). The author demonstrates that across South Central Siberia processes are observable which reflect the far advanced Russification in the domain of clause combining. Anderson’s focus is on PAT-borrowing in the course of which Russian construction patterns are replicated with autochthonous morphological means. There are, however, also instances of MAT-borrowing from Russian. These cases of loan conjunctions are dutifully registered in Section 4.2. Converbs are by no means the privilege of the trans-Uralian languages. They are also common among cis-Uralian languages. Haspelmath (1995) describes different types of converbs for the Nakh-Daghestanian language Lezgian whereas Weiss (1995) looks at converbial constructions in Russian, meaning: the donor language also displays this strategy of clause combining. Converbs are typical for Mongolic languages and their reconstructed proto-language (Janhunen 2003: 25–26). For Tungusic languages like Evenki, Nedjalkov (1995) provides the basic information about the converbial system. Similarly, Johanson (1995, 1998: 47) registers converbs as shared feature of Turkic languages in general. Bereczki (2004: 168) sketches the Uralic proto-language as generally lacking conjunctions because

|| 15 In contrast, the limited stock of converbs is considered a minor common feature of Standard Average European whose members “tend to have adverbial conjunctions” (Haspelmath 2001: 1504).

286 | Thomas Stolz and Nataliya Levkovych

paratactic strategies were employed. At the same time, “the Uralic proto-language was also characterized by simple participial constructions which fulfilled the function of subordinating clauses” (Bereczki 2004: 169). Klumpp (2002: 325–333) argues that Kamas is undergoing massive Russification which affects especially the productivity of the autochthonous system of converbs. For the Finnic branch of Uralic, Laanest (1982: 308–309) argues that asyndetic constructions represent the inherited strategy of linkage not only inter-clausally but also between and within NPs. Many of the examples correspond to Wälchli’s (2005) co-compounds which are widely common in languages throughout Siberia. Kolyma Yukaghir has no dedicated AND. NP coordination is achieved by way of juxtaposing the conjuncts or by using either the comitative case-marker -n’e ‘with’ or the temporal adverb tāhile ‘then’ (Maslova 2003: 313–319). The asyndetic coordination of NPs is illustrated in (22). (22)

Kolyma Yukaghir [Maslova 2003: 316] Mēmē čugurubie tabun-get čied’e-me joŋžō-ŋi. bear chipmunk that-ABL winter-TMP sleep-3PL.INTR ‘That is why the bear and the chipmunk hibernate.’

In accordance with the above preference for converbs, Kolyma Yukaghir makes extensive use of this strategy in the domain of subordination. Example (23) features the conditional converb. (23)

Kolyma Yukaghir Tudel numø-ge kel-de-j-ne 3SG house-LOC come-3SG-DS-COND ‘When he comes home, I will kill him.’

[Maslova 2003: 394] kudde-t kill-FUT(TR.1SG)

Uzbek – as representative of Turkic – displays a dozen canonical conjunctions almost all of which are loan conjunctions of Persian and/or Arabic origin (cf. Section 4.2). The autochthonous strategy of linkage involves the use of converbs. Conjunctions and converbs may resemble each other functionally as shown in (24). (24) a.

b.

Uzbek [Landmann 2010: 85] conjunction Yuvin-dik va kiyin-dik wash-1PL.PERF and dress-1PL.PERF converb Yuvin-ib kiyin-dik wash-CONV dress-1PL.PERF ‘We have washed and dressed.’

On loan conjunctions | 287

The Mongolic language Kalmyk displays a number of so-called conjunctions whose emergence seems to be triggered by language contact with Russian (Benzing 1985: 70–71). Many of these secondary conjunctions can be shown to result from the grammaticalization of erstwhile converbs such as boln ‘and’ (NPconjunction) which is identical to the modal converb of bol- ‘be’ and givčn ‘but’ which formally corresponds to the concessive converb of gi- ‘say’, etc. Under the rubric gerunds, Benzing (1985: 48–54) distinguishes eighteen different converbial forms. According to Bläsing (2003: 243–244), there are nine categories onto which these forms can be mapped. The concessive converb is illustrated in (25). (25)

Kalmyk Xurta bol-v čign zug rain:SOC be-CONV yet ‘It is a warm day although it is raining.’

dulan warm

[Benzing 1985: 50] ödr day

Tungusic languages display very similar properties. In Even, for instance, there is the enclitic -dә ‘and, too’ which serves to connect any kinds of conjuncts; it may also be used as correlative -dә...-dә ‘both...and’.16 Malchukov (1995: 19) presents a list of enclitic particles some of which resemble conjunctions functionally as e.g. =gal~=gel ‘but’. Other so-called conjunctions are grammaticalized converbs or case-forms of pronominals (Benzing 1955: 111–112). Malchukov (1995: 17–18) distinguishes eleven productive converbs. In (26), the preceding converb denotes “a secondary event immediately preceding a primary event” (Malchukov 1995: 17). As in (23), (24b), and (25), we are facing a monosyndetic construction with the connector being part of the first conjunct (Haspelmath 2007: 6). (26)

Even [Malchukov 1995: 17] n’eekičen-Ø tööre-se-mnin dege-l-re-n duck-NOM quack-MOM-PRE_CONV fly-INCH-NFUT-3SG ‘The (wild) duck quacked and (immediately) flew away.’

The isolate Nivkh gives evidence of correlative-like constructions in the case of NP-coordination (Comrie 1981: 270–271). The comitative case-marker -xe (dual) / -xo (plural) is attached to each of the conjuncts as in (27). This is an example for a postpositive bisyndetic construction (Haspelmath 2007: 6).

|| 16 The enclitic -də and the correlative -də...-də are reminiscent of Turkic -da/-de and -da/-de ...da/-de.

288 | Thomas Stolz and Nataliya Levkovych

(27)

Nivkh n’i ņaķr-Ø-ux k’eķ-xe hyjk-xe 1SG snow-SG-LOC fox-COM.DU hare-COM.DU ‘I saw fox’s and hare’s tracks on the snow.’

[Gruzdeva 1998: 17] zif-ku n’řy-d’ track-PL see-FIN

Sentence particles are said to be overwhelmingly derived from the auxiliaries ha- ‘do so’ and hoĝa- ‘be so’ which host the appropriate converbial morphology (Gruzdeva 1998: 38). On the back-cover of Gruzdeva’s grammatical sketch, we read that “Nivkh is well-known for its numerous converbs” of which there are about thirty. In the domain of clause combining, Gruzdeva (1998: 48) mentions coordinating verbal forms which also reflect the format of correlatives because coordination has to be marked on each conjunct. Subordination is achieved by way of using converbs. The causal converb is illustrated in (28). This is another example of the pattern A-CO B (Haspelmath 2007: 6). (28)

Nivkh [Gruzdeva 1998: 51] n’i ķ’o-xryry ys n’-za-d’ 1SG sleep-CONV.CAUS master 1SG-beat-FIN ‘Because I fell asleep, (my) master beat me.’

To close this section, we briefly look at Ket. According to Comrie (1981: 265), Ket is special insofar as [u]nlike many of the other languages of the area, Ket does not have a well-developed system of nonfinite forms, whereas it does have a number of conjunctions, including many native conjunctions in addition to a current tendency to borrow conjunctions from Russian.

Werner (1997a: 115–116) emphasizes that in lieu of making use of a dedicated AND-conjunction, Ket employs the comitative case or alternatively the adverb haj ‘in addition’. As far as we can judge, the comitative marker is not repeated on all conjuncts. Furthermore, Werner (1997a: 342–359) dedicates a chapter of his reference grammar of Ket to the problems posed by polypredicative sentences. There are several ways to link clauses to each other. Most importantly, Werner (1997a: 344) argues against classifying the Ket case in analogy to those of the previously discussed language families (Uralic, Turkic, Mongolic, Tungusic) because, in Ket, the predicates of the clauses are always finite. This means that the supposed postpositions which serve as connectors are homophonous conjunctions in Werner’s (1997a: 318–319) interpretation. In (29), the postposition dugde ‘while’ (< d- possessive prefix of the 3rd person singular feminine/neuter + ugde ‘long’) is used on the temporal clause and thus conforms to the areally

On loan conjunctions | 289

preferred monosyndetic construction type with the connective occurring on the first conjunct. (29)

Ket [Werner 1997a: 349] Qima da-ukl’ivet-dugde dɨl’gat grandmother 3SG-cook_soup.PAST-while child:PL (t)-tɔl’damin (3PL)-undress.PAST.REFL.PL ‘The children went to bed [lit. undressed] while the grandmother was cooking soup.’

Beyond this postpositional strategy, Werner (1997a: 344) mentions the existence of juxtaposition with accompanying prosodic means, case-inflexion on the subordinate clause, the use of pronominal or adverbial word-forms. The above selective presentation cannot do justice to the diversity of structural options the replica languages have in stock. We have picked a variety of properties which prove that the models of clause linkage and NP-coordination familiar from Standard Average European have no monopoly across the languages mentioned in (14). Conjunctions of the canonical type are not entirely alien to some of these languages. However, canonical conjunctions represent only a minor option which has to compete with several major strategies. The integration of loan conjunctions of the canonical type is therefore at least a partial innovation in the replica languages.

4 The catalogue In this section, we present the empirical findings which result from our search of the extant literature on the languages of the former Soviet Union. Wintschalek (1993: 74) emphasizes the role of Russian as the common superstrate of the languages of the Volga-Kama region. Among many other shared properties, Turkic (Chuvash, Tatar) and Uralic languages (Mari, Udmurt) of this area borrow identical constructions materially from Russian as e.g. the correlative ni...ni ‘neither...nor’. More generally, Johanson (1997) briefly reports on Russian loan conjunctions in Turkic languages. Johanson (2006: 11–12) mentions a variety of cases of MAT-borrowing of conjunctions with Turkic and other languages of the former Soviet Union being situated at the receiving end of the process. Van den Berg (2004: 215) addresses the widely common borrowing of the copulative conjunction wa ‘and’ from Arabic in numerous NakhDaghestanian languages. These and other studies are suggestive of the exist-

290 | Thomas Stolz and Nataliya Levkovych

ence of very fertile ground for language-contact related investigations on the basis of the autochthonous languages of the former Soviet Union. The topic of language contact in the Russian Federation has been addressed twice in international conferences organized by the Vinogradov Institute for the Russian language of the Russian Academy of Sciences in Moscow, namely Indigenous Languages in Contact with Russian: Morphosyntactic and Semantic Interference (30 November – 1 December, 2018) and Indigenous Languages of Russia in Contact with Russian (11–13 February, 2021). Moreover, there is an edited volume dedicated explicitly to language contact in the former Soviet Union (Forker and Grenoble 2021b). Forker and Grenoble (2021a) emphasize the general linguistic importance of studying Slavicization processes also in comparative perspective. Loan conjunctions are a recurrent theme in these conferences and comparative studies like that of Khomchenkova and Stoynova (2021) on the Russian influence in the domain of subordination in three different replica languages strongly suggest that the time is ripe for a large-scale comparison of the borrowing behavior of the replica languages in general. Our own study only marks the very beginning of the envisaged project. Before the facts are disclosed in Section 4.2, it is necessary to mention potential pitfalls in Section 4.1 which have to be borne in mind when the data are analyzed. It is also necessary to explain that we have found reliable evidence of loan conjunctions in the majority of our sample. As shown in Figure 1, loan conjunctions are corroborated to exist in 97 of 137 sample languages, i.e. 71% of the members of the sample can be considered to be confirmed replica languages. Those 40 members of the sample for which the existence of loan conjunctions could not be confirmed on the basis of our sources are listed alphabetically in (30). (30)

Sample languages without confirmed loan conjunctions Ainu†, Assyrian (Neo-Aramaic), Belarusian, Buryat, Chukchi, Chulym, Dungan, Even, Georgian, Godoberi, Kamas†, Kerek†, Koryak, Koryo-mar (= Korean), Kurdish, Kurmanji, Laz, Lithuanian, Livvi, Lomavren, Ludian, Mingrelian, Moldavian, Naukan, Nenets, Nganasan, Nivkh, Oroch, Orok, Ossetic, Pontic Greek, Romani Vlax, Sirenik†, Tofalar (Karagas), Udege, Ukrainian, Ulch, Yakut, Yukaghir Northern, Yukaghir Southern

On loan conjunctions | 291

no evidence; 40; 29%

confirmed; 97; 71%

Figure 1: Share of confirmed conjunction-borrowers in the sample.

Twenty-one of these languages are classified as trans-Uralian. Nine are spoken in contemporary independent states and another four languages are marked as extinct. The closest relatives of Russian – Belarusian and Ukrainian – are also part of this group which is no surprise since their system of conjunctions and clause-linkage practices (including the relators themselves) can be taken to be very similar to that of Russian so that MAT-borrowing is hardly an issue. On the other hand, we assume that the appearance of languages like Chulym, Kamas, and Livvi in (30) is incidental in the sense that the sources we had access to do not go into the topic of interest. As to Kamas, Alexander Arkhipov (p.c.) argues that there is evidence of Russian loan-conjunctions already in the corpus compiled by Kai Donner in the early 20th century. The existence of loanconjunctions could not be verified in time to integrate the data in this study, that is why we register Kamas as a language without evidence for loan conjunctions. Since Anderson (2005) counts Chulym among those languages in South Central Siberia which are especially exposed to Russian influence in syntax, it is to be expected that evidence of loan conjunctions will be found in future studies also for this and other languages in (30). The classification will be revised in future research on this subject.17

|| 17 Since parallel to this project there were other pressing issues of an academic nature to be seen to, we arbitrarily declared 15 March, 2021 the deadline for the admission of further cases in the database – be they new replica languages or further loan conjunctions. Before this deadline expired we had checked at least one source per language for the entire sample.

292 | Thomas Stolz and Nataliya Levkovych

We exclude the languages listed in (30) from the further discussion. As mentioned in the previous paragraph, their exclusion does not preclude the possibility that they too have borrowed conjunctions – a fact which the linguistic descriptions pass over tacitly for whatever reasons. However, it would be methodologically inacceptable to generalize without material evidence over the borrowing behavior of languages in analogy to that of some of their relatives and/or neighbors. There is thus a good motive for continuing this research to determine whether the gaps in the distribution of loan conjunctions over the sample languages can be filled.

4.1 False friends Conjunctions in different languages may resemble each other without being historically related. This is the case with the loan conjunction vaya ‘or’ which is of Persian origin but may be mistaken for an Arabo-Persian hybrid since Persian has borrowed wa ‘and’ as va from Arabic. However, vā ‘or’ was already firmly established in Old Iranian and later on underwent reinforcement by way of fusing with the synonymous yā ‘or’ (Skjærvø 2009: 150). This is not the only example of unrelated lookalikes which are easily mistaken for borrowings. Several of the loan conjunctions are monosegmental or bisegmental chains. This phonological shape applies for instance, to candidates like Russian a ‘and, but’, i ‘and’, da ‘and (also)’, no ‘but’. These very short phonological chains are difficult to judge if they show up in similar function in a given replica language because nothing prevents their contact-independent emergence in this language. Monosyllabic conjunctions are frequently attested cross-linguistically as shown by Stolz et al. (2012: 208–209) who list mono-vocalic cases from outside Europe such as Brahui o ‘and’, Chocho ā ‘or’, Kilivila e ‘and’, Limba o ‘and’, Mende ɔɔ ‘or’, and Somali oo ‘and, but’. It would be nothing out of the usual if a language of our sample developed monosegmental/bisegmental conjunctions of its own whose shortness might be explicable with reference to their supposed high token frequency. Given that the number of vowels which could possibly be part of the segmental chains is severely limited, chance similarity cannot be ruled out. Some problems are easy to solve. This is true of the bisyndetic coordination with the enclitics =i...=i ‘both...and’ in the Abkhaz-Adyge language family as e.g. in Abkhaz sare=i bare=i ‘both me and you’ (Klyčev and Čkadua 2001: 124). There is the vague outward resemblance with Russian i ‘and’ but in contrast to the latter we are not dealing with a canonical conjunction in the first place. In Abkhaz, there is no autochthonous canonical conjunction. The Abkhaz pattern reflects Haspelmath’s (2007: 6) pattern A-CO B-CO whereas the Russian pattern

On loan conjunctions | 293

is a realization of A CO-B. The undeniable functional equivalence of the Russian and the Abkhaz case notwithstanding, there is no contact-related connection between them. Similarly, Chechen =ʔa ‘and’ is superficially reminiscent of Russian a ‘and, but’. However, like in the previous case, we are dealing with an enclitic but this time it is monosyndetic and is attached to the first conjunct or to the word immediately preceding the finite verb (Nichols 1994a: 59–60). The morphosyntactic differences between Russian a ‘and, but’ and Chechen =ʔa ‘and’ are too big to be put aside. In this case too, we assume incidental similarity. This line of argumentation also holds for Chechen’s sister language Ingush where =ʔa ‘and’ is attested, too (Nichols 1994b: 126). Furthermore, the Georgian conjunction da ‘and’ is canonical and looks like Russian da ‘and (also)’. However, the Georgian conjunction is attested already as early as the Old Georgian period (4th–11th century) (Fähnrich 1994: 197), i.e. its existence predates the beginning of Russian-Georgian language contacts by several centuries. Serious difficulties arise again with Russian da ‘and (also)’ and its equivalents in Turkic replica languages. From the inventory of modern Turkish conjunctions in (12) we know that there is enclitic =da/=de ‘and also, but’. This element is widely common also in most of the Turkic languages of our sample (such as Turkmen, Yakut, Kyrgyz, etc.). In all of these cases, the connective is enclitic and belongs to the first conjunct. In most of the modern Turkic languages it is subject to vowel harmony. We take this to mean that the element under scrutiny is part of the common heritage of the Turkic languages and only incidentally resembles Russian da ‘and (also)’. This also holds for the correlative uses. A particularly intriguing case is Russian ni...ni ‘neither...nor’. From (10) and (12), we are familiar with the existence of the synonymous correlatives na...na (= [næ]...[næ]) and ne...ne (de) in Persian and Turkish, respectively. Turkish has borrowed this bisyndetic construction from Persian. Correlative constructions of this kind are reported for a dozen of the replica languages in our sample but it is not always clear which of the donor languages they have been borrowed from. As an ad-hoc shortcut solution, we apply the following rule of thumb. Unless the sources determine the donor language explicitly, if the correlative involves a high front vowel /i/ and/or palatalization of the preceding nasal, we assume a Russian origin. If, however, there is a low or mid-open front vowel (without palatalization), a Persian origin is postulated. It is possible that the phonological similarity of the Russian and Persian patterns has facilitated the replacement of the latter with the former in several Turkic languages. Further problematic issues will be mentioned separately in the empirical catalogue presented in the subsequent Section 4.2.

294 | Thomas Stolz and Nataliya Levkovych

4.2 True friends In this section, we uncover our loan-conjunction database (as of early 2021). Only those languages are taken account of which have been reported to borrow conjunctions.18 The cases are presented for each replica language separately. The replica languages are ordered according to alphabetic principles. For each of the replica languages, the stock of loan conjunctions is given. If possible, a sentential example is given for each individual replica language provided our sources illustrate the use of (loan) conjunctions in the first place. The full list of sentential examples is meant to cover as many different types of loan conjunctions as possible of those found in the database. We identify the donor languages with reference to the distant donor because many Arabisms have entered the replica languages via Persian or Ottoman Turkish. Grant (2012: 351) states that “[b]orrowing of dependent clause markers that were themselves borrowed from another language [] seems to be especially frequent.” Our database contains also many examples of NPconjunctions and coordinating conjunctions in general which look back on a chain of donors and replicas. The catalogue is built on the principle of maximal coverage, i.e. we have tried to find as many different loan conjunctions as possible for each language. To this end we have consulted and exploited – wherever possible – several descriptive-linguistic sources. Note, however, that, except two illustrative cases from Abkhaz (Section 4.2.2) and Tsez (4.2.85), we have excluded a number of doubtful cases from the discussion.19 It cannot be ruled out that on closer inspection some of these cases might re-enter the scene in a follow-up study. By far not all of the sample languages have the privilege of being described extensively and more often than once. Not all relevant texts dedicated to a given language were accessible to us. There are thus several logistic and technical factors which might skew the picture that results from the subsequent presentation.

|| 18 Thus, the use of Russian a exclusively as discourse particle in Buryat means that this Mongolic language cannot be admitted to the catalogue. This function of the Russian loan comes to the fore in Buryat sentences like A ši juume vyigrelee xen guš? ‘And have you won anything?’ (Namdakova 2021: 63). 19 A case in point is Karakalpak gä...gä ~ gej...gei ~ gej de...gej de ‘either...or’ for which Baskakov (1952: 521) assume a Persian origin. We have come across similar references in connection with several other replica languages. It is not clear to us what the Persian original of this borrowing should be. We have decided therefore not to discuss this and similar cases in this study.

On loan conjunctions | 295

For the empirical part of our study we heavily rely upon the descriptive linguistic sources. We cannot be sure that the sources properly distinguish codeswitching phenomena from genuine borrowing. We are interested only in the latter but it cannot be ruled out completely that some of the data we present below fail to meet the criteria of borrowing in the first place. We admit all cases of loan conjunctions to the catalogue unless the source explicitly states that we are dealing with codeswitching. Furthermore, the loan conjunctions (and where necessary their donor language equivalents) are translated into English according to the principle of economy, i.e. we normally give only one or maximally two meanings – and the first meaning is usually the one which reflects best the coordinating or subordinating function of the item under review. This practice glosses over other (perhaps even more important) meanings associated with the loan conjunctions. Owing to the use of different sources for one and the same language, there is also variation as to the graphic representation of the loan conjunctions. The catalogue ends with Section 4.2.97 on Yugh. Immediately after the presentation of the evidence for loan conjunctions found in this Yeniseian language, we proceed to the systematic evaluation of the data to which Section 5 is dedicated. Thus, there will be no intermediate summary because the discussion in Section 5 will refer back time and again to the facts presented throughout Section 4.2 and its subdivisions.

4.2.1 Abaza According to Šagirov (1989), most of the members of the Abkhaz-Adyge language family give evidence of loan conjunctions. For Abaza, the author registers ja ‘or’ which is classified as a loan from Turkish (Šagirov 1989: 127). Not only in this case, Turkish is probably the immediate donor but ultimately ya ‘or’ is of Persian origin. In (31), this loan conjunction is shown to link clauses in (31a) and NPs in (31b) alike. (31) a.

Abaza [Arkadiev 2020: 62] clause combining w-á-ʒ.qa-gəla-ta ja 2SG.M.ABS-3SG.NH.IO-LOC-stand-ADV or w-á-ʒ.qa-gəla-mə-ztən 2SG.M.ABS-3SG.N.IO-LOC-stand-NEG-COND.REAL ‘[] either while you are standing there or while you are not there []’

296 | Thomas Stolz and Nataliya Levkovych

b.

NP coordination sará waʕa.qá-ta eseseser j-t-əw zəmʕʷá r-pnə 1SG nation-ADV USSR REL.ABS-be.in-PRS.NFIN all 3PL.IO-at za-ḳə́~zaḳ warád ja kuplét-ḳ z-də́r-əj-d one-INDF~DISTR song or verse-INDF 1S.ERG-know-PRS-DCL ‘I know one song or verse from all nations of the USSR.’

Arkadiev (2020: 61) mentions the coordinating conjunction wa ‘and’ which has to accompany each of the conjuncts as in (32). (32)

Abaza [Arkadiev 2020: 62] wa kʷa-ṭáḳʷ, wa ʒ-ṭaḳʷ jálah hʷa and rain-little and water-little god QUOT h-nə́q̇ʷ-əw-n 1PL.ABS-walk-IMPEPF-PAST ‘We were going around saying, God, a little rain, a little water!’

Our source does not address the issue of borrowing in the domain of conjunctions. Yet, this does not automatically mean that a foreign origin of Abaza wa is ruled out. To the contrary, we assume that we are facing an instance of Arabic wa- ‘and’ which has been borrowed materially but obeys to the rules of Abaza syntax since its occurrence with each of the conjuncts is not a requirement of Arabic grammar. However, the autochthonous copulative conjunction jg’əj ‘and’ seems to be exempt from obligatory doubling (Arkadiev 2020: 61). In analogy to wa, we assume an Arabic origin for the adversative conjunction ma ‘but’ in (33). (33)

Abaza [Arkadiev 2020: 61] d-ʕa-ḳʷ-ša-ṭ, ma w-qa-pssʕa-l! 3SG.H.ABS-CISL-LOC-go.around(AOR)-DCL but 2SG.ABS-LOC-fly-LAT(IMP) ‘He went around (the rock), but one can’t climb it up!’

It is possible that Abaza ma ‘but’ is the reduced form of Arabic amma’ ‘but’ transferred to Abaza via Persian and/or Turkish. That this interpretation is not absolutely straightforward results from the discussion of the Abkhaz parallel in the subsequent Section 4.2.2.

4.2.2 Abkhaz For the Batumi variety of Abkhaz, Šagirov (1989: 130) registers the correlative construction nej...nej ‘neither...nor’ which stems from Persian but has entered

On loan conjunctions | 297

the replica language via Turkish (Turkish ne...ne < Persian ne...ne). Also in Batumi Abkhaz, there is x‘em ‘and, also’ derived from Persian hem (and again mediated by Turkish) (Šagirov 1989: 137). It is doubtful whether the adversative conjunction amala ‘but’ (Klyčev and Čkadua 2001: 124) can be etymologically connected to Arabic amma’ (+ negation laa). Another puzzle is ma ‘or’ as illustrated in (34). (34)

Abkhaz ma wará wǝ-cá, ma sará either you(H:M) you(H:M)-go either 1SG ‘Either you go, or else I shall go.’

[Chirikba 2003: 62] s-ca-wá-jt‘ 1SG-go-PRES:DYN-FIN

In Chirikba’s (1996: 109) A dictionary of Common Abkhaz, the entry ma is translated as English or for Abkhaz but as English but for Abaza. There is thus an etymological connection between the two. However, it remains unclear whether we are dealing with a loan conjunction in the first place. On account of this uncertainty, we exclude both amala ‘but’ and ma ‘or’ from the list of loan conjunctions.

4.2.3 Adyghe For the Azakh variety of Adyghe, Paris (1989: 224) qualifies the correlative ye...ye ‘either...or’ as relatively infrequent. According to Šagirov (1989: 127), this bisyndetic construction contains ye ‘or’ borrowed from Persian via Turkish. We assume that the correlative itself has been borrowed from these sources, i.e. it is not an autonomous formation of the replica language. Paris (1989: 224) also mentions the equally infrequent correlative nəy...nəy~ney...ney ‘neither...nor’ for which we assume a Turco-Persian origin, too. The sources do not provide sentential examples.

4.2.4 Aghul The Nakh-Daghestanian language Aghul gives evidence of the borrowing of at least three conjunctions. Besides the almost ubiquitous amma ‘but’ (Alekseev and Sulejmanov 2001: 405) and wa ‘and’ (van den Berg 2004: 215) from Arabic, there is also the originally Persian correlative ja...ja ‘either...or’ (Alekseev and Sulejmanov 2001: 405) as shown in (35).

298 | Thomas Stolz and Nataliya Levkovych

(35)

Aghul [Alekseev and Sulejmanov 2001: 405] ja dad adese, ja bab or father come:FUT or mother ‘Either the father or the mother will come.’

It is likely that these cases do not exhaust the list of loan conjunctions in Aghul. However, a larger database is needed to continue the search for additional cases. This situation is by no means limited to the Aghul case but is a recurrent trait across the vast majority of the languages of our sample.

4.2.5 Akhvakh For Akhvakh (Nakh-Daghestanian), Magomedbekova (2001b: 253) mentions the Arabic loan conjunctions amma ‘but’ and va ‘and’. Example (36) illustrates the use of the adversative loan conjunction and is taken from an earlier publication of the same author. (36)

imixi č̣agoda biḳwari, amma č̣ilagune čanka donkey alive was but from_home very riḳǝal’ilajehe biḳwari. moved_away was ‘The donkey was alive but he was far away from home.’ [Magomedbekova 1967: 138]

Whether there are further loan conjunctions in this replica language cannot be determined on the basis of the information provided in the sources we have consulted.

4.2.6 Aleut (Bering Island) The Eskimo-Aleut language Aleut (Bering Island) is reported to borrow the copulative conjunction i ‘and’ as well as the adversative conjunction a ‘but’ from Russian.20 The examples (37)–(38) are taken from Golovko’s PhD-thesis which also covers the mixed language Mednyj Aleut to be addressed in Section 4.2.7.

|| 20 Golovko (2003: 180) claims that Aleut (Bering Island) has adopted the Russian irrealis marker by in the phonetically regular shape ku(-m), too.

On loan conjunctions | 299

(37)

(38)

Aleut (Bering Island) Taanja-n i Peetja-n Tanja-REL and Petja-REL ‘Tanja’s and Petja’s house’

[Golovko 2009: 254] ula-a house-POSS

Aleut (Bering Island) sag’alakag’ix a hilakux’ sleep:NEG:3SG but read:3SG ‘He doesn’t sleep, but he reads.’

[Golovko 2009: 362] hin’a here

Aleut boasts autochthonous means of expressing both functions one of them being asyndetic juxtaposition. On the other hand, there are for instance, the canonical conjunctions (h)ama(s) ‘and’ (not to be mistaken for the Arabic adversative amma!) and the interjection tag’a which is used to give orders to animals and as intensifying particle in adversative contexts (Golovko 2009: 235–236).

4.2.7 Aleut Mednyj Aleut Mednyj has been keeping scholars of language contact busy because of its mixed structure. The peculiar combination of Russian and Aleut components in the grammatical system of the language have been discussed time and again (as e.g. by Thomason and Kaufman (1988: 233–237). In our context, it suffices to state that all conjunctions of the language are taken from Russian. As far as we can see, adversative relations are usually expressed by asyndetic sequences of clauses. In Golovko (2009), we have identified the following loan conjunctions: a ‘but’, i ‘and’, ili ‘or’, kogda ‘when’, ni...ni ‘neither...nor’, patomu šta ‘because’, što ‘that’, tak ‘so’, and tol’ka ‘but (then)’. There are possibly many more which have escaped our notice. In (39), we illustrate the use of the temporal conjunction kogda ‘when’. (39)

Aleut Mednyj [Golovko 2009: 447] kamga-m ula-a kogda sixi-ča-l-i prayer-REL house-POSS when destroy-CAUS-DISTR-PL ikoona-n iksa-ča-l-i icon-PL place-CAUS-DISTR-PL ‘When the church has been destroyed, the icons were buried.’

In contrast to Aleut (Bering Island), Aleut Mednyj has no autochthonous connectives to compete with the Russian loan conjunctions.

300 | Thomas Stolz and Nataliya Levkovych

4.2.8 Altay Direct and convincing evidence for the existence of loan conjunctions in Altay (Turkic) is scarce. According to Schönig (1998c: 414), the Russian adversative conjunction a ‘but’ should form part of the inventory of Altay connectives. The disjunctive conjunction ale ‘or’ goes probably back to Persian valī(kan) ‘but’ which is itself an Arabism in Persian. In the absence of sentential examples in the sources, we cannot illustrate the use of the loan conjunctions further.

4.2.9 Alutor For this member of the Chukotko-Kamchatkan language family, Žukova (1980: 131) states that autochthonous conjunctions co-exist with loan conjunctions from Russian. Among the latter we find čtoby ‘in order to’, kogda ‘when’, čto ‘that’, i ‘and’. The list is probably much longer. The author presents the loan conjunctions in their Russian shape. In (40), the use of the general complementizer sjto ‘that’ is shown. (40)

Alutor [Nagayama 2003: 239] ǝт use valum-ǝ-tke-nin ŋavakǝk now already hear-E-IMPF-3SG.A/3SG.P daughter:ABS.SG pǝrvisat-ǝ-lɁ-ǝ-n sjto ǝnnu asiqɁat pǝkir-ǝ-tkǝn. speak-E-PART-E-ABS.SG that she:ABS.SG at_last arrive-E-IMPF:3SG.S ‘And [he] hears already the daughter’s voice that she is coming.’

4.2.10 Andi As many of its sister-languages in the Nakh-Daghestanian group, Andi is reported to borrow amma ‘but’ and va ‘and’ from Arabic (Alekseev 2001a: 226). According to Salimov (2010: 215), there is also evidence of ja ‘or’ from Persian. The adversative loan conjunction is featured in example (41). (41)

Andi [Salimov 2010: 215] Ješi išu, amma vošo k’vaṭi-la girl at_home but boy street-LOC ‘The girl is at home, but the boy is outside.’

On loan conjunctions | 301

4.2.11 Archi In her study of loanwords in Archi, Chumakina (2009: 445) identifies va ‘and’ from Arabic and the two Persian loan conjunctions ya ‘or’ and nagah ‘if, when’. In (42), the copulative conjunction va ‘and’ links two predicates to each other. (42)

Archi Tov lo vartɬir va that child come and ‘That boy comes and listens (regularly).’

[Mikailov 1967: 143] irkkur listen

4.2.12 Armenian Armenian has a rich stock of autochthonous conjunctions. In colloquial style, however, the inherited correlative ew...ew ‘both...and’ is often replaced with ham...ham ‘both...and’ (Dum-Tragut 2009: 289) which is a Persism that reached Armenian perhaps via Turkish. Example (43) suggests that the correlative construction – at least theoretically – puts no upper limit on the number of conjuncts. (43)

Armenian [Dum-Tragut 2009: 289] ham čašaran a ham nnǰaran and dining_room.NOM be.3SG and sleeping_room.NOM a ham zugaran a be.3SG and toilet be.3SG ‘[…] it is dining room, and it is sleeping room and it is toilet as well.’

4.2.13 Avar Avar (Nakh-Daghestanian) attests to the following loan conjunctions. There are the Arabisms va ‘and’ and amma ‘but’ alongside the Persian-derived ya ‘or’, ya...ya ‘either...or’ (Alekseev et al. 2012: 239–240), nagah ‘if’, and ħamma ‘and even’ (Surxaeva 2007: 90). There are also two possible Persian-Avar hybrids, namely yaɬuni ‘or’ and yagi ‘or’ both of which contain initial ja- from Persian and a second component which is an autochthonous particle (Surxaeva 2007: 57). As we will see below, these Persian-Avar hybrids have been borrowed by a number of other Nakh-Daghestanian languages for which Avar is the near donor language.

302 | Thomas Stolz and Nataliya Levkovych

The already familiar adversative conjunction forms part of example (44). (44)

Avar [Alekseev et al. 2012: 240] Dun halt’-ana amma svak-a-č’o. 1SG work-PAST but get_tired-PAST-NEG ‘I have worked but I didn’t get tired.’

4.2.14 Azerbaijanian “Azerbaijanian has numerous conjunctions, mainly of Arabo-Persian origin” (Schönig 1998a: 257). We have found evidence of eleven loan conjunctions which meet this description, namely amma ‘but’, läkin ‘but’, fäkät ‘but’, vä ‘and’ from Arabic and ki ‘that’, ya ‘or’, ya...ya da ‘either...or’, nä...nä dä ‘neither...nor’, ägär ‘if’, çünkü ‘because’, and häm...häm ‘on the one hand...on the other hand’ from Persian (Landmann 2013a: 91–92). In (45), the use of the conditional conjunction ägär ‘if’ is illustrated. In this example, the dependent clause precedes the main clause. (45)

Azerbaijanian əgər gəlmək istəyirsən if come:INF want:PRS:2SG ‘If you want to come, just come!’

[Landmann 2013a: 92] gəl! come.IMPV

4.2.15 Bagvalal In Bagvalal, we find those loan conjunctions which are also reported for many other Nakh-Daghestanian languages, namely Arabic amma ‘but’ and Persian ya ‘or’, ya...ya ‘either...or’, and the Persian-Avar hybrid yagi ‘or’ (Kibrik and Tatevosov 2001: 178 and 723). The Persian correlative is featured in example (46). (46)

Bagvalal [Kibrik and Tatevosov 2001: 178] ja miq’ b=is-ē-n-ō=b, ja e=b=da or way N=find-CAUS-IMPERF-PART=N or self=N=DA b=ič’-ir-ō=b... N=die-IMPERF-PART=N ‘[Young deer that has lost his way] either finds the way or dies.’

On loan conjunctions | 303

4.2.16 Bashkir According to the alphabetical presentation, the Turkic language Bashkir is the first replica language which gives evidence of the co-existence of Arabo-Persian loan conjunctions on the one side and those of Russian origin on the other. The conjunctions borrowed from Arabic are as follows: ämmä ‘but’, läkin ‘but’, whereas Persian has contributed häm ‘and’, jä...jä ‘either...or’, ägär ‘if’, sönki ‘because’ (< Persian čūn ke), and ki ‘that’ (Juldašev 1981: 456; Landmann 2015: 108–110). Landmann (2015: 108–110) also lists ä ‘and’ and ni...ni ‘neither...nor’ as Russian loan conjunctions in Bashkir (Landmann 2015: 108–110). In (47), we provide an example for the Arabic adversative conjunction läkin ‘but’. The Russian correlative ni...ni ‘neither...nor’ is featured in (48). (47)

Bashkir [Landmann 2015: 109] Min heððe anglajym läkin irken höjläšä 1SG 2PL:ACC understand:1SG but free speak almajym take:NEG:1SG ‘I can understand you, but I cannot speak freely.’

(48)

Bashkir byl bala ni alma ni this child neither apple nor ‘This child likes neither apples nor pears.’

[Landmann 2015: 108] gruša jaratmaj pear like:NEG

4.2.17 Bezhta In her contribution to the first conference in Moscow (cf. above), Khalilova (2018) notices the presence of Russian i ‘and’ and no ‘but’ in her corpus of Bezhta texts where the copulative i seems to be typical of the speech habits of the younger generations. These Russian loan conjunctions are complemented by borrowings from Arabic, namely va ‘and’ and amma ‘but’ (Comrie et al. 2015: 436), as well as those from Persian, namely ya ‘or’, ya...ya ‘either...or’, nagah ‘if, when’ which were transferred to Bezhta probably via Avar. Additional borrowings from Avar are two Persian-Avar hybrid conjunctions yagi ‘or’ and yaluni ‘or’ (Comrie and Khalilov 2009: 426–428; Comrie et al. 2015: 436). Example (49) illustrates the use of the originally Persian disjunctive correlative.

304 | Thomas Stolz and Nataliya Levkovych

(49)

Bezhta [Testelec and Xalilov 2001: 310] ya do goval, ya is ẽyal or 1SG:ABS go or brother:ABS send ‘Either I will go or I will send the brother.’

4.2.18 Bohtan Neo-Aramaic The information about this Afro-Asiatic language is scarce. As to the issue of loan conjunctions, we have only found Fox’s (2002: 110) claim that “[w]hen speakers do feel the need to link clauses they sometimes resort to Russian i ‘and’.”

4.2.19 Botlikh The documentation of Botlikh (Nakh-Daghestanian) is similar to that of the replica language in the previous Section 4.2.18. Except Magomedbekova’s (2001a: 228) statement that the language boasts the loan conjunctions amma ‘but’ and va ‘and’ from Arabic and Khalilov’s (2015) word-list which contains Persian nagah ‘if’ as well as the Persian-Avar hybrids jagi ‘or’ and jaɬuni ‘or’, nothing much can be said with certainty – not the least because sentential examples are not given in the source. Note, however, that there are publications which treat of Arabisms in Botlikh (e.g. Azaev 1973) to which we have no access. It is possible therefore that the list of loan conjunctions in Botlikh is not yet closed.

4.2.20 Budukh The loan conjunctions in Budukh (Nakh-Daghestanian) number eight cases. Talibov (2007: 242–244) mentions va~ve~vä ‘and’, amma ‘but’, lakin ‘but’, as loans from Arabic, ya~ye ‘or’, agar~agam ‘if’, ki ‘that’ and ham...ham ‘both...and’ from Persian, and anǯaq ‘but’ from Azerbaijanian (Alekseev 1994: 290). Several of the Arabo-Persian conjunctions seem to have passed through Azerbaijanian before they reached Budukh. In (50), the two adversative loan conjunctions lakin ‘but’ and anǯaq ‘but’ can freely replace each other.

On loan conjunctions | 305

(50)

Budukh

[Talibov 2007: 243] lakin

Yin

kitab

suxo-ǯi,

yezmi

anǯaq we book read-PAST but writing ‘We have read the book but haven’t written.’

sü’ürdab. do:1PL

4.2.21 Central Siberian Yupik This member of the Eskimo-Aleut language family is special insofar as it gives evidence neither of Russian loan conjunctions nor (unsurprisingly) of those of Arabo-Persian origin. The donor language for Central Siberian Yupik is Chukchi (Chukotko-Kamchatkan). Kämpfe and Volodin (1995: 119) emphasize that among the 32 conjunctions of Chukchi, there is not a single borrowing from Russian. De Reuse (1994) has dedicated his prize-winning PhD-thesis to the topic of Chukchi influence on Central Siberian Yupik. The author identifies numerous so-called particle loanwords which Central Siberian Yupik has borrowed from Chukchi (De Reuse 1994: 366–408). In this list, there are several conjunctions such as ama ‘and’21 < Chukchi əmə ‘also’, enraq ‘but’ < Chukchi ənraq ‘in turn’, esgi ‘when’ < Chukchi ecgi ‘as soon as’, etc. The borrowed copulative conjunction is illustrated in (51). (51)

Central Siberian Yupik [De Reuse 1994: 428] ayvegh-agh-luteng enkaam ama walrus-catch.N-AOP(3PL) then and allanqik nanugh-agh-luteng=llu additionally polar_bear-catch.N-AOP(3PL)=also ‘They had caught a walrus and additionally a polar bear.’

In connection to this example, De Reuse (1994: 428) emphasizes that Central Siberian Yupik tolerates the cooccurrence of several coordinating means which otherwise may also replace each other. In this case the loan conjunction ama ‘and’ combines with enkaam and =llu both of which could fulfil the function of coordinating connectives also on their own. Thus, like in many other cases in our database, the loan conjunction ama ‘and’ does not fill a gap in the system of

|| 21 We cannot go into the intriguing question whether Central Siberian Yupik ama ‘and’ is in any way related to Aleut Bering (h)ama(s) ‘and’ mentioned in Section 4.2.6. Is it possible to link the Aleut Bering conjunction to its Chukchi equivalent?

306 | Thomas Stolz and Nataliya Levkovych

the replica language because it already had means to express coordination prior to contact with Chukchi.

4.2.22 Chamalal The Nakh-Daghestanian Chamalal is reported to borrow Arabic amma ‘but’ and va ‘and’ (Alekseev 2001b: 298). The copulative loan conjunction is part of example (52). (52)

Chamalal [Bokarev 1949: 124] Innubeda č’amalaldu-be bak’ve ixv REFL.GEN.PL Chamalal_person.M-PL be:PAST NEG beg-zabi, xan-zabi va nucal ahul... bek-PL khan-PL and prince folk ‘For the Chamalals themselves there were no beks, khans and prince [noble] folk...’

4.2.23 Chechen The turnout of loan conjunctions is rather small in the case of Chechen since the only pieces of evidence for their existence we could find are amma ‘but’ from Arabic (Jeschull 2004: 260–261) and Persian ja ‘or’ as well as ja...ja ‘either...or’ (Jeschull 2004: 257–258) as shown in example (53). (53)

Chechen Moguš vu swo amma healthy be.1SG 1SG but ‘I’m healthy, but I’ve lost a little weight.’

k’ezzig a_bit

[Nichols 1994a: 60] azvella thin_became

With reference to the Chechen case, Jeschull (2004: 263) assumes that the coordinating constructions of this Nakh-Daghestanian language resemble those of Russian very closely so that the notion of PAT-borrowing is invoked. The author especially emphasizes that the bisyndetic construction ja...ja ‘either...or’ mirrors Russian ili...ili ‘either...or’. To our mind, this parallel does not prove that Russian influence is responsible for the existence of this particular construction in Chechen. On account of many parallel examples from other replica languages in the same region and beyond, it makes more sense to assume that the donor language is Persian – both for single disjunctive ja ‘or’ and the correlative ja...ja ‘either...or’.

On loan conjunctions | 307

4.2.24 Chuvash In contrast to other Turkic languages, Chuvash attests only to a very limited number of loan conjunctions of Persian origin. These are je ‘or’, je...je ‘either...or’, and exer ‘if, when’ (Landmann 2014b: 100). The latter loan conjunction is exemplified in (54). The pronoun of the 2nd person plural is the polite form which is coreferential with the 2nd person singular on the first finite verb. (54)

Chuvash exer esir vărmana kajsan epĕ te if 2PL forest:DAT go:2SG 1SG too ‘If you go into the forest, I go with you, too.’

[Landmann 2014b: 100] sirĕnpe pyratăp 2PL come:1SG

4.2.25 Crimean Tatar Crimean Tatar (Turkic) is characterized by a plethora of loan conjunctions from both Arabic and Persian. Arabic has contributed amma ‘but’, l‘akin ‘but’, faqat ‘but’, and ve ‘and’ (Memetov 2013: 530–535). In addition, (h)äm...(h)äm ‘both...and’, nä...nä dä ‘neither...nor, ya...ya da ‘either...or’, ki ‘that’, čünki ‘because’, ägär ‘if’, (oylä …) ta ‘so that’ (Prokosch 2006: 273–277) go to the credit of Persian. For the latter conjunction, the Crimean Tatar-Russian dictionaries (e.g. Asanov et al. 1988; Useinov 2008) do not mention ta as consecutive conjunction but register it as an intensifier particle. On account of this controversy, we refrain from admitting Crimean Tatar ta in our database. In (55), we provide an example of the causal loan conjunction čünki ‘because’. (55)

Crimean Tatar [Prokosch 2006: 278] Šu-niñ ičün bu-nda saqt olmalı čünki that-GEN because_of this-LOC careful be:NEC.3SG because til mäsälä-si ġayät nazik inğä mäsälä. language problem-POR.3SG enormously delicate fine problem ‘Therefore, one has to be careful because the language problem is an extremely delicate and difficult problem.’

4.2.26 Dargwa In Dargwa (Nakh-Daghestanian), Arabic wa ‘and’ and am:a ‘but’ as well as Persian eger ‘if, when’, nagah ‘if’, yara ‘or’, and the correlative ya...yara ‘either...or’

308 | Thomas Stolz and Nataliya Levkovych

are attested (Isaev 2004: 325). Abdullaev (1971: 34) classifies yara as variant of ya. Ya and yara can combine to form a correlative construction ya...yara ‘either...or’. This is why the two conjunctions are lumped together in our account of loan conjunctions. The latter bisyndetic construction is illustrated in (56) with orthographic = ä = /ja/. (56)

Dargwa [Abdullaev 1971: 34] it-ini baribsi ä ħu-ni ma-birid, ä nu-ni 3SG-ERG do:PAST.3SG or 2SG-ERG NEG-do:IMP or 1SG-ERG ħe-biris NEG-do:FUT.1SG ‘What he did, neither you should do, nor I will do.’

4.2.27 Dolgan The Turkic language Dolgan gives evidence of three Russian loan-conjunctions, namely ā ‘and’, i ‘and’, and no ‘but’ (Li 2011: 185). Example (57) illustrates how ā ‘and’ is made use of in the replica language. (57)

Dolgan sīder ïrïa-nï ïllï̄r e-te ā Sider song-ACC sing:PART be-PAST.3SG and üŋkǖlǖr dance:PRS.3SG ‘Sider was singing and Yuliya was dancing.’

[Li 2011: 185] yūl’a Yuliya

4.2.28 Enets (Forest) In this Uralic language, several Russian loan conjunctions are attested, namely the subordinating čtoby ‘in order to’, jesli ‘if’, potomu čto ‘because’, kogda ‘when’, poka ‘while’ (Khomchenkova and Stoynova 2021: 90). We have not been able to determine whether the usual candidates for conjunction borrowing – copulative, disjunctive, and adversative conjunctions – are also taken from Russian. Example (58) contains the purposive loan conjunction čtoby ‘in order to’. (58)

Enets Forest [Khomchenkova and Stoynova 2021: 91] pɔxi ʃtɔb sɔjza-ɔn kasu-nʲi-ʃ dried_fish in_order_to good-PROL.SG dry_out:PERF-CONJ-3SG.PAST ‘In order that dried fish dry out well...’

On loan conjunctions | 309

4.2.29 Estonian In the standard variety of the Uralic language Estonian, two Germanic loan conjunctions are registered. As to ent ‘but’ Laanest (1982: 292) speaks vaguely of a Germanic origin – most probably Scandinavian (Etymological dictionary of Estonian – online consulted 8 April, 2021) – whereas Grant (2012: 348) assumes that ja ‘and’ goes back to Gothic (Grant 2012: 348). The latter copulative conjunction has made it into the grammatical systems of many members of the Finnic branch and has been handed down from there to Saamic. In their grammar of Gothic, Braune and Ebbinghaus (1973: 126) give jah ‘and, also’ as the copulative conjunction number one of this extinct Germanic language. The borrowing must have happened around the time of the Great Migrations. Example (59) involves the adversative conjunction ent ‘but’ in modern standard Estonian. (59)

Estonian [Tauli 1983: 284] kõik pidi olema korras ent siiski everything must:IMPERF be:INF in_order but still pole korras NEG.be in_order ‘Everything was said to be in order, nevertheless it is not in order.’

The Germanic loan conjunction ent has to compete with autochthonous synonyms – aga ‘but’ and kuid ‘but’ which are commonly used whereas the loan conjunction “occurs seldom” (Tauli 1983: 284). It is unclear whether there are stylistic or semantic nuances which determine which of the adversative conjunctions is chosen for a given utterance.

4.2.30 Evenki In recent years, speakers of the Tungusic language Evenki have started to use conjunctions borrowed from Russians, such as i ‘and’, sato ‘but’ (Bulatova and Grenoble 1999: 56). An example of the copulative loan conjunction is given in (60). (60)

Evenki [Bulatova and Grenoble 1999: 56] gǝlǝktǝ-rǝ-n gǝlǝktǝ-rǝ-n i ba:-rǝ-n baka-mi:. look-AOR-3SG look-AOR-3SG and unable-AOR-3SG find-PART.COND ‘He looked, looked, and couldn’t find [it]’

Grenoble (2000: 115) states that “even the most fluent speakers [of Evenki] make use of these [loan] conjunctions.”

310 | Thomas Stolz and Nataliya Levkovych

4.2.31 Gagauz Gagauz (Turkic) as spoken in Moldova was one of the westernmost alloglottic languages in the Soviet Union. Like most of its sister languages, Gagauz gives evidence of Persian loan conjunctions such as the complementizer ki ‘that’, the copulative hem ‘and’, causal čünkü ‘because’, adversative ama ‘but’ (< Arabic), disjunctive ya ‘or’, and the bisyndetic ne...ne ‘neither...nor’ (Menz 2006: 142; Karanfil 2010: 88 and 116). Menz (2006: 142) is sceptical as to the existence of Slavic (Russian or Bulgarian) raz ‘when’ and už ‘as if’ postulated by Gajdarži (1981). In contrast, Menz (2006: 142) assumes that the Slavic influence on Gagauz syntax comes mostly in the shape of PAT-borrowing. One of her own examples, however, proves that the adversative loan conjunction a ‘but’ also forms part of the Gagauz grammatical system as shown in (61). (61)

Gagauz [Menz 2006: 147] Buradan stadyona kaa bän üüzeǰem here:ABL stadium:DAT to 1SG swim:FUT.1SG a gēri gelmem deil belli but back come:NR.POR.1SG NEG clear ‘I can swim from here to the stadium, but it’s not certain that I will come back.’

4.2.32 German (Volga) The Russification of everyday discourse in the varieties of German spoken in the former Soviet Union has been addressed repeatedly. Matras (2009: 140–141) discusses examples of this process without reference to loan conjunctions. However, loan conjunctions from Russian are abundantly attested in German (Volga). A cursory look at the anthology edited by Berend (2011) yields the following turnout: a ‘and, but’, i ‘and’, no ‘but’, ras ‘when’, chotj ‘although’, chotja-by ‘even if’, ili...ili ‘either...or’, lisch-by ‘if only’ (Berend 2011: 60–61, 78, 97–98, 131, 155). Berend (2011: 188) characterizes loan conjunctions and other “Kleinwörter” (‘little words’) as extremely frequent elements. In (62), it is shown that Russian liš’ by ‘if only’ is fully integrated as inter-clausal conjunction in the German variety.22

|| 22 The example stems from a text which represents Wolynian German as spoken near the Siberian city of Omsk in the 1980s. Owing to the intricate migration history of the

On loan conjunctions | 311

(62)

German (Volga) Sachn immer des braucht say:PL always it need:3SG Reichtum sein lisch by das Leben wealth be:INF if_only DEF.NT life ‘They always say that it does have to be wealth well.’

[Berend 2011: 146] nich de NEG DEF.M jejt. go:3SG provided life is going

4.2.33 Hinuq It is instructive to read what Forker (2013: 451) has to say about loan conjunctions in the Nakh-Daghestanian language Hinuq: Other conjunctions are the disjunctions ya, yagi, and yaɬuni ‘or, either’, the contrastive or adversative conjunction amma ‘but’, and the conditional conjunction nagaħ ‘if’. All these conjunctions are Avar loans. Note, however, that Hinuq uses converbs and participles for subordination; therefore, special subordinators or complementizers are not needed. The only true subordinator nagaħ is hardly ever used []. Younger speakers occasionally use the Russian conjunctions i ‘and’, ili ‘or’, no ‘but’, esli ‘if’, and the complementizer čto ‘that’, especially if they are asked to translate from Russian into Hinuq.

The author identifies two layers of loan conjunctions with two different donor languages. There is the older layer which reflects language contact with Avar whereas the more recent layer involves Russian loan conjunctions which are typical for the speech habits of the younger speakers. What is particularly interesting about this situation is that the older loan conjunctions are given an Avar origin. This is certainly correct if we look at the near donor. The distant donors are, however, Arabic for amma ‘but’ and Persian for ya ‘or’ and nagaħ ‘if’. Yaɬuni ‘or, either’ and yagi ‘or’ are Persian-Avar hybrids with the initial ya ‘or’ from Persian. It transpires from the dedicated literature that Avar has repeatedly functioned as propagator of originally Arabo-Persian conjunctions to whose wide distribution across the Caucasian region Avar has contributed considerably. Xalilov and Isakov (2001: 339) register the following loan conjunctions for Hinuq: yagi...yagi ‘either...or’, amma ‘but’, va ‘and’. Example (63) illustrates the use of the copulative conjunction wa ‘and’.

|| Germanophone groups in Russia and the Soviet Union, this group moved from an initial cisUralian settlement (western Ukraine) to Siberia and Kazakhstan in the aftermath of the German attack on the Soviet Union in the course of World War II.

312 | Thomas Stolz and Nataliya Levkovych

(63)

Hinuq zeru-z haw b-ike-s wa hayɬuy fox-DAT that CL3-see-PAST and it.ERG b-uɬi-š 3-begin-PAST ‘The fox saw it and wanted to betray it.’

haw that

[Forker 2013: 544] b-aƛ’ir-a 3-betray-INF

4.2.34 Hunzib According to Isakov (2001: 317), Hunzib (Nakh-Daghestanian) attests to the loan conjunctions va ‘and’ and amma ‘but’ from Arabic and the correlative ja...ja ‘either...or’ from Persian. The existence of the latter is also acknowledged by van den Berg (1995: 134) who seems to have doubts as to the presence of wa ‘and’ in Hunzib (van den Berg 2004: 218). We illustrate the employment of the correlative ya...ya ‘either...or’ in example (64). (64)

Hunzib [van den Berg 1995: 134] ya iyu-l ya αbu-l baba b-ox-át‘ or mother-ERG or father-ERG bread.CL4 CL4-buy.PRS-NEG ‘Neither mother nor father buys a loaf.’

4.2.35 Ingrian Apart from the singular Gothic loan conjunction ja ‘and’ which is attested across most of the members of the Finnic branch of Uralic, there is evidence of several Russian loan conjunctions in Ingrian, namely a ‘but’, dai ‘and also’, i ‘and’, ili~iľ i ‘or’, jes’li ‘if’, libo ‘or’, što ‘that’, štobi̮~štob(i)~štoB ‘in order to’ (Laanest 1982: 292). The disjunctive libo ‘or’ is illustrated in (65). (65)

Ingrian [Nirvi 1971: 265] oli kaks liBo kolt kiv̌vì(ä siDä be:3SG.IMPERF two or three stone:PTV.PL that:PTV küläz koGo pile keel:INESS ‘These were two or three stones all of them in the keel.’

On loan conjunctions | 313

4.2.36 Ingush Like its sister-language Chechen (Section 4.2.23), Ingush (Nakh-Daghestanian) yields a rather limited turnout of loan conjunctions. On account of our small empirical basis, we could confirm only the borrowing of amma ‘but’ from Arabic (Nichols 1994b: 126) for which we provide an example in (66). (66)

Ingush So var ciga amma 1SG be.PAST.1SG there but ‘I was there, but I didn’t see you.’

šo-m 2SG-PTC

[Nichols 1994b: 127] dovnzar suona see:NEG:PAST 1SG:DAT

It is not entirely impossible that a dedicated in-depth search might reveal more cases of loan conjunctions also for this language.

4.2.37 Itelmen Like Chukchi, the Chukotko-Kamchatkan language Itelmen has a sizable set of autochthonous conjunctions. In contrast to its next of kin, however, Itelmen complements the inventory of conjunctions with loans from Russian. Georg and Volodin (1999: 202–206) mention a ‘and, but’, i ‘and’, kak ras ‘immediately when’, poka ‘while’, toļko ‘as soon as’. Whether this list exhausts the number of Russian loan conjunctions in Itelmen is a question we cannot answer in this study. Example (67) illustrates the use of the complex temporal conjunction kak ras ‘as soon as’. (67)

Itelmen [Georg and Volodin 1999: 207] Xaļç laç izuse-qzu-wen sezza-qzu-wen n-sxezi-k now sun rise-IMPERF-3SG flood-IMPERF-3SG 1PL-leave-1PL Kamenskoj-anke kak ras sutneʔ n-ənx-kiçen ŋuʔn. Kamenskoe-DAT as_soon_as steamer 1PL-find-1PL here ‘Well, the sun rose, the high tide came in, and we left for Kamenskoe, as soon as we found the steamer.’

4.2.38 Kabardian For the Abkhaz-Adyge language Kabardian, Colarusso (1989: 342) enumerates the coordinating conjunctions several of which are reminiscent of Arabic and/or Persian equivalents. This is the case with disjunctive yə ‘or’ (but there is also

314 | Thomas Stolz and Nataliya Levkovych

homophonous copulative yə ‘and’) which invokes Persian ya ‘or’, and ħama ‘either...or’ which is certainly connected to Persian ham ‘or’. In addition, there is adversative awa ‘but’ (analyzed as bimorphemic a-wa in the source) whose connection to either Arabic wa ‘and’ or Arabic amma’ ‘but’ cannot be proved empirically. On the other hand, Kumaxov (2013: 271) mentions Persian ja…ja ‘either … or’ and a:mā ‘but’ which we assume to be related to Arabic amma’ ‘but’. The disjunctive loan conjunction ħama ‘or’ is featured in example (68). (68)

Kabardian ḥa-ha ḥama dog-PL or ‘dogs or cats’

[Colarusso 1992: 168] gyadəw-ha cat-PL

4.2.39 Kalmyk The borrowing behavior of the Mongolian language Kalmyk is characterized as follows by Baranova (2021: 11) for the domain of conjunctions: В калмыцком языке практически отсутствуют случаи прямого заимствования русских союзов, по крайней мере, среди освоенных и вошедших в словарь заимствований, однако в устных текстах встречаются русские союзы. Во многих случаях не очевидно, следует ли трактовать эти союзы как переключения кодов или заимствования.23

Kalmyk thus seems to be relatively resistant to the impact of Russian in this domain of its grammar. However, Benzing (1985: 71) mentions that a ‘and, but’ has been borrowed from Russian. Baranova’s (2021: 12) own example of the Russian loan conjunction no ‘but’ forms part of a sentence which contains a sequence of three Russian words which must be considered an instance of codeswitching. We reproduce this example as (69) below. The codeswitched passage is marked out by underlining. (69)

Kalmyk Oda cag-tə now time-DAT

uga, NEG.COP

oda now

cag-tə time-DAT

[Baranova 2021: 12] ter that

|| 23 Our translation: ‘In Kalmyk, there are practically no cases of directly borrowed Russian conjunctions, at least among the assimilated conjunctions which are mentioned in the vocabulary of loans, still Russian conjunctions are present in spoken language. In many cases it is not clear if these conjunctions should be treated as code switching or as loans.’

On loan conjunctions | 315

v osnovnom korejcy ködəl-dəg bilä, no xaljmg-ud in major Koreans work-PC.HAB be.REM but Kalmyk-PL basə ködəl-dəg bilä also work-PC.HAB be.REM ‘There is no [melon field] now, that time mostly Koreans worked there, but Kalmyks also worked.’

4.2.40 Karachay-Balkar According to Pritsak (1959b: 366), the Turkic language Karachay-Balkar has borrowed em ‘and’ and (w)ā~(w)a ‘and’ from Ossetic. It is clear, however, that Ossetic is the near donor whereas the latter loan conjunction is originally Arabic. This is also true of amma ‘but’ whose presence in the grammar of KarachayBalkar is assumed by Aliev (1973: 277). Example (70) shows how originally Persian em ‘and’ is made use of in the replica language. (70)

Karachay-Balkar ǯangur ǯauğanə toxtadə, rain fall:PAST stop:PAST ‘The rain stopped and the sun rose.’

em and

[Aliev 1973: 277] k’un tijdi sun rise:PAST

4.2.41 Karaim Karaim (Turkic) is subdivided into several varieties scattered over different regions in the western half of the erstwhile USSR. In analogy to the case of German (Volga) in Section 4.2.32, we do not further differentiate but lump all varieties of Karaim together. Berta (1998: 314) claims that colloquial Karaim has conjunctions of Slavic origin without going into the details. Pritsak (1959a: 338–339) mentions a ‘but, and’, χot'(a) ‘unless’, i ‘and’, no ‘but still’ all of which are probably of Russian origin except no ‘but still’ for which borrowing from Polish is also an option. More cases can be found in the trilingual Karaim-Russian-Polish dictionary compiled by Baskakov et al. (1974). Table 2 presents the loan conjunctions in alphabetical order. For each item, the meaning is given as well as the donor language and the variety of Karaim in which the loan conjunction is attested according to our source.

316 | Thomas Stolz and Nataliya Levkovych

Table 2: Loan conjunctions in Karaim according to Baskakov et al. (1974).

Conjunction

Meaning

a

‘but’

ale

‘but’

amma

‘but’

i jezli

Donor

Variety

Page

Russian

Trakai

37

Polish (?), Arabic

Trakai

64

Arabic

Krim

67

‘and’

Slavic

Galits

192

‘if’

Russian

Galits

269

ki

‘because, in order to’

Persian

Trakai/Galits

316

l‘akin

‘but’

Arabic

Krim

400

no

‘but’

Polish/Russian

Trakai

420

Krim

xota ‘although’

Russian

Trakai

čunki~čünki

‘because’

Persian

Krim

633

eger

‘if’

Persian

Trakai/Galits

653

xot‘

604

Galits

xotei

Baskakov et al. (1974: 168) also list daa ‘and, still’ as a borrowing from Arabic without, however, specifying the supposed Arabic etymon. Since there is no suitable candidate in Arabic as the source of the putative loan conjunction, we side with Pritsak (1959a: 338) who classifies daa as part of the Turkic heritage of Karaim. We thus exclude it from the list of loan conjunctions. Németh (2004: 113–114) convincingly argues that Karaim ale ‘but’ is not of Slavic origin alone, but Polish/Ukrainian ale and Persian vali(kan) ‘but’ from Arabic walakin ‘but’ have merged in the course of the contact history of Karaim. Example (71) illustrates the use of the concessive loan conjunction xotej ‘although’. (71)

Karaim Baram saχar-χa χotäy yamχur go:1SG city-DAT although rain ‘I go into the city, although it is raining.’

[Pritsak 1959a: 339] kis-äd fall-3SG

On loan conjunctions | 317

4.2.42 Karakalpak In the Turkic language Karakalpak, we find several copulative loan conjunctions. Beside Arabic va~vɛ~ve ‘and’, there are also hɛm ‘and’ attributed to Persian (Wurm 1951: 593) as well as mu ‘and’ borrowed from Uyghur.24 Baskakov (1952: 516) observes differences in the distribution of the loan conjunctions. Those of Arabic origin are said to be mostly used to connect sentences or clauses with each other whereas the other loan conjunctions are used within clauses. Karakalpak gives evidence also of several adversative conjunctions, namely eli ‘but’ (< wali Arabic), lekin ‘but’ and amma ‘but’ with the latter two Arabisms having fallen out of use after the Russian Revolution as Baskakov (1952: 519) argues. In addition, there is not only the Arabo-Persian hybrid bɛlki ‘but’ but also Russian a ‘but’ (Wurm 1951: 594). The correlatives ne...ne ‘neither...nor’ (Baskakov 1952: 521) and ja...ja ‘either...or’ are loans from Persian as is the simple disjunctive ja~jaki ‘or’ (Wurm 1951: 594) and the conditional ɛger~ɛger de ‘if’ (Wurm 1951: 594).25 In (72), we show how one of the many copulative loan conjunctions is used in Karakalpak. (72)

Karakalpak bałłar hɛm muɣallim child:PL and teacher ‘The children and the teacher came.’

[Wurm 1951: 593] keldi come:PAST

4.2.43 Karata In Karata (Nakh-Daghestanian), there are again loan conjunctions from Arabic, namely wa ‘and’, amma ‘but’. Magomedbekova (1971: 176) also mentions yal’uni ‘or’ taken from Avar which is a hybrid formation based on Persian ya ‘or’. Additionally, Rasulova (2013: 55 and 65) identifies the loan conjunctions ya ‘or’ (Persian), yagi ‘or’ (Persian-Avar hybrid), nagag’ ‘if’ (Persian) in Karata. As (73) shows there is also the Persian correlative ya...ya ‘either...or’.

|| 24 Our source assumes a Uyghur origin also for Karkalpak da ‘and’. We assume, however, that we are dealing with an inherited conjunction which is widely common across the members of the Turkic language family. 25 Baskakov (1952: 521) mentions the causal conjunction sebebi ‘because’ (lit. ‘its reason’) which reflects an original Arabic common noun. A parallel can be found in Turkmen (Landmann 2013b: 105). We are not certain as to the status of this case and thus exclude it from further discussion.

318 | Thomas Stolz and Nataliya Levkovych

(73)

Karata [Rasulova 2013: 55] men ya q’ajƛ’a wuɣi, ya dik’el saru woƛƛu 2SG or home stay:IMP or with_me together go:IMP ‘Either stay at home or get ready to come with me.’

4.2.44 Karelian Karelian (Uralic) is another case of a highly differentiated diatopic system whose internal subdivisions are glossed over in this study. Sarhimaa (1999: 185– 188) presents sentential examples from Karelian-Russian language alternation in which conjunctions play an important role. Some of the instances of Russian conjunctions being used in a Karelian utterance seem to be bona fide loan conjunctions whereas in other cases it is next to impossible to decide whether borrowing or codeswitching applies. This difficulty is probably also characteristic of Karelian examples found in other sources. Laanest (1982: 292) assumes the same set of loan conjunctions for several members of the Finnic branch of Uralic among which we find Karelian. According to this source, there is ja ‘and’ from Gothic alongside the Russian loan conjunctions a ‘but’, dai ‘and also’, i ‘and’, ili~iľ i ‘or’, jes’li ‘if’, libo ‘or’, što ‘that’, štobi̮~štob(i)~štoB ‘in order to’. As example (74) suggests this list is not exhaustive because Laanest does not mention the adversative conjunction no ‘but’ from Russian. (74)

Karelian mie läks-isi-n kala-lla, 1SG go-COND-1SG fish-ADSS ‘I would go fishing but I don’t want to.’

no but

[Zajkov 2000: 33] haluo NEG:1SG want en

It is only to be expected that an in-depth search of the available Karelian sources will reveal the existence of further Russian loan conjunctions.

4.2.45 Kazakh As to the category of conjunctions in Kazakh (Turkic), Kirchner (1998a: 327) remarks that [u]nlike more strongly Persianized Turkic languages such as Uzbek and Turkmen, Kazakh has a weakly developed system of conjunctions. The conjunction žæne ‘and’ usually only coordinates two clauses. It often occurs in constructions copied from Russian, being itself in some respects a semantic copy of Russian i ‘and’, whose use has been reinforced in the

On loan conjunctions | 319

written language. Biraq ‘but’ is adversative, whereas yæ and yæki ‘or’, both copied from Persian, are disjunctive.

Muhamedowa (2009) looks at Russian-derived conjunctions in the speech of bilingual Kazakh speakers. She identifies potomu čto ‘because’, i ‘and’, a ‘but’, and ili ‘or’ as the most frequently employed Russian insertions in Kazakh (Muhamedowa 2009: 343). There is also evidence of the complementizer čto ‘that’ being used by bilinguals (Muhamedowa 2009: 348). It is not clear to us to what extent these cases are also characteristic of the speech of monolingual Kazakh speakers. Chances are that we are facing code-mixing in lieu of fullyblown borrowing. In her descriptive grammar of Kazakh, Muhamedowa (2016: 62–64) mentions only i ‘and’, a ‘but’ as Russian loan conjunctions. According to Landmann (2012: 97–98), there are also the Persian loan conjunctions eger ‘if’, ja...ja ‘either...or’, and ne...ne ‘neither...nor’ which are fairly common across the members of the Turkic language family. The use of the Persian conditional loan conjunction is illustrated in (75). (75)

Kazakh eger tauɢa barsandar men if mountain:DAT go:2PL 1SG ‘If you go to the mountain, I will go, too.’

de too

[Landmann 2012: 98] baramin go:1SG

4.2.46 Ket A general survey of structural Russification of Ket (Yeniseian) is provided by Maksunova (2003). Werner (1997a: 318) identifies the loan conjunctions i ‘and’, nɔ ‘but’, iľ i ‘or’, a ‘and’ from Russian whose existence in the replica language is confirmed by Nefedov (2015: 101). Moreover, Nefedov (2015: 105) also assumes a Russian origin for the correlative construction qōd...qōd ‘either...or’ which he connects to Russian xot’ ‘even (if)’. The use of the adversative conjunction a ‘but’ is illustrated in example (76). (76)

Ket [Nefedov 2015: 111] dɨˀl baŋ-di-ŋta da-ses-ta a bəjbel-aŋ child earth-N-ADESS 3F-place-be_in_position but braid-PL əl-am outside-N.PRED ‘The girl sits in the ground, whereas (her) braids are outside.’

320 | Thomas Stolz and Nataliya Levkovych

4.2.47 Khakas Schönig (1998c: 414) states that in Khakas (Turkic) “some conjunctions are Russian loanwords, e.g. a ‘but’. In many cases, the way of using the conjunctions is copied from Russian.” There is thus evidence both of MAT-borrowing and PATborrowing. From Anderson’s (2005: 197–200 and 219–221) account of the Russification of Khakas syntax, we extract the following Russian loan conjunctions: što ‘that’, štop~štob ‘in order to’, a ‘and’, poka ‘while, until’, i ‘and’. It is very likely the case that this list covers only a small part of the Russian loan conjunctions used in contemporary Khakas. Example (77) involves the Russianderived complementizer. (77)

Khakas [Anderson 2005: 197] noɣa saɣ-n-čar što olar pu kɪzɪ-ler xorix-ča-lar why think-PRS-2PL that 3PL this person-PL be_afraid-PRS-PL ‘Why do you think that they are afraid?’

4.2.48 Khanty For the Uralic language Khanty, several studies address the issue of Russian loan conjunctions. Nikolaeva (1999: 38) registers i ‘and’ and no ‘but’ to which Filchenko (2008: 34) adds ili ‘or’. With special reference to Eastern Khanty, Potanina and Filchenko (2016: 34) also mention that the Russian conditional construction with the conjunctions esli… to… (Russ[ian]: ‘if…, then…’) is replicated in Eastern Khanty, while the native conditional constructions are morphologically marked on the condition predicate.

Their example of this phenomenon is reproduced as (78) below. (78)

Khanty [Potanina and Filchenko 2016: 34] esli sajm-ali əntim, to tʃoɣo jul-wən if stream-DIM NEG then snow melt-PRS.2SG ‘If there is no stream, then you melt some snow.’

Borise and Kiss (2021: 11) argue that the appearance of Russian loan conjunctions in Khanty texts is a relatively recent development dating back to the 1930s when tangible proof of borrowed i ‘and’ and a ‘but’ emerged. The authors connect the Russification in the domain of clause linkage and NP-coordination to the growing Khanty-Russian bilingualism and increasing literacy (Borise and Kiss 2021: 35). Their evidence suggests that PAT-borrowing is the normal strategy.

On loan conjunctions | 321

4.2.49 Khinalug For the Nakh-Daghestanian language Khinalug, Alekseev (2001d: 469) takes note of the loan conjunctions anǯaq ‘but, only’ from Azerbaijanian, ja...ja ‘either...or’, nä...näm ‘neither...nor’, and ki ‘that’ from Persian as well as amma ‘but’ and va ‘and’ from Arabic. It is possible that most of these borrowings have been transferred to Khinalug via Azerbaijanian. The adversative loan conjunction amma ‘but’ is featured in example (79). (79)

Khinalug [Khvtisiashvili 2013: 159] jir velejboli ansk-ir-z̆-šä-mä amma q̇ula gäššämä 1PL volleyball play-IMPERF-VCM.IRR-PAST-IND but rain came ‘We were going to play volleyball, but it rained’

4.2.50 Khwarshi As to Khwarshi (Nakh-Daghestanian) and its stock of loan conjunctions, our sources distinguish between those which entered the language via Avar, namely ya ‘or’, yagi ‘otherwise’, yaɬuni ‘or’, yaɬunani ‘or’ (Karimova and Xalilov 2013: 68), and Arabic and Persian loan conjunctions. The Avar hypothesis is also defended by Khalilova (2009: 458) who argues that [i]n addition to the question particle used with each alternative, alternative questions can also use the Avar loan conjunction yagi / ya ‘or’, which is positioned between the two alternatives. Also the conjunction yagi...yagi / ya…ya ‘either...or’ can occur twice, once before each alternative.

In analogy to other cases of supposed Avar loan conjunctions, we assume that we are dealing with hybrid formations which contain a Persian and an Avar component. It is certainly correct to classify amma ‘but’ as an Arabism. However, Karimova and Xalilov (2013: 184) consider ya ‘or’ to be of Arabic origin although they also assume it to be borrowed from Avar. To our mind, ya ‘or’ is a Persian loan conjunction like nagah ‘if’ (Karimova and Xalilov 2013: 246). An example of the latter is given in (80). (80)

Khwarshi nagah žu ono b-eč-ło, if that.ABS there CL3-be-COND ‘If it (bear) is there, then move.’

[Khalilova 2009: 413] žwarλ’ada-ya λɨn. move-IMP QUOT

322 | Thomas Stolz and Nataliya Levkovych

4.2.51 Komi-Permyak Hausenberg (1998: 318) sets the scene when he claims that “Komi syntax shows many Uralic traits as well as innovations that are mostly due to Russian influence. Many aspects of the field still await systematic exploration.” Accordingly, Komi-Permyak (Uralic) gives evidence of a number of Russian loan conjunctions such as i ‘and’, da(j) ‘and’, no ‘but’, a ‘but’, köt’ ‘although’ (< Russian xot’), li ‘or’, and ni ‘neither’ (Avril 2006: 59). Beyond this inventory, Lytkin (1961: 82) also registers jezl’i ‘if’, l’ibö ‘or’, and al’i ‘or’ for the Yazva variety of KomiPermyak. The concessive conjunction köt’ ‘although’ is illustrated in (81). (81)

Komi-Permyak Menym lokas köt’ 1SG:DAT come:3SG.FUT although ‘He will come to me although it is snowing.’

lym snow

[Avril 2006: 83] us’ö fall:3SG.PRS

4.2.52 Komi-Zyrian According to Rédei (1978: 121), Komi-Zyrian (Uralic) has borrowed the following conjunctions from Russian: a ‘however, but, still’, da ‘and’, i ‘and, too’, aľ i ‘or’, ľ ibe̮ ‘or’, ľ ibe̮...ľ ibe̮ ‘either...or’, ńi ‘neither’, ńi...ńi ‘neither...nor’, no ‘but’. In his investigation of the Russian impact on the Ižma variety of Komi, Leinonen (2009: 309) claims that the Russification of this variety is particularly strong because of more intensive contacts with the dominant language. The Russian impact comes especially to the fore in the domain of conjunctions. In addition to the above loan conjunctions identified by Rédei (1978), Leinonen (2009: 219) mentions potomu što ‘because’, tak što ‘so that’, and the Komi-Russian hybrid med by ‘in order to’ all of which are binary, i.e. complex formations. Conjunctional hybrids are relatively common also in other varieties of Komi (Leinonen 2009: 326). In (82), an example of what Leinonen (2009: 325) terms a “hybrid pleonastic conjunction” is given, i.e. the Russian conditional conjunction esl’i ‘if’ combines with the synonymous Komi conditional clause marker kö > ke ‘if’. (82)

Komi-Zyrian [Leinonen 2009: 325] Esl’i ke mijan kyk tuj vyl-yn kut-am vors-ny if if 1PL.GEN two road up-on begin-1PL.PRS play-INF ‘If our two roads are opened both down and up, then we shall play along with it.’

On loan conjunctions | 323

This combination seems to be relatively old since it was recorded already in 1910 (Leinonen 2009: 326).

4.2.53 Kryts Kryts is a member of the Nakh-Daghestanian language family. As to its loan conjunctions, the picture resembles that of many of its sister-languages. There is the adversative conjunction aman ‘but’ from Arabic side by side with the Persian loan conjunctions ki ‘that’, yä...yä ‘either...or’ (Saadiev 1994: 441–442) and na...na ‘neither...nor’ (Authier 2009: 384). The latter correlative is illustrated in (83). (83)

Kryts [Authier 2009: 384] na ğant-ibe-zina ça-rt’-de-d-ni neither wing-PL-INST PREV-strike-NEG.PRS-NT-PAST na maʕan uxvats’-de-d-ni nor sing-NEG.PRS-NT-PAST ‘It neither flapped its wings nor sang any more.’

4.2.54 Kumyk In the Turkic language Kumyk, we encounter the usual Arabo-Persian loan conjunctions, namely amma ‘but’ and va ‘and’ from Arabic as well as ya ‘or’, eger ‘if’, ne...ne ‘neither...nor’ from Persian (Doniyorova and Qahramonil 2004: 42). The adversative loan conjunction is featured in example (84). (84)

Kumyk [Doniyorova and Qahramonil 2004: 67] Sizin jaxšy tjošjonemen amma cjojlemekka qyjynlyɢym 2PL:DAT well understand:1SG but speak:INF:ACC difficulty:1SG ‘I understand you well, but I have difficulties to express myself.’

4.2.55 Kyrgyz For Kyrgyz (Turkic), Kirchner (1998b: 352) argues that “[s]ince Iranian influence on Kirghiz has not been very intensive, the system of conjunctions is only weakly developed.” However, there is evidence of Arabisms like va ‘and’ and Persian loan conjunctions such as eger(de) ‘if’ (Kirchner 1998b: 354), že...že ‘either...or’ (Persian), ne...ne ‘neither...nor’ (Persian) (Dor 2004: 158–159). Landmann (2011:

324 | Thomas Stolz and Nataliya Levkovych

97) also mentions simple že ‘or’ from Persian. Sentence (85) involves the conditional loan conjunction. (85)

Kyrgyz eger maqul bolsoŋ if agreed become:COND.2SG ‘I will tell you, if you agree.’

[Kirchner 1998b: 354] aytam tell:1SG

4.2.56 Lak Schulze (2007: 13) gives the following loan conjunctions for the NakhDaghestanian replica language Lak: anma~amma ‘but’ and wa ‘and’ from Arabic, agar ‘if’ and ya ‘or’ from Persian. Žirkov (1955: 132) erroneously considers ya ‘or’ and agar ‘if’ to be borrowed from Arabic. He also mentions anǯaq ‘only’ from Turkic (presumably Azerbaijanian) whose status as conjunction is unclear to us and ǯaq...ǯaq ‘either...or; now...now’ from Persian. As results from example (86), the copulative loan conjunction wa ‘and’ cliticizes to the first conjunct and thus is an instance of the pattern A-CO B (Haspelmath 2007: 6). (86)

Lak [Schulze 2007: 21] na ina qin-nu-wa x:ari-nu t’ayla 1SG:ABS 2SG:ABS good-ADV-and happy-ADV directly b-uk:-an-na ku-nu b-u-r. CL3-let_go-INF-SAP:SG say:PAST-PAST CL3-be:PRS-nSAP ‘I will nicely and happily let you go directly.’

4.2.57 Latvian26 The Indo-European language Latvian (Baltic) attests to the borrowing of un ‘and’ from (Middle) Low German (cf. Curonian in (5a)). It is attested in Latvian texts as early as the 16th century when it still had the shape und~unt (Karulis 1992: 453). In contemporary Latvian, un ‘and’ serves as coordinator of NPs as well as of clauses. Example (87) shows how un ‘and’ is made use of in modern Latvian.

|| 26 As to Latgalian, an Eastern Baltic language which does not form part of our present sample, Nicole Nau (p.c.) informs us that there are Russian loan-conjunctions as well. We will take account of these cases in our follow-up studies on loan conjunctions.

On loan conjunctions | 325

(87)

Latvian [Nau 1998: 53] no kurien-es un kas tie tād-i? from where-GEN and what.NOM DEM:NOM.PL.M such-NOM.PL.M ‘Where are they from and what kind of people are they?’

4.2.58 Lezgian For Lezgian (Nakh-Daghestanian), Haspelmath (1993: 309) identifies the Turkic loan conjunction anžax ‘only, but’ from Azerbaijanian and the two widely common Arabisms amma ‘but’ and va ‘and’ (Haspelmath 1993: 327). There are also Persian nagah ‘if’ (Haspelmath 1993: 499), ja ‘or’ (Haspelmath 1993: 493), the Arabo-Persian hybrid wa ja ‘or’ (Haspelmath 1993: 510) which is perhaps better understood as borrowed Persian vaja ‘or’, and the originally Persian correlative ham...ham ‘both...and’ (Haspelmath 1993: 491). The Turkic loan conjunction is featured in (88) where it links two consecutive sentences so that it might better be analyzed as a discourse marker. (88)

Lezgian [Haspelmath 1993: 309] rufun tux x̂a-ji juğ am patal xalis stomach satisfied become-AOP day 3SG:ABS for real suwar tir. Anžax ix̂tin suwar-ar ada-qh holiday COP:PAST but such holiday-PL 3SG-POESS lap t’imil že-zwa-j very few be-IMPERF-PAST ‘A day when his stomach was full was a real holiday for him. But he had few such holidays.’

4.2.59 Livonian The moribund Uralic language Livonian has borrowed several conjunctions from Germanic languages, namely ja ‘and’ from Gothic, aļz ‘when’ from German, and un ‘and’ from (Middle) Low German via Latvian (Grant 2012: 332). There are also several loan conjunctions of Latvian origin which are mentioned by Laanest (1982: 292): bet ‘but’, ja ‘when’, jo...jo ‘the more...the’. The adversative conjunction bet ‘but’ is part of example (89).

326 | Thomas Stolz and Nataliya Levkovych

(89)

Livonian mina om kaval, bet täma 1SG be:PRS wise but 3SG ‘I am wise, but he is yet wiser.’

om be:PRS

vēl yet

[Winkler 1994: 314] koval-im wise-CPV

4.2.60 Mansi The situation in Mansi (Uralic) is described by Keresztes (1998: 421) as follows: “There are native conjunctions (os ‘and, also’, man ‘or’), but the majority are of Russian origin (i ‘and’, a ‘but’).” Riese (2001: 54) adds no ‘but’ to the list of Russian loan conjunctions. The use of the adversative conjunction a ‘but’ can be gathered from example (90). (90)

Mansi [Riese 2001: 71] χōtal χosat nōχ-nēγləs, a kon iŋ aśirmaγ sun:NOM long appear-PAST.3SG but outside still cold:TRANSL ōləs be:PAST.3SG ‘The sun had been up for a long time, but outside it was still cold.

4.2.61 Mari Eastern (= Meadow) The grammatical systems of the two Mari languages reflect foreign influence not only from Russian but also from Turkic, notably Tatar and Chuvash. Kangasmaa-Minn (1998: 242) claims that [a]part from relatively recent borrowings Mari has few conjunctions; coordination and subordination of simple finite sentences is accordingly rare. Instead, utterances are linked together with the help of non-finite verb forms.

This statement notwithstanding, there is a plethora of loan conjunctions not only in Mari (Meadow) but also in Mari (Hill). In the case of Mari (Meadow), Sibatrova (2016: 33–34) registers da/dä ‘and’, ni…ni ‘neither…nor’, i ‘and’, a ‘but’, no ‘but’, ito < Russian i to ‘and then’, zato ‘but’, to…to ‘now…now’, ili ‘or’, and li…li ‘either…or’ as coordinating loan conjunctions borrowed from Russian. As to the correlative ale…ale / äli (äl’)…äli (äl’) ‘either…or’, the author argues that it could either be of Russian origin or borrowed from Tatar or, as a third possibility, represent a contamination. Clear cases of Turkic loan conjunctions in the domain of coordination are ala…ala

On loan conjunctions | 327

‘either…or’ < Tatar әllә...әllә and ma…ma ‘either…or’ < Tatar -my /-me (Sibatrova 2016: 35). For Mari (Meadow) ja…ja ‘either…or’, either Tatar ja…ja or Bashkir jә…jә are assumed to be the donor language. Independent of the question which Turkic language is the near donor, the distant donor remains the same, namely Persian. In the domain of subordination, Sibatrova (2016: 35–38) identifies the Russian loans pujto/vujta ‘as if’ < Russian budto, što (esli) ‘if’, esli ‘if’, koli ‘if’, raz ‘if’, dyk ‘if’ < Russian dak/dyk/tak, keč/xot’/xotja ‘although’, potomušto ‘because’ < Russian potomu čto, štovy/(y)štyvy ‘for’ < Russian čtoby, kak ‘as soon as’, poka ‘until’. There are also two subordinating loan conjunctions from Chuvash – both with concessive function, namely kerek ‘although’ < Chuvash kirek, tek ‘although; even if’ < Chuvash tek (Sibatrova 2016: 37). In example (91), the adversative conjunction a ‘but’ from Russian is featured. (91)

Mari Eastern [Sebeok and Ingemann 1961: 23] mǝj tǝlajǝt ik šüdö teŋgǝm ogǝl, a 1SG to_you one hundred rouble:ACC NEG but kum šüdö teŋgǝm puem oksam three hundred ruble:ACC give:1SG money:ACC ‘I will give you not one hundred roubles, but three hundred.’

4.2.62 Mari Western (= Hill) The situation in Mari (Hill) closely resembles that reported for Mari (Meadow) in the foregoing section. Alhoniemi’s (1993: 199, 202, and 206) glossary contains the Russian loan conjunctions a ‘but; and’, ðä ‘and’, i ‘and also, too’. Khomchenkova and Stoynova (2021: 91) additionally mention jesli ‘if’, čtoby ‘in order to’, potomu čto ‘because’, poka + NEG ‘until’, and hotʲa ‘although’. In Sibatrova (2016: 35–38), we find no ‘but’, zato ‘but’, ale…ale/äli (äl’)…äli (äl’) ‘either…or’, to…to ‘now…now’, ili ‘or’, li…li ‘either…or’, potomušto ‘because’, nə…nə ‘neither…nor’, l’əvə… l’əvə ‘either…or’ < Russian libo…libo‚ neto…neto ‘either…or’, pujto/vujta ‘as if’ < Russian budto, koli ‘if’, raz ‘if’, dyk ‘if’ < Russian dak/dyk/tak, kak ‘as soon as’, poka ‘until’, the complementizer (ə)štə ‘that’ from Russian and kerek ‘although’ < Chuvash kirek, tek ‘although; even if’ < Chuvash tek, mä…mä ‘either…or’ from Chuvash me. In (92), the use of the conditional conjunction borrowed from Russian is illustrated. (92)

Mari Hill [Khomchenkova and Stoynova 2021: 91] jeslʲi čə̈də̈-n ə̑l-at gə̈nʲə̈ ik igra vele prostə̑ if few-FULL be-NPAST.2SG if one game only simply ‘If there are few people, there is simply only one game.’

328 | Thomas Stolz and Nataliya Levkovych

4.2.63 Mordvin Erzya As in the Mari cases presented above, both varieties of Mordvin (Uralic) give evidence of substantial Russification in the domain of conjunctions. For Erzya, Zaicz (1998: 211) observes that subordination is “often effected by means of conjunctions of Russian origin.” Accordingly, Keresztes (1989: 71) identifies the following Russian loan conjunctions in Erzya: i ‘and’, di ‘and’, a ‘but’, eslí ‘or’, što ‘that’, štobu ‘that’, χot’ ‘although’. Example (93) suggests that this list is far from being closed as there is evidence also of no ‘but’. Note also the presence of buto ‘as though’ which is a copy of Russian budto ‘as though’. We do not include it in the list of loan conjunctions because of its unclear status in the replica language’s system. (93)

Mordvin Erzya [Rueter 2010: 96] seŕ-eze t́et́a-ška-nzo, no śe-d’e height-POSS.3SG father-CPV-POSS.3SG but that-ABL šumbra dɨ, keveŕ-ića šar buto, bojka. healthy/stout and roll-PART.PRS.ABS ball as_though quick ‘He is tall like his father, but stouter and quick like a rolling ball.’

4.2.64 Mordvin Moksha Keresztes (1989: 71) assumes almost the same inventory of Russian loan conjunctions for both varieties of Mordvin. For Moksha, he mentions: i ‘and’, di ‘and’, a ‘but’, eslí ‘or’, što ‘that’, štobu ‘that’, χot’ ‘although’. According to Kehayov (2020: 25), who does not go into the issue of borrowing, there are also the conditional conjunctions koli~kuli ‘if’ and kəda ‘if’ which we interpret as the Moksha rendering of Russian kogda ‘when, if’. In (94), we provide an example of adversative a ‘but’. (94)

Mordvin Moksha [Kehayov 2020: 29] T’ä ńiŋgä lac ašəź matədəv, a this.NOM yet well NEG.PAST.1SG fall_asleep.CNG and matədəvə-ńd́́äŕä-j, kiŕd́əst śt́enat́ńa fall_asleep-COND-3SG hold_up:IND.PAST1.3PL wall:PL.DEF.NOM ‘He was not asleep yet, and even if he was, the walls were holding up.’

On loan conjunctions | 329

4.2.65 Nanai The Tungusic language Nanai gives evidence of several Russian loan conjunctions, namely a ‘and, but’, no ‘but’, ili ‘or’, i ‘and’, esli ‘if’, kogda ‘when’ (Oskol’skaja and Stojnova 2013: 386). The authors argue that no ‘but’ and ili ‘or’ fill gaps in the original syntactic frames of Nanai whereas the use of i ‘and’ often triggers the use of Russian syntactic patterns. For a ‘but’, esli ‘if’, and kogda ‘when’, however, Oskol’skaja and Stojnova (2013: 386–387) claim that they cooccur with their autochthonous Nanai equivalents. Furthermore, Oskol’skaja and Stojnova (2013: 387) state that coordinating loan conjunctions are used much more frequently than subordinating loan conjunctions. There are also differences between the Nanai varieties as to the frequency of either a ‘but’ or i ‘and’. The stock of loan conjunctions in Nanai looks different if we take account of the data provided by Khomchenkova and Stoynova (2021: 90) who register also čtoby ‘in order to’, poka ‘while’, potomu čto ‘because’, and poka + NEG ‘until’. Example (95) illustrates the use of the causal loan conjunction. (95)

Nanai [Khomchenkova and Stoynova 2021: 91] acasi, patamušta piktə-gu-j bā-ri-du-ji forbidden because child-DEST-REFL.SG find-PRS-DAT-REFL.SG piktə-ni=təni tā-ri child-3SG=COORD be_stuck-PRS ‘It is forbidden, because the baby can be stuck during the birth.’

4.2.66 Negidal Negidal (Tungusic) yields a surprisingly small turnout of Russisms in the domain of conjunctions. We could pinpoint only one uncontroversial case which is esli ‘if’ (Barbier 2021: 14). Its use is visible from the fragmentary conditional in (96). (96)

Negidal esli gun-ə-m if say-NFUT-1PL ‘If we go to Malyshevsk […]

[Pakendorf and Aralova 2017: 106] malyševsk ŋənə-mi Malyshevsk go-SS.COND

330 | Thomas Stolz and Nataliya Levkovych

4.2.67 Nogai Csató and Karakoç (1998: 342) do not mention explicitly any loan conjunctions for Nogai (Turkic). In their grammatical sketch, we find evidence of the usual Persian correlative constructions ne...ne ‘neither...nor’ and ya...yade ‘either...or’. The chapter on conjunctions in the descriptive grammar of Nogai additionally features Persian-derived em ‘and’, Persian eger ‘if’, Russian-derived a ‘but’, and Arabic-derived ama ‘but’ (Kalmykova 1973: 291–295). The latter Arabism is featured in example (97). (97)

Nogai [Baskakov 1963: 562] men kniga al-dïm, ama žurnal ala al-ma-dïm. 1SG book take-PAST but magazine be_able take-NEG-PAST ‘I took a book but I couldn’t take a magazine.’

4.2.68 Romani Kalderash In the Kalderash (Kotljary) variety of Romani, Oslon (2018: 356) identifies i ‘and’, i...i ‘both...and’, ili ‘or’, no ‘but’ as Russian loan conjunctions. The disjunctive conjunction ili ‘or’ is featured in example (98). (98)

Romani Kalderash [Oslon 2018: 402] Tù źàs màn-ca ili aśès te dadè-sa? 2SG go:2SG 1SG-INST or stay:2SG at father-INST ‘Will you go with me or will you stay with your father?

4.2.69 Romani Lithuanian The turnout of Russian loan conjunctions is bigger in the case of Lithuanian Romani. Tenser (2005: 58) registers the following cases: i ‘and’, a ‘and however’, ili ‘or’, no ‘but’, ili...ili ‘either...or’, ni...ni ‘neither...nor’, and esli (by) ‘if’. The conditional conjunction forms part of example (99). (99)

Romani Lithuanian [Tenser 2005: 49] jesli mande te javen love, me davas tuke if me.LOC SUBJ became.3PL money, 1SG give.1SG.REM you.DAT ‘If I had money, I would give it to you.’

On loan conjunctions | 331

4.2.70 Romani North Russian According to Matras (2002), Romani varieties can be expected to borrow extensively in the domain of conjunctions. As Sections 4.2.68–4.2.69 show, this is indeed the case also for those varieties which are spoken in the area under scrutiny. Wentzel (1980: 133) reports the existence of the Russian loan conjunctions i ‘and’, so ‘that’ < Russian čto, sóbï ‘so that’ < Russian čtoby for the North Russian variety of Romani. The copulative conjunction i ‘and’ forms part of example (100). (100) Romani North Russian [Rusakov 2001: 316] Joj na gyja i daj la na otdyja. 3SG.F NEG go:IMPERF and mother 3SG.F.ACC NEG give_away:IMPERF ‘She didn’t go and her mother didn’t give her [to them].’

4.2.71 Rutul Rutul belongs to the Nakh-Daghestanian language family. Its stock of loan conjunctions is in line with what is typical of many of its sister-languages. Alekseev (2001c: 418) presents amma ‘but’ from Arabic, ja...ja ‘both...and’ and ki ‘that’ from Persian to which Maxmudova (2001: 207 and 209) adds the Persian loan conjunctions ja ‘or’, ne...ne ‘neither nor’, nagaq ‘if’, and eger ‘if’. According to van den Berg (2004: 215), there is also evidence of Arabic wa ‘and’. We illustrate the use of the copulative conjunction in (101). (101) Rutul [van den Berg 2004: 212] wä pɨlɨw-a ū sipil kašniš wa maddɨ you pilaf-LOC on.top onion(ABS) coriander(ABS) and other uq’-bɨr sɨʔɨ-r ulärä herb-PL(ABS) throw-GER eat.PERF ‘Having put onion, coriander and other herbs on the pilaf, you ate it.’

4.2.72 Saami Akkala All four of the Saami languages of our sample attest to Russian loan conjunctions. In the case of Saami Akkala, we entirely rely on the small collection of sample texts included in Sammallahti’s (1998) survey of the Saami languages. We have identified three Russian loan conjunctions for Saami Akkala, namely i

332 | Thomas Stolz and Nataliya Levkovych

‘and’, jees’li ‘if’, što ‘that’ (Sammallahti 1998: 146–147). Example (102) involves the Russian complementizer. (102) Saami Akkala Voajnn što aldd see:3SG that reindeer_cow ‘[] he sees that the cow has fawned []’

[Sammallahti 1998: 146] kuõ´ ndi fawn:3SG.PRET

4.2.73 Saami Kildin On the basis of the inventory of Russian loan conjunctions in Saami Kildin, one might assume that the other varieties of Saami have also borrowed more conjunctions than those we have been able to identify in the above text anthology. In Saami Kildin, there is evidence of jesli ‘if’, patamúšte ‘because’ < Russian potomú čto, štobe~štop ‘so that’ < Russian čtob(y), što~šte ‘that’, i ‘and’, ili ‘or’, ne ‘but’ (Rießler 2007: 239), and a ‘and, but’ (Sammallahti 1998: 149). In (103), it is again the Russian-derived complementizer which represents the class of loan conjunctions. (103) Saami Kildin Munn tēdta šte 1SG know:1SG that ‘I know that you come.’

[Rießler 2007: 239] tōnn 2SG

puadak come:2SG

4.2.74 Saami Skolt Sammallahti’s (1998: 144) glossary contains Russian a ‘but’ and Gothic jä ‘and’ which has found its way into the grammar of Saami Skolt via Finnish (Feist 2010: 354). Feist (2010: 338) mentions the temporal conjunction poka ‘until’ as a loan from Russian. The typical loan conjunction ja ‘and’ is exemplified in (104). (104) Saami Skolt [Feist 2010: 355] čee’estõõlim kõskkrääʹjest ja e’pet prepare.tea.PAST.1PL half.way.SG.LOC and again jue’tǩim mää’tǩ continue.PAST.1PL trip.SG.ACC ‘We prepared tea when we were half-way and again we continued the trip.’

On loan conjunctions | 333

4.2.75 Saami Ter In the case of Saami Ter, Sammallahti (1998: 152) registers the Russian loan conjunctions a ‘and, but’, i ‘and’, and poka ‘while’. According to Tereškin (2002: 123), no ‘but’ and jesl’i ‘if’ have also been borrowed from Russian. For the illustration of the use of these loan conjunctions in (105), we have chosen the temporal conjunction poka ‘while’. (105) Saami Ter [Sammallahti 1998: 151] kiit’k,ma̮s vaaļţţi pi̮jij poka̮ v́iiŕḿḿijes little_cradle take:3SG.PRET put:3SG.PRET while net:ACC.PL puna̮j mend:3SG.PRET ‘She put the cradle there all right while she mended her nets [].’

4.2.76 Selkup Helimski (1998: 576) addresses the issue of conjunction borrowing in Selkup (Uralic) as follows: Subordinating and co-ordinating conjunctions are partly borrowed from Russian, but partly result from secondary functions of native words (e.g., from employing interrogative pronouns in the role of relative ones); they occur in the present-day colloquial language more often than in traditional folklore texts, and still more often in non-Northern Selkup dialects, so sentence-combining by means of conjunctions must be attributed mainly to the influence of the Russian language.

In point of fact, Russian loan conjunctions abound in Selkup. Il’ina (1976) dedicates a separate study to this subject matter. Her inventory hosts the following cases: i ‘and’, ta/da ‘and’, i…i ‘both…and’, il’i ‘or’, al’i ‘or’, il’i…il’i ‘either…or’, al’i…al’i ‘either…or’, to l’i…to l’i ‘either…or’, l’iba…l’iba ‘either…or’, a ‘and, but’, no ‘but’, a to ‘otherwise’, annaka~atnaka ‘however’ < Russian odnako, ešl’i ‘if’ < Russian eželi, ėsl’i ‘if’ < Russian esli, kol’i ‘if’, kaby ‘if’, ras ‘once, as, because’ < Russian raz, paka ‘until’, štop~štobı ‘for’, xot’ ‘although’, and što ‘that’ (Il’ina 1976: 63–64). To this already impressive list, Kuznecova et al. (2002: 295) add the correlative ni...ni ‘neither...nor’. Example (106) illustrates the use of the copulative conjunction i ‘and’. (106) Selkup nɨmtɨ i then and

mat 1SG

[Kazakevič 2021: 50] podnjalsja wəʧʲʧʲɨʧʲɨsak buran raise raise:DETR:PAST:1SG.SBJ buran

334 | Thomas Stolz and Nataliya Levkovych

mi-s-ak take-PAST-3SG ‘Then also I raised, raised, bought (lit. took) buran.’

4.2.77 Shor For the Turkic language Shor, Daniyarova et al. (2012: 47) claim that “[e]n shor il n’a pas de conjonctions outre паза [paza].”27 Anderson (2005: 236) demonstrates, however, that the Russian loan conjunction kogda ‘when’ exists as well as we show in (107). (107) Shor kogda kel-ze-ŋ when come-COND-2 ‘When you come, we will talk.’

[Anderson 2005: 36] čoqta-ž-ar-ɨs speak-RECIP-FUT-1PL

We seriously doubt that borrowed kogda exhausts the list of Russian loan conjunctions in Shor.

4.2.78 Tabassaran As member of the Nakh-Daghestanian language family, Tabassaran gives evidence of the expected Arabic loan conjunctions amma ‘but’ and va ‘and’ (Xanmagomedov 2001: 397). Alekseev and Šixalieva (2003: 75) add vaja ‘or’, ja...ja ‘either...or’, eger ‘if’, and nagah ‘if’ which are Persian loans. The usual adversative conjunction is involved in example (108). (108) Tabassaran Jiʕ riʕ alib, amma day sun be_at:PAST but ‘The day was sunny but cold.’

[Alekseev and Šixalieva 2003:75] äqhüb vuji. cold be:PAST

4.2.79 Tajiki In the Iranian language Tajiki, there is evidence of Arabic va ‘and’ and Russian to ‘so that, in order to’ (Khojayori and Thompson 2009: 22 and 130). According || 27 Our translation: ‘in Shor there are no conjunctions except paza.’

On loan conjunctions | 335

to Ido (2005: 81), the Arabic loan conjunctions in Tajiki also include the synonymous adversative conjunctions vale, balki (an Arabo-Persian hybrid), ammo, lekin ‘but’ all of which are originally Arabic (Ido 2005: 81). We provide an example for the unique Russian loan in (109). (109) Tajiki [Khojayori and Thompson 2009: 130] Mo ba Misr meravem to ahromro vinem 1PL to Egypt go:1PL in_order_to pyramid:ACC see:1PL.SUBJ ‘We’re going to Egypt to see the pyramids.’

4.2.80 Tat (Judeo-Tat) The two languages which go by the name of Tat and are differentiated according to the religious orientation of their speech communities are looked at separately in this study. As members of the Iranian phylum, both Tat languages belong to the Indo-European language family. With reference to the strategies of coordination in Judeo-Tat, Authier (2010: 92) argues that apart from asyndetic juxtaposition of conjuncts it is possible to place “entre eux une conjonction, soit ve empruntée à l’arabe, soit ne, emprunté aux langues lezgiques, qui est beaucoup plus fréquent.”28 The adversative conjunction ommo ‘but’ is an undisputable loan from Arabic (Authier 2010: 93). Moreover, the originally Persian causal conjunction çünki ‘because’ is characterized as relatively recent borrowing from Azerbaijanian by Authier (2010: 266). In example (110), the conjunction ne ‘and’ is employed twice. Which Lezgian language is the donor for Judeo-Tat cannot be determined in this study. (110) Tat (Judeo-Tat) [Authier 2010: 92] Me ne tü=ni mi=zihim injo bebe ne 1SG and 2SG=COP FUT=live:1PL here father and piser=e xuno son=ATTR like ‘We will live here you and me like father and son.’

|| 28 Our translation: ‘a conjunction between them, either ve borrowed from Arabic or ne borrowed from Lezgian languages which is much more frequent.’

336 | Thomas Stolz and Nataliya Levkovych

4.2.81 Tat (Muslim Tat) For Muslim Tat, Grjunberg (1963: 51–52) mentions the loan conjunctions anǰáq ‘but’ and yóxsa ‘or’ from Azerbaijanian, and hæm...hæm ‘both...and’ from Persian as well as amma ‘but’ from Arabic. In (111), its is shown that the two adversative loan conjunctions can replace one another freely. (111)

Tat (Muslim Tat)

[Grjunberg 1963: 111] anǰáq

gal

zærám

tÿræ,

amma call make:PAST:1SG 2SG:OBL but ‘I called you, but you didn’t reply.’

hay

nædášti

answer

NEG:give:PERF:2SG

4.2.82 Tatar The Turkic language Tatar has borrowed from three donor languages. Landmann (2014a: 105–106) mentions (a) ämma ‘but’ and läkin ‘but’ as Arabic loan conjunctions, (b) häm ‘and’, ja...ja ‘either...or’, ägär ‘if’, čönki ‘because’ as Persian loan conjunctions, and (c) ä ‘and’ (< Russian a) as well as ni...ni ‘neither...nor’ as conjunctions of Russian origin. Example (112) involves the Persian causal conjunction čönki ‘because’. (112)

Tatar [Landmann 2014a: 106] Min eškä bara almadym čönki 1SG work:DAT go:CONV take:NEG:PAST:1SG because avyrganmyn. fall_ill:PERF:1SG ‘I could not go to work because I fell ill.’

Landmann (2014a: 106) emphasizes that the integration of especially this loan conjunction has serious repercussions on the Turkic syntax of Tatar.

4.2.83 Tindi In Tindi (Nakh-Daghestanian), the stock of loan conjunctions comprises familiar cases, namely the Arabisms amma ‘but’, va ‘and’ (Magomedbekova 2001c: 284), the Persian loans ja ‘or’ and ja...ja ‘either...or’ as well as Persian-Avar hy-

On loan conjunctions | 337

brids jagi ‘or’ and jal’uni ‘or’ (Magomedova 2012: 205). Examples (113) illustrates the use of the Persian correlative. (113)

Tindi [Magomedova 2012: 205] Ja me, ja ila axarí baqviɬ’a baxva or 2SG or mother home:LOC be_located must ‘Either you or the mother must stay at home.’

4.2.84 Tsakhur The Nakh-Daghestanian language Tsakhur behaves like the bulk of its sisterlanguages in the sense that it too borrows Arabic amma ‘but’ and va ‘and’, Persian va ja ‘or’ and ja ‘or’ (Talibov 2001: 427) as well as Persian ne...ne ‘neither...nor’, ja...ja ‘either...or’, and agar ‘if’ (Talibov 2004: 411–412). The copulative conjunction is featured in example (114). (114) Tsakhur rasul qarɨ wa haj-na rasul I.come.PERF and this-ATTR ‘Rasul came and did the work.’

[van den Berg 2004: 210] iš hawʔ-u work(ABS) CL3.do-PERF

4.2.85 Tsez In Tsez (Nakh-Daghestanian), we find amma ‘but’ (Arabic), jagi ‘or’ (PersianAvar hybrid) (Imnajšvili 1963: 268), and ja...ja ‘either...or’ (Persian) (Xalilov 2001: 326). With reference to the latter bisyndetic construction, Comrie and Polinsky (2020: 32) claim that “[d]isjunction is expressed by the placement of -ja ‘or’ on each constituent.” As example (115) shows, the two coordinators are enclitics on their hosts. (115)

Tsez k’et’u-ja cat-or ‘a cat or a dog’

[Comrie and Polinsky 2020: 32] ʁʔwaj-a dog-or

The pattern is A-CO B-CO (Haspelmath 2007: 6). We doubt that it has arisen from contact with Persian because we would expect the pattern CO-A CO-B to emerge. We are probably facing another false friend (see Section 4.1).

338 | Thomas Stolz and Nataliya Levkovych

4.2.86 Tsova-Tush (= Batsbi) Whether the adversative conjunction ma~me ‘but’ (Črelašvili 2001: 200) in this Nakh-Daghestanian language is borrowed from Arabic is difficult to determine. Hauk (2020: 58), however, points at Georgian as donor of several conjunctions of Tsova-Tush when the claim is made that [o]ccasionally speakers use the Georgian conjunction და /da/ ‘and’ in Tsova-Tush. Contrastive coordinating conjunctions include magram ‘but’, originally Georgian, andma ‘but/and,’ perhaps a clipped version of the former.

The Georgian adversative conjunction occurs twice in (116). The first instance is probably better analyzed as a discourse marker. (116) Tsova-Tush [Hauk 2020: 311] o st’ak’ magram, o dad, o msxal-a-n yon man but yon father yon pear-PL-GEN dad, ħič’ me, oquin msxal b-aq’-o-š father look.IMPF COMP yon.one.GEN pear CM-eat.IMPF-PRS-CONV qo pešk’ar d-uit’ magram, oquin vuntxeʔ-er three child CM-go.IMPF but yon.one.DAT dunno-IMPF me vux ambui j-e-r. COMP what story CM-be-IMPF ‘That man, however, that dad, that pear farmer, sees three children going, eating his pears, but he didn’t know what happened.’

4.2.87 Turkmen According to Schönig (1998b: 269) “[t]he Turkmen conjunctions are mainly of Arabo-Persian origin.” It is no surprise therefore to find the Arabisms emma ‘but’, we ‘and’, weli ‘but’ alongside the Persian loan conjunctions çünki ‘because’, eger ‘if’, ne...ne ‘neither...nor’, hem ‘and also’ in Blacher’s (2002: 96–97) study who also registers Russian a ‘but’. Example (117) proves that there is also the correlative hem...hem ‘both...and’ borrowed from Persian. (117)

Turkmen [Landmann 2013b: 104] Şu köçede hem dükanlar hem bazar bar. this street:LOC and shop:PL and market EXI ‘In this street, there are both shops and a market.’

On loan conjunctions | 339

4.2.88 Tuvin Alphabetically the next member of the Turkic language family is Tuvin. In contrast to many other Turkic languages, there is no evidence of Arabo-Persian loan conjunctions. In contrast, two loan conjunctions of Russian origin are mentioned in our sources, namely a ‘and, but’ (Landmann 2017: 97) and esli ‘if’ (Anderson 2005: 259). We provide an example of the latter in (118). (118) Tuvin esli bažɨŋ tɨp if house find.CONV ‘…if (he) finds this house.’

[Anderson 2005: 259] tur-ar AUX-PRS_FUT

4.2.89 Ubykh For the Abkhaz-Adyghe language Ubykh, Šagirov (1989: 127, 130, and 137) registers a number of conjunctions of foreign origin as, e.g., na...na ‘neither...nor’ < Turkic ne...ne < Persian ne...ne ‘neither...nor’, Persian ja ‘or’ (for which the source assumes a Turkic donor language), Persian xem ‘and, also’ (probably also transferred to Ubykh via a Turkic intermediary). Furthermore, Šagirov (1989: 127) mentions jaxut ‘or’ from Persian yaxud ‘or’ – most probably via Azerbaijanian yahud ‘or’. In addition, Fenwick (2011: 184) assumes a Turkic origin for the correlative jɜ...jɜ ‘either...or’ which, of course, is originally Persian. Example (119) features the latter bisyndetic construction. (119) Ubykh [Fenwick 2011: 184] kw’ɜnɨ jɜ kw-ɜw:t jɜ ɐ-ʑwɜ ʈʂ’ɜ-ʃ-ɜwːt tomorrow or rain-FUT.II or the-sky good-become-FUT.II ‘Tomorrow, it will either rain or it will become fine’

4.2.90 Udi The Nakh-Daghestanian language Udi displays the loan conjunctions amma ‘but’ and va ‘and’ from Arabic. It is not entirely clear to us whether anǯak – borrowed from Azerbaijanian – is employed as a conjunction or only adverbially. Moreover, there are two borrowed complementizers, namely Persian ki ‘that’ and Armenian-derived te ‘that’ (< Armenian et’e) (Schulze-Fürhoff 1994: 499– 500). The latter is used productively as basis for the formation of hybrid subor-

340 | Thomas Stolz and Nataliya Levkovych

dinators (Schulze-Fürhoff 1994: 487). The Armenian loan conjunction is also employed to introduce direct or indirect speech as in (120). It is optional in the case of direct speech. (120) Udi [Schulze-Fürhoff 1994: 500] ĝar-en ex̂-ne te taĝ-en ox̂al-a son-ERG say-3SG that go-IMP.1PL hunt-DAT ‘The son says: “Let’s go hunting.”.’

4.2.91 Udmurt The borrowing behavior of Udmurt (Uralic) has been the topic of several dedicated studies already. Kaysina’s (2013) study of Russian loan conjunctions reveals that there are considerable stylistic differences between the heavily Russianized colloquial register and literary Udmurt with the latter being subject to purism. Kaysina (2013: 136) mentions the loan conjunctions a ‘and, but’ and no ‘but’29 adding that [l]ike in the model language combinations of coordinators with particles frequently occur, e.g. a ved’, a vot, no ved’, no tol’ko, which, in their turn, were also taken over from Russian and can be used separately.

Russian libo ‘or’ and ili ‘or’ (the latter being restricted to informal spoken Udmurt) compete with Persian ja ‘or’ and jake ‘or’ which have been transferred to Udmurt via Tatar (Kaysina 2013: 137). The latter Persism is also attested in the shape of the correlative ja...ja ‘either...or’. Russian bisyndetic conjunctions borrowed into Udmurt are kot’...kot’ ‘whether...or’, ne to...ne to ‘either...or’, to li...to li ‘either...or’, and to...to ‘sometimes...sometimes’ (Kaysina 2013: 137). Kaysina (2013: 139) also registers the complementizer č́to ‘that’ and consecutive č́tobi̮ ‘so that’ as well as the adverbial subordinators potomu č́to ‘because’, jesli ‘if’, raz ‘as’, kot́/xot́, xotja ‘although’. Both Kaysina (2013: 140) and Salánki (2015: 257– 258) emphasize that Russian loan conjunctions often cooccur with their Udmurt functional equivalents in one and the same clause where they occupy different positions so that double marking of the relation emerges. As example (121) shows, the copulative Russian conjunction i ‘and’ is also attested among the

|| 29 Note, however, that the author assumes that “no can be considered a selective copy whose form goes back to the Proto-Permic particle, whereas the functions have been adopted from Russian” (Kaysina 2013: 135).

On loan conjunctions | 341

borrowings. According to Kaysina (2015: 226) it is widely common in spoken Udmurt. (121)

Udmurt [Kaysina 2015: 226] Mi̬n-i̬m zvonítt-o i mi vańmi̬ 1SG-DAT call-3PL.PRS and 1PL all bíčáški-́ško-m i ́čálak mi̬ni-́ško-mi̬ ki̬t́či̬keno get_together-PRS-1PL and quickly go-PRS-1PL somewhere ‘They call me and we all get together and we go somewhere.’

4.2.92 Uyghur For Uyghur (Turkic), Nadžip (1960: 102) assumes that all of the subordinating conjunctions and the majority of the coordinating conjunctions are borrowings from Persian like həm ‘and also’, həm...həm ‘and...and’. It is clear, however, that amma ‘but’, lekin ‘but, however’, pəkət ‘but’, wə ‘and’ reflect the distant donor Arabic whereas ya ‘or’, yaki ‘or’, əgər ‘if’, čunki ‘because’, ki ‘that’ (De Jong 2007: 205–206) as well as nə...nə ‘neither...nor’, əgərdə ‘if’, madəmki ‘if’ (Nadžip 1960: 102) go back to Persian. The adversative conjunction amma ‘but’ in (122) is already familiar to us from numerous other replica languages which borrow this element from Arabic. (122)

Uyghur [Nadžip 1960: 124] Patimə xət jazmidi, amma Səmət uniŋa Fatima letter write:NEG.PAST.3SG but Samet 3SG:DAT išinatti trust:FOCPRES.3SG ‘Fatima didn’t write letters, but Samet was confident in her.’

4.2.93 Uzbek For the Turkic language Uzbek, Boeschoten (1998: 374) mentions the existence of a number of coordinating conjunctions among which we find the usual suspects, in a manner of speaking. There are the Arabic loan conjunctions ammo ‘but’, lekin ‘but’ and va ‘and’ alongside the Persian cases ham(da) ‘and’, ham...ham ‘both... and’, yo ‘or’ (Doniyorova 2001: 76), agar ‘if’, chunki ‘because’, na...na ‘neither...nor’, yoki...yoki ‘either...or’ (Landmann 2010: 94–96). Russian loan conjunctions seem to be typical of urban Uzbek (especially of the variety of Tashkent).

342 | Thomas Stolz and Nataliya Levkovych

Baran (2000: 20–21) has identified a ‘but’, i ‘and’, and ili ‘or’ in this variety of Uzbek. The Arabic loan conjunction va ‘and’ forms part of example (123). (123) Uzbek Ikkita o’g’lim va bitta two:NUM son:POR.1SG and one:NUM ‘I have two sons and a daughter.’

[Landmann 2010: 95] qizim bor. daughter:POR.1SG EXI

4.2.94 Veps As most of its Finnic sister-languages, Veps (Uralic) has borrowed numerous Russian conjunctions. Laanest (1982: 292) summarily lists a ‘but’, dai ‘and also’, i ‘and’, ili~iľ i ‘or’, jes’li ‘if’, libo ‘or’, što ‘that’, štobi̮~štob(i)~štoB ‘in order to’. On the basis of Zajceva (1981: 295–296), we can add no ‘but’, da ‘and, but’, xot’ ‘although’, xot’ i ‘although’ to the inventory. Example (124) illustrates the use of one of the disjunctive Russian loan conjunctions. (124) Veps tule͔-n tämbe͔i come-IMPERF.1SG today ‘I will come today or tomorrow.’

l’ibo or

[Zajceva 1981: 295] homen tomorrow

4.2.95 Votic Votic (Uralic) presents us with a plethora of Russian loan conjunctions. Ariste (1968: 111–112) lists i ‘and’, da ‘and’, dai ‘and’, i...i ‘both... and’, a ‘but’, xot’ ‘although’, što ‘that’, štobi̮~štoB ‘so that, in order that’, to...to ‘one moment...the other moment’ (Ariste 1968: 111–112). This list is complemented by Markus and Rožanskij (2017: 529 and 612) by way of adding to it il’i ‘or’, il’i...il’i ‘either...or’, l’ibo ‘or’, l’ibo...l’ibo ‘either...or’, jesl’i ‘if’, and no ‘but’. Example (125) hosts three tokens of Russian loan conjunctions which represent two different types. (125)

Votic [Markus and Rožanskij 2017: 51] te̮jn bratko akardiona-ka pillitt-I a miä other brother piano_accordion:GEN-COM play-IMPF:3SG but 1SG i akardiona-ka I bajani-ka and piano_accordion:GEN-COM and button_accordion:GEN-COM ‘... the other brother played piano accordion whereas I [played] both piano accordion and button accordion.’

On loan conjunctions | 343

4.2.96 Yiddish The Germanic language Yiddish has experienced influence from a variety of Slavic languages so that it is not always easy to tell potential Russisms and contributions of other Slavic donor languages apart. Jacobs (2005: 202–204) mentions the correlatives i...i ‘both...and’ (Russian), to...to ‘both...and’ (Russian/Ukrainian). There is also ci ‘or’ most probably related to Ukrainian čy or Polish czy ‘whether, or’. In (126), we give an example of the correlative i...i ‘both...and’. (126) Yiddish [Jacobs 2005: 202] i der lerər i di talmidəm hobn DEF.M teacher and DEF.PL pupil:PL have:PL and lib jontojvəm love holiday:PL ‘Both the teacher and the pupils love holidays.’

4.2.97 Yugh The second representative of Yeniseian is the moribund language Yugh (cf. Ket in Section 4.2.43). Werner (1997b: 230–231) enumerates several Russian loan conjunctions, namely i ‘and’, a ‘and’, iľ i ‘or’, nɔ ‘but’, butta ‘as if’ < Russian (kak) budto ‘as if’, što ‘that’, štɔbɨ ‘in order that’, pakuda ‘until’, jedba ‘as soon as’ < Russian edva, χɔt’ ‘although’. The final example (127) in this catalogue of loan conjunctions features the Russian complementizer. (127)

Yugh U ladnə što k-is-o-ɣ-a-get ok that 2-fish-PV-MTS-PRS-2.ITER 2SG ‘It is ok that you fish.’

[Anderson 2005: 240]

5 Evaluation The raw data as presented throughout Section 4.2 call for a systematization to see whether patterns emerge which can be used to derived generalizations which, in turn, might have a bearing on the ongoing discussion within the framework of language-contact studies. To reach this goal, we divide this section in two major parts. We start with the borrowing behavior of the donor and replica languages in Section 5.1. Section 5.2 looks at the loan conjunctions. In both sections, quantitative issues are prominently discussed. The qualitative aspects of the phenomena are given more space in Section 5.2.

344 | Thomas Stolz and Nataliya Levkovych

5.1 Languages Figure 1 shows that there is a 71% majority of the sample languages which attest to loan conjunctions as opposed to a minority of 29% of the sample languages which have not yet been confirmed to borrow conjunctions. In this section, we check to what extent this uneven distribution is determined by factors such as genetic affiliation, political independence, geography, and/or language death. For a start, we test whether it makes a difference if a given language became extinct during the life-cycle of the Soviet Union. Starting with Figure 2, we distinguish borrowers (= replica languages which attest to loan conjunctions) from non-borrowers (= languages for which the borrowing of conjunctions could not be proved). In Figure 2, it is shown that the percentage of borrowers among those languages which suffered extinction in the 20th century is smaller by far than that of borrowers in the group of living languages. 100% 36

80% 4 60% 40% 20%

94 3

0% extinct borrowers

vital non-borrowers

Figure 2: Shares of borrowers and non-borrowers according to vitality.

In contrast to vitality, the political factor independence does not seem to contribute to the statistical imbalance of borrowers and non-borrowers in the sample. Figure 3 clearly reveals that the sample languages behave identically not matter whether their speech community is located in contemporary independent states or in the Russian Federation. In both cases, the shares are the same as that calculated for the sample as such.

On loan conjunctions | 345

100% 80%

9

31

21

76

independent states

Russia

60% 40% 20% 0%

borrowers

non-borrowers

Figure 3: Shares of borrowers and non-borrowers according to (in)dependence.

Form the political status we proceed to geography. The gross bipartition of the sample in a majority of cis-Uralian languages and a minority of trans-Uralian languages was introduced in Section 3. According to Figure 4, cis-Uralian languages have a propensity to borrow conjunctions which exceeds the share of borrowers in the entire sample. In contrast, trans-Uralian languages are far less affected by conjunction borrowing than their cis-Uralian counterparts. The shares of borrowers and non-borrowers are almost equal for the trans-Uralian group. 100% 19 80%

21

60% 40%

73 24

20% 0% cis-Uralian borrowers

trans-Uralian non-borrowers

Figure 4: Shares of borrowers and non-borrowers according to geography.

346 | Thomas Stolz and Nataliya Levkovych

Genetic affiliation is the next parameter to check. There are thirteen genetically defined groups including the heterogenous class of isolates as spelled out in (14). Figure 5 tells us that the language families behave differently from each other when it comes to borrowing conjunctions.

Kartvelian

3 1

Sino-Tibetan Tungusic

5 3

3 2

Chukotko-Kamchatkan; Isolate Afro-Asiatic; Mongolic

1 10

1 10

Indo-European Eskimo-Aleut

2

3

Uralic

7

21

Turkic

3

21

Nakh-Daghestanian

1

28 5

Abkhaz-Adyge

0%

20% borrowers

40%

60%

80%

100%

non-borrowers

Figure 5: Shares of borrowers and non-borrowers according to affiliation.

Kartvelian and Sino-Tibetan yield no example of loan conjunctions. All other language families and the isolates (as a class) are involved in conjunction borrowing albeit to different extents. Particularly interesting are the top four in the ranking order because the families are sizable enough to exclude the chance factor. All five members of the Abkhaz-Adyge language family attest to loan conjunctions. With a coverage of 100% of its membership, this language family stands out from the bulk of the sample in terms of participation in borrowing. Interestingly, another language family from the Caucasian region – NakhDaghestanian – comes as close as can get to the top result of Abkhaz-Adyge. With 28 borrowers out of 29 Nakh-Daghestanian languages, the borrowers claim a share of 97%. The sole case of a Nakh-Daghestanian non-borrower is Godoberi. This exceptional status might change if additional sources can be consulted. In the Turkic language family, the borrowers form an 88% majority with 21 out of 24 languages. We assume that the share of the borrowers will increase further as soon as the Chulym case can be re-opened. Three quarters of all Uralic languages of the sample are borrowers. In analogy to the NakhDaghestanian and Turkic cases, it is very probable that future research will reveal that the number of Uralic borrowers is higher than we were able to de-

On loan conjunctions | 347

termine for the purpose of this study. The language families which occupy ranks 1–4 yield shares for the borrowers which are bigger – even considerably bigger – than the 71% calculated for the entire sample. In contrast to vitality and independence, geography and affiliation are the better criteria to make statements about the probability of a given language’s participation in conjunction borrowing. Of the latter two factors, affiliation yields the most reliable results. Figure 6 discloses how many replica languages have borrowed loan conjunctions from a given donor language. Those donor languages which are involved only in one donor-replica pair form the heterogeneous category of others. In the cases of Turkic and Germanic, it is not always possible to identify an individual language as the donor of a given loan conjunction. The category of hybrids comprises those cases in which a given loan conjunction is complex in the sense that it consists of several components which stem from two different donors. 60 50 40 30 20 10 0 Russian

Arabic

Persian

hybrid

Turkic

Germanic

others

Figure 6: The number of borrowers per donor language.

What Figure 6 additionally shows is that, even if their turnouts are added up to twenty-nine cases, the four heterogeneous categories are ousted quantitatively by each of the three major donor languages alone. Russian, Arabic, and Persian yield almost identical results because each of these three counts at least fifty replica languages which borrow loan conjunctions from them. It is therefore typical of a replica language in the macro-area of interest to take loan conjunctions from Russian, Arabic, and/or Persian. The probability that a replica language borrows from one of the big three is five times as high as that of it borrowing from a different source.

348 | Thomas Stolz and Nataliya Levkovych

For the three major donor languages, we determine how many borrowers associated with them belong to which language family as shown in Figure 7.

5

2

Persian

16

24

3

2 2

15

Arabic

28

3

Russian

14

2

0

6

5

19

11

10

15

20

Abkhaz-Adyghe

Uralic

Turkic

Nakh-Daghestanian

Indo-European

others

25

30

Figure 7: Affiliation of borrowers per major donor.

Different members of the Nakh-Daghestanian, Turkic, Uralic, and IndoEuropean language families are reported to borrow from Russian, Arabic, or Persian. Nakh-Daghestanian languages have a clear preference for Arabic and Persian as donors whereas Russian is the favourite donor for Uralic languages and to a lesser degree also for Indo-European languages. Interestingly, Turkic replica languages do not seem to clearly prefer one of the major donors over the others. Russian does not function as donor language for Abkhaz-Adyge languages whereas there is evidence of members of this language family borrowing from Arabic or Persian. In contrast, Russian has the monopoly as donor language for Afro-Asiatic, Chukotko-Kamchatkan, Mongolic, and Tungusic languages as well as for all isolates in the former Soviet Union. Of the three major donors, Russian is also the only one to export loan conjunctions to EskimoAleut languages. In Figure 8, we adopt the perspective of the replica languages insofar as we determine how many members of a given language family opt for which donor language.

On loan conjunctions | 349

1

others

11

Abkhaz-Adyghe

5

2

5

Indo-European

3 3

6

8

Uralic

19

2 2 2

Turkic

14 15 15

2

Nakh-Daghestanian 0

24

5

other

16

10

Russian

15

Persian

20

25

28

30

Arabic

Figure 8: Donors per members of replica language families.

The discernible patterns in the domain of genetic affiliation of donors and replicas correlate relatively strongly with geography. Arabic and Persian are particularly successful as donors in those language families which are situated in the southern regions of the former Soviet Union such as Nakh-Daghestanian. Russian gains in importance the further north we move on the map (see Appendix). The competition between Arabic, Persian, and Russian is still largely undecided within the Turkic language family whose territory extends far beyond the Caucasian region. Russian surpasses Arabic and Persian as donor languages when we reach the northerly zones populated by Uralic replica languages. These facts meet our expectations. One-to-one relationships between a given donor and a given replica language family are not the rule. On the level of the individual replica languages, the situation looks different. According to Figure 9, 44% of all replica languages borrow exclusively from one donor language. Slightly more than a quarter of all borrowers display loan conjunctions from two different donors. Three different donors are involved with 22% of the borrowers. The percentage is much lower for replica languages which borrow from four different donors. The Turkic language Karakalpak is the only case of a replica language which displays loan conjunctions from five different origins.

350 | Thomas Stolz and Nataliya Levkovych

two donors

three donors

29%

22%

four donors 4% five donors 1%

single donor 44% Figure 9: Replica languages with a single donor or multiple donors.

The success rate of the donor languages cannot only be measured in terms of the number of borrowers they are associated with. It is also possible to determine how many different types of loan conjunctions go to the credit of a given donor language. This is what Figure 10 tells us about. 50 40 30 20 10 0 Russian

Persian

other

Arabic

Turkic

hybrid

Germanic

Figure 10: Number of conjunction types per donor.

With forty-four different types, Russian provides the most differentiated input in the language-contact processes. The Persian turnout is much smaller with only

On loan conjunctions | 351

sixteen types whereas that of Arabic equals only 14% of the Russian inventory. This huge gap between the major donor languages is in stark contrast to the almost equal number of borrowers they are associated with according to Figure 6. If Arabic feeds only a small number of types into the contact situations and at the same time is involved in almost as many contact situations as Russian, it can be concluded that many replica languages must borrow the same conjunctions from Arabic. Table 3 reflects the hierarchy of the replica languages in the order of the decreasing number of loan conjunctions. Table 3: Ranking order of replicas according to number of loan conjunctions.

Rank

Replica language

Loans

1

Mari Western (= Hill)

27

2

Mari Eastern (= Meadow)

26

3

Selkup

20

4

Udmurt

18

5

Votic

14

6–7

Uyghur, Uzbek

13

8–10

Hinuq, Karakalpak, Komi-Zyrian

12

11–12

Azerbaijanian, Karaim

11

13–18

Crimean Tatar, Karelian, Kazakh, Nanai, Veps, Yugh†

10

19–26

Aleut (Mednyj), Bashkir, Bezhta, Ingrian, Komi-Permyak, Mordvin Erzya, Mordvin Moksha, Turkmen

9

27–31

Budukh, German (Volga), Rutul, Saami Kildin, Tatar

8

32–35

Gagauz, Karata, Romani Lithuanian, Tsakhur

7

36–45

Avar, Dargwa, Khinalug, Lak, Lezgian, Livonian†, Nogai, Tabassaran, Tajiki, 6 Tindi

46–59

Botlikh, Enets (Forest), Itelmen, Kabardian, Ket, Khakas, Khanty, Khwarshi, 5 Kumyk, Kyrgyz, Saami Ter, Tat (Judeo), Ubykh†, Udi

60–65

Alutor, Bagvalal, Kryts, Romani Kalderash (Kotljary), Tat (Muslim), TsovaTush

4

66–81

Abaza, Aghul, Andi, Archi, Central Siberian Yupik (Yuit), Chechen, Chuvash, Dolgan, Hunzib, Karachay-Balkar, Mansi, Romani North Russian, Saami Akkala, Saami Skolt, Tsez, Yiddish

3

82–91

Abkhaz, Adyghe, Akhvakh, Aleut (Bering Island), Altay, Chamalal, Estonian, Evenki, Kalmyk, Tuvin

2

92–97

Armenian, Bohtan Neo-Aramaic, Ingush, Latvian, Negidal, Shor

1

352 | Thomas Stolz and Nataliya Levkovych

Half a dozen replica languages give evidence of only a single loan conjunction. This means that 94% of the replica languages borrow two or more conjunctions. The five top-ranking borrowers belong to the Uralic language family. If we compare the data in Figure 10 with those in Table 3, we immediately see that even the most proliferous borrowers do not borrow the entire set of Russisms. There are thus more types on offer than a single replica language is reported to integrate into its system. This discrepancy between supply and demand calls for a closer look at the loan conjunctions.

5.2 Conjunctions There are altogether ninety-two types of loan conjunctions of which nineteen (= 21%) are correlatives. These ninety-two types yield 618 instances of loan conjunctions (= tokens). These tokens distribute unevenly over the sample languages as shown in Figure 11. Note also that 468 tokens are reported in cis-Uralian replica languages (= 76%) whereas trans-Uralian languages yield 150 instances (= 24%). Uralic replica languages are responsible for over a third of all tokens (= 213 cases). Turkic and Nakh-Daghestanian claim each slightly less than a quarter of all tokens with 148 cases and 143 cases, respectively. No other language family comes even remotely close to the 10%-mark. NakhDaghestanian; 23.2% Indo-European; 6.8%

Turkic; 24.0%

Abkhaz-Adyghe; 2.8% Isolates; 2.4% Eskimo-Aleut; 2.3% Tungusian; 2.1% ChukotkoKamchadal; 1.5% Uralic; 34.5%

Afro-Asiatic; 0.2%

Mongolic; 0.3%

Figure 11: Shares of tokens (per language family).

The individual loan-conjunction types are identified in Table 4 and ordered according to the decreasing number of borrowers in which the loan conjunction

On loan conjunctions | 353

is attested. Neither the donor language nor the meaning of the loan conjunction is revealed because this issue is discussed below. For ease of recognition, the loan-conjunction types are represented in a standardized shape which glosses over a wide range of different realizations in the replica languages. Table 4: Ranking order of loan conjunctions according to number of borrowers.

Rank

Conjunction

Borrowers

1

ˀammaa

45

2

a

39

3

i

37

4

wa-

34

5–6

yā, yā…yā

28

7

no

25

8

esli/eželi

22

9–10

eger, ili/ali

20

11–12

čto, ne…ne

19

13

čtob(y)

17

14–15

ham, xot’/xotja

13

16–18

da/da i, ke, nagah

11

19–20

čūn ke, laakinna

10

21–23

ham…ham, poka, potomu čto

9

24–25

ni…ni, yagi

8

26

libo

7

27–29

anǯaq, kogda, yaɬuni

6

30–34

bal-, ili…ili/ali…ali, jah, raz, to…to

5

35–38

(kak) budto, i…i, koli, libo…libo

4

39–40

faqat, zato

3

41–59

belki, dak/dyk/tak, kak, kirek, laˁalla…laˁalla, li…li, me…me, ne to…ne 2 to, ni, poka ne, tek, to li…to li, tol’ko, und, vaa-yaa, xot’…xot’, yā ki, yā ki…yā ki, že…že

60–92

a to, als, andma, bet, ci/czy, da, ecgi, edva, əmə, ənraq, enti, esli…to, et’e, i to, ja, jo…jo, kaby, kak raz, liš by, magram, me, medemki, ne, odnako, pokuda, tak, tak čto, to, xotja by, yagi…yagi, yahut, yoxsa, že

1

The vast majority of the types (= 64%) is attested in two or more replica languages. With thirty-three cases, hapaxes cover slightly more than a third of all

354 | Thomas Stolz and Nataliya Levkovych

types. The first 23 positions in Table 4 are filled exclusively by loan conjunctions which go back to the major donors. There are eleven bisyndetic types on the lowest ranks. It strikes the eye that monosegmental and monosyllabic loan conjunctions such as i, a, wa- mostly occupy ranks in the upper half of the hierarchy without claiming the top position though. In contrast, segmentally more complex and polysyllabic loan conjunctions such as andma, odnako, xotja by cluster at the bottom of the same hierarchy. The above loan conjunctions are no meaningless segmental chains but fulfil certain tasks because they are put to labor in the replica languages. The functions for which the loan conjunctions are employed in the replica languages are ranked according to tokens in Table 5. Table 5: Functions of loan conjunctions (tokens).

Function

n of cases in sample

copulative

161

adversative

134

disjunctive

133

conditional

66

complementizer

31

temporal

28

causal

19

concessive

19

purposive

17

comparative

4

consecutive

2

limitative

2

varia

2

The equivalents of AND, BUT, and OR together outnumber the remaining ten categories by far since the former are responsible for 69% of all cases of loan conjunctions. This dominance of copulative, adversative, and disjunctive loan conjunctions justifies the prominence they have been given in the general discussion of language-contact phenomena as reviewed in Sections 1–2. There are five implicational patterns according to which the borrowing of a conjunction for function X requires the co-existence of a loan conjunction for function Y in a given replica language. All four languages which have borrowed a

On loan conjunctions | 355

comparative conjunction also display loan conjunctions with adversative, concessive, copulative, and purposive function as well as a borrowed complementizer. This pattern applies in Mari Eastern (Meadow), Mari Western (Hill), Mordvin Erzya, and Yugh. Figure 12 is the formulaic rendering of this pattern. ADVERSATIVE COMPLEMENTIZER COMPARATIVE



CONCESSIVE COPULATIVE PURPOSIVE

Figure 12: Implicational pattern I.

The presence of a limitative loan conjunction presupposes the borrowing of adversative, copulative, and temporal conjunctions as suggested by Figure 13. However, this pattern is infrequent since only Aleut (Mednyj) and Itelmen reflect it. ADVERSATIVE LIMITATIVE



COPULATIVE TEMPORAL

Figure 13: Implicational pattern II.

Since we know from Table 5 that adversative and copulative conjunctions are typical candidates for borrowing these patterns are not particularly informative. Superficially, this seems to be even less so with borrowed complementizers, concessive, and consecutive conjunctions each of which implies that there also is a copulative loan conjunction in the replica language. In the case of the consecutive loan conjunction the implication also involves adversative BUT. The implicational pattern III in Figure 14 is attested only twice, namely in KomiZyrian and Tajiki. ADVERSATIVE CONSECUTIVE

 COPULATIVE

Figure 14: Implicational pattern III.

356 | Thomas Stolz and Nataliya Levkovych

The implicational pattern IV, however, is reflected in thirteen different borrowers which speaks against classifying it as an incidental combination. We have found evidence of CONCESSIVE  COPULATIVE in German (Volga), Karaim, KomiPermyak, Mari Eastern (= Meadow), Mari Western (= Hill), Mordvin Erzya, Mordvin Moksha, Selkup, Udmurt, Veps, Votic, Yiddish, and Yugh†, i.e. in ten Uralic replica languages, two Germanic replica languages, and an isolate (Yeniseian). As to the implication pattern V, the turnout goes far beyond that of implicational pattern IV. There are thirty replica languages which attest COMPLEMENTIZER  COPULATIVE, namely Aleut (Mednyj), Alutor, Azerbaijanian, Bashkir, Budukh, Crimean Tatar, Gagauz, Hinuq, Ingrian, Karaim, Karelian, Kazakh, Khakas, Khinalug, Kryts, Mari Eastern (= Meadow), Mari Western (= Hill), Mordvin Erzya, Mordvin Moksha, Romani North Russian, Rutul, Saami Akkala, Saami Kildin, Selkup, Udi, Udmurt, Uyghur, Veps, Votic, and Yugh†. This time there are twelve Uralic replica languages alongside eight Turkic, five Nakh-Daghestanian, and another five replica languages from different families. Expecting to find strict implications among the loan conjunctions probably means that we are asking too much. The absence of further implicational patterns might have different explanations. One is certainly the still incomplete database, i.e., it can be assumed that a more thorough search will yield further instances of loan conjunctions to fill at least some of the incidental gaps in the matrix. For the time being, we are content with identifying preference patterns of parallel borrowing of conjunctions. The cells in Table 6 indicate how many languages which borrow a conjunction with the function given for a row in the leftmost cell also borrow a conjunction with the function specified for a column in the topmost row. Accordingly, the combination CAUSAL-ADVERSATIVE = 95% means that 95% of all languages which borrow a causal conjunction also borrow an adversative conjunction. If we look at this combination from the point of view of those languages which borrow adversative conjunctions, the share of parallel borrowing of causal conjunctions is down to 23% (ADS-CAU = 23%). Table 6: Shares of parallel borrowing (languages).

ADS ADS

CAU

CPV

CMP

CNC

CND

CNS

COP

DIS

LIM

PUR

TMP

23%

5%

34%

15%

53%

3%

93%

70%

3%

18%

18%

CAU

95%

CPV

100%

50%

11%

CMP

90%

40% 13%

63%

21%

84% 5%

95%

79%

5%

32%

32%

100%

100%

75%

–

100%

75%

–

100%

75%

33%

70%

–

100%

73%

3%

50%

27%

On loan conjunctions | 357

ADS

CAU

CPV

CMP

92%

31%

31%

77%

CND

86%

33%

6%

43%

20%

CNS

100%

50%

–

–

–

CNC

CNC

CND

CNS

COP

DIS

LIM

PUR

TMP

77%

–

100%

77%

–

69%

46%

2%

92%

82%

–

27%

14%

100%

50%

–

–

–

50%

COP

85%

21%

5%

34%

15%

52%

2%

DIS

90%

24%

5%

35%

16%

65%

2%

90%

64%

2%

18%

17%

2%

18%

13%

LIM

100%

50%

–

50%

–

–

–

100%

–

100%

PUR

82%

35%

24%

88%

53%

76%

–

94%

65%

–

TMP

82%

35%

18%

47%

35%

41%

–

88%

47%

12%

50%

53% 53%

Given that the ternary set of BUT, OR, AND has a privileged position in the domain of function-word borrowing, the percentages given in Table 6 largely meet our expectations. Parallel borrowing of functionally different conjunctions with members of this set mostly yields sizable shares. Only three replica languages display loan conjunctions without also borrowing from the BUT-OR-AND set, namely – Enets (Forest) which attests to causative, conditional, purposive, and temporal loan conjunctions, – Negidal where a conditional loan conjunction has been registered, and – Shor with a temporal loan conjunction. With 94 out of 97 borrowers, about 97% of the replica languages borrow from the BUT-OR-AND set. This result is interesting if we look at the pattern observed by Grant (2012: 350) who assumes that [i]f a language has borrowed a coordinating conjunction it will (almost certainly) have borrowed at least one subordinating conjunction or dependent clause marker as well.

Our own data contradict this generalization because there are thirty replica languages which attest to loan conjunctions which are taken exclusively from the BUT-OR-AND set, namely Abaza, Abkhaz, Adyghe, Aghul, Akhvakh, Aleut (Bering Island), Altay, Andi, Armenian, Bagvalal, Bohtan Neo-Aramaic, Chamalal, Chechen, Dolgan, Estonian, Evenki, Hunzib, Ingush, Kabardian, Kalmyk, Karachay-Balkar, Ket, Latvian, Mansi, Romani Kalderash (Kotljary), Tat (Muslim), Tindi, Tsez, Tsova-Tush, and Ubykh†. On account of this sizable group of languages which attest to coordinating loan conjunctions in the absence of borrowed adverbial subordinators (and complementizers as well), we suggest that the validity of Grant’s above statement needs to be checked on the basis of an empirically enlarged corpus. It is possible that there are areal preferences which deviate from the cross-linguistic tendency.

358 | Thomas Stolz and Nataliya Levkovych

In the two subsequent sections we look at the loan conjunctions according to the following division. In Section 5.2.1, we focus on the set constituted by adversative, disjunctive, and copulative conjunctions. Section 5.2.2 is dedicated to the adverbial subordinators and the THAT-complementizer.

5.2.1 The BUT-OR-AND set If we look at the parallel borrowing patterns which involve members of the set BUT-OR-AND on the one hand and conjunctions from outside this set from Table 6, the shares range from 88% to 100% for parallel borrowing with copulative and from 82% to 100% for parallel borrowing with adversative. The situation is remarkably different for parallel borrowing with disjunctive because in this case the shares can be as low as 47% and never go beyond 82%. From the perspective of copulative, adversative, and disjunctive, parallel borrowing of conjunctions from outside the ternary set is far less important. Adversative attests to shares of parallel borrowing of further conjunctions which range from minimally 3% to maximally 53%. For copulative, the range spans from 2% to 52%. In the case of disjunctive, the smallest share equals 2% whereas the share of 65% parallel borrowing with conditional conjunctions is the top result in this category. Conditional is also responsible for the maximum of parallel borrowing involving adversative, and copulative. What about the parallel borrowing of the three members of the above set? First of all, BUT, OR, AND are not borrowed by the same number of replica languages. Figure 15 reveals that the number of OR-borrowers is considerably smaller than that of BUT-borrowers and AND-borrowers. 100 copulative; 87 adversative; 80 80 disjunctive; 62 60

40

20

0

Figure 15: Borrowers of BUT, OR, AND.

On loan conjunctions | 359

About 90% of the borrowers attest to a copulative loan conjunction. Adversative loan conjunctions are found in 82% of the borrowers. Disjunctive loan conjunctions, however, only claim a share of 64%. This discrepancy is unexpected in the sense that it disagrees with the hierarchy BUT > OR > AND discussed in Section 2.1 above. Under these conditions, the question arises to what extent the three functions and their material representatives are interconnected in the process of borrowing. According to Table 6, the shares of parallel borrowing are diverse. The highest share has been calculated for ADS-COP (= 93%) whereas the lowest is given for COPDIS (= 64%). Between these extremes, we find DIS-ADS and DIS-COP both with shares of 90%, COP-ADS (= 85%), and ADS-DIS (= 70%). These percentages characterize disjunctive as relatively less important for parallel borrowing. Figure 16 accounts for the number of languages in which the categories under scrutiny have been borrowed either jointly or separately. The three languages which borrow only subordinating conjunctions are excluded from the calculation.

Figure 16: Parallel and separate borrowing of BUT, OR, AND.

More than half of the replica languages (n = 51/94) attest to the parallel borrowing of all three categories. A quarter of the replica languages (n = 23/94) give evidence of the parallel borrowing of adversative and copulative conjunctions without disjunctive being involved too. The next biggest share (n = 8/94~9%) goes to the borrowing of AND alone. The combinations BUT-OR and AND-OR yield

360 | Thomas Stolz and Nataliya Levkovych

shares of 5% (n = 5/94) whereas the separate borrowing of adversative or disjunctive conjunctions is attested only once for each category. In point of fact, in 79 out of 80 replica languages, BUT is borrowed together with OR and/or AND. In this case, parallel borrowing amounts to 99%. The share of parallel borrowing is only slightly smaller for disjunctive. In 61 out of 62 replica languages with a disjunctive loan conjunction, OR is borrowed together with BUT and/or AND (= 98% of all OR-borrowers). Parallel borrowing is also the rule for copulative. However, only 79 of 87 AND-borrowers also display adversative and/or disjunctive loan conjunctions. The share of parallel borrowing is thus down to 91%. These facts are indicative of a certain degree of autonomy for AND albeit a relatively limited freedom. If we take the quantitative results at face value, they seem to speak in favour of a hierarchy AND > BUT > OR. Yet, to be sure that we are indeed entitled to change the order of the categories, we need additional confirmation on the basis of a considerably enlarged empirical foundation. As an intermediate result, we may note down that our findings for the borrowing behavior of the replica languages in the former Soviet Union does not fully corroborate the hypothesis of a universally valid borrowing hierarchy which puts adversative on top and copulative at the bottom. To complement the picture, we briefly check how borrowed adversative, disjunctive, and copulative conjunctions are distributed for the parameters of geography and affiliation. Figure 17 shows that disjunctive is responsible for 11% of all loan conjunctions (tokens) in the cis-Uralian group whereas its share is cut down to 7% in the trans-Uralian group. 15%

13%

14%

13%

14%

11% 10% 7% 5%

0% cis-Uralian adversative

Figure 17: Shares of BUT, OR, AND (geography).

trans-Uralian copulative

disjunctive

On loan conjunctions | 361

In both geographical subdivisions, copulative boasts the biggest share whereas adversative is second best in both regions. If the location to the west or to the east of the Ural Mountains makes a difference as to the size of the shares of at least one of the three categories under review, one might want to know whether genetic affiliation has a similar effect. Figure 18 captures the wide range of variation which is characteristic of the language families on this parameter.

Chukotko-Kamchatkan

1

Abkhaz-Adyge

2

Isolates

2

2

Uralic

6 8

41

34

Tungusic

3

1

Eskimo-Aleut

3

1

Nakh-Daghestanian

37

Turkic

39

Indo-European

7 4

3

6

47

91 7

2 4

5

49

32

12

6

Mongolic

25

46

24

39

15

9 1

1

Afro-Asiatic

1

0%

20%

adversative

40%

disjunctive

60%

copulative

80%

100%

other

Figure 18: Shares of BUT, OR, AND (affiliation).

Copulative loan conjunctions are present in each of the language families whereas proof of the borrowing of disjunctive conjunctions is missing from three language families. No adversative loan conjunction is reported for AfroAsiatic. Moreover, Figure 18 also tells us that borrowings from outside the BUTOR-AND set are not attested in Abkhaz-Adyghe, Mongolic, and Afro-Asiatic the latter two being too small in terms of the number of replica languages so that their value for the discussion remains doubtful. In contrast, loan conjunctions which do not belong to the above set of three form the majority in ChukotkoKamchatkan and Tungusic. Copulative is the most important of the three functions in seven of the eleven language families represented in Figure 18. Note that OR claims bigger shares than BUT in four language families (BUT ousts OR in six cases). The parameter affiliation thus yields a variegated result which nevertheless supports the idea that copulative has a very strong position.

362 | Thomas Stolz and Nataliya Levkovych

Given that the internal hierarchy in the BUT-OR-AND set is still debatable, it makes sense to have a look at the situation outside the domain of coordination. Is it possible to postulate a similar hierarchy also for adverbial subordinators and complementizers?

5.2.2 Beyond BUT-OR-AND Khomchenkova and Stoynova (2021: 90) use frequencies to put two hypotheses to the test. In addition to his oft-cited BUT > OR > AND order, Matras (2007: 56) also assumes a borrowing hierarchy of adverbial subordinators beyond the ternary set. This hierarchy is reproduced in Figure 19. concessive conditional  other subordinators causal purposive Figure 19: Borrowing hierarchy beyond BUT-OR-AND.

Matras (2007: 56) characterises this hierarchy as tentative. The class of “other subordinators” is not spelled out exhaustively but the author explicitly mentions that “the borrowing of temporal subordinators is often linked to that of conjunctions expressing purpose and cause.” Grant (2012: 350) claims that [l]ess frequently used dependent clause markers seem to be borrowed more readily than forms such as IF and BECAUSE, which can be considered to be more basic inasmuch as these are ‘prototypical’ markers of their kinds of dependent clauses (conditional and causal respectively).

Thus, a high degree of semantic specificity limits the token frequency of a conjunction which in turn is not an obstacle to its borrowability. In contrast, the lack of semantic specificity together with high token frequency is pictured as not enhancing borrowing. In their comparative study of the borrowing behavior of Enets (Forest), Nanai, and Mari Western, Khomchenkova and Stoynova (2021: 90) refer to the two hypotheses by Matras and Grant when they say that [t]hese generalizations predict the presence / absence of a borrowed conjunction in a language. Our data do not contradict to them notably. However, they do not explain frequen-

On loan conjunctions | 363

cy asymmetries between different subordinators in each language, as well as the difference between the languages under consideration.

We take up this issue by way of reviewing pertinent patterns from our database. In this way, we complement what we have said previously, especially in connection with Figures 1–14 and Table 6. First of all, we look at possible patterns in the realm of the borrowed adverbial subordinators. Table 7 ranks the patterns of parallel borrowing of adverbial subordinators and complementizer according to the number of borrowers. Table 7: Parallel borrowing of adverbial subordinators/complementizer.

Rank

Combinations of categories

Borrowers

1

conditional-complementizer

21

2

conditional-causal

16

3

purposive-complementizer

15

4

purposive-conditional

13

5

complementizer-causal

12

6–7

concessive-complementizer; concessive-conditional

10

8–9

purposive-concessive; purposive-temporal

9

10

temporal-complementizer

8

11

temporal-conditional

7

12–14

causal-purposive; causal-temporal; temporal-concessive

6

15–18

concessive-causal; comparative-purposive; comparativecomplementizer; comparative-concessive

4

19–20

comparative-conditional; comparative-temporal

3

21–22

comparative-causal; temporal-limitative

2

23–26

causal-consecutive; causal-limitative; limitative-complementizer; consecutive-conditional

1

Shor – a doubtful case according to Section 4.2.77 – is the only language in our sample which attests to the borrowing of only one adverbial subordinator (temporal) whereas all other languages which borrow conjunctions from outside of BUT-OR-AND set have at least two borrowed adverbial subordinators or combine a borrowed adverbial subordinator with a borrowed complementizer. Matras’s hierarchy presented in Figure 19 is largely corroborated for temporal conjunctions. There are 17 languages which display temporal loan conjunctions. Of these 17 languages, 16 also borrow other adverbial subordinators or the

364 | Thomas Stolz and Nataliya Levkovych

complementizer. It is also true that combinations of temporal with purposive or conditional are frequent. Table 7 is additionally indicative of certain preferences which, however, are much more difficult to capture in a two-dimensional figure. If we combine the information given in Tables 6–7, the following chain of preference relations (cf. Figure 20) emerges (with  symbolising a unilateral preference relation and not a strict implication). causal comparative  temporal  purposive 

 complementizer  conditional concessive

Figure 20: Preference chain of borrowed adverbial subordinators.

According to Figure 20, the borrowing of comparative conjunctions is easier if a loan conjunction representing one of the categories to the right is already present. The same pattern holds for temporal and the categories to its right, etc. We emphasize that this chain is preliminary and might be subject to major modifications once our database has been sufficiently enlarged. What can be done at this point is a very simple quantitative analysis of loan conjunctions in the domain of adverbial subordinators and complementizer. We follow the pattern employed by Khomchenkova and Stoynova (2021) in order to determine whether frequency of borrowing of a given conjunction correlates with its token frequency in a donor language. The frequencies are compared only for Russian and Persian and loans from these two donors. Table 8 provides the reason for the exclusion of Arabic. Table 8: Contributions of major donors to loan categories.

COP

DIS

ADS

Arabic

34

3

62

Persian

41

66

Russian

74

45

59

CND

CMP

CAU

32

11

10

34

19

9

TMP

PUR

CNC

CPV

CNS

LIM

25

17

15

4

2

2

In contrast to Persian and Russian, Arabic contributes exclusively loan conjunctions which belong to the BUT-OR-AND set. Adverbial subordinators and complementizers, however, stem either from Persian or Russian. It is therefore legitimate to limit the comparison of the frequencies to the latter two donor

On loan conjunctions | 365

languages by way of accounting exclusively for the adverbial subordinators and complementizers.30 What we will say about quantities is based on the frequency dictionaries by Miller and Aghajanian-Stewart (2018) for Persian and by Sharoff et al. (2013) for Russian. Our approach is informal. It is also anachronistic again since the frequencies refer to the contemporary situation in the donor languages which is not automatically identical to that of the time when the conjunctions were borrowed. What we cannot determine in this way is the frequency of use of the loan conjunctions in the replica languages. Tables 9–10 account only for those elements which are classified as conjunctions both in the donor language and in the replica languages. Furthermore, we exclude correlative constructions from the counts because they are treated differently in our sources. In the leftmost column the conjunction is identified, in the two columns to its right the meaning and function the loan conjunction has in the replica languages is given, the next column reveals the rank the conjunction has in the frequency dictionary of the donor language. The ranking order among the loan conjunctions mentioned in a given table is presented according to two different principles. The column headed by ALL refers to the rank a given conjunction has in the entire inventory of loan conjunctions borrowed from a given donor language. The rightmost column looks at the rank position of the same loan conjunction within the class of non-coordinating conjunctions borrowed from the same donor language. Table 9: Ranking of Persian conjunctions.

Conjunction

Meaning

Function

Rank in donor

Rank in loans all

non-coordinating

ke

‘that’

complementizer

5

6–7

2–3

eger

‘if’

conditional

77

3

1

|| 30 It should not go unmentioned that the coordinating loan conjunctions also yield interesting results in terms of their token frequencies in the donor languages. For Persian, the Arabism va ‘and’ is the most frequently used word. Other Arabisms and Arabo-Persian hybrids occupy ranks in the upper section of the hierarchy too, namely amma ‘but’ (#31), vali ‘or’ (#131). Others like balki ‘but’ (#291) and likan ‘but’ (#4413) are not among the top-200 where we find Persian ham ‘or’ (#36) and yā ‘or’ (#41). In the Russian frequency list, copulative i ‘and’ is in the topmost position, adversative-copulative a ‘but, and’ follows on rank 10, adversative no ‘but’ comes on rank 15, whereas disjunctive ili/ali ‘or’ occupies rank 41. These frequently borrowed coordinating conjunctions form part of the top-100 list.

366 | Thomas Stolz and Nataliya Levkovych

Conjunction

Meaning

Function

Rank in donor

Rank in loans all

non-coordinating

čūn ke

‘because’

causal

157

8

4

nagah

‘if’

conditional

3200

6–7

2–3

Table 10: Ranking of Russian conjunctions.

Conjunction

Meaning

Function

Rank in donor

Rank in loans all

non-coordinating

6

2

čto

‘that’

complementizer

9

kak

‘as’

temporal

19

31

8

esli

‘if’

conditional

42

4

1

kogda

‘when’

temporal

52

14

6

čtoby

‘(in order) to’

purposive

57

7

3

xotja

‘although’

concessive

157

8

4

odnako

‘although’

concessive

175

32–44 9–10

poka

‘while’

temporal

211

10–11

5

budto

‘as if’

comparative

869

18–21

7

edva

‘as soon as’

temporal

1132

32–44 9–10

As to the idea that less frequently used non-coordinating conjunctions of the donor languages are especially affected by borrowing, Tables 9–10 tell a different story. Three of four Persian loan conjunctions are relatively frequent in the donor language and thus occupy ranks in the top-200. In the case of the Russian loan conjunctions, seven of ten items are taken from the top-200 of the frequency dictionary. Admittedly, the threshold of 200 ranks is arbitrary. However, Grant (2012) does not provide any means to determine what counts as relatively frequent or infrequent. Moreover, most of the loan conjunctions – be they of Persian or Russian origin – seem to have very general meanings in the sense that they are “prototypical” markers of the relation as Grant (2012) put it. Their prototypical nature holds for both donor and replica languages. This means that also the second part of Grant’s hypothesis is not borne out by the data from the language-contact scenarios in the ex-USSR. We repeat: the situation that emerges from the comparison of languagecontact phenomena in the former Soviet Union does not fall absolutely square with the hypotheses put forward by Grant (2012). Under borrowing, there is no

On loan conjunctions | 367

preference for infrequent conjunctions. Furthermore, semantic specificity likewise fails to facilitate the borrowing of a given conjunction. Future investigations will have to clarify whether this discrepancy between universally oriented hypotheses and the empirical findings is an areal trait of the sample languages employed for this study.

5.2.3 Multiple borrowing It would be premature to declare the case closed. We have several reasons to be cautious. Among other things, there remain uncertainties as to the correct semantic description of a number of loan conjunctions. This applies especially where the function given to the loan conjunction in the replica language differs from that the same conjunction is used for in the donor language. Functional disagreement of this kind is not frequent, an example being Russian disjunctive libo ‘or’ which is borrowed into six languages with the same function but in one case the loan conjunction expresses the conditional. Similarly, we still have to find a solution for the problems posed by polysemy. Conjunctions can be multifunctional and thus semantically ambiguous both in the donor language and in the replica language. To handle this situation, we have adopted a purely practical method for the present purpose by way of exclusively concentrating on the first meaning reported for a given conjunction assuming that what comes first must be the major function. This is of course a gross simplification of the structural facts. In the case of Russian a ‘but, and’ we are facing a conjunction which is ambiguous as to an adversative or copulative reading in the donor language. According to Table 4, Russian a is the second most frequent loan conjunction and has been borrowed 39 times in our sample. The loan conjunction a has been registered as adversative, if the descriptive-linguistic source mentions this function as the first or the only function. In a number of cases, the sources either invert the translation equivalents so that the copulative function is mentioned first or as the only function of the loan conjunction. What makes things worse is that the meta-language used by the descriptive linguists is often Russian and this means that the loan conjunction a is translated as a into Russian. In these cases, it is not clear whether the loan conjunction has the same range of functions as its equivalent in the donor language. Similar problems arise with other conjunctions in our database so that it is necessary to have another go at the semantic side of the loan conjunctions in a follow-up study. The picture is further diversified if we account also for the phenomenon of multiple borrowing within a given category. Seven categories are affected by

368 | Thomas Stolz and Nataliya Levkovych

multiple borrowing. Multiple borrowing is particularly popular with adversatives since 36 replica languages have borrowed more than one conjunction with this function. The second most frequent category which associates with multiple borrowing is copulative for which we have identified 27 multiple borrowers. Disjunctive counts 21 multiple borrowers followed by conditional with 13 borrowers and concessive, comparative, and temporal with three, two, and one multiple borrower, respectively. What makes this phenomenon especially interesting for us is the fact that multiple cannot simply be replaced by double, meaning: several replica languages are not content with two loan conjunctions with identical functions but add further borrowings to the same domain. In what follows we focus on those cases in which the multiple borrowers have three or more (supposedly) synonymous loan conjunctions. We start with adversatives in Table 11. Grey shading marks those cells which host loan conjunctions of Russian origin. The second color highlights elements which go back to donor languages other than Russian, Arabic, or Persian. For the latter two no specific color code is employed. Table 11: Multiple adversative borrowers (more than two loan conjunctions).

Borrower

1

Karakalpak

amma

Karaim

amma

Tajiki

ammo

Azerbaijanian

amma

Crimean Tatar

2

3

4

5

lekin

eli

bɛlki

a

l’akin

ale

a

no

lekin

vale

balki

läkin

fäkät

amma

l’akin

faqat

Uyghur

amma

lekin

pəkət

Budukh

amma

lakin

anǯaq

Uzbek

ammo

lekin

a

Turkmen

emma

veli

a

Tsova-Tush

ma~me

magram

andma

Mari Eastern

no

a

zato

Mari Western

no

a

zato

With five adversative loan conjunctions the Turkic languages Karakalpak and Karaim offer the richest inventory closely followed by Tajiki (Indo-European) with four borrowings in this domain. Six of the replica languages borrow from several donor languages. Note that Turkic and Georgian are involved as donor

On loan conjunctions | 369

languages, too. On the basis of the grammars and dictionaries we have used for this study, it is impossible to determine if in these and further cases multiple borrowing creates a set of full synonyms or an array of semantically, pragmatically, stylistically differentiated partial synonyms. It cannot be ruled out that, at least in some cases, the functionally identical loan conjunctions have different diatopic profiles. As results from Table 12, the sets of functionally similar loan conjunctions per replica language reported for the domain of the copulative relation are not as extended as the top-ranking cases in Table 11. However, in each of the four chains of three cases of AND two or three donors are involved. Table 12: Multiple copulative borrowers (more than two loan conjunctions).

Borrower

1

2

3

Ingrian

i

dai

ja

Karelian

i

dai

ja

Uzbek

va

ham(da)

i

Karakalpak

va~vɛ~ve

hɛm

mu

The two Uralic languages combine Germanic ja ‘and’ with Russian copulative conjunctions. It is likely that dai is semantically slightly different from both i and ja. In Uzbek and Karakalpak, two loan conjunctions from Arabic and Persian are joined by a borrowing from a third language, namely Russian i and Uyghur mu. As in the case of dai, it is possible that a more fine-grained analysis will reveal that Uzbek ham(da) is only partially synonymous with va. Multiple borrowing of disjunctive conjunctions results in sets of three to four functionally interrelated loan conjunctions as shown in Table 13. Nine of eleven replica languages featured in Table 13 borrow from two or three different donor languages each. Table 13: Multiple disjunctive borrowers (more than two loan conjunctions).

Borrower

1

2

3

4

Hinuq

ya

yagi

yaɬuni

ili

Udmurt

ja

jake

ili

libo

Mari Eastern

ala…ala

li…li

ma…ma

ja…ja

370 | Thomas Stolz and Nataliya Levkovych

Borrower

1

2

3

4

Mari Western

ala…ala

li…li

mä…mä

Selkup

il’i…il’i/al’i…al’i

l’iba…l’iba

to l’i…to l’i

Komi-Zyrian

al‘i

ľ ibe̮

ľ ibe̮…ľ ibe̮

Bezhta

ya

yagi

yaluni

Karata

ya

yagi

yal’uni

Khwarshi

ya

yagi

jaɬuni~ jaɬunani

Tindi

ja

yagi

yal'uni

Kazakh

yæ

yæki

i

It is worth noting that Avar is responsible for the pairwise borrowing of yagi and yaluni in five languages (all of them belonging to Nakh-Daghestanian). Besides Persian and Russian there is also Tatar as donor language for the two Mari varieties. It strikes the eye, that the Uralic languages Mari Eastern, Mari Western, and Selkup give evidence exclusively of borrowed correlative constructions in the domain of the disjunctive. Whether the co-existence of several disjunctive loan conjunctions correlates with some kind of semantic or other differentiation is presently too difficult to determine. Further in-depth research is required to settle this question. Multiple borrowing of conditional conjunctions is attested in the four Uralic languages in Table 14. In contrast to the previous cases, Russian is the sole donor. Table 14: Multiple conditional borrowers (more than two loan conjunctions).

Borrower

1

2

3

4

Mari Eastern

jesli

raz

koli

dyk dyk

Mari Western

jesli

raz

koli

Mordvin Moksha

jesli

koli~kuli

kəda

Selkup

kol’i

ešl’i/ėsl’i

kaby

It remains to be investigated to what extent the conditional loan conjunctions reflect the principles of the donor language as to the semantic, pragmatic, and stylistic distinctions associated with the competing conjunctions. The final category which yields cases of multiple borrowing involving more than two loan conjunctions is the concessive relation. Table 15 hosts the two Mari varieties.

On loan conjunctions | 371

Table 15: Multiple concessive borrowers (more than two loan conjunctions).

Borrower

1

2

3

Mari Eastern

kerek

tek

keč/xot’/xotja

Mari Western

kerek

tek

hotʲa

The two Uralic replica languages behave like twins in so far as they borrow the same conjunctions from the same two donor languages. There are two loan conjunctions which are originally Chuvash besides the Russian concessive conjunction xot’a ‘although’. In analogy to the previously discussed cases, we can only speculate that the multitude of loan conjunctions might reflect a range of not immediately discernible functional nuances. To get to the core of this problem, further inquiries have to be made. For those cases, in which two putative synonyms are reported in the literature, we can sketch the situation as follows. Of 24 pairs of adversative loan conjunctions, nine involve two different donor languages. There are seven cases of borrowings from two different donor languages in the domain of the copulative (which yields altogether 22 pairs of synonymous loan conjunctions). Disjunctive and conditional give evidence of two replica languages each of which borrows from two different donor languages. In the case of comparative and temporal, there is one replica language for each of the categories which borrows from two donors. It is evident that multiple borrowing from different donor languages does not imply that all of the borrowings happened at the same time. To the contrary, we are facing the synchronic co-existence of several historical layers. As a rule of thumb, it can be assumed that we are confronted with the results of a chronological succession of language contacts. The Arabo-Persian loan conjunctions in many languages in the southern regions of the erstwhile Soviet Union represent an older layer to which the Russian loan conjunctions were added on a later stage as the more recent layer. It is interesting to see that the newcomers do not seem to replace the old established loan conjunctions although a preference for the Russian loan conjunctions to the detriment of their Arabo-Persian equivalents has been observed for younger speakers of Uzbek as mentioned in Section 4.2.93. However, things must not be organized that neatly, in a manner of speaking. Van den Berg (2004: 218–219) attributes the borrowing of Arabic wa ‘and’ into several Nakh-Daghestanian languages to the rise of literacy in the 1920s. The introduction of the conjunction into the systems of the replica languages is thus a relatively recent process which started only in the 20th century. The au-

372 | Thomas Stolz and Nataliya Levkovych

thor also ponders the idea that cultural affinity may have facilitated the parallel borrowing from Arabic as a common source of so-called intellectual vocabulary. At the same time, Azerbaijanian served as language of written communication in parts of Daghestan for a short period of time. This Turkic language too displays the loan conjunction wa ‘and’ so that it has probably strengthened the position of the loan conjunction in the Daghestanian replica languages. Van den Berg (2004: 219) concludes that “[e]ven today, the use of wa has remained characteristic of written texts,” meaning: the loan conjunction has not yet penetrated into the colloquial styles of the replica languages. Apart from Baskakov (1952) for Karakalpak, our sources do not address the issue of the vitality of the loan conjunctions in the sense that nothing is said about their synchronic status. It is possible that, at least in some cases, certain loan conjunctions have long fallen into disuse and are only found in historical documents.

6 Conclusions In the introduction, we have posed the question whether the parallel borrowing behavior of many replica languages can give rise to a linguistic area in the sense of the existence of shared properties which were acquired through contact with the same donor(s). The map in the Appendix provides a geographic synopsis of the donor-replica relations in the former Soviet Union. The map refines and partly corrects the findings in Stolz (in press). The donor-replica relations are represented by pointy arrows which connect a given donor to a given replica language. Each donor language is given a distinct colour, namely red for Russian, blue for Persian, and green for Arabic. On the basis of the data in the catalogue (Section 4.2) and the evaluation in Section 5, it is possible to assume that arrows with identical colours more often than not represent sets of partially identical loan conjunctions. Given that this generalization is largely correct, the map can be interpreted as hosting two major contact-borne areas which are linked to each other by a narrow zone of overlap. In the south, there is the sphere of Arabo-Persian influence. Since Persian has been shown to function very often as propagator of originally Arabic conjunctions, it is legitimate to subsume the donor languages Arabic and Persian under one category. In the north, there is the Russian sphere of influence. Where the two spheres collide, replica languages borrow from both Arabo-Persian sources and Russian sources. The north-south division of the territory of the ex-USSR is a simplification in so far as it glosses over the differences which exist between the Caucasian region and Central Asia. Russian influence in the former region is

On loan conjunctions | 373

relatively limited whereas it is much stronger in Central Asia. Other donor languages are restricted to certain geographic niches where they often function as intermediaries (mainly for Arabic and/or Persian). This is the case with Armenian, Avar, Azerbaijanian, and Georgian in the Caucasian region, Tatar and/or Chuvash on the banks of the Volga, Uyghur in Turkestan, Germanic languages in the Baltic region, and Chukchi in the far east of the Russian Federation. The spheres of these minor donor languages are highlighted in orange. Those replica languages which share the same donor(s) form geographical neighborhoods whose members have become similar (no matter how minimally) to each other because of the identical MAT-borrowings. They now have at least one more feature in common which was absent from their grammatical systems in the pre-contact period. In the south, languages of diverse genetic background have become similar to each other on account of the shared Arabisms and Persisms in the domain of conjunctions. Similarly, in the north, languages of diverse genetic affiliation have undergone Russification in the same domain so that the material Russisms are now typical traits of languages which formerly lacked these properties. The ex-Soviet Union is probably a particularly convincing showcase for the emergence of a language-contact area. Our investigation may serve as a pattern for like-minded inquiries which focus on other multilingual macro-areas. We expect that the processes of parallel and multiple borrowing described in Sections 4.2 and 5 are not the monopoly of the languages of the region we chose for our study. We further expect that the comparison of the results of several in-depth studies of the borrowing behavior in different macro-areas promises highly valuable new insights for the theory of language contacts. On the one hand, loan conjunctions in the ex-USSR reflect the division into relatively unmarked coordinating loan conjunctions which have made it into practically all of the replica languages and the relatively marked noncoordinating conjunctions the borrowing of which almost always requires the parallel or prior borrowing of a coordinating conjunction. We have shown that Grant’s (2012) counter-hypothesis does not hold for our sample languages. Our own data support neither Grant’s (2012) inverse correlation of low frequency and high borrowability of adverbial subordinators nor that of semantic specificity and high borrowability. Most of the replica languages borrow highly frequent adverbial subordinators with prototypical meanings. Moreover, the replica languages from the former Soviet Union reviewed in this study yield results which do not absolutely conform to the well-established BUT > OR > AND hierarchy as propagated by Matras (1998). Our data speak in favour of moving the copulative loan conjunction from the bottom to the top of the markedness hierarchy.

374 | Thomas Stolz and Nataliya Levkovych

The data cursorily discussed in Section 2.2 together with those from the cisUralian languages are strong support for our rebuttal of Mauri’s (2008) assumption of the scarcity of MAT-borrowing of conjunctions in Europe. What we have seen is that loan conjunctions come by the score also in Europe. This is an insight you can gain only if the sample is densely packed by way of hosting as many languages as possible from the same macro-area. Too sparsely populated samples are methodologically unhelpful if one intends to make areal-linguistic generalizations. This pilot study has confirmed Kortmann’s (1996, 1998) expectation that a larger sample will allow us to trace many more cases of MATborrowing in the domain of conjunctions. We are convinced that the same holds also for other macro-areas. We do not want to give the false impression that we are arguing against cross-linguistic approaches to the issue of conjunction borrowing. To the contrary, the cross-linguistic perspective and the macro-area perspective complement each other. It is necessary that both ways of investigating loan conjunctions continue – and interact with each other. It cannot be precluded that macro-areas other than that of the ex-USSR yield a much higher degree of micro-variation. This potential heterogeneity in the borrowing behavior of the replica languages would be as interesting a topic to study as is the large-scale parallelism of the replica languages in our sample. All its achievements notwithstanding, our study only yields preliminary results and leaves a plethora of questions without answers. For a start, we repeat that the disagreement between Matras’s BUT > OR > AND order and our AND > BUT > OR order might turn out to be caused by the insufficient documentation of loan conjunctions in our sources. The continuation of our project requires of us the creation of a by far more reliable database to which only fully confirmed cases are admitted. To this end, it might be recommendable to use other kinds of sources (corpora, for instance) in addition to the extant descriptive-linguistic material. To our minds, the most serious problem is posed by the semantics of the loan conjunctions. For this first attempt, we have oversimplified matters by way of blindly following the lead of our sources. In this way, we could pragmatically ignore polysemy and other complicating phenomena. This crude way of dealing with an intricate matter is hardly suitable for the spin-offs which will result from this study. Similarly, the role of the transmitter languages which act as near donor needs to be elaborated upon. For the purpose of the pilot study, it was sufficient to identify the distant donors. However, if we want to understand better how exactly loan conjunctions diffuse over macro-areas the intermediary donors which are replica languages at the same time deserve to be studied more closely too. For obvious reasons, we have excluded the issue of PAT-borrowing

On loan conjunctions | 375

from our investigation. This exclusion is in order for a first test run of a project of this kind. In the future, Gardani’s (2020) highly differentiated taxonomy of MAT-PAT-borrowing should be taken as frame of reference. We have not invested much time and energy in clarifying the socio-cultural factors which make conjunction borrowing easy or difficult. At several points, the sociolinguistic parameter of age was mentioned especially when the younger generation of replica-language speakers was said to use certain loan conjunctions more readily than their elders. This generation gap is connected to the different degrees of bilingualism within a replica speech community. Growing bilingualism with the donor language facilitates borrowing from the donor language. Is the reverse also true, i.e. diminishing bilingualism with a former donor language weakens the position of loan conjunctions taken from the erstwhile donor? Several authors emphasize the importance of literacy for the introduction of loan conjunctions at least in the written register of a replica language. Urban centres also seem to play a role because big cities are multilingual meeting places with a comparatively high scholarization rate and literacy especially for younger speakers. These aspects and many others related to them must be taken account of if we want to formulate a full-blown theory of the behavior of conjunctions under the conditions of language contact. There is a limit to self-criticism. In spite of the many unresolved problems mentioned above, we have also achieved something. We have demonstrated that a macro-areal account of the borrowing behavior of a great number of languages is not only feasible technically but it is also revealing since patterns emerge which call for being explained by experts of language contact phenomena. Hopefully, our study encourages other linguists to take a macro-area of their choice and study how loan conjunctions fare there.

Acknowledgments: We are grateful to Martin Haspelmath and Paolo Ramat for providing us with much needed reading matter. Peter Arkadiev gave us permission to quote from his still unpublished grammar of Abaza. Katalin É. Kiss and Lena Borise unbureaucratically agreed to letting us use their manuscript on Khanty. Greville G. Corbett and Marina Chumakina kindly shared their expertise on Archi with us. Nicole Nau and Alexandre Arkhipov provided us with information on Russian loan conjunctions in Latgalian and Kamas respectively. Diana Forker supplied us with additional information. A special word of thanks goes to Werner Drossard who took it on him to read and comment on the draft version of this paper. In spite of all these helping hands, the full responsibility of the content and form of this study remains exclusively ours.

376 | Thomas Stolz and Nataliya Levkovych

Map: Donor-replica relations in the former Soviet Union.

Abbreviations 1/2/3 A ABL ABS ACC ADS ADSS ADV

1st/2nd/3rd person agent of transitive verbs ablative absolute (past)/absolutive accusative adversative adessive adverbial

On loan conjunctions | 377

AOP AOR ATTR AUX CAU(S) CISL CL1 CL3 CL4 CM CNC CNG CNS COM C(O)MP C(O)ND CONJ CONV COORD COP CPV DA DAT DCL DEF DEM DEST DETR DIM DIS DISTR DS DU DYN E ERG EXI F FIN FOCPRES FULL FUT GEN GER H IMP(ER)F IMP(V)

aorist participle aorist attributive auxiliary causal cislocative class 1 class 3 class 4 class marker concessive connegative consecutive comitative complementizer conditional conjunctive converb coordinative copula(tive) comparative logical and emphatic particle dative declarative definite (article) demonstrative destinative detransivator diminutive disjunctive distributive different subject dual dynamic epenthesis ergative existential feminine finite focal present full form future genitive gerund human imperfect(ive) imperative

378 | Thomas Stolz and Nataliya Levkovych

INCH IND INDF INESS INF INST INTR IO IRR ITER LAT LIM LOC M MASD MOD MOM MTS NEG NEG.COP NFUT NH N(ON_)FIN NP/NP NPAST NR

nSAP N(T) NUM OBL P PART PASS PAST PC.HAB PERF PL POESS POR POSS PRE_CONV PRED PR(E)S PRET P(RE)V PROL PROG

inchoative indicative indefinite inessive infinitive instrumental intransitive indirect object irrealis iterative lative limitative locative masculine masdar modal momentative morphological separator negation negative copula non-future non-human NOM nominative non-finite noun phrase non-past nominalizer non-speech act participant neuter numerative oblique patient of transitive verbs participle passive past tense habitual adverb perfective plural postessive possessor possessive preceding converb predicate present preterite preverb prolative progressive

On loan conjunctions | 379

PTC PTV PUR QUOT REAL RECIP REFL REL REM S SAP SBJ SG SOC SS SUBJ TMP TR TRANSL VCM

particle partitive purposive quotative realis reciprocal reflexive relative remote past agent of intransitive verb speech act participant subjunctive singular sociative same subject subjunctive temporal transitive translative verb class marker

References Abdullaev, Z. G. 1971. Očerki po sintaksisu darginskogo jazyka [Sketch of the syntax of the Dargwa language]. Moskva: Izdatel’stvo Akademii Nauk SSSR. Abu-Manga, Al-Amin. 1986. Fulfulde in the Sudan: Process of adaptation to Arabic. Berlin: Reimer. Alavi, Bozorg & Manfred Lorenz. 1988. Lehrbuch der persischen Sprache. Leipzig: Enzyklopädie. Alekseev, Mikhail E. 1994. Budukh. In Rieks Smeets (ed.), The indigenous languages of the Caucasus. Volume 4: North East Caucasian languages. The three Nakh languages and six minor Lezgian languages, 259–296. Delmar/BY: Caravan. Alekseev, M[ixail] E. 2001a. Andijskij jazyk. In M. E. Alekseev, G. A. Klimov, S. A. Starostin & Ja. G. Testelec (eds.), Jazyki mira: Kavkazskie jazyki [Languages of the world: Caucasian languages], 220–228. Moskva: Akademia. Alekseev, M[ixail] E. 2001b. Čamalinskij jazyk [Chamalal language]. In M. E. Alekseev, G. A. Klimov, S. A. Starostin & Ja. G. Testelec (eds.), Jazyki mira: Kavkazskie jazyki, 291–299. Moskva: Akademia. Alekseev, M[ixail] E. 2001c. Rutul’skij jazyk [Rutul language]. In M. E. Alekseev, G. A. Klimov, S. A. Starostin & Ja. G. Testelec (eds.), Jazyki mira: Kavkazskie jazyki, 409–420. Moskva: Akademia. Alekseev, M[ixail] E. 2001d. Xinalugskij jazyk [Khinalug language]. In M. E. Alekseev, G. A. Klimov, S. A. Starostin & Ja. G. Testelec (eds.), Jazyki mira: Kavkazskie jazyki, 460–469. Moskva: Akademia.

380 | Thomas Stolz and Nataliya Levkovych

Alekseev, M[ixail] E. & N. D. Sulejmanov. 2001. Agul’skij jazyk [Aghul language]. In M. E. Alekseev, G. A. Klimov, S. A. Starostin & Ja. G. Testelec (eds.), Jazyki mira: Kavkazskie jazyki, 398–408. Moskva: Akademia. Alekseev, Mixail E. & Sabrina X. Šixalieva. 2003. Tabasaranskij jazyk [Tabassaran language]. Moskva: Nauka. Alekseev, M. E. et al. 2012. Sovremennyj avarskij jazyk [Modern Avar language]. Machačkala: Aleph. Alhoniemi, Alho. 1993. Grammatik des Tscheremissischen (Mari). Hamburg: Buske. Aliev, U. B. 1973. Sintaksis karačaevo-balkarskogo jazyka [Syntax of the Karachay-Balkar language]. Moskva: Nauka. Anderson, Gregory D. S. 2005. Language contact in South Central Siberia. Wiesbaden: Harrassowitz. Aquilina, Joseph. 1959. The structure of Maltese. Valletta: Progress Press. Ariste, Paul. 1968. A grammar of the Votic language. Bloomington: Indiana University Publications. Arkadiev, Peter. 2020. Abaza. A grammatical sketch. Preprint (September 2020). DOI: 10.13140/RG.2.2.13434.31683. Asanov, Š. A., A. N. Garkavec & S. M. Useinov. 1988. Krymskotatarsko-russkij slovar’ [Crimean Tatar Russian dictionary]. Kiev: Radjans’ka škola. Authier, Gilles. 2009. Grammaire kryz (langue caucasique d’Azerbaïdjan, dialecte d’Alik). Leuven & Paris: Peeters. Authier, Gilles. 2010. Le judéo-tat (langue iranienne des juifs du Caucase de l’est). Paris: Ecole Pratique des Hautes Études. Avril, Yves. 2006. Parlons komi. Une langue finno-ougrienne de Russie. Paris: L’Harmattan. Azaev, X. G. 1973. Arabskie zaimstvovanija v slovarnom sostave botlixskogo jazyka. Sbornik naučnyx soobščenij fakul’teta inostrannyx jazykov [Arabic loans in the vocabulary of the Botlikh language. Collection of scientific notes of the faculty of foreign languages], 120– 138. Maxačkala: Dagučpedgiz. Baran, Dominika. 2000. The role of Russian function words in urban colloquial Uzbek. Texas Linguistic Forum 44(1). 18–32. Baranova, Vlada. 2021. Sojuznaja model’ složnogo predloženija v kalmyckom jazyke: kontaktno-obuslovlennoe javlenie ili nezavisimaja grammatikalizacija?/Complex sentences with conjunctions in Kalmyk as a contact-induced change or a language-internal grammaticalization. In Egor Kashkin et al. (eds.), Indigenous languages of Russia in contact with Russian II. 11–13 February 2021. V. V. Vinogradov Russian Language Institute RAS & Institute of Linguistics, RAS, Moscow: Book of abstracts, 11–12. Moscow: V. V. Vinogradov Russian Language Institute RAS & Institute of Linguistics, RAS. Barbier, Laurène. 2021. Russian influence on Upper Negidal adverbial clauses. In Egor Kashkin et al. (eds.), Indigenous languages of Russia in contact with Russian II. 11–13 February 2021. V. V. Vinogradov Russian Language Institute RAS & Institute of Linguistics, RAS, Moscow: Book of abstracts, 13–15. Moscow: V. V. Vinogradov Russian Language Institute RAS & Institute of Linguistics, RAS. Baskakov, Nikolaj A. 1952. Karakalpakskij jazyk. Tom II: Fonetika i morfologija [Karakalpak language. Vol. II: Phonetics and morphology]. Moskva: Nauka. Baskakov, N[ikolaj] A. 1960. The Turkic languages of Central Asia. Problems of planned culture contact. The Turkic peoples of the USSR: The development of their languages and writing. Oxford: Central Asian Research Centre.

On loan conjunctions | 381

Baskakov, N[ikolaj]. A. 1963. Nogajsko-russkij slovar’ [Nogai-Russian dictionary]. Moskva: Gosudarstvennoe izdatel’stvo inostrannyx i nacional’nyx slovarej. Baskakov [Baskakow], N[ikolaj]. A., A. Zajončkovski [Zajączkowski] & S. Š. Šapšal [Szapszał]. 1974. Karaimsko-russko-pol’skij slovar’ [Karaim-Russian-Polish dictionary]. Moskva: Russkij jazyk. Benzing, Johannes. 1955. Lamutische Grammatik. Wiesbaden: Steiner. Benzing, Johannes. 1985. Kalmückische Grammatik zum Nachschlagen. Wiesbaden: Harrassowitz. Bereczki, Gábor. 2004. The Uralic language family. In György Nanovfszky (ed.), The Finno-Ugric world, 165–170. Budapest: Teleki László Foundation. Berend, Nina. 2011. Russlanddeutsches Dialektbuch. Halle: Projekte. Berg, Helma van den. 1995. A grammar of Hunzib. München & Newcaste: Lincom Europa. Berg, Helma van den. 2004. Coordinating constructions in Daghestanian languages. In Martin Haspelmath (ed.), Coordinating constructions. 197–226. Amsterdam & Philadelphia: John Benjamins. Berger, Hermann. 1974. Das Yasin-Burushaski (Werchikwar). Wiesbaden: Harrassowitz. Berta, Árpád. 1998. West Kipchak languages. In Lars Johanson & Éva Csató-Johanson (eds.), The Turkic languages, 301–317. London & New York: Routledge. Blacher, Philippe-Schmerka. 2002. Parlons turkmene. Paris: L’Harmattan. Bläsing, Uwe. 2003. Kalmuck. In Juha Janhunen (ed.), The Mongolic languages. 229–247. London & New York: Routledge. Boeschoten, Hendrik. 1998. Uzbek. In Lars Johanson & Éva Csató-Johanson (eds.), The Turkic languages, 357–378. London & New York: Routledge. Bokarev, A. A. 1949. Očerk grammatiki čamalinskogo jazyka [Sketch of the grammar of the Chamalal language]. Moskva & Leningrad: Izdatel’stvo Akademii Nauk SSSR. Boretzky, Norbert. 1975. Der türkische Einfluss auf das Albanische. Teil 1: Phonologie und Morphologie der albanischen Turzismen. Wiesbaden: Harrassowitz. Borise, Lena & Katalin É. Kiss. 2021. The emergence of conjunctions and phrasal coordination in Khanty. Preprint (February 2021). DOI: 10.13140/RG.2.2.29071.10402. Braune, Wilhelm & Ernst A. Ebbinghaus. 1973. Gotische Grammatik. Tübingen: Niemeyer. Brauner, Siegfried & Irmtraud Herms. 1986. Lehrbuch des modernen Swahili. Leipzig: Enzyklopädie. Breu, Walter & Giovanni Piccoli. 2000. Dizionario croato molisano di Acquaviva Collecroce. Campobasso: Naš Grad. Brockelmann, Carl. 1908. Grundriss der vergleichenden Grammatik der semitischen Sprachen. I. Band: Laut- und Formenlehre. Berlin: Reuther & Reichard. Bulatova, Nadezhda & Lenore Grenoble. 1999. Evenki. München: Lincom Europa. Camaj, Martin. 1977. Die albanische Mundart von Falconara Albanese in der Provinz Cosenza. München: Trofenik. Caragiu Marioţeanu, Matilda. 1975. Compendiu de dialectologie română (nord- şi suddunăreană) [Compendium of Romanian dialectology (northern and southern Romanian)]. Bucureşti: Editura ştiinţifică şi enciclopedică. Chirikba, V[iacheslav] A. 1996. A dictionary of Common Abkhaz. Leiden: s.l. Chirikba, Viacheslav A. 2003. Abkhaz. München: Lincom Europa. Chumakina, Marina. 2009. Loanwords in Archi. In Martin Haspelmath & Uri Tadmor (eds.), Loanwords in the world’s languages. A comparative handbook, 430–446. Berlin & New York: De Gruyter.

382 | Thomas Stolz and Nataliya Levkovych

Colarusso, John. 1989. East Circassian (Kabardian dialect). In B. George Hewitt (ed.), The indigenous languages of the Caucasus. Volume 2: The North West Caucasian languages, 261– 355. Delmar/NY: Caravan. Colarusso, John. 1992. A grammar of the Kabardian language. Canada: University of Calgary Press. Comrie, Bernard. 1981. The languages of the Soviet Union. Cambridge. Cambridge University Press. Comrie, Bernard & Madzhid Khalilov. 2009. Loanwords in Bezhta. In Martin Haspelmath & Uri Tadmor (eds.), Loanwords in the world’s languages. A comparative handbook, 414–429. Berlin & New York: De Gruyter. Comrie, Bernard, Madzhid Khalilov & Zaira Khalilova. 2015. Grammatika bežtinskogo jazyka [Grammar of the Bezhta language]. Leipzig & Makhachkala: MPIEVA. Comrie, Bernard & Maria Polinsky. 2020. Tsez. In Yuri Koryakov, Yury Lander & Timur Maisak (eds.), The Caucasian languages: An international handbook, 1–40. Berlin & Boston: De Gruyter Mouton. Corbett, Greville G. 2005. The canonical approach in typology. In Zygmunt Frajzyngier, Adam Hodges & David S. Rood (eds.), Linguistic diversity and language theories, 25–50. Amsterdam & Philadelphia: John Benjamins. Csató, Éva Ágnes & Birsel Karakoç. 1998. Noghay. In Lars Johanson & Éva Csató-Johanson (eds.), The Turkic languages, 333–343. London & New York: Routledge. Črelašvili, K. T. 2001. Bacbijskij jazyk [Batsbi language]. In M. E. Alekseev, G. A. Klimov, S. A. Starostin & Ja. G. Testelec (eds.), Jazyki mira: Kavkazskie jazyki, 196–203. Moskva: Akademia. Daniyarova, Saodat, Shodiyor Daniyarov & Barchinoy Daniyarova. 2012. Parlons shor. Langue turcique de Sibérie. Paris: L’Harmattan. De Jong, Frederick. 2007. A grammar of modern Uyghur. Utrecht: Houtsma. De Reuse, Willem Joseph. 1994. Siberian Yupik Eskimo. The language and its contacts with Chukchi. Salt Lake City: University of Utah Press. Dimitrescu, Florica et al. 1978. Istoria limbii române [History of the Romanian language]. Bucureşti: Editura didactică şi pedagogică. Doniyorova, Saodat. 2001. Parlons ouzbek. Paris: L’Harmattan. Doniyorova, Soadat & Toshtemirov Qahramonil. 2004. Parlons koumyk (Daghestan). Paris: L’Harmattan. Dor, Rémy. 2004. Parlons kirghiz. Manuel de langue, orature et littérature kirghizes. Paris: L’Harmattan. Dum-Tragut, Jasmine. 2009. Armenian. Amsterdam & Philadelphia: John Benjamins. El Mogharbel, Christliebe. 1993. Nehrungskurisch. Dokumentation einer moribunden Sprache. Frankfurt a.M.: Hector. Ersen-Rasch, Margarete I. 2012. Türkische Grammatik. Wiesbaden: Harrassowitz. Etymological dictionary of Estonian. Eesti Keele Instituut [Institute of the Estonian Language]: http://www.eki.ee/dict/ety/index.cgi?C06=en. Fähnrich, Heinz. 1986. Kurze Grammatik der georgischen Sprache. Leipzig: Enzyklopädie. Fähnrich, Heinz. 1994. Grammatik der altgeorgischen Sprache. Hamburg: Buske. Feist, Timothy. 2010. A grammar of Skolt Saami. University of Manchester. Doctoral dissertation. Fenwick, Rohan S. H. 2011. A grammar of Ubykh. München: LINCOM.

On loan conjunctions | 383

Filchenko, Andrei. 2008. The syntax and pragmatics of adverbial clauses in Eastern Khanty. In Edward J. Vajda (ed.), Subordination and coordination in North Asian languages, 31–46. Amsterdam & Philadelphia: John Benjamins. Forker, Diana. 2013. A grammar of Hinuq. Berlin & Boston: De Gruyter Mouton. Forker, Diana & Lenore Grenoble. 2021a. Some structural similarities in the outcomes of language contact with Russian. In Egor Kashkin et al. (eds.), Indigenous languages of Russia in contact with Russian II. 11–13 February 2021. V. V. Vinogradov Russian Language Institute RAS & Institute of Linguistics, RAS, Moscow: Book of abstracts, 42–43. Moscow: V. V. Vinogradov Russian Language Institute RAS & Institute of Linguistics, RAS. Forker, Diana & Lenore Grenoble (eds.). 2021b. Language contact in the territory of the former Soviet Union. Amsterdam & Philadelphia: John Benjamins. Foulon-Hristova, Jordanka. 1998. Grammaire pratique du macédonien. Paris: Langues & Mondes. Fox, Samuel Ethan. 2002. A Neo-Aramaic dialect of Bohtan. In Werner Arnold & Hartmut Bobzin (eds.), „Sprich doch mit deinen Knechten aramäisch, wir verstehen es!” 60 Beiträge zur Semitistik: Festschrift für Otto Jastrow zum 60. Geburtstag, 165–180. Wiesbaden: Harrassowitz. Gajdarži, Gavril A. 1981. Gagauzskij sintaksis. Pridatočnye predloženija sojuznogo podčinenija [Syntax of Gagauz. Subordinate clauses of conjunctional subordination]. Kišinev: Štiinca. Gardani, Francesco. 2020. Borrowing matter and pattern in morphology. An overview. Morphology 30. 263–282. Georg, Stefan & Alexander P. Volodin. 1999. Die itelmenische Sprache. Grammatik und Texte. Wiesbaden: Harrassowitz. Golovko, Evgenij V. 2003. “Folk“ linguistic engineering. In Yaron Matras & Peter Bakker (eds.), The mixed language debate. Theoretical and empirical advances. 177–207. Berlin & New York: De Gruyter. Golovko, Evgenij V. 2009. Aleutskij jazyk v Rossijskoj Federacii (struktura, funkcionirovanie, kontaktnye javlenija) [Aleut language in the Russian Federation (structure, functions, contact phenomena)]. PhD-thesis. Sankt-Peterburg: Institut Lingvističeskix Issledovanij Rossijskoj Akademii Nauk. Grant, Anthony P. 2012. Contact, convergence, and conjunctions: A cross-linguistic study of borrowing correlations among certain kinds of discourse, phasal adverbial, and dependent clause markers. In Claudine Chamoreau & Isabelle Léglise (eds.), Dynamics of contactinduced language change, 311–358. Berlin & Boston: De Gruyter Mouton. Grenoble, Lenore A. 2000. Morphosyntactic change: The impact of Russian on Evenki. In D. G. Gilbers & J. Nerrbonne & J. Schaeken (eds.), Languages in contact, 105–120. Amsterdam & Atlanta/GA: Rodopi. Grjunberg, Aleksandr L. 1963. Jazyk severoazerbajdžanskix tatov [Language of the NorthAzerbaijanian Tats]. Moskva: Akademia Nauk SSR. Gruzdeva, Ekaterina. 1998. Nivkh. München & Newcastle: Lincom Europa. Haase, Martin. 1993. Sprachkontakt und Sprachwandel im Baskenland. Die Einflüsse des Gaskognischen und Französischen auf das Baskische. Hamburg: Buske. Haspelmath, Martin. 1993. A grammar of Lezgian. Berlin & New York: De Gruyter. Haspelmath, Martin. 1995. Contextual and specialized converbs in Lezgian. In Martin Haspelmath & Ekkehard König (eds.), Converbs in cross-linguistic perspective. Structure and meaning of adverbial verb forms – adverbial participles, gerunds, 415–440. Berlin & New York: De Gruyter.

384 | Thomas Stolz and Nataliya Levkovych

Haspelmath, Martin. 2001. The European linguistic area: Standard Average European. In Martin Haspelmath et al. (eds.), Language typology and language universals. An international handbook. 1492–1510. Berlin & New York: De Gruyter. Haspelmath, Martin. 2007. Coordination. In Timothy Shopen (ed.), Language typology and linguistic description, 1–51. Cambridge: Cambridge University Press. Hauk, Bryn. 2020. Deixis and reference tracking in Tsova-Tush. University of Hawaiʻi at Mānoa. Doctoral dissertation. Hausenberg, Anu-Reet. 1998. Komi. In Daniel Abondolo (ed.), The Uralic languages, 305–326. London: Routledge. Heine, Bernd & Tania Kuteva. 2006. The changing languages of Europe. Oxford: Oxford University Press. Helimski, Eugene. 1998. Selkup. In Daniel Abondolo (ed.), The Uralic languages, 548–579. London & New York: Routledge. Hewitt, B. George. 1989. Abkhaz. In B. George Hewitt (ed.), The indigenous languages of the Caucasus. Volume 2: The North West Caucasian languages, 38–88. Delmar/NY: Caravan. Hober, Nicole. this volume. On the borrowing of the English adversative connector but. Holisky, Dee Ann & Rusudan Gagua. 1994. Tsova-Tush (Batsbi). In Rieks Smeets (ed.), The indigenous languages of the Caucasus. Volume 4: North East Caucasian languages. The three Nakh languages and six minor Lezgian languages, 147–211. Delmar/BY: Caravan. Hurch, Bernhard. 1989. Hispanisierung im Baskischen. In Norbert Boretzky, Werner Enninger & Thomas Stolz (eds.), Vielfalt der Kontakte. 1. Band, 11–36. Bochum: Brockmeyer. Ido, Shinji. 2005. Tajik. München: LINCOM. Il’ina, L. A. 1976. Ob upotreblenii zaimstvovannyx russkix sojuzov v sel’kupskom jazyke [Towards the use of loan Russian conjunctions in the Selkup language]. In Ėrika Bekker (ed.), Jazyki i toponimija [Languages and toponymy], 52–55. Tomsk: Tomskij gosudarstvennyj pedagogičeskij institut. Imnajšvili, D. C. 1963. Didojskij jazyk v sravnenii s ginuxskim i xvaršijskim jazykami [Didoic language in comparison with Hinuq and Khwarshi languages]. Tbilisi: Izdatel’stvo AN Gruzinskoj SSR. Isaev, M.-Š.A. 2004. Dargwa. In Michael Job (ed.), The indigenous languages of the Caucasus. Volume 3: The North East Caucasian languages. Part 1, 299–345. Ann Arbor: Caravan. Isakov, I. A. 2001. Gunzibskij jazyk [Hunzib language]. In M. E. Alekseev, G. A. Klimov, S. A. Starostin & Ja. G. Testelec (eds.), Jazyki mira: Kavkazskie jazyki, 312–320. Moskva: Akademia. Jacobs, Neil G. 2005. Yiddish. A linguistic introduction. Cambridge: Cambridge University Press. Janhunen, Juha. 2003. Proto-Mongolic. In Juha Janhunen (ed.), The Mongolic languages, 1–28. London & New York: Routledge. Jeschull, Liane. 2004. Coordination in Chechen. In Martin Haspelmath (ed.), Coordinating constructions, 241–265. Amsterdam & Philadelphia: John Benjamins. Johanson, Lars. 1995. On Turkic converbs. In Martin Haspelmath & Ekkehard König (eds.), Converbs in cross-linguistic perspective. Structure and meaning of adverbial verb forms – adverbial participles, gerunds, 313–348. Berlin & New York: De Gruyter. Johanson, Lars. 1996. Kopierte Satzjunktoren im Türkischen. STUF/Language Typology and Universals 49(1). 39–49.

On loan conjunctions | 385

Johanson, Lars. 1997. Kopien russischer Konjunktionen in türkischen Sprachen. In Dieter Huber & Erika Worbs (eds.), Ars transferendi. Sprache, Übersetzung, Interkulturalität, 115– 121. Frankfurt a.M.: Lang. Johanson, Lars. 1998. The structure of Turkic. In Lars Johanson & Éva Ágnes Csató (eds.), The Turkic languages, 30–66. London & New York: Routledge. Johanson, Lars. 2002. Structural factors in Turkic language contacts. Richmond: Curzon. Johanson, Lars. 2006. Turkic language contacts in a typology of code interaction. In Hendrik Boeschoten & Lars Johanson (eds.), Turkic languages in contact, 4–26. Wiesbaden: Harrassowitz. Juldašev, A. A. 1981. Grammatika sovremennogo baškirskogo jazyka [Grammar of the modern Bashkir language]. Moskva: Izdatel’stvo Akademii nauk SSSR. Kalmykova, S. A. 1973. Sojuzy i sojuznye slova [Conjunctions and linking words]. In N. A. Baskakov (ed.), Grammatika nogajskogo jazyka. Fonetika i morfologija [Grammar of the Nogai language. Phonetics and morphology], 290–297. Čerkessk: Karačaevo-čerkesskoe otdelenie Stavropol’skogo knižnogo izdatel’stva. Kämpfe, Hans-Rainer & Alexander P. Volodin 1995. Abriss der tschuktschischen Sprache auf der Basis der Schriftsprache. Wiesbaden: Harrassowitz. Kangasmaa-Minn, Eeva. 1998. Mari. In Daniel Abondolo (ed.), The Uralic languages, 219–248. London & New York: Routledge. Karanfil, Güllü. 2010. Parlons gagaouze. Paris: L’Harmattan. Karimova, R. Š. & M. Š. Xalilov [Khalilov] 2013. Zaimstvovannaja leksika v xvaršinskom jazyke [Loan vocabulary in the Khwarshi language]. Maxačkala: Alef. Karulis, Konstantīns. 1992. Latviešu etimoloģijas vārdnīca [Latvian etymological dictionary]. Vol. II, P–Ž. Rīga: Avots. Kaysina, Inna. 2013. The adoption of Russian conjunctions in Udmurt. Journal of Estonian and Fenno-Ugric Linguistics 4(2). 131–144. Kaysina, Inna. 2015. Grammatical effects of Russian-Udmurt language contact. In Christel Stolz (ed.), Language empires in comparative perspective, 219–236. Berlin & Boston: De Gruyter Mouton. Kazakevič [Kazakevich], Olga. 2021. Russkie glagoly v reči severnyx sel’kupov/Russian verbs in Northern Selkup speech. In Egor Kashkin et al. (eds.), Indigenous languages of Russia in contact with Russian II. 11–13 February 2021. V. V. Vinogradov Russian Language Institute RAS & Institute of Linguistics, RAS, Moscow: Book of abstracts, 49–51. Moscow: V. V. Vinogradov Russian Language Institute RAS & Institute of Linguistics, RAS. Kehayov, Petar. 2020. Between facts and speech acts: The conditional and conditionalconjunctive in Moksha Mordvin. Linguistica Uralica 56(1). 18–44. Keresztes, László. 1989. Chrestomathia Morduinica. Budapest: Tankönyvkiadó. Keresztes, László. 1998. Mansi. In Daniel Abondolo (ed.), The Uralic languages, 387–427. London & New York: Routledge. Khalilov, Madzhid. 2015. Botlikh dictionary. In Mary Ritchie Key & Bernard Comrie (eds.) The intercontinental dictionary series. Leipzig: Max Planck Institute for Evolutionary Anthropology. (Available online at http://ids.clld.org/contributions/37, Accessed on 2021-03-01.) Khalilova, Zaira. 2009. A grammar of Khwarshi. Utrecht: LOT. Khalilova, Zaira. 2018. The impact of Russian on Bezhta and Khwarshi. Retrieved March 08, 2021 from https://drive.google.com/file/d/1WPdTGvyQQsJSYokl7vQ8npOrVFmjDRQ7/ view.

386 | Thomas Stolz and Nataliya Levkovych

Khojayori, Nasrullo & Mikael Thompson. 2009. Tajiki. A reference grammar for beginners. Washington/DC: Georgetown University Press. Khomchenkova, Irina & Natalia Stoynova. 2021. Adverbial’nye klauzy s russkimi sojuzami v trex raznostrukturnyx jazykax/Adverbial clauses with Russian conjunctions in three languages with different subordination strategies. In Egor Kashkin et al. (eds.), Indigenous languages of Russia in contact with Russian II. 11–13 February 2021. V. V. Vinogradov Russian Language Institute RAS & Institute of Linguistics, RAS, Moscow: Book of abstracts, 87–92. Moscow: V. V. Vinogradov Russian Language Institute RAS & Institute of Linguistics, RAS. Khvtisiashvili, Tamrika. 2013. Principal aspects of Xinaliq phonology and morphosyntax. University of Utah. Doctoral dissertation. https://www.proquest.com/docview/1442804609]. Kibrik, Alexander & Sergey Tatevosov. 2001. Bagvalinskij jazyk: grammatika, teksty, slovari [Bagvalal language: grammar, texts, dictionaries]. Moskva: IMLI RAN. Kirchner, Mark. 1998a. Kazakh and Karakalpak. In Lars Johanson & Éva Csató-Johanson (eds.), The Turkic languages, 318–332. London & New York: Routledge. Kirchner, Mark. 1998b. Kirghiz. In Lars Johanson & Éva Csató-Johanson (eds.), The Turkic languages, 344–356. London & New York: Routledge. Klumpp, Gerson. 2002. Konverbkonstruktionen im Kamassischen. Wiesbaden: Harrassowitz. Klyčev, R. N. & L. P. Čkadua. 2001. Abxazskij jazyk [Abkhaz language]. In M. E. Alekseev, G. A. Klimov, S. A. Starostin & Ja. G. Testelec (eds.), Jazyki mira: Kavkazskie jazyki, 113–131. Moskva: Akademia. Kortmann, Bernd. 1996. Adverbial subordination. A typology and history of adverbial subordinators based on European language. Berlin & New York: De Gruyter. Kortmann, Bernd. 1998. Adverbial subordinators in the languages of Europe. In Johan van der Auwera (ed.), Adverbial constructions in the languages of Europe, 457–462. Berlin & New York: De Gruyter. Kovačec, August. 1968. Observations sur les influences croates dans la grammaire istroroumaine. La Linguistique 4(1). 79–115. Krier, Fernande. 1976. Le maltais au contact de l’italien. Hamburg: Buske. Kumaxov, M. A. 2013. Kabardino-čerkesskij jazyk [Kabardino-Cherkess language]. Moskva: Institut Jazykoznanija RAN. Kuznecova, A. I., O. A. Kazakevič, E. B. Gruškina & E. A. Xelimskij. 2002. Sel’kupskij jazyk [Selkup language]. Sankt-Peterburg: Prosveščenie. Laanest, Arvo. 1982. Einführung in die ostseefinnischen Sprachen. Hamburg: Buske. Landmann, Angelika. 2010. Usbekisch. Kurzgrammatik. Wiesbaden: Harrassowitz. Landmann, Angelika. 2011. Kirgisisch. Kurzgrammatik. Wiesbaden: Harrassowitz. Landmann, Angelika. 2012. Kasachisch. Kurzgrammatik. Wiesbaden: Harrassowitz. Landmann, Angelika. 2013a. Aserbaidschanisch. Kurzgrammatik. Wiesbaden: Harrassowitz. Landmann, Angelika. 2013b. Turkmenisch. Kurzgrammatik. Wiesbaden: Harrassowitz. Landmann, Angelika. 2014a. Tatarisch. Kurzgrammatik. Wiesbaden: Harrassowitz. Landmann, Angelika. 2014b. Tschuwaschisch. Kurzgrammatik. Wiesbaden: Harrassowitz. Landmann, Angelika. 2015. Baschkirisch. Kurzgrammatik. Wiesbaden: Harrassowitz. Landmann, Angelika. 2017. Tyvanisch. Kurzgrammatik. Wiesbaden: Harrassowitz. Lazard, Gilbert. 1989. Le persan. In Rüdiger Schmitt (ed.), Compendium linguarum iranicarum, 263–292. Wiesbaden: Reichert. Leinonen, Marja. 2009. Russian influence of the Ižma Komi dialect. International Journal of Bilingualism 13(3). 309–329.

On loan conjunctions | 387

Li, Yong-Sŏng. 2011. A study of Dolgan. Seoul: Seoul National University Press. Lytkin, V. I. 1961. Komi-jaz’vinskij dialekt [Komi-Yazva dialect]. Moskva: Izdatel’stvo Akademii Nauk SSSR. Magomedbekova, Zagidat M. 1967. Axvaxskij jazyk [Akhvakh language]. Tbilisi: Mecniereba. Magomedbekova, Zagidat M. 1971. Karatinskij jazyk [Karata language]. Tbilisi: Izdatel’stvo AN Gruzinskoj SSR. Magomedbekova, Zagidat M. 2001a. Botlixskij jazyk [Botlikh language]. In M. E. Alekseev, G. A. Klimov, S. A. Starostin & Ja. G. Testelec (eds.), Jazyki mira: Kavkazskie jazyki, 228–236. Moskva: Akademia. Magomedbekova, Zagidat M. 2001b. Axvaxskij jazyk [Akhvakh language]. In M. E. Alekseev, G. A. Klimov, S. A. Starostin & Ja. G. Testelec (eds.), Jazyki mira: Kavkazskie jazyki, 252–261. Moskva: Akademia. Magomedbekova, Zagidat M. 2001c. Tindinskij jazyk [Tindi language]. In M. E. Alekseev, G. A. Klimov, S. A. Starostin & Ja. G. Testelec (eds.), Jazyki mira: Kavkazskie jazyki, 283–291. Moskva: Akademia. Magomedova, P. T. 2012. Tindinskij jazyk [Tindi language]. Maxačkala: Institut jazyka, literatury i iskusstva DNC RAN. Majtinskaja, K. E. 1993. Finno-ugorskije jazyki [Finno-Ugric languages]. In Ju. S. Eliseev, K. E. Majtinskaja & O. I. Romanova (eds.), 20–31. Moskva: Nauka. Maksunova, Zoya. 2003. Russian structural influence on Ket. STUF/Language Typology and Universals 56(1–2). 123–132. Malchukov, Andrei L. 1995. Even. München & Newcastle: Lincom Europa. Markus, E. B. & F. I. Rožanskij 2017. Sovremennyj vodskij jazyk [Modern Votic language]. SanktPeterburg: Nestor-Istorija. Marín Ramos, Ferran. 2014. Gramática básica de djudeo-espanyol. S.j.: MS Publishers. Maslova, Elena. 2003. A grammar of Kolyma Yukaghir. Berlin & Boston: De Gruyter Mouton. Matras, Yaron. 1998. Utterance modifiers and universals of grammatical borrowing. Linguistics 36. 281–331. Matras, Yaron. 2002. Romani. A linguistic introduction. Cambridge: Cambridge University Press. Matras, Yaron. 2007. The borrowability of structural categories. In Yaron Matras & Jeanette Sakel (eds.), Grammatical borrowing in cross-linguistic perspective, 31–74. Berlin & New York: De Gruyter. Matras, Yaron. 2009. Language contact. Cambridge: Cambridge University Press. Matras, Yaron & Jeanette Sakel (eds.). 2007. Grammatical borrowing in cross-linguistic perspective. Berlin & New York: De Gruyter. Mauri, Caterina. 2008. Coordination relations in the languages of Europe and beyond. Berlin & New York: De Gruyter. Maxmudova, Svetlana Musaevna. 2001. Morfologija rutul’skogo jazyka [Morphology of the Rutul language]. Moskva: Institut jazykoznanija RAN. Memetov, A. 2013. Krymskotatarskij jazyk [Crimean Tatar language]. Simferopol’: Krymučpedgiz. Menz, Astrid. 2006. On complex sentences in Gagauz. In Hendrik Boeschoten & Lars Johanson (eds.), Turkic languages in contact, 139–151. Wiesbaden: Harrassowitz. Mikailov, K. Š. 1967. Arčinskij jazyk: grammatičeskij očerk s tekstami i slovarem [Archi language: grammatical sketch with texts and dictionary]. Maxačkala: Dagestanskij filial AN SSSR.

388 | Thomas Stolz and Nataliya Levkovych

Miller, Corey & Karineh Aghajanian-Stewart. 2018. A frequency dictionary of Persian. London & New York: Routledge. Muhamedowa, Raihan. 2009. The use of Russian conjunctions in the speech of bilingual Kazakhs. International Journal of Bilingualism 13(3). 331–356. Muhamedowa, Raihan. 2016. Kazakh. A comprehensive grammar. London & New York: Routledge. Nadžip, È. N. 1960. Sovremennyj ujgurskij jazyk [Modern Uyghur language]. Moskva: Izdatel’stvo Akademii Nauk SSSR. Nagayama, Yukari. 2003. Očerk grammatiki aljutorskogo jazyka [Grammatical sketch of the Alutor language]. Osaka: ELPR. Namdakova, Serzhema. 2021. Grammatičeskie osobennosti russkix glagolov-vključenij v ustnoj reči na burjatskom jazyke/Grammatical features of Russian verbs-inclusions in oral speech of Buryat language. In Egor Kashkin et al. (eds.), Indigenous languages of Russia in contact with Russian II. 11–13 February 2021. V. V. Vinogradov Russian Language Institute RAS & Institute of Linguistics, RAS, Moscow: Book of abstracts, 63–64. Moscow: V. V. Vinogradov Russian Language Institute RAS & Institute of Linguistics, RAS. Nau, Nicole. 1998. Latvian. München & Newcastle: Lincom Europa. Nedjalkov, Igor’ V. 1995. Converbs in Evenki. In Martin Haspelmath & Ekkehard König (eds.), Converbs in cross-linguistic perspective. Structure and meaning of adverbial verb forms – adverbial participles, gerunds, 441–464. Berlin & New York: De Gruyter. Nefedov, Andrey. 2015. Clause linkage in Ket. Utrecht: LOT. Németh, Michał. 2004. Some disputable Slavic etymologies in Crimean-Karaim. Studia Etymologica Cracoviensia 9. 111–118. Nichols, Johanna. 1994a. Chechen. In Rieks Smeets (ed.), The indigenous languages of the Caucasus. Volume 4: North East Caucasian languages. The three Nakh languages and six minor Lezgian languages, 1–78. Delmar/BY: Caravan. Nichols, Johanna. 1994b. Ingush. In Rieks Smeets (ed.), The indigenous languages of the Caucasus. Volume 4: North East Caucasian languages. The three Nakh languages and six minor Lezgian languages, 79–146. Delmar/BY: Caravan. Nikolaeva, Irina. 1999. Ostyak. München: Lincom Europa. Nirvi, R. E. 1971. Inkeroismurteiden sanakirja [Dictionary of Ingrian dialects]. Helsinki: Suomalais-Ugrilainen Seura. Noonan, Michael. 2007. Complementation. In Timothy Shopen (ed.), Language typology and linguistic description, 52–150. Cambridge: Cambridge University Press. Oskol’skaja, S. A. & N. M. Stojnova. 2013. Russkie sojuzy v sovremennom nanajskom jazyke [Russian conjunctions in the modern Nanai language]. In N. N. Kazanskij (ed.), Acta Linguistica Petropolitana. Trudy Instituta lingvističeskix issledovanij RAN [Proceedings of the Institute for Linguistic Studies of RAS]. Vol. IX (3), 362–388. Sankt-Peterburg: Nauka. Oslon, M. V. 2018. Jazyk kotljarov-moldovaja. Grammatika kèldèrarskogo dialekta cyganskogo jazyka v russkojazyčnom okruženii [Language of the Kotlyary-Moldovaya. Grammar of Kalderash dialect of the Gipsy language in the Russian-speaking environment]. Moskva: JaSK. Pakendorf Brigitte & Natalia Aralova. 2017. Documentation of Negidal, a nearly extinct Northern Tungusic language of the Lower Amur. London: SOAS. Paris, Catherine. 1989. West Circassian (Adyghe: Abzakh dialect). In B. George Hewitt (ed.), The indigenous languages of the Caucasus. Volume 2: The North West Caucasian languages, 154–260. Delmar/NY: Caravan.

On loan conjunctions | 389

Plisch, Uwe-Karsten. 1999. Einführung in die koptische Sprache (sahidischer Dialekt). Wiesbaden: Reichert. Potanina, Olga & Andrei Filchenko. 2016. Russian contact-induced innovations in Eastern Khanty. Tomsk Journal of Linguistics and Anthropology 2. 27–39. Pritsak, Omeljan. 1959a. Das Karaimische. In Jean Deny et al. (eds.), Philologiae Turcicae Fundamenta, 318–339. Wiesbaden: Steiner. Pritsak, Omeljan. 1959b. Das Karatschaische und Balkarische. In Jean Deny et al. (eds.), Philologiae Turcicae Fundamenta. 340–368. Wiesbaden: Steiner. Prokosch, Erich. 2006. Handbuch des Krimtatarischen unter Einschluss des Dobrudschatatarischen. Diachronische Grammatik mit kultur- und realkundlichem Hintergrund. Graz: Institut für Sprachwissenschaft. Radeva, Vassilka (ed.). 2003. Bulgarische Grammatik. Hamburg: Buske. Ramat, Paolo. 2020. Dal greco μακάριε al siciliano macari: storia di un percorso panromanzo (e balcanico). Archivo Glottologico Italiano 105(2). Rasulova, Baxu-Mesedu. 2013. Basandi: Karatinskie poslovicy, pogovorki i skorogovorki [Basandi: Karata proverbs, sayings, and tongue-twisters]. Maxačkala: Narody Dagestana. Rédei, Károly. 1978. Syrjänische Chrestomathie mit Grammatik und Glossar. Wien: VWGÖ. Rheinfelder, Hans. 1967. Altfranzösische Grammatik. 2. Teil: Formenlehre und Syntax. München: Hueber. Riese, Timothy. 1998. Permian. In Daniel Abondolo (ed.), The Uralic languages, 249–275. London & New York: Routledge. Riese, Timothy. 2001. Vogul. München: Lincom Europa. Rießler, Michael. 2007. Grammatical borrowing in Kildin Saami. In Yaron Matras & Jeanette Sakel (eds.), Grammatical borrowing in cross-linguistic perspective, 229–244. Berlin & New York: De Gruyter. Rueter, Jack. 2010. Adnominal person in the morphological system of Erzya. Helsinki: Société Finno-Ougrienne. Rusakov, Aleksandr Yu. 2001. The North Russian Romani dialect: Interference and code switching. In Östen Dahl & Maria Koptjevskaja-Tamm (eds.), The Circum-Baltic languages: Typology and contact, 313–337. Amsterdam & Philadelphia: John Benjamins. Ryding, Karin C. 2005. Modern Standard Arabic. Cambridge: Cambridge University Press. Saadiev, A. A. 1994. Kryts. In Rieks Smeets (ed.), The indigenous languages of the Caucasus. Volume 4: North East Caucasian languages. The three Nakh languages and six minor Lezgian languages, 407–445. Delmar/BY: Caravan. Salánki, Zsuzsa. 2015. The bilingualism of Finno-Ugric language speakers in the Volga Federal district. In Christel Stolz (ed.), Language empires in comparative perspective, 237–264. Berlin & Boston: De Gruyter Mouton. Salimov, X. S. 2010. Gagatlinskij govor andijskogo jazyka [Gagatl dialect of the Andi language]. Maxačkala: IYALI. Sammallahti, Pekka. 1998. The Saami languages. An introduction. Karasjok: Savvi Girji. Sarhimaa, Anneli. 1999. Syntactic transfer, contact-induced change, and the evolution of bilingual mixed codes. Focus on Karelian-Russian language alternation. Helsinki: Finnish Literature Society. Schönig, Claus. 1998a. Azerbaijanian. In Lars Johanson & Éva Csató-Johanson (eds.), The Turkic languages, 248–260. London & New York: Routledge. Schönig, Claus. 1998b. Turkmen. In Lars Johanson & Éva Csató-Johanson (eds.), The Turkic languages, 261–272. London & New York: Routledge.

390 | Thomas Stolz and Nataliya Levkovych

Schönig, Claus. 1998c. South Siberian Turkic. In Lars Johanson & Éva Csató-Johanson (eds.), The Turkic languages, 403–416. London & New York: Routledge. Schulze, Wolfgang. 2007. The Lak language: A quick reference. Manuscript. Schulze-Fürhoff, Wolfgang. 1994. Udi. In Rieks Smeets (ed.), The indigenous languages of the Caucasus. Volume 4: North East Caucasian languages. The three Nakh languages and six minor Lezgian languages, 447–514. Delmar/BY: Caravan. Sebeok, Thomas A. & Francis J. Ingemann. 1961. An Eastern Cheremis manual. The Hague: Mouton de Gruyter. Sharoff, Serge, Elena Umanskaya & James Wilson. 2013. A frequency dictionary of Russian. London & New York: Routledge. Sibatrova, S. S. 2016. Russkie zaimstvovanija v sisteme sojuzov marijskogo jazyka [Russian loans in the system of concunctions of the Mari language]. Ežegodnik finno-ugorskix issledovanij 10(3). 30–41. Skjærvø, Prods Oktor. 2009. Old Iranian. In Gernot Windfuhr (ed.), The Iranian languages, 43– 195. London & New York: Routledge. Smeets, Ineke. 2008. A grammar of Mapuche. Berlin & New York: De Gruyter. Steenwijk, Han. 1992. The Slovene dialect of Resia: San Giorgio. Amsterdam & Atlanta: Rodopi. Stolz, Christel (ed.). 2015. Language empires in comparative perspective. Berlin & Boston: De Gruyter Mouton. Stolz, Christel & Thomas Stolz. 1996. Funktionswortentlehnung in Mesoamerika. Spanischamerindischer Sprachkontakt (Hispanoindiana II). STUF/Language Typology and Universals 49. 86–123. Stolz, Christel & Thomas Stolz. 1998. Universelle Hispanismen? Von Manila über Lima bis Mexiko und zurück: Muster bei der Entlehnung spanischer Funktionswörter in die indigenen Sprachen Amerikas und Austronesiens. Orbis 39(1). 1–77. Stolz, Thomas. 1997. Grammatical Hispanisms in Amerindian and Austronesian languages: The other kind of transpacific isoglosses. Amerindia 21. 137–160. Stolz, Thomas. 2002. General linguistic aspects of Spanish-Indigenous language contacts with special focus on Austronesia. Bulletin of Hispanic Studies 79. 133–158. Stolz, Thomas. 2005. Italianisierung in den alloglotten Sprachen Italiens. In Ermenegildo Bidese, James R. Dow & Thomas Stolz (eds.), Das Zimbrische zwischen Germanisch und Romanisch, 43–68. Bochum: Brockmeyer. Stolz, Thomas. 2007. Allora: On the recurrence of function-word borrowing in contact situations with Italian as donor language. In Jochen Rehbein, Christiane Hohenstein & Lukas Pietsch (eds.), Connectivity in grammar and discourse, 75–100. Amsterdam & Philadelphia: John Benjamins. Stolz, Thomas. in press. Entlehntes ABER. Kontaktinduzierte Diffusion adversativer Konnektoren des konjunktionalen Typs. In Julia Nintemann & Cornelia Stroh (eds.), Über Widersprüche sprechen – Linguistische Beiträge zu Contradiction Studies. Wiesbaden: Springer VS. Stolz, Thomas, Deborah Arbes & Christel Stolz. 2021. Pero – Champion of Hispanization? On the challenges of documenting function word borrowing in Mesoamerican languages. In Danae Maria Perez & Eeva Sippola (eds.), Postcolonial language varieties in the Americas, 17–53. Berlin & Boston: De Gruyter Mouton. Stolz, Thomas, Sonja Hauser & Heiko Stamer. 2012. ω → σ → V: The first step towards the comparative grammar of monosyllables. In Thomas Stolz, Nicole Nau & Cornelia Stroh (eds.), Monosyllables. From phonology to typology, 197–237. Berlin: Akademie Verlag.

On loan conjunctions | 391

Stolz, Thomas & Nataliya Levkovych. this volume. On the (almost im)possible emergence of grammatical gender in language-contact situations. Surxaeva, Sijadat Abdulaevna. 2007. Služebnye časti reči v avarskom jazyke [Functional parts of speech in the Avar language]. Dissertation. Maxačkala: Institut jazyka, literatury i iskusstva im. Cadasy Dagestanskogo naučnogo centra Rossijskoj Akademii Nauk. Šagirov, A. K. 1989. Zaimstvovannaja leksika abxazo-adygskix jazykov [Loan vocabulary in Abkhaz-Adyghe languages]. Moskva: Nauka. Talibov, B[ukar]. B. 2001. Caxurskij jazyk [Tsakhur language]. In M. E. Alekseev, G. A. Klimov, S. A. Starostin & Ja. G. Testelec (eds.), Jazyki mira: Kavkazskie jazyki, 420–428. Moskva: Akademia. Talibov, B[ukar]. B. 2004. Tsakhur. In Michael Job (ed.), The North East Caucasian languages: Part 1, 347–419. Ann Arbor: Caravan Book. Talibov, Bukar B. 2007. Buduxskij jazyk [Budukh language]. Moskva: Akademia. Tauli, Valter. 1983. Standard Estonian Grammar. Part II: Syntax. Uppsala: Uppsala Universitet. Tauscher, Elisabeth & Ernst-Georg Kirschbaum. 1983. Grammatik der russischen Sprache. Düsseldorf: Brücken-Verlag. Tenser, Anton. 2005. Lithuanian Romani. München: Lincom Europa. Tereškin, Sergej Nikolaevič. 2002. Jokan’gskij dialekt saamskogo jazyka [Yokanga dialect of the Saami language]. Rossijskij gosudarstvennyj pedagogičeskij universitet, SanktPeterburg. Doctoral dissertation. Testelec, Ja. G. & M. Š. Xalilov [Khalilov]. 2001. Bežtinskij jazyk [Bezhta language]. In M. E. Alekseev, G. A. Klimov, S. A. Starostin & Ja. G. Testelec (eds.), Jazyki mira: Kavkazskie jazyki, 303–311. Moskva: Akademia. Thomason, Sarah G. 2001. Language contact. An introduction. Washington/DC: Georgetown University Press. Thomason, Sarah G. & Terrence Kaufman. 1988. Language contact, creolization, and genetic linguistics. Berkeley, Los Angeles & London: University of California Press. Thompson, Sandra A., Robert E. Longacre & Shin Ja J. Hwang. 2007. Adverbial clauses. In Timothy Shopen (ed.), Language typology and linguistic description, 237–301. Cambridge: Cambridge University Press. Tiffou, Étienne & Jurgen Pesot. 1989. Contes du Yasin. Paris: Peeters. Tyroller, Hans. 2003. Grammatische Beschreibung des Zimbrischen von Lusern. Stuttgart: Steiner. Useinov, Sejran M. 2008. Krymskotatarsko-russko-ukrainskij slovar’ [Crimean Tatar-RussianUkrainian dictionary]. Simferopol’: Tezis. Wälchli, Bernhard. 2005. Co-compounds and natural coordination. Oxford: Oxford University Press. Weiss, Daniel. 1995. Russian converbs: A typological outline. In Martin Haspelmath & Ekkehard König (eds.), Converbs in cross-linguistic perspective. Structure and meaning of adverbial verb forms – adverbial participles, gerunds, 239–282. Berlin & New York: De Gruyter. Wentzel, Tatjana W. 1980. Die Zigeunersprache (Nordrussischer Dialekt). Leipzig: Enzyklopädie. Werner, Heinrich. 1997a. Die ketische Sprache. Wiesbaden: Harrassowitz. Werner, Heinrich. 1997b. Das Jugische (Sym-Ketische). Wiesbaden: Harrassowitz. Winkler, Eberhard. 1994. Salis-Livische Sprachmaterialien. München: Veröffentlichungen des Finnisch-Ugrischen Seminars an der Universität München. Wintschalek, Walter. 1993. Die Areallinguistik am Beispiel syntaktischer Übereinstimmungen im Wolga-Kama-Areal. Wiesbaden: Harrassowitz.

392 | Thomas Stolz and Nataliya Levkovych

Woidich, Manfred. 2006. Das Kairenisch-Arabische. Eine Grammatik. Wiesbaden: Harrassowitz. Wurm, Stefan. 1951. The Karakalpak language. Anthropos 46. 487–610. Xalilov [Khalilov], M. Š. 2001. Cezskij jazyk [Tsets language]. In M. E. Alekseev, G. A. Klimov, S. A. Starostin & Ja. G. Testelec (eds.), Jazyki mira: Kavkazskie jazyki, 320–331. Moskva: Akademia. Xalilov [Khalilov], M. Š & I. A. Isakov. 2001. Ginuxskij jazyk [Hinuq language]. In M. E. Alekseev, G. A. Klimov, S. A. Starostin & Ja. G. Testelec (eds.), Jazyki mira: Kavkazskie jazyki, 331– 339. Moskva: Akademia. Xanmagomedov, B.-G. K. 2001. Tabasaranskij jazyk [Tabassaran language]. In M. E. Alekseev, G. A. Klimov, S. A. Starostin & Ja. G. Testelec (eds.), Jazyki mira: Kavkazskie jazyki, 385– 398. Moskva: Akademia. Zajceva, M. I. 1981. Grammatika vepsskogo jazyka [Grammar of the Veps language]. Leningrad: Nauka. Zaicz, Gábor. 1998. Mordva. In Daniel Abondolo (ed.), The Uralic languages, 184–218. London & New York: Routledge. Zajkov, P. M. 2000. Glagol v karel’skom jazyke [Verb in the Karelian language]. Petrozavodsk: Izdatel’stvo Petrozavodskogo gosudarstvennogo universiteta. Žirkov, L. I. 1955. Lakskij jazyk [Lak language]. Moskva: Izdatel’stvo Akademii Nauk SSSR. Žukova, Alevtina N. 1980. Jazyk palanskix korjakov [Language of the Palana Koryaks]. Leningrad: Nauka.

Thomas Stolz and Nataliya Levkovych

Parallel Romancization: Chamorro and Tetun Dili – two heavy borrowers compared Abstract: The study focuses on two cases of (Ibero-)Romancization of Austronesian languages. The Hispanization of Chamorro and the Lusitanization of Tetun Dili are shown to yield largely identical results. It is argued that this parallel borrowing behavior is causally connected to the two languages’ status as heavy borrowers. This hypothesis receives support from the investigation in three domains which shed light on similarities and dissimilarities in loan phonology, fossilized plurals, and gender agreement. The facts are indicative of the necessity to study further cases of Romancization-cum-heavy-borrowing in comparative perspective to the benefit of the theory of language contacts. Keywords: adjectival agreement; fossilized plurals; heavy borrowing; loan phonology; Romance-Austronesian language contact

1 Introduction Inspired by Matras’ (2007: 64–65) views upon the different frequency with which certain kinds of contact-induced effects are attested cross-linguistically, we start from the idea, that what can happen in language contact is not entirely random, although the range of admissible phenomena relatively wide. Given the diversity of contact-induced phenomena, it strikes the eye that historically unrelated situations of language contact yield almost identical results in terms of the borrowing of matter and pattern (Sakel 2007) in the replica languages (= RL) aka “recipient” or “copier”, i.e. the languages into which something is borrowed (as opposed to “donor” language [= DL], i.e. the language from which something is borrowed (Matras and Sakel 2007: 1)). This parallel behavior of the replica languages can be studied to the benefit of the theory of language contact. To this end, we address the hitherto understudied case of the parallel IberoRomancization of two Austronesian languages: Chamorro (= CH) and Tetun Dili

|| Thomas Stolz: University of Bremen, FB 10: Linguistics/Language Sciences, UniversitätsBoulevard 13, 28359 Bremen, Germany. E-Mail: [email protected] Nataliya Levkovych: University of Bremen, FB 10: Linguistics/Language Sciences, UniversitätsBoulevard 13, 28359 Bremen, Germany. E-Mail: [email protected] https://doi.org/10.1515/9783110785517-007

394 | Thomas Stolz and Nataliya Levkovych

(= TD) aka Tetum Praça/Prasa which bear signs of Hispanization and Lusitanization, respectively. The parallel examples (1)–(2) from the translations of Le Petit Prince into Chamorro and Tetun Dili show that entire sentences in the two languages may consist (almost) exclusively of syntactic words with an IberoRomance background. Spanish (= SP) or Portuguese (= PT) borrowings appear in boldface (including the morpheme glosses and the English translation1). (1)

(2)

Chamorro Todu i tiempo ma all ART.CN time 3PL.ERG ‘They always need explanations.’ Tetun Dili Sira sempre presiza 3PL always need ‘They always need explanations.’

nesesita need

[LPP CH]2 esplikasion. explanation [LPP TD, 8]

esplikasaun. explanation

In both sentences, the Ibero-Romance words outnumber the Austronesian words. In (1), only the specific article for common nouns i and the 3rd person plural pronoun of the ergative set ma ‘they’ are Austronesian as opposed to the four words todu ‘all’, tiempo ‘time’, nesesita ‘need’, and eksplikasion ‘explanation’ which are Chamorro borrowings from Spanish (namely: todo, tiempo, necesitar, and explicación). Similarly, in (2), only the pronoun sira ‘they’ is of Austronesian origin whereas sempre ‘always’, presiza ‘need’, and esplikasaun ‘explanation’ stem from Portuguese (namely: sempre, precisar, and explicação). Their meaning, the borrowings closely reflect the original semantics the words have in the donor languages. Given that the examples (1)–(2) are representative of the contemporary written registers of the replica languages, we ask ourselves what it might mean for a replica language structurally to be a heavy borrower. If we know that language A belongs to the class of heavy borrowers and shows the effects X, Y, and Z, does it follow that language B, another member of the same class, displays the very same effects in the domain of grammar? To answer questions of this kind

|| 1 Except otherwise stated the morpheme glosses and translations are ours. Boldface marks out those syntactic words which are focused upon in the ensuing discussion. We apply morpheme hyphanization only very sparingly and only where it is relevant to the issue at hand. We refrain from homogenizing the different orthographies of our object languages and reproduce them faithfully as given in our sources. 2 By kind permission of the translator Eric Forbes, we extract this sentence from the still ongoing translation of the text into Chamorro.

Parallel Romancization: Chamorro and Tetun Dili – two heavy borrowers compared | 395

(Matras 2000), Chamorro and Tetun Dili come in handy because their shared genetic background and their age-long exposure to two donor languages which are also relatives of each other offer the unique opportunity to study the behavior of languages under largely similar conditions of language contact. If we narrow down the scope of the investigation to the structural repercussions resulting (in)directly from the transfer of sizable amounts of lexemes, we create almost ideal conditions for a language-contact laboratory experiment. This paper is a first test-run of the experiment. It is therefore empirically restricted to the written registers of the replica languages the analysis of which serves as preparation for dedicated follow-up studies which will focus on the spoken register. We acknowledge that what we describe here captures only part of the story. Nevertheless, investigating the written registers of Chamorro and Tetun Dili provides a suitable basis for the formulation of hypotheses to be tested in the interaction with native speakers on a different occasion. The guiding question is whether there is evidence that parallel massive borrowing in the lexical domain causes parallel developments in structural domains outside the lexicon. The paper is organized as follows. In Section 2, we summarize the contact history and structural properties of the replica languages and familiarize the reader with the extant hypotheses as to the replica languages’ classification in the taxonomy of language contact categories. There is also information about our own approach. Section 3 addresses phonological issues. Section 4 sketches a variety of cases which count among the better known contact-induced phenomena in morpho-syntax. Sections 5–6 are dedicated to two phenomena – fossilized plurals and gender agreement –which hitherto have not been given sufficient attention in language contact studies. Section 5 focuses on the role of fossilized plurals in the replica languages whereas the topic of Section 6 is the potential genesis of adjectival gender agreement. The conclusions are drawn in Section 7 where we argue for studying as many heavy borrowers as possible in comparative perspective.

2 Background 2.1 General information Chamorro, an internal isolate within the West Malayo-Polynesian branch of Austronesian in the Mariana Islands (Micronesia), was in contact with Spanish at first sporadically from 1521/1565 and then intensively from 1668 onwards until 1898/1899 when Guam was ceded to the United States and the northerly islands

396 | Thomas Stolz and Nataliya Levkovych

were sold to Germany. Posterior to the withdrawal of Spain from her former Micronesian possessions, Spanish ceased to play a role as a factor in the language contact scenario in the Marianas. Owing to the division of the Chamorro speechcommunity in two separate political entities –the unincorporated US territory Guam and the Commonwealth of the Northern Marianas – differences which already previously existed between the varieties spoken on Saipan, Rota, and Guam have not been leveled out yet (Topping and Dungca 1973: 9–10). Except for recent (symbolic) attempts at re-Austronesianizing Chamorro, there is no variety of the language which lacks Spanish influence altogether. Nowadays Chamorro is the co-official language with English in Guam and with English and Carolinian in the Commonwealth of the Northern Marianas. The number of native speakers is presently estimated not to exceed 45,000 all of which are bilingual with English. The language is moderately endangered (Pagel 2010). Tetun Dili belongs to the Timor-Flores branch of West Malayo-Polynesian and is thus distantly related to Chamorro. Direct contact with Portuguese began already in 1515 and gained momentum in the mid-18th century to last until 1974 when the Portuguese colonial regime was abolished. After the subsequent Indonesian occupation of the former Portuguese possession ended in 2002, Portuguese has re-emerged on the linguistic scene in the now fully independent Timor Leste. For the number of native speakers of Tetun Dili, the 2015 census reports 361,027 speakers who use Tetun Dili at home which corresponds to 31% of Timor-Leste’s population. There are over half a million second-language speakers (~ 56% of the entire population) (Direcção Geral de Estátistica TimorLeste 2016). The speech-community seems to be growing. There is also a considerable population of non-native speakers of the language. Tetun Dili is sociolinguistically stratified into a particularly Lusitanized acrolect, a mesolect with less evidence of Lusitanization, and a basilect which is only marginally Lusitanized (Hull and Eccles 2005: xvi). Tetun Dili is officially recognized alongside Portuguese as state language of Timor-Leste (Thomaz 2006). Chamorro is a moderately synthetic language with VSO as basic word order and traits of a morphological ergative system whereas Tetun Dili is a predominantly analytic SVO language of the nominative-accusative type. The two languages differ further structurally insofar as, in Chamorro, there is a system of distinct articles for common nouns, person names, and place names formally sensitive to categories which are analyzed either in terms of a focus vs. nonfocus system (Topping and Dungca 1973: 243–254) or as straightforward instances of morphological case (Chung 1998: 50–53). Nothing similar is reported for Tetun Dili. On the other hand, Tetun Dili makes ample use of serial verb constructions which are practically unknown in Chamorro. Hajek (2006b: 252– 253) argues that the renewed Portuguese influence on Tetun Dili might ultimately contribute to the deserialisation of the verb system.

Parallel Romancization: Chamorro and Tetun Dili – two heavy borrowers compared | 397

2.2 Romancization Topping and Dungca (1973: 67) state that “Chamorro is not a dialect of Spanish. Nor is it a creolized version of Spanish”. They insist further that borrowing from Spanish – though massive – “was linguistically superficial. The bones of the Chamorro language remained intact; a little Spanish flesh was added through vocabulary borrowing, but Chamorro remained basically Chamorro” (Topping and Dungca 1973: 6). According to the same authors, Spanish borrowings “were forced to conform to the rules of Chamorro grammar” and Spanish “had virtually no effect on Chamorro grammar” (Topping and Dungca 1973: 7). Similarly, Borja et al. (2006: 118) argue that there has been no wholesale replacement of Chamorro grammar with Spanish grammar. The sketch of the Spanish elements in Chamorro provided by Quilis (1992: 196–199), on the other hand, is suggestive of the ubiquity of Hispanisms on all levels of the replica language’s system. In contrast to the assumption that Spanish influence has failed largely to affect the inner structure of Chamorro, experts of Tetun Dili acknowledge that the impact of Portuguese on this Timorese language seems to be increasing so that major contact-induced effects might be expected to occur in the future (Williams-van Klinken et al. 2002: 7). For the time being, however, Hajek (2006a: 173–174) summarizes his survey of contact phenomena in Tetun Dili as follows: Overall, while TD [= Tetun Dili] is fully open to the influence of Portuguese in phonology and lexicon, it is more resistant to borrowing from Portuguese in other areas, especially morphology. Morphological influence, although system changing, is very constrained and restricted almost exclusively to borrowed Portuguese lexicon. Syntactic influence is sometimes system altering, but mostly system reinforcing, increasing the frequency of existing structures that were less used in the past. On the other hand, there is little direct evidence, with the partial exception of SVCs [= serial verb constructions], that contact specifically with Portuguese has led to grammatical simplification.

The above quotes reveal a kind of agreement among the specialists. To their opinion, the Ibero-Romancization of the languages under scrutiny is largely a surface phenomenon which relies almost completely on the high type and token frequency of lexical borrowings. On account of Williams-van Klinken and Hajek’s (2018a) findings as to the emergence of an especially Lusitanized journalistic register in Tetun Dili, we seize the opportunity to put this “belittling” of the Ibero-Romance influence on the replica languages’ grammar to the test. According to Rodríguez-Ponga (1995: 91), estimates of the number of Spanish lexical borrowings in Chamorro cover some 55% of the dictionary entries in Topping et al. (1975) but cf. Pagel (2010: 145). Hajek (2006a: 169) calculates the

398 | Thomas Stolz and Nataliya Levkovych

Portuguese share to account for 10–30% of the words in spoken Tetun Dili and 60–80% of the words in the written official register. We are well aware of the problems which arise if one assumes that there are neatly distinct types of contact languages. Pagel (2015, 2018) argues convincingly for the continuum-like character of the typology of contact languages. For the practical purposes of this study, however, we turn a blind eye on this possibility to establish a common reference frame for Chamorro and Tetun Dili. On the basis of the above percentages for the written registers, we consider both replica languages to fulfill the prerequistes of massive borrowing according to Bakker and Mous (1994: 5). In fact, Chamorro and Tetun Dili display shares of borrowed vocabulary which exceed the admittedly artificial 45%-mark imposed by Bakker and Mous (1994) as upper limit for massive borrowing and at the same time do not reach the 90%-mark postulated by the same authors as threshold for the status of Mixed Language. Chamorro and Tetun Dili are located in the intermediate zone termed “the gap” (Stolz 2003: 290). For many of the lexical Hispanisms in Chamorro, there exist autochthonous synonyms (Salas Palomo and Stolz 2008) so that it does not make sense to sweepingly speak of the filling of lexical gaps in the replica languages. The above percentages suggest that the replica languages can be rubricated as instances of (extremely) heavy borrowing (Bakker and Matras 2013: 11–12).3 However, they probably fail to pass the test as Mixed Languages technically because it is still possible to locate them on the Austronesian family tree (Bakker 2003: 121). An indicator of this possibility is that their basic vocabulary has not been as massively infiltrated by Hispanisms and Lusitanisms as the periphery of the lexicon. According to Grant (2012: 329), less than 5% of the words included in Tetun Dili’s version of the extended Swadesh List (with 207 items) are borrowings. The turnout is significantly higher in the case of Chamorro for which the same author calculates that almost a quarter of all words on the list are of foreign origin. Nevertheless, in both replica languages, the borrowings form a clear minority in the core vocabulary. One might ask whether Chamorro and Tetun Dili are Mixed Languages of the kind which “exhibit only lexical borrowing” (Meakins 2013: 189). Chamorro and Tetun Dili give evidence of “the presence of structural features from both [donor and replica] languages” (Meakins 2013:

|| 3 Rodríguez-Ponga (2009: 15) is still convinced that Chamorro is “lengua única en el mundo, por la manera en que refleja la fusion de elementos lingüísticos” [our translation: ‘a unique language in the world because of the manner in which it reflects the fusion of linguistic elements’]. Our study clearly shows that Chamorro is by no means a loner in terms of its borrowing behavior.

Parallel Romancization: Chamorro and Tetun Dili – two heavy borrowers compared | 399

205) as well yet without meeting the criteria a genuine Mixed Language according to the principles laid down by Bakker (1997: 202–203). Bakker (1997: 203) assumes that for a language to fall into the class of Mixed Languages there should be language intertwining, i.e. “they combine the grammatical system [] of one language with the lexicon of another”. This is the case neither with Chamorro (Bakker 1997: 185) nor with Tetun Dili whose lexicon and grammar contain both Austronesian and Romance elements. How much trouble it is to find an appropriate category for Chamorro in the typology of language contact categories can be gathered from RodríguezPonga’s (2009: 19–23) and Pagel’s (2010: 134–147) literature reviews on this subject. The hypotheses range from considering Chamorro a Pidgin (Fischer 1961), a(n atypical) creole (Munteanu 1997: 962)4, a semi-creole (RodríguezPonga 2009: 19), (the reverse of) an anti-creole (Couto 1996: 89), a mixed language (Rodríguez-Ponga 2009: 196) to that of an instance of heavy borrowing as defended here. The situation is very much the same in the case of Tetun Dili. In an article on Tetun Terik, Van Engelhoven and Williams-van Klinken (2005: 735) claim in passing that Tetun Dili “is a creole with large-scale borrowing from Portuguese”. In contrast, Hajek (2006a: 163) strongly argues against previous classifications of Tetun Dili as a pidgin (Hagège 2000: 356–357) or a creole (Grimes 1992: 654–655; Hull 1999: 317).5 What is clear nevertheless is the function of Tetun Dili as a lingua franca in those parts of Timor Leste where it is not

|| 4 Rodríguez-Ponga (2009: 20–21) mentions four further authors who classify Chamorro as a(n Atlantic) creole (de Granda 1978: 389; Alvar 1995: 215; Malherbe 1995: 716; Stewart 1999: 8). On account of the recurrence of this classification pattern, it can be assumed that at least among those scholars who have not studied the language in-depth, the creole option seems to be the preferred choice for Chamorro. 5 The idea that Chamorro and Tetun Dili count as creoles is explicable with reference to purely superficial similarities that exist between the two Austronesian languages and genuine IberoRomance-based creoles of the same geographical macro-region. Consider example (i): (i) Chabacano (Ternate) (glosses and translation adapted) [Sippola 2011: 144] Kwándu-kel el manga baguntaw tasé harána kon kel when-DEM ART PL youngster.M IPFV:make serenade OBJ ART manga dalága. PL youngster.F ‘In the past the adolescent boys performed serenades for the adolescent girls.’ Boldface is used to identify Tagalog words in this utterance in the Spanish-based Creole of Ternate. Besides the lexical Tagalisms, there is also the pre-nominal plural marker manga (< Tagalog mga) which occurs twice with animate nouns. Sippola (2011: 113–117) describes the use of this plural marker as optional but wide-spread. As in examples (1)–(2), the Chabacano sentence contains items which stem from a language which is not the main lexifier.

400 | Thomas Stolz and Nataliya Levkovych

spoken natively. Williams-van Klinken and Williams (2015) show that Tetun Dili started out as a lingua franca and has attained significant numbers of native speakers only in the last few decades. After about 370 years of direct exposure to Spanish influence in the case of Chamorro and some 460 years of contacts between Tetun and Portuguese, the high number of borrowings can hardly surprise us. It would be a mistake, however, to consider the borrowing behavior of the Austronesian replica languages unworthy of study. In fact, their case is especially interesting for several reasons. First, the donor languages Spanish and Portuguese are not only genetically close relatives and areal neighbors, but they are typologically look-alikes of each other, in a manner of speaking. At the same time, the two replica languages are also genetically related albeit more distantly. They are geographically separated from each other. What is more, they do not always correspond to each other as to the typological properties they display (cf. Section 2.1). If it can be shown that parallels in the replica languages’ borrowing behavior abound in the lexicon and the grammatical system, it needs to be tested whether these similarities are attributable to – the structural similarities of the genetically closely related donor languages or to – those of the genetically distantly related replica languages or to – general principles of grammars in contact or to – coincidence or to – a combination of several of the above. Similarly, if the phenomena of grammatical borrowing do not yield sufficient evidence of parallel behavior of Tetun Dili and Chamorro, one might ask whether this failed parallelism can be correlated to any specific properties of the languages involved in the language-contact situations, to independent forces or to the chance factor. The Hispanization of Chamorro is the topic of a several dedicated studies. For reasons of space, we mention only a small selection thereof. RodríguezPonga’s (1995) pioneering PhD-thesis makes an abundance of lexical Hispanisms accessible to the interested scholars and demonstrates that Chamorro deserves being investigated in-depth and systematically for the benefit of language-contact studies, in general. His overview (Rodríguez-Ponga 1995: 13–101) still serves as points of reference in the on-going discussion. Stolz (2003) argues that Chamorro – together with (Italianized) Maltese – fails to fit the description of any of the extant categories of contact languages and thus proves that the contact typology does not involve an array of neatly delineated distinct catego-

Parallel Romancization: Chamorro and Tetun Dili – two heavy borrowers compared | 401

ries but has the shape of a continuum. Rodríguez-Ponga (2009) among other things advocates the idea that in the course of the 19th century Chamorro underwent drastic structural changes under the impact of Spanish so that Chamorro displayed signs of creolization at the end of the Spanish rule over the Marianas. Pagel (2010) provides the as yet most comprehensive account of the Hispanic properties of modern Chamorro in all areas of lexicon, phonology, and grammar. He draws the conclusion that Chamorro is a case of heavy borrowing both lexically and structurally. Moreover, the author compares his findings to other cases of Hispanization, viz. Rapanui and Cebuano. Especially Pagel’s PhDthesis is suggestive of the possibility to conduct systematic comparative studies in the domain of Romancization. As to the impact of Portuguese on Tetun Dili, Hajek (2006a) must be mentioned. In this article a certainly incomplete sketch of the range of contact-induced phenomena on all levels of grammar and beyond is given. The most recent study in this domain is Greksakova (2018). The borrowing of the bound derivational morpheme -dor from Portuguese that has become productive in Tetun Dili is discussed in Hajek and Williams-van Klinken (2003). In Williams-van Klinken and Hajek (2020) it is shown that dór is a free morpheme in the replica language. The specificities of the Lusitanized Tetun Dili press register are described in Williams-van Klinken and Hajek (2018a). The co-existence of different numeral systems in Tetun Dili is addressed in Williamsvan Klinken and Hajek (2018b). A study which encompasses the Romancization processes in both replica languages viewed from a comparative perspective is still wanting. This paper of ours fills this gap to the extent that more focused follow-up studies can start from where we have stopped.

2.3 Our approach This study forms part of the research program Romancization worldwide as outlined in Stolz (2008). This research program aims at describing and evaluating all language contact situations in which a Romance language functions as donor language. The purpose is to determine to what extent structurally similar donor languages trigger similar contact-induced processes in replica languages in order to create a broad empirical basis for generalisations about language contact. In Figure 1 we schematically show the four logically possible combinations that are interesting in this context.

similar

different

RL1 = RL2

RL1 ≠ RL2

similar

DL1 = DL2

A

C

different

402 | Thomas Stolz and Nataliya Levkovych

DL1 ≠ DL2

B

D

Figure 1: Four logically possible combinations.

The figure combines fictitious binary situations of language contact. Either the donor languages involved in different contacts are structurally similar (A and C) or they differ (B and D). The same applies to the replica languages which resemble each other structurally (A and B) or fail to do so (C and D). It is interesting to investigate what happens in two different replica languages when the input of two donor languages is almost the same. Most of the data presented in the empirical sections of this study corresponds to A, because Spanish and Portuguese are similar, Chamorro and Tetun Dili also share many properties. There are also some additional cases of B and C. This approach is indebted to the theory exposed in Matras (2009: 146–274). This theory assumes that it is possible to compare different language contact situations and classify them because it is predictable to a certain degree how languages interact structurally in language contact. Methodologically we follow the lead of Matras (2007). This means that we compare Chamorro and Tetun Dili by way of ticking off (ideally) each level of grammar and the Romancization processes attested in them. The data stem from a variety of different written contemporary sources as published since World War II. For the investigation of the lexicons (Sections 3, 5, and 6), we strictly limit our data-base to the reference dictionaries by Topping et al. (1975) and Hull (1999) for Chamorro and Tetun Dili, respectively. There are other dictionaries such as Flores and Bordallo Aguon (2009) for Chamorro and Williams-van Klinken (2011) for Tetun Dili against which the findings have to be checked in a separate study. Morphosyntactic phenomena (Sections 4, 5, and 6) are illustrated with examples from the extant descriptive linguistic literature, language courses, and original or

Parallel Romancization: Chamorro and Tetun Dili – two heavy borrowers compared | 403

translated literature most of which is designed for the use in schools. What we conclude on this basis is valid only for the written register of the replica languages. Especially for Tetun Dili, it cannot be ruled out that some of the data reflect Tetun Dili as a second language. In many cases we have to take the data at face value without trying to delve too deeply into the issues they raise.

2.4 Idiosyncrasies Before we look into the parallels of the replica languages, we mention cases of Ibero-Romancization which are attested only in one of the replica languages. This is the case with several grammatical Hispanisms in Chamorro for which no parallel Lusitanisms exist in Tetun Dili, namely cases like – existential and past habitual marker eståba ‘used to be/exist’ < SP estaba ‘s/he was’ ← estar ‘to be (situated/temporarily)’ (Rodríguez-Ponga 2009: 211–215; Pagel 2010: 95–97); – existential, presentative, and (recent) perfective marker esta(gue’) ‘there is (already)’ < SP está ‘s/he is’ ← estar ‘to be (situated/temporarily)’ + CH 3SG.ABS gue’ (Rodríguez-Ponga 2009: 205–211; Pagel 2010: 97–98)6; – proximal demonstrative este ‘this’ < SP este ‘this’ (Pagel 2010: 79); – indefinite-specific article un < SP un ‘a(n)’ (Stolz 2012b). The existentials fall into the so-called “rest class” of Matras’ (2009: 208–209) which contains examples of crosslinguistically relatively infrequent borrowings. Matras (2009: 203) explicitly states that Chamorro is exceptional because the borrowing of demonstratives is a rarum too. It seems that Chamorro behaves differently from Tetun Dili exactly in those areas of grammatical borrowing in which Chamorro is generally special as compared to the bulk of the contact languages world-wide.

3 Towards loan phonology A cursory look at Topping et al. (1975: 61–63) and Hull (1999: 69–73) reveals that their lexicons host many cognates which are originally Ibero-Romance. Since

|| 6 Matras (2009: 209) mentions CH está but mixes it up with eståba.

404 | Thomas Stolz and Nataliya Levkovych

the donor languages are closely related, the lexical borrowings involve segmental chains which resemble each other phonologically and semantically. Table 1 gives an impression of how pervasive the contact-induced lexical convergence is in the replica languages. Grey shading identifies borrowings which yield identical phonological strings in the replica languages. Segments which differ between Chamorro and Tetun Dili are represented by capital letters. Table 1: Parallel lexical borrowings from Ibero-Romance in Chamorro and Tetun Dili.

#

Chamorro

Spanish

Tetun Dili

Portuguese

Meaning

#1

eskapa

escapar

eskapa

escaper

‘flee’

#2

eskasu

escaso

eskasu

escasso

‘infrequent’

#3

eskoBa

escoba

eskoVa

escôva

‘broom’

#4

eskUEla

escuela

eskOla

escola

‘school’

#5

eskuSa

escusa

eskuZa

escusa

‘excuse’

#6

eskritura

escritura

eskritura

escritura

‘(holy) scripture’

#7

esperansa

esperanza

esperansa

esperança

‘hope’

#8

espesiáT

especial

espesiáL

especial

‘special’

#9

espia

espiar

espia

espiar

‘search for’

#10

estadu

estado

estadu

estado

‘state’

#11

estima

estima

estima

estima

‘esteem’

#12

estómagu

estómago

estómagu

estómago

‘stomach’

With reference to the parallel lexical borrowings in Table 1, several observations can be made. First, one notes that different word-classes of the donor languages are borrowed from, namely nouns (= #3–7 and #10–12), verbs (= #1 and #9), and adjectives (= #2 and #8). With #3–4 and #12, only three of the borrowed lexemes in Table 1 correspond to an item on the Loanword Typology meaning list (Haspelmath and Tadmor 2009: 22–34). The semantic domains to which the borrowed words belong range from body-part terms (= #12) via basic actions (= #9) and warfare/hunting (= #1) to quantified time (= #2). Several phonological issues are interesting in connection with our topic. We start with the segmental phonology of the lexemes featured in Table 1. Both Chamorro and Tetun Dili display a high back vowel /u/ in word-final position of the borrowings where Castilian Spanish employs the mid-high back vowel /o/. Thus, for both of the replica languages, the quality of the final vowel in #2, #10, and #12 resembles that of the Portuguese cognates since in this language, the

Parallel Romancization: Chamorro and Tetun Dili – two heavy borrowers compared | 405

back vowel contrast /o/ ≠ /u/ is neutralized as [u] in post-stress position (Cunha and Cintra 1984: 39–40). Tetun Dili observes this rule in Portuguese loans (Williams-van Klinken et al. 2002: 12) whereas no such rule seems to apply to the Austronesian component of the lexicon of this language (Hull 1999: xviii–xx) (cf. leko ‘honor, revere’).7 However, the resemblance is incidental in the case of Chamorro because the phonotactics of this language favor the raising of unstressed /o/ to [u] independent of the origin of the word in question (Topping and Dungca 1973: 21). Many Hispanisms, however, fail to obey this rule (Topping and Dungca 1973: 57), whereas words of Austronesian stock additionally follow vowel-harmonic rules which exclude the occurrence of high vowels in words which contain a medial consonant cluster as in totche ‘dip’ vs. lísto ‘quick’ (< SP listo ‘ready’) (Topping and Dungca 1973: 57). Secondly, three of the phonological differences identified in Table 1 are “inherited” in the sense that they reflect features which have been present already in the donor languages. The first two are CH/SP /b/ ~ TD/PT /v/ in #3 and CH/SP /s/ ~ TD/PT /z/ in #5. The distinction of the voiced bilabial plosive vs. voiced labio-dental fricative and that of the alveolar sibilants /s/ vs. /z/ in intervocalic position are traditional topics in Ibero-Romance dialectology which assumes the neutralization and subsequent abolishment of these contrasts for Castilian Spanish as opposed to the preservation of the voice distinction in Portuguese (Vicente 1974: 140–146). Both /v/ and /z/ are phonemic innovations in Tetun Dili which have been copied from Portuguese (Hajek 2006a: 168–169). In Chamorro, however, neither /v/ nor /z/ forms part of the phoneme inventory (Topping and Dungca 1973: 27)8, meaning: like Tetun Dili, Chamorro lacked these phonemes in pre-contact times, but in contrast to the Portuguese impact on the phonology of Tetun Dili, Spanish could not trigger the introduction of similar contrasts in Chamorro. The third difference is that of CH/SP /we/ ~ TD/PT /ɔ/ in #4. The diphthongization of inherited stressed /o/ is a typical feature of Castilian Spanish and many of its regional varieties which is not shared by Portuguese (Vicente 1974: 89–95). It is therefore not surprising that Portuguese borrowings in Tetun Dili involve the monophthong /o/. As to Chamorro, the sequence /we/ is not count|| 7 This issue needs to be looked into because there are several sets of alloforms of lexemes which seem to reflect the variation of final /o/ ~ /u/ also in Austronesian items such as la’o ~ la’u ‘walk’, leno ~ lenu ~ leo ‘illuminate’, etc. (Hull 1999: 196, 198). 8 For some speakers of Chamorro, [z] is an allophone of /ʣ/ (written ) (Topping and Dungca 1973: 25) which, in Spanish borrowings, stems from two sources, namely the palatal lateral /λ/ and the palatal approximant /j/. This means that there is no connection to the sibilants in the first place.

406 | Thomas Stolz and Nataliya Levkovych

ed among the diphthongs by Topping and Dungca (1973: 24) who argue that the initial part of the Spanish diphthong has been reinterpreted either as a consonantal segment /w/ or as the labialization of the immediately preceding consonant to add to an already established series of labialized Cw phonemes (Topping and Dungca 1973: 25 and 34). There are two further instances of segmental differences resulting from the application of internal sound laws of Chamorro, viz. CH /t/ ~ TD/PT/SP /l/ in #8. The donor languages and Tetun Dili tolerate liquids in syllable-final position. Chamorro, however, disallows the occurrence of /r/ or /l/ in this position. In lexical Hispanisms, the liquids are not only neutralized but also merge with the unvoiced dental plosive /t/ which represents all three phonemes in the syllable coda (Stolz 2013: 208–214). The second case is CH /h/ ~ TD/PT/SP Ø in #9. The glottal fricative /h/ is absent generally from the phoneme charts of Portuguese and Castilian Spanish. In contrast, both Tetun Dili and Chamorro boast a phonemic /h/. This fricative is said to be “optionally deleted in intervocalic position” (Williams-van Klinken et al. 2002: 11) – a facultative rule which applies exclusively to Austronesian words, whereas the reverse happens in Chamorro. In Chamorro, sequences of heterosyllabic vowels are disfavored. Wherever they occur in Spanish borrowings, the two vowels may be separated optionally by an excrescent (non-phonemic) consonant which is either the glottal stop [ʔ] or [h] (e.g. ma’estru ~ mahestru ~ maestru ‘teacher’ < SP maestro). CH /h/ may also be the regular reflex of the voiceless velar fricative /x/ (= ) of SP as in husto ‘just, upright’ < SP justo (Pagel 2010: 63). When we look beyond the selection of cases in Table 1, we realize that vowels and consonants are affected by contact-induced phenomena differently. This observation is fully in line with Matras’ (2009: 228) following statement: In terms of occurrence frequency, contact-related change is more likely to affect consonants than vowels […]. The reason behind this hierarchy is the fact that consonant inventories are generally larger and so the potential for lack of correspondence between consonant systems in contact is higher, resulting in greater pressure to adjust the consonant system.

As to the vowel phonemes of the replica languages, the Ibero-Romance influence seems to be largely restricted to allophones. In Chamorro, Spanish borrowings have probably contributed to the phonologization of the erstwhile purely allophonic contrasts of high and mid-high vowels (Pagel 2010: 64). In Tetun Dili, automatic (i.e. non-phonemic) nasalization of vowels applies in the immediate vicinity of nasal consonants (Williams-van Klinken et al. 2002: 12) so that the (phonemic) Portuguese nasal vowels pose no problem for the integration of

Parallel Romancization: Chamorro and Tetun Dili – two heavy borrowers compared | 407

borrowings into since they are generally reinterpreted as sequences of VNASAL + CNASAL (eg. TD entaun ‘then, so’ < PT então). Hull and Eccles (2005: 237) consider the nasalization to be relatively weak. Chamorro consonants do not give much evidence of Hispanization although it is commonly assumed that the erstwhile allophonic contrasts of the liquids have been phonologized under Spanish influence to yield the opposition lateral /l/ vs. rhotic /r/ (Pagel 2010: 63–64). The situation is different in Tetun Dili. Of the 22 consonantal phonemes, eleven are said to have a Portuguese background (Hajek 2006a: 168; cf. also Matras 2009: 228). The following phonological classes have entered the system of the replica language via language contact with Portuguese: – all voiced fricatives (= /v/, /z/, /ʒ/) – all approximants (= /w/9, /j/) – all alveo-palatal consonants (= /ʃ/, /ʒ/) – all palatal consonants (= /ɲ/, /λ/, /j/) – the plosives /p/ and /g/ – the rhotic /r/10 Hull and Eccles (2005: 235) claim that except /ʃ/ none of the loan phonemes occurs in word-final position in Tetun Dili. This phonotactic restriction is not remarkable since most of the consonants under review are excluded from the word-final slot in Portuguese too. In contrast to the above addition to the phoneme chart, Tetun Dili has lost the glottal stop /ʔ/ (which is still represented orthographically though) (Hull 1999: xix–xx). Whether this loss can be attributed in any way to the impact exerted by Portuguese or perhaps to contact with Mambae (Hajek 2006a) is a question to which we do not have a ready answer. In connection to this problem, Hull and Eccles (2005: 240) add an instructive observation as to the geolinguistic diffusion of the glottal stop. According to them, it is generally absent from the variety spoken in the capital city, i.e. exactly where Portuguese influence has always been strongest. The glottal stop survives, however, in many varieties spoken outside of Dili.

|| 9 According to Hajek (2006a), /w/ is only partly a product of Lusitanization. 10 Williams-van Klinken et al. (2002: 11) state that the contrast of tap /ɾ/ and trill /r/ is “extremely marginal, and possible only when one contrasts a small number of Portuguese loans with original trill in intervocalic position […] with loans and native words with intervocalic tap […]. However, most speakers appear not to make a distinction.”

408 | Thomas Stolz and Nataliya Levkovych

Table 2 provides a synopsis of the consonantal phonemes of the two replica languages in one and the same chart. A color code distinguishes three categories of phonemes. Cells are shaded grey if the consonant is attested exclusively in Chamorro. Yellow is indicative of those phonemes which are exclusive to Tetun Dili. Blue is used for phonemes which are shared by both Chamorro and Tetun Dili but are contact-induced innovations to the system of the latter only. Extra strong lines surround phonemes belonging to the systems of both languages.

t

plosive [+voice]

b

d ʦ

affricate [+voice]

ʣ

fricative [+voice] nasal

f

s

ʃ

v

z

ʒ

palatal ɲ

lateral

l

ʎ

trill

r

tap

ɾ w

ʔ

h

n

approximant

m

k g

affricate [–voice]

fricative [–voice]

glottal

p

velar

plosive [–voice]

alveopalatal

coronal

labiodental

bilabial

Table 2: Synopsis of consonantal phonemes in Chamorro and Tetun Dili.

ŋ

j

As shown in Table 2, the majority of contact-induced additions to the consonant inventory of Tetun Dili have contributed to the phonological divergence of the replica languages. Only in a minority of cases the reverse has happened, i.e. the addition of /p/, /g/, and /ɲ/ to the system of Tetun Dili renders the latter more similar to that of Chamorro. None of the Chamorro phonemes which are absent from Tetun Dili have arisen on account of contact with Spanish. Phonotactics and word-level prosody are two further domains in which intriguing phenomena can be observed. Prior to contact with Spanish, Chamorro did not allow for complex syllable heads, i.e. only single consonants were permitted in this position. The massive borrowing of Spanish lexical material, how-

Parallel Romancization: Chamorro and Tetun Dili – two heavy borrowers compared | 409

ever, has introduced dozens of words which host binary consonant clusters in syllable-initial position (Pagel 2010: 63) such as those in (A). (A)

Spanish borrowings with initial clusters in Chamorro presta ‘borrow’ (< SP prestar), triste ‘sad’ (< SP triste), klaru ‘clear’ (< SP claro), blusa ‘blouse’ (< SP blusa), dragón ‘dragon’ (< SP dragón), gradu ‘grade’ (< SP grado), flauta ‘flute’ (< SP flauta), etc.

The development from the pre-contact state to that of Hispanized Chamorro can be gathered from the succession of the two formulae in Figure 2. pre-contact:

σMAX

→

CVC

Hispanized:

σMAX

→

COBSTRUENT&-GLOTTALCLIQUIDVC

Figure 2: Pre-contact and Hispanized patterns of maximally complex syllables.

As to Tetun Dili, binary clusters were allowed as syllable heads in Austronesian words already provided that the onset was filled by the unvoiced velar plosive /k/ as e.g. word-initially in kbiit ‘power’, kdook ‘distance’, kfo’er ‘dirt’, klees ‘open’, kmolak ‘empty’, knasuk ‘body odor’, kraik ‘low’, ksolok ‘joy’, ktaak ‘layer’, kwaa ‘dew’. The majority of the #[kC]HEAD combinations are inadmissible in Portuguese. However, via contact with Portuguese, Tetun Dili acquired a plethora of words whose segmental chains involve syllable-initial clusters with onsets other than /k/ (Williams-van Klinken et al. 2002: 9). The possibility that consonant clusters in inherited Austronesian words can be optionally split by an anaptyctic vowel (Williams-van Klinken et al. 2002: 9) is mentioned as a basilectal feature by Hull and Eccles (2005: 240). The following list (B) contains examples of nouns with initial clusters borrowed from Portuguese all of which have Spanish cognates in (A). (B)

Portuguese borrowings with initial clusters in Tetun Dili presta ‘be suitable’ (< PT prestar-se), triste ‘sad’ (< PT triste), klaru ‘clear’ (< PT claro), bluza ‘blouse’ (< PT blusa), dragaun ‘dragon’ (< PT dragão), grau ‘degree’ (< PT grau), flauta ‘flute’ (< PT flauta), etc.

Figure 3 is indicative of the abolishment of the erstwhile restriction which limited the extent of combinations of consonants in the syllable head in pre-contact times. For reasons of space, the formula for the contemporary structures glosses over the necessary further subdivisions regulating the compatibility of onset consonants and obstruents in the slope.

410 | Thomas Stolz and Nataliya Levkovych

pre-contact: nowadays:

σMAX σMAX

→ →

/k/COBSTRUENTVC COBSTRUENT&-GLOTTALCOBSTRUENTVC

Figure 3: Pre-contact and Lusitanized patterns of maximally complex syllables.

The phonotactics of Chamorro and Tetun Dili are not identical as to the qualities of the consonants which may combine to yield complex syllable heads. Nevertheless, the phonotactic rules of the replica languages have become more similar because of the languages’ exposure to the influence of the donor languages since Spanish and Portuguese display largely the same properties in this domain. In polysyllabic Chamorro words of Austronesian origin, stress may fall on any syllable except the final. Spanish borrowings have altered the situation such that the ban on ultimate stress does not apply to Hispanisms as is shown by prosodic minimal pairs such as ultimate stress mohón ‘landmark’ (< SP mojón) vs. penultimate stress mohon ‘wish feeling’. Pace Pagel (2010: 65) who assumes that “[n]achhaltiger Einfluss des Spanischen ist […] allgemein in der Prosodie nicht zu verzeichnen”11, we consider the introduction of ultimate stress via language contact a phenomenon which is worthwhile mentioning. With the possibility of placing stress on the ultimate syllable, Chamorro has become more similar to Tetun Dili which allowed for ultimate stress already before Portuguese appeared on the scene on Timor (cf. Austronesian words like manán ‘win’, sasán ‘things’, etc.). In contrast to Chamorro, Tetun Dili permitted only two positions of stress, namely the penultimate and the ultimate (Williams-van Klinken et al. 2002: 9). Note that Williams-van Klinken (2015: xv) assumes that “[o]n native Tetun words, stress is always on the second-last syllable”. Owing to the introduction of many Portuguese borrowings with antepenultimate stress site, the prosodic options have increased (cf. ókulu ‘spectacles’ < PT óculo, krédulu ‘gullible’ < PT crédulo, etc.) (Hull and Eccles 2005: 236). In this way, Tetun Dili now has more in common than originally with Chamorro in the domain of prosody. This albeit limited convergence is an effect of the replica languages’ contact with the Ibero-Romance donor languages whose prosodic systems are largely identical. At the same time, in the domain of the lexical phonology of borrowings, Tetun Dili and Chamorro may differ for two reasons. Either their donor languages already display systematic segmental differences which arose during the historical development of the Ibero-Romance languages or the two replica lan-

|| 11 Our translation: ‘in prosody, sustainable Spanish influence is generally not attested.’

Parallel Romancization: Chamorro and Tetun Dili – two heavy borrowers compared | 411

guages obey to different individual sets of phonological rules which presumably date back to the pre-contact period. The contact-induced lexical correspondences between the two replica languages are numerous. Table 1 captures only a few examples of lexemes of IberoRomance origin which belong to the lexicon of Chamorro and Tetun Dili, respectively. Tentative conclusions based on these examples are: – in the realm of phonemics, Ibero-Romancization tends to cause divergence among the replica languages because of already existing differences in the donor languages whereas – on the levels of phonotactics and prosody, the replica languages converge under the pressure of the donor languages and – through shared loans, lexical Ibero-Romancization too has the effect of increasing the similarity between Chamorro and Tetun Dili. In order to determine to what extent the parallel borrowing behavior of the replica languages applies outside lexicon and phonology, not only the morphosyntax of Tetun Dili and Chamorro should be scrutinized but it also needs to be determined in which sectors language contact triggers convergent development of the replica languages as opposed to those areas in which language contact causes divergence.

4 Almost trivial cases This section touches upon some very common contact-induced phenomena across language-contact situations world-wide. Since we are dealing with relatively well-studied phenomena (albeit mostly in connection with different replica languages) we neither discuss the data in each detail nor do we provide a comprehensive inventory of all pertinent cases. What this section is destined to show is that Chamorro and Tetun Dili do not necessarily behave differently from other contact languages in every domain.

4.1 Ordinal numerals Numerals are especially prone to being borrowed. Matras (2009: 201–203) puts forward the hypothesis that cardinals and ordinals differ insofar as borrowability is strongest with higher cardinals whereas with ordinals, especially the lower and lowest values are affected by language contact. Chamorro has lost its Austronesian cardinals and (except FIRST) also its original ordinals

412 | Thomas Stolz and Nataliya Levkovych

(Rodríguez-Ponga 2009: 167–196; Pagel 2010: 123). The situation is more complex in the case of Tetun Dili (Hull and Eccles 2005: 59–79; Williams-van Klinken and Hajek 2018b) because Tetun Dili speakers are familiar with numerals in Tetun, Portuguese and Indonesian. […] Tetun numerals tend to be used mainly for small numbers, such as the number of children in a family. Dates, prices and arithmetic are more commonly presented in Portuguese and Indonesian, while time is given in any of the three languages. When used in a phrase, Portuguese numerals are used with Portuguese nouns […] in contrast to Tetun [numerals being used with Tetun nouns] (Williams-van Klinken et al. 2002: 21)

Ordinals in Tetun Dili likewise come in two sets, namely an Austronesian set and that borrowed from Portuguese which seems to be getting the upper hand in the competition between autochthonous and imported numerals (Manhitu 2016: 100). Williams-van Klinken et al. (2002) do not register Austronesian ordinals for Tetun Dili whereas Manhitu (2016: 100) has forms like daruak ‘second’ also recorded as darua in Hull (1999: 45) derived from the cardinal rua ‘two’ via prefixation of da- and suffixation of -k. In Table 3, we reproduce ordinal numerals which bear evidence of contact influence in the replica languages (together with the corresponding numerals in the donor languages). Grey shading identifies the only instance of a purely Austronesian ordinal in Chamorro. Boldface marks out the Chamorro prefix mina’- which is employed to derive ordinals from cardinals (Rodríguez-Ponga 2009: 190–194). Table 3: Borrowed ordinal numerals in Chamorro and Tetun Dili and their basis in the donor languages.

Rank

Spanish Cardinal

Ordinal

1st

uno

2nd

dos

3rd

tres

4th 5th

Chamorro: ordinal

Portuguese ordinal

Tetun Dili ordinal primeiru

Option 1

Option 2

primer

fine’na

primet

primeiro

segundo

mina’dos

sigundo

segundo

segundu

tercer

mina’tres

tetseru

terceiro

terseiru

cuatro

cuarto

mina’kuatro

kuatto

quarto

kuartu

cinco

quinto

mina’sinko

kinto

quinto

kintu

6th

seis

sesto

mina’sais

sesto

sexto

sestu

7th

siete

séptimo

mina’siete

séptimo

sétimo

setimu

8th

ocho

octavo

mina’ocho

oktabo

oitavo

oitavu

9th

nueve

noveno

mina’nuebi

nobeno

nono

nonu

10th

diez

décimo

mina’dies

désimo

décimo

désimu

Parallel Romancization: Chamorro and Tetun Dili – two heavy borrowers compared | 413

Chamorro and Tetun Dili share one set of almost identical ordinals borrowed from the donor languages which have inherited this system from their common ancestor Latin. Accordingly, the Chamorro and Tetun Dili ordinals of this set resemble each other phonologically. Besides, Chamorro has a second option which is a case of regular morphological derivation of the ordinals from their corresponding cardinals with the latter being direct borrowings from Spanish. Thus, the construction [mina’-CARDINAL]ORDINAL is an instance of a hybrid formation involving Austronesian and Spanish components. Chamorro stands out further because of the retention of an Austronesian expression for FIRST fine’an (original meaning: ‘foremost’) which is in conflict with the idea that FIRST and SECOND are the most likely candidates for being replaced with borrowed ordinals (Stolz and Robbers 2016). What can be said in connection to the data in Table 3 is a repetition of our conclusions in Section 3, namely that the replica languages have become more similar in a sector of their lexico-grammatical system because of the Ibero-Romance impact. This conclusion is corroborated by further instances of parallel behavior in the domain of function-word borrowing.

4.2 Function words Function words of many kinds are prone to entering the system of the replica languages (Matras 1998). The processes are common enough to be counted among the best studied contact-induced phenomena. It is therefore not surprising that there is ample evidence of function-word borrowing also in Chamorro and Tetun Dili (Matras 2009: 194–196).

4.2.1 Discourse particles Among the many discourse particles of Ibero-Romance origin, we find the cognates CH entonses = TD entaun ‘then’ from SP entonces and PT então, respectively (Pagel 2010: 128; Manhitu 2016: 48). Very often they occupy the utteranceinitial position as illustrated in (3)–(4). (3)

CH Entonses na then LINKER ‘Then, you don’t want it?’

|| 12 English translation original.

munga not_do

[Topping and Dungca 1973: 153]12 hao? 2SG.ABS

414 | Thomas Stolz and Nataliya Levkovych

(4)

TD [Manhitu 2016: 64] Entaun, ha’u tenke bolu tekniku atu mai lailais. then 1SG OBLIG call technician to come quick ‘Then, I have to call the technician to come quickly.’

4.2.2 Conjunctions The clause-initial position is also characteristic of loan conjunctions which abound in the replica languages (Hajek 2006a: 172; Pagel 2010: 125). In (5)–(6), the cognate conjunctions Chamorro maskesea = TD maski ‘although’13 are used. These conjunctions go back to functional equivalents in the donor languages which seem to have fallen into disuse in the standards meanwhile, namely SP por más que sea (Rodríguez-Ponga 1995: 458) and PT mais que (Mattos e Silva 1984: 656). [Topping and Dungca 1973: 152]14 lao ya-hu. but like-1SG

(5)

CH Maskesea ti bunita although NEG pretty:F ‘Although she isn’t pretty, I like her.’

(6)

TD [Williams-van Klinken et al. 2002: 45]15 Nia sai hanesan ema bót maski sei klosan 3SG become like person big although still youth ‘He has become like an important person, even though he is still young.’

4.2.3 Prepositions Chamorro and Tetun Dili have adopted numerous prepositions of IberoRomance origin. Rodríguez-Ponga (2009: 119) claims that not only are most of the prepositions of modern Chamorro borrowed from Spanish but he also assumes that most of the prepositions of the donor language have been integrated into Chamorro. According to the list of prepositions in Tetun Dili provided by

|| 13 Hull and Eccles (2005: 212 fn. 89) assume a Malay etymology for TD maski (Malay meski(pun) ‘although’) which in turn had been borrowed from Portuguese (por mais que) into Malay first. 14 English translation original. 15 Glosses and English translation original.

Parallel Romancization: Chamorro and Tetun Dili – two heavy borrowers compared | 415

Manhitu (2016: 56), about a quarter of the inventory has a Portuguese history. It is therefore only to be expected that the replica languages make use of cognate prepositions such as entre ‘between’ (< SP/PT entre) in (7)–(8) (Pagel 2010: 121). (7)

CH [Underwood 1998: 21] Desde ayu na tiempo ha tungo’ si since DEM.DIS LINKER time 3SG.ERG know ART.PN Rosario yan si Roberto na dånkolo i Rosario and ART.PN Roberto LINKER big ART.CN difirensia entre i kado’kado’ yan i magåhet. difference between ART.CN pretence and ART.CN honesty ‘Since that time Rosario and Roberto knew that the difference between pretence and honesty is huge.’

(8)

TD Relasaun entre kolega ne’e relation between colleague this ‘The relation between the colleagues is important.’

[Manhitu 2016: 55] importante. important

4.2.4 Obligation Modality is also affected by Ibero-Romance influence. To express obligation, the replica languages employ markers which are materially copied from Spanish and Portuguese. In (9)–(10), it is shown that Chamorro makes use of debidi ‘have to, must’ from SP debe(r) de (Pagel 2010: 95) whereas Tetun Dili uses tenke ‘have to, must’ from PT tem que (← ter que) (Williams-van Klinken et al. 2002: 38). The borrowings reflect the 3rd person singular of the inflected modal verb. (9)

(10)

CH Debidi

[Barcinas 1973: 21] fama’atkos todu i dos. OBLIG FUT make_like:arch all ART.CN two ‘You have to arrange both of them in the shape of an arch.’ u

TD [Manhitu 2016: 65]16 Alberto tenke ramata estuda iha tinan ida-ne’e nia laran. Alberto OBLIG finish study in year one-DEM.PROX 3SG in ‘Albert has to finish his studies in this year.’

|| 16 English translation original.

416 | Thomas Stolz and Nataliya Levkovych

4.2.5 Negation Negation is also affected by language contact although Matras (2009: 208) states that “[n]ot many examples of direct borrowing of word-forms can be found”. The replica languages have borrowed cognate (temporal) negations from the Ibero-Romance donor languages, namely SP = PT nunca ‘never’ (Williams-van Klinken et al. 2002: 35; Pagel 2010: 124), as shown in (11)–(12).17 [Topping and Dungca 1973: 269]18 fatta gi eskuela. absent in school

(11)

CH Nunka si Rosa ni never ART.PN Rosa NEG ‘Rosa is never absent from school.’

(12)

TD Oinsá nia bele koñese ha’u how 3SG be_able know 1SG se nia nunka haree ha’u antes? if 3SG never see 1SG before ‘How can he know me when he never saw me before?’

[LPP TD, 35]

4.2.6 Comparatives To close this incomplete illustration of wide-spread contact-induced phenomena, we present examples of partially Ibero-Romancized comparative constructions in the replica languages in (13)–(14). (13)

CH Man-maolek-ña ham PL-good-COMP 1PL.EXCL.ABS ‘We are better than you.’

[Taimanglo et al. 1999: 26] ki hamyo. than 2PL.EMPH

(14)

TD Ninia planeta ladún boot liu duke POR.3SG planet NEG big pass than ‘His planet was not bigger than a house.’

uma house

[LPP TD, 16] ida! one

|| 17 According to Matras’ (2009: 198) classification, nunca forms part of the class of indefinites which seem to be likelier candidates for borrowing than proper negations are. 18 English translation original.

Parallel Romancization: Chamorro and Tetun Dili – two heavy borrowers compared | 417

These comparative constructions are conflations of Austronesian and IberoRomance components which we calibrate in the footnotes according to the typology of Event Schemas as proposed by Heine (1997). Chamorro and Tetun Dili have borrowed the relator (Stolz and Stolz 2001: 36) from Spanish and Portuguese, respectively. SP que19 > CH ki combines with the bound comparative morpheme -ña (= the degree marker) which is attached to the quality expression (in this case: maolek ‘good’) (Pagel 2010: 82–87). In Tetun Dili, the complex Portuguese relator do que20 > TD duké is integrated into a construction with the verb liu ‘(sur)pass, exceed’.21 This verb functions as degree marker (Williamsvan Klinken et al. 2002: 39). In contrast to most cases of Hispanized comparative constructions in the Americas and Austronesia (Stolz and Stolz 2001; Matras 2009: 190–191), Chamorro and Tetun Dili normally avoid the Ibero-Romance degree marker más/mais ‘more’ although its use is not entirely unknown at least in Chamorro (Pagel 2010: 83–84). For Tetun Dili, Hull and Eccles (2005: 153–155) state that comparative constructions which are based on liu alone are fully grammatical but the use of duke is considered “bastante vulgar”. Based on the presented data we can conclude that Chamorro and Tetun Dili are well-behaved in the sense that they meet many of the expectations linguists have of contact languages in general. Massive borrowing and the creation of replica languages conform to the general patterns of what can happen in language contact situations. Since Chamorro and Tetun Dili are in line with the common picture of contact languages, it is only logical that the number of parallels between them increases with each widely attested contact-induced phenomenon. To counter the impression that the parallels are exclusively the effect of universal tendencies in language contact, we reserve Sections 5–6 for two as yet understudied issues.

5 An exploration of fossilized plurals In this section, we focus on the borrowing of Ibero-Romance nouns in their plural forms and its repercussions in the replica languages. In Chamorro and Tetun Dili, there are many borrowed nouns which host the Ibero-Romance plural suffix -s already in their citation form. The evidence is so numerous that it is || 19 The Spanish comparative is a representative of the Similarity Schema (Heine 1997: 118–119). 20 The ablative preposition de ‘from’ (+ ART.M o → do) characterizes the Portuguese construction as an instance of the Source Schema (Heine 1997: 115–116). 21 This is an example of Heine’s (1997: 112–114) Action Schema.

418 | Thomas Stolz and Nataliya Levkovych

legitimate to ask whether the replica languages’ number system is affected. What we say in the subsequent paragraphs is purely descriptive. Matras (2009: 212–213) discusses instances of the employment of donor language number inflections on borrowed nouns in replica languages to show that the retention of foreign plural inflections is relatively common in language contact situations at least temporarily. Holm (2008: 300–305) gives an account of the survival of plural markers on nouns in several Portuguese-based creoles where the markers keep their number-marking function albeit only to a limited extent.

5.1 Plural marking under quantification The productive use of the Spanish plural marker in Chamorro is arguable. Rodríguez-Ponga (1995: 79–81) claims that the pluralization of borrowed nouns according to the Spanish pattern is possible (but far from being compulsory) with a small number of Hispanisms in Chamorro, notably with measurements of time and ethnonyms. Since the expressions of measurements are likely to cooccur with (cardinal) numbers all of which have been borrowed from Spanish, it is doubtful whether Spanish plural is productive in Chamorro. A case in point is CH añu ‘year’ (< SP año) → CH dos/tres/kuatro/sinko años ‘two/three/four/five years’ with dos/tres/kuatro/sinko < SP dos/tres/cuatro/cinco, etc. Pluralization of measurement terms is also possible in constructions which involve the linker na as in (15). (15)

CH [Onedera 1994: 47] Kada såkkan gi kuatro na åños na each year in four LINKER year:PL LINKER tiempo na ha cho’gue este time LINKER 3SG.ERG do DEM.PROX ‘He did this for four years consecutively […].’

If the noun to be quantified is not a measurement expression (irrespective of its etymology), the combination of numeral (n ≥ 2) and noun remains in the singular and the linker particle na, i.e. a typically Austronesian construction pattern, is made use of. The Hispanism osu ‘bear’ (< SP oso ‘bear’) is a case in point as shown in (16). (16)

CH I ART.CN

tres three

na LINKER

osu bear

[Perez 1975: 38] mañasaga PL:RED~live

Parallel Romancization: Chamorro and Tetun Dili – two heavy borrowers compared | 419

gi un dikike’ na in ART.INDEF small LINKER ‘The three bears were living in a small house.’

guma’. house

Rodríguez-Ponga claims further that the employment of the suffix -s from Spanish is limited and subject to variation. Nevertheless, he insists that the Spanish plural marker forms part of the grammatical system of Chamorro. Furthermore, Rodríguez-Ponga (1995: 80) assumes that the Spanish suffix has become productive recently via the influence of English. Pagel (2010: 72–73) rebuts this hypothesis because the outwardly similar English pluralizer -s occurs only on borrowings from English or in English-Chamorro codeswitching and thus cannot be classified as an inheritance from Spanish. For Tetun Dili, Hajek (2006a: 171) notes that Portuguese loans may also be marked for number, although such marking on nouns and adjectives (involving final -s) is regular only in borrowed phrases […]. Otherwise it is optional, and not particularly common in spoken language. It does, however, appear to be becoming more frequent in written registers.

Williams-van Klinken et al. (2002: 20) make similar observations when they say that borrowed Portuguese nouns may be inflected for plural according to the patterns of the donor language specifically in formal contexts. They add that the co-occurrence of Portuguese and Austronesian plural markers on one and the same noun is attested though disapproved of by many speakers of Tetun Dili. Hajek (2006a: 171–172) mentions that a) there is free variation of single (i.e. exclusively Austronesian) and double (i.e. combined Portuguese and Austronesian) plural marking in NPs involving borrowed Portuguese nouns like (single) pasiente lepra sira ~ (double) pasientes lepra sira ‘the leprosy patients’, and b) contrary to the Austronesian rules, Portuguese borrowings may also be marked for plurality when combined with a numeral provided the normally intervening classifier is absent: pasiente (na’inCLASSIFIER) ruaNUMERAL ~ pasientes ruaNUMERAL ‘two patients’. Since (b) is particularly at odds with the Austronesian grammar of the precontact epoch, we consider it a major contact-induced adjustment of the replica language’s morpho-syntax.

420 | Thomas Stolz and Nataliya Levkovych

5.2 Number marking and definiteness The Austronesian number systems into which the Ibero-Romance plural markers have to be integrated are outlined in Table 4. The nouns CH guma’ ‘house’ and TD uma ‘house’ are cognates.22 Boldface marks out the only borrowed element in Chamorro, viz. the indefinite-specific article un (< SP un ‘a(n)’) (Stolz 2010). Table 4: Austronesian system of definiteness and number marking in the replica languages.

Definiteness

Number

Chamorro

Tetun Dili

Meaning

indefinite-unspecific

transnumeral

guma’

uma

‘a house ~ houses’

indefinite-specific

singular

un guma’

uma ida

‘a house’

definite

singular

i gima’

uma (ida-ne’e(ba)) ‘the house’

definite

plural

i gima’ siha

uma sira

‘the houses’

To some extent, overt number marking is optional in both Chamorro and Tetun Dili. It is frequent with definite NPs especially (but not exclusively) if the noun bears the feature [+human]/[+animate]. Significant is that in Tetun Dili “[u]nlike sira, the Portuguese plural does not indicate definiteness” (Williams-van Klinken et al. 2002: 20), i.e. the introduction of the Portuguese patterns of number marking has affected the architecture of the above system in the sense that plural and definiteness are no longer as strongly associated to each other as before. The presence of un in Chamorro restricts the domain of the bare noun and makes it possible to differentiate specific from unspecific indefinite NPs. Table 4 does not capture the Chamorro facts entirely (Topping and Dungca 1973: 234–236). Besides the pluralization by the post-nominal free morpheme siha, there is also the possibility of using the (basically verbal) prefix man- especially when the noun functions as predicate as in Amerikanu ‘AmericanMALE’ → NP i Amerikanu siha ‘the Amerikans’ / predicate noun manAmerikanu hamyo ‘youPL are Americans’. Independently of the predicative function, man-plurals are also attested in an apparently limited (but still to be determined) number of cases all of which represent human nouns as SG ga’chong ‘friend’ → PL mangga’chong ‘friends’ with man- → mang-/_ CVELAR. Three human nouns display so-called irregular plu|| 22 The change from back vowel /u/ (guma’) to front vowel /i/ (gima’) is a regular instance of vowel fronting in Chamorro – in this case triggered by the definite common article i (Klein 2000: 84).

Parallel Romancization: Chamorro and Tetun Dili – two heavy borrowers compared | 421

rals, namely SG lahi ‘man’ → PL lalahi ‘men’; SG palao’an ‘woman’ → PL famalao’an ‘women’; SG patgon ‘child’ → PL famagu’on ‘children’. The rules according to which the different strategies of pluralization function and under what conditions pluralization is favored or not still need to be determined.

5.3 Retention of the plural marker In this chapter, we scrutinize the properties of those Ibero-Romance borrowings, which are identical in form to plurals in the donor language. The borrowing of plural word-forms with a singular or trans-numeral meaning is a relatively widespread process, it has been observed not only in creoles (Holm 1988: 97–98) but also for numerous Welsh nouns borrowed from English (Stolz 2001: 68–69). Both Chamorro and Tetun Dili provide ample evidence of this phenomenon. In the sub-section on fossilized plural forms, Rodríguez-Ponga (1995: 81) states that [h]ay muchos ejemplos de palabras españolas que pasan al Chamorro con la -s final del plural y sin embargo no tienen significado plural. Suelen ser, por lo general, objetos que no aparecen aislados, sino formando conjuntos […]. En todos casos, son singular en chamorro […].23

For this study, we take stock of fossilized plurals in the replica languages provisionally. On the basis of the above reference dictionaries, we account only for borrowings which host the erstwhile Ibero-Romance plural suffix -s and are classified as “nouns” in the replica languages. In the case of Chamorro, this means that only certain members of Class II (Topping and Dungca 1973: 78–80) are admitted to the sample. There are dozens of Austronesian nouns which also host a final /s/ such as CH ma’gas ‘boss, master’ or TD kabaas ‘shoulder’. We assume that their presence in the lexicon of the replica languages has a supporting effect on the phenomenon we are about to describe in this section. Yet, we exclude these nouns from our inventory because they fall outside the category of contact-induced cases. In Appendix I we provide a list of 114 cases of nouns with fossilized Spanish plurals in Chamorro. With only twenty cases, Appendix II shows that the turn-

|| 23 Our translation: ‘There are many examples of Spanish words which are transferred to Chamorro with the final -s of the plural without, however, displaying plural meaning. In general, they are objects which do not occur as individuals but form groups. […] In all cases, they are singulars in Chamorro.’

422 | Thomas Stolz and Nataliya Levkovych

out is much smaller in the case of Tetun Dili. The differences in size of the Appendixes I–II notwithstanding, one recognizes that there are many parallels. Certain concepts are represented in both inventories although the nouns themselves do not always correspond to each other etymologically. The following eight concepts are attested not only in Appendix I but also in Appendix II (Roman numbers refer to the appropriate Appendix): – EAR-RING: I #2 = II #4 – SLIPPER: I #18 = II #20 – GLOVE: I #29 = II #10 – BUCKLE: I #31 = II #5 – STOCKING: I #46 = II #11 – NAPPY: I #54 = II #6 – PEARL: I #59 = II #9 – SANDAL: I #73 = II #16 Most of the fossilized plurals corroborate Rodríguez-Ponga’s (1995: 81) hypothesis according to which these borrowings refer to entities usually occurring in pairs or larger numbers. To solve the problem of the marked category (= plural) displaying a token frequency that exceeds that of the unmarked category (= singular), the formal plural has to be analyzed as representing the singular. This is done throughout the Appendixes I–II. In the case of Chamorro, recent borrowings from English seem to follow the same path as the following lexicon entries suggest: – krakas ‘cracker, biscuit’ < English crackers (Topping et al. 1975: 114) – maches ‘match ~ matches’ < English matches (Topping et al. 1975: 128) – yardas ‘(measurement) yard ~ yards’ < English yards (Topping et al. 1975: 214) Beside these cases we also find evidence for different solutions. The English plurale tantum premises is given a Spanish look in Chamorro where the borrowed item is attested as premisias ‘premises’ (Topping et al. 1975: 172). The borrowed terms are transnumeral, i.e. they are ambiguous as to number. There is a distinct form neither for the singular nor for the plural. A cracker or a match only seldom comes as a singleton, in a manner of speaking, since they are usually part of a number of items kept in the same container. Cases of this kind have to be distinguished from a different type of borrowed nouns, viz. those hosting the Ibero-Romance plural suffix and invite an interpretation as plural but lack a corresponding singular form in the replica language. To get a firm grip on these nouns, we take a detour first.

Parallel Romancization: Chamorro and Tetun Dili – two heavy borrowers compared | 423

First, we give evidence that for those items which are presented in Appendix I–II the Ibero-Romance plural suffix is no longer functional in the replica languages. Consider the Chamorro noun duhendes ‘goblin’ in (17)–(18). (17)

CH [Underwood 1974: 10] Ayu nai hu tungo’ hayi ginen DEM.DIS when 1SG.ERG know who from tumattitiyi yo‘. Un duhendes! RED~follow 1SG.ABS ART.INDEF goblin Estaba i duhendes gi tatten un tronko. EXI.PERF ART.CN goblin in back:LINKER ART.INDEF tree ‘That was when I came to know who was following me. A goblin! The goblin was behind a tree.’

(18)

CH Ma

[Onedera 1994: 25] ni’ Mañamoru na PASS by PL:Chamorro LINKER espiriton famagu’on i duhendes siha. spirit:LINKER PL.child ART.CN goblin PL ‘It is believed by Chamorros that the goblins are spirits of children.’ hongge believe

The Chamorro noun reflects the plural duendes of SP duende ‘dwarf’. The Spanish singular form is not attested in the replica language (Rodríguez-Ponga 1995: 254). Although etymologically plural, CH duhendes combines with the indefinite article un to yield an indefinite-specific NP in the singular un duhendes ‘a goblin’ in (17). In the same example, the noun is taken up again preceded by the definite common article in the DP i duhendes ‘the goblin’ – again with singular reading. The plural is made explicit, however, in (18) by way of combining the noun with the post-nominal pluralizer siha in the DP i duhendes siha ‘the goblins’. Note that (19) is not a counter-example to this claim since plural marking is generally optional in Chamorro. (19)

CH [Onedera 1994: 25] mang-gaige i duhendes gi lihende ni‘ metgot gi PL-EXI ART.CN goblin in folklore REL strong in hale‘ i kottura. root ART.CN culture ‘[…] the goblins are many in the folklore which is strongly rooted in the culture.’

The DP i duhendes has plural reading in this context because the sentenceinitial existential hosts the prefix man- agreeing (semantically) with a plural

424 | Thomas Stolz and Nataliya Levkovych

subject. This means that the Spanish plural marker -s has lost its morpheme status in Chamorro where it is but a segment in the phonological chain of the lexeme. The situation is similar in Tetun Dili. The noun oras ‘time, hour’ reflects the plural horas ‘hours’ of the Portuguese noun hora ‘hour’. The singular form of the Portuguese noun has not made it into the lexicon of Tetun Dili. Examples (20)– (22) illustrate that oras has no inherent plural meaning in the replica language. [Manhitu 2016: 12]24 loos ka lae? right or no

(20)

TD Minutu neenulu halo oras ida, Minute sixty make hour one ‘Sixty minutes make an hour, don’t they?’

(21)

TD [Manhitu 2016: 17] I ó sei estuda oras hira iha-ne’ebá? and 2SG FUT study hour how_many there Ha’u sempre estuda oras haat. 1SG always study hour four ‘For how many hours will you study there? I always study for four hours.’

(22)

TD oras ida la hanesan hour one NEG be_equal ‘[…] one hour is not like the other hours.’

oras hour

sira PL

[LPP TD, 68] seluk other

In (20), the noun oras combines with the numeral ida ‘one’ (identical with the indefinite article). This combination would be barred if oras bore the feature [+plural]. In (21) the same noun occurs in contexts which give rise to a plural reading – but only because oras is followed by quantifiers (interrogative and numeral), meaning: the construction [oras QUANTIFIERn>1]QP is invested with plural meaning, not the noun oras taken in isolation. Example (22) ultimately demonstrates that oras itself may be overtly pluralized via the post-nominal sira. Thus, the above examples are indicative of the de-morphologization of the Ibero-Romance plural suffix -s in borrowed nouns like those in Appendix I–II. Their citation forms are primarily interpreted as transnumeral. If plural needs to be made explicit, overt pluralization by means of siha / sira is required. However, this is not the end of the story.

|| 24 English translation original.

Parallel Romancization: Chamorro and Tetun Dili – two heavy borrowers compared | 425

5.4 (Pseudo-)collectives Beside the nouns in Appendix I–II, there are further borrowings in the replica languages which reflect formal plurals. In contrast to the above instances of fossilizations, however, the additional cases do not necessarily invite a transnumeral interpretation. To make our point clear, we look at examples from Chamorro. In Topping et al. (1975), we find separate lexicon entries for the following formal plurals and their corresponding singulars if any: PL ángheles ‘angels’ / SG anghet ‘angel’ (Topping et al. 1975: 15) < SP PL ángeles/SG ángel (cf. #7 in Appendix III). In (23), the plural Añgheles refers to the angels heading for Bethlehem to worship Jesus. (23)

CH i

[De Vera 1941: 14] Añgheles yan i pastot siha manmato ART.CN angel:PL and ART.CN shepherd PL PL:arrive giya Belen in:PLN Bethlehem ‘…the angels and the shepherds arrived in Bethlehem…’

When reference is made to the replicas of angels such as those of the Christmas manger, pluralization is achieved via the pluralizer siha as in (24). (24)

CH [Taimanglo et al. 1999: 4] Manmåtto i pastores yan i anghet siha. PL:come ART.CN shepherd and ART.CN angel PL ‘The shepherds and the angels have come!’

The pluralizer siha has scope over the preceding two coordinated NPs. The word-form pastores (< SP pastores ‘shepherds’) is not registered in the dictionary where, in accordance with (23), only pastót ‘herder, shepherd’ (< SP pastor ‘shepherd’) is mentioned (Topping et al. 1975: 165). We assume that CH pastores is yet another example of a fossilized plural with collective function in the context of religion. – apóstoles ‘the apostles, the disciples, follower, scholar’ < SP PL apóstoles ‘apostles’/SG apóstol; no singular form is given (Topping et al. 1975: 17). The Chamorro noun is ambiguous in terms of grammatical number. – hóbenes ‘youngsters’ < SP PL jóvenes ‘youngsters’/SG jóven; the dictionary has hoben ‘young, immature’ but apparently only with adjectival meanings whereas the same source refers to the alternative plural manhoben ‘youngsters’ with the Austronesian plural prefix man- (Topping et al. 1975: 134). As singulars, only the gendered nouns hobensitu ‘male teen-ager’ and hobensita ‘female teen-ager’ are given (Topping et al. 1975: 91).

426 | Thomas Stolz and Nataliya Levkovych

– – – – –

PL konfesores ‘confessors’/SG konfesót ‘confessor’ (Topping et al. 1975: 112) < SP PL confesores/SG confesor; PL peskadores ‘fishermen, hunters’/SG peskadót ‘fisherman, hunter’ (Topping et al. 1975: 168) < SP PL pescadores ‘fishermen’/SG pescador; PL potgadas ‘inches’/SG potgada ‘inch’ (Topping et al. 1975: 171) < SP PL pulgadas ‘inches’/SG pulgada; PL reyes ‘kings’/SG rai ‘king’ (Topping et al. 1975: 176–177) < SP PL reyes ‘kings’/SG rey; PL santos ‘saints’/SG santo ‘(male) saint’ (Topping et al. 1975: 182) < SP PL santos ‘saints’/SG santo

Given that the above singular-plural pairs are indeed functional in terms of number distinctions, it remains to be explained why exactly these eight nouns distinguish singular and plural according to the requirements of Spanish grammar. One of the cases belongs to the class of measurements (= potgada(s) ‘inch(es)’) mentioned by Rodríguez-Ponga (1995: 81) as one of the niches in which Spanish plurals may survive. As to hóbenes ‘youngsters’, the presence of an alternative Austronesian plural is indicative of the unstable position of the Spanish word-form in the system of the replica language. The majority of the above cases, however, are more or less directly connected to the Christian religion. This is clear in the cases of ángheles, apóstoles, konfesores, and santos. However, a religious context can also be assumed for peskadores ‘fishermen, hunters’ and reyes ‘kings’. For the latter term, Topping et al. (1975: 177) explicitly narrow the reference down to “kings – as in Bible, three wise men”. We assume that the Spanish plural has been lexicalized in Chamorro to refer to these protagonists in the story of Christ’s nativity. The word-form reyes is typically used in collocations with the cardinal numeral tres as in (25). (25)

CH Este DEM.PROX siha

[Underwood 1998: 37] Tres Reyes man-rai gi chago’ ART.CN three_wise_men PL-king in far na lugåt. PL LINKER place ‘These Three Wise Men were kings in distant countries.’ i

The Biblical reference is less evident with peskadores. However, it is not too farfetched to associate this plural with the original profession of several of the apostles. In mundane contexts, the plural of this noun is usually made explicit by way of using the post-nominal pluralizer siha as in (26).

Parallel Romancization: Chamorro and Tetun Dili – two heavy borrowers compared | 427

(26)

CH [Taimanglo et al. 1999: 25] Desde ki ayu na tiempu, i peskadot siha since DEM.DIS LINKER time ART.CN fisherman PL ma a’ata este na åcho 3PL.ERG RED~look_at DEM.PROX LINKER stone ‘Since that time the fishermen watched out for this stone […].’

There is thus a slight tendency that these so-called plurals refer to a collectivity of people in the context of Christianity. They might be interpreted as evidence of the existence of a hagiolectal register of Chamorro. Williams-van Klinken (2002) claims that a liturgical register also exists for Tetun Dili without, however, basing herself on pluralization. A case which supports hypothesis is CH kantores ‘choir, chorus’ from SP PL cantores ‘singer’/SG cantor. This borrowing is etymologically related to two other borrowings, namely kantót ‘male singer’ (< SP cantor) and kantora ‘female singer’ (< SP cantora). In the donor language, there is a straightforward number distinction SG ≠ PL. In the replica language, however, the erstwhile Spanish plural has been lexicalized as a collective noun referring to a (probably Church) choir and not to any number of individual singers. On account of these and similar cases, it makes sense to test the hypothesis that the borrowing of Ibero-Romance nouns in their plural form has created favorable conditions for the (still unaccomplished and largely hypothetical) genesis of a class of collective nouns. The members of this class of collectives-to-be are listed in Appendixes III– IV. The items we consider to be candidates for this class in the replica languages have in common that – they host the Ibero-Romance plural suffix, – they refer to either an assemblage of entities or an abstract indivisible concept, – they cannot refer to an individual member of this assemblage, and – there is no distinct singular form of the borrowed noun in the replica language (always according to the two dictionaries we refer to in this study). Accordingly, the would-be collectives have the following relation to their equivalents in the donor languages: – There are direct borrowings of pluralia tantum which exist also in the donor languages. A case in point is the following pair of identical lexical borrowings: CH aras ~ TD arras ‘bride’s money, dowry’ from SP arras ~ PT arras ‘bride’s money, dowry’. The Ibero-Romance nouns are formal plurals of *arra – a singular which, however, is not attested.

428 | Thomas Stolz and Nataliya Levkovych

–

–

–

The borrowings reflect semantically specified formal plurals of co-existing singulars in the donor language. CH potbos ‘(talcum) powder’ (< SP PL polvos ‘powder’/SG polvo ‘dust’) is. Similarly, TD aspas ‘inverted commas’ goes back to PT aspas ‘inverted commas’ which is the plural of aspa ‘crossbeam’ formally. The borrowings reflect regular, semantically transparent plurals of IberoRomance nouns whose singular has not been borrowed as e.g. CH botas ‘boots’ (< SP PL botas ‘boots’/SG bota) and TD rolamentus ‘ball-bearings’ (< PT PL rolamentos ‘ball-bearings’/SG rolamento). The replica languages borrow both the singular and the plural of a given noun but with different semantics as e.g. CH estasión ‘station (house), stopoff’/estasiones ‘station(s) of the cross’ from SP SG estación ‘station’/PL estaciones and TD bola ‘ball’/bolas ‘cartridge belt’ from PT SG bola ‘ball’/PL bolas.

With 36 and 45 nouns in Appendix III and IV, respectively, the hypothetical class of collectives does not seem to be exceedingly large in either of the replica languages. This picture changes if we take into consideration that there are additional lexico-semantic fields which might feed further candidates into this class but are not accounted for in the Appendixes III–IV. In the case of Chamorro, there are numerous names of plant types which host a final -s such as e.g. abuchuelas ‘phaseolus vulgaris’ (< SP PL habichuelas ‘(white) beans’/SG habichuela) which are listed in the reference dictionary exclusively in this form without any indication of the parallel existence of a singular (Topping et al. 1975: 4). The same holds for the names of certain types of fish such as palos ‘strongylura gigantea’ (< SP PL palos ‘sticks’/SG palo) (Topping et al. 1975: 162). Strikingly, there are also several instances of types of fish and fowl which bear names with a clear Hispanic flair without, however, having an identifiable Spanish etymon as e.g. aletses ‘round herrings’ and balaskes ‘type of small chicken’ (Topping et al. 1975: 11, 26). Kántanes ‘type of bantam-like chicken’ (Topping et al. 1975: 103) is interesting because it goes back to SP (pollo) cantonés ‘(chicken) from Canton’ (Rodríguez-Ponga 1995: 362–363). The stress site has changed from ultimate to antepenultimate so that the word-form now resembles other items from the same semantic field which reflect a genuine Spanish plural. The evidence from Chamorro is sufficiently numerous to call for closer inspection in a dedicated study. In the case of Tetun Dili, there are many uncertainties as to the origin of many nouns of the same semantic fields. Moreover, /s/ is generally admitted as stem-final consonant in Tetun Dili. Nevertheless, it strikes the eye that there are numerous names of biological species which host a

Parallel Romancization: Chamorro and Tetun Dili – two heavy borrowers compared | 429

final -s such as e.g. baboras ‘species of crabs’, bikas ‘species of fruit tree’, kahoris ‘species of bush with medical qualities’, kakas ‘species of water snake’, lobas ‘Timor chestnut tree’, lonus ‘species of seaside plant’, olas ‘species of fruit tree’, tulas ‘species of tree’, xokus ‘cuttlefish’, etc. Independent of the number of Lusitanisms in this segment of the lexicon, it is clear that reference to animal and plant species is often achieved via nouns which are equipped with a final s. This is another property the two replica languages share. Fossilized plurals are also attested for both replica languages with the four suits of card games: – CH bastos ‘clubs’ (< SP PL bastos/SG basto ‘saddle, clubs’) = TD paus ‘clubs’ (< PT PL paus ‘clubs’/SG pau ‘wood, stick’), – CH = TD espadas ‘spades’ (< SP/PT PL espadas ‘spades’/SG espada ‘sword’), – CH = TD kopas ‘hearts’ (< SP/PT PL copas ‘hearts’/SG copa ‘beaker; sideboard’), – CH oros ‘diamonds’ (< SP PL oros ‘diamonds’/SG oro ‘gold’) = TD ourus ‘diamonds’ (< PT PL ouros ‘diamonds’/SG ouro ‘gold’). We admit that the evidence is not conclusive as to the class-character of the above supposed collectives. Moreover, the Chamorro data are more numerous than those from Tetun Dili. This quantitative discrepancy is probably an effect of our choice to restrict the data-base to the findings in the dictionaries. In case the hypothetical existence of a class of collectives can be disproved empirically, nonetheless the replica languages have become indirectly more similar by way of borrowing many words in their Ibero-Romance plural forms and sometimes also with their plural semantics so that there are now many entries in the replica languages’ lexicons with lexemes displaying a basic plural meaning. Whether their number suffices to claim that the lexico-typological properties of Chamorro and Tetun Dili have been seriously affected is a matter the discussion of which requires that the phenomenon of fossilized plurals is studied in-depth first.

6 Gender agreement The massive lexical borrowing from Ibero-Romance means that many nouns have entered the replica languages as hosts of derivational morphology. A case which has gained some notoriety in the domain of contact morphology (Gardani et al. 2015: 9) is the Portuguese agent noun marker -dor. Matras (2009: 210) refers to this case when he argues that “[q]uite prominent on the list of attested borrowing of nominal derivation markers are markers of agentivity”. It is there-

430 | Thomas Stolz and Nataliya Levkovych

fore only to be expected that Chamorro as well provides evidence of the borrowing of the cognate Spanish agent noun marker -dor. What is remarkable about these cases is that the agent noun markers are gendered and thus come in pairs of masculine-feminine forms – in this case M -dor/F -dora as in SP M hablador/F habladora = PT M dizedor/F dizedora ‘vain talker, chatterbox’. These derived nouns can also be used as adjectives. There are further gendered derivational morphemes in the donor languages which have been integrated into the lexicon of the replica languages. The linguistically most exciting aspect of these borrowings is that the replica languages have acquired gendered nouns together with adjectives which have to agree with their head nouns in the donor languages. Since neither Chamorro nor Tetun Dili had grammatical gender before the contact with Ibero-Romance, it is interesting to see how the replica languages react to the pressure exerted by their partners in contact in this domain. Chamorro and Tetun Dili share a number of cognate pairs of gendered borrowings. We present a selection of twelve examples in (C) but emphasize that there are many more. (C)

Cognate gendered borrowings in Chamorro and Tetun Dili #1 CH M abogado/F abogada = TD M advogadu/F advogada ‘lawyer’, #2 CH M afrikano/F afrikana = TD M afrikanu/F afrikana ‘African’, #3 CH M amigo/F amiga = TD M amigu/F amiga ‘friend’, #4 CH M bankeru/F bankera = TD M bankeiru/F bankeira ‘banker’, #5 CH M biatu/F biata = TD M beatu/F beata ‘blessed’, #6 CH M brutu/F bruta = TD M brutu/F bruta ‘rude’, #7 CH M dibotu/F dibota = TD M devotu/F devote ‘devout’, #8 CH M kuñadu/F kuñada = TD M kuñadu/F kuñada ‘brother/sister-in-law’, #9 CH M oktabo/F oktaba = TD M oitavu/F oitava ‘eighth’, #10 CH M primu/F prima = TD M primu/F prima ‘cousin’, #11 CH M suspechosu/F suspechosa = TD M suspeitozu/F suspeitoza ‘distrustful, suspicious’, #12 CH M tihu/F tiha = TD M tiu(n)/F tia(n) ‘aunt/uncle’

That there is a considerable overlap of the replica languages in this domain is hardly surprising. Stolz (2012a: 136–140) has identified 300 pairs of gendered nouns and adjectives in Chamorro. The turnout for Tetun Dili is 1,684 pairs of gendered nouns and adjectives (cf. Appendix V).25 The mere presence of these

|| 25 Appendix V is based on the reference dictionary (Hull 1999). Stolz’s (2012a) study makes use of several sources in addition to Topping et al. (1975). Thus, the results are not absolutely

Parallel Romancization: Chamorro and Tetun Dili – two heavy borrowers compared | 431

gendered pairs in the replica languages is a further factor in their parallel borrowing behavior. Moreover, borrowings of the above kind provide favorable conditions for the genesis of gender agreement since gender-sensitive adjectives and nouns borrowed from Ibero-Romance can be involved in the same construction. It does not seem to make much sense to borrow a set of adjectives along with their masculine-feminine distinction if this distinction is not activated anywhere in the domain of morpho-syntax. Matras (2009: 189–190) discusses some instances of gender systems in language contact. Gillon and Rosen (2018: 75–106) study gender in the Frenchbased NP of Michif. The paper by Stolz (2015: 286–295) on the fate of agreement in language contact contains a section dedicated to the genesis of agreement patterns. Choupina (2011) is a comparison of grammatical gender in Portuguese and Tetun Dili. Hajek and Williams-van Klinken (2019) focus on contact-induced gender in Tetun Dili with some comparative data from Tagalog and Chamorro. Stolz (2012a: 97–104) adduces evidence for gender-copy from a variety of cases of Romancization world-wide which attest to the transfer of the masculinefeminine distinction into a number of replica languages (Tagalog, Quechua, etc.) which prior to contact lacked grammatical gender. The structural facts are, however, not as straightforward as one might wish them to be – and this judgment applies not only to Chamorro but also to Tetun Dili. Gender is there if agreement applies (Corbett 1991: 105). We have to differentiate semantic gender from formal gender. Semantic gender applies if wordforms vary in accordance to the natural sex of the referents (Corbett 1991: 9, 30– 32). Formal gender is independent of natural sex because it is expressed also for non-sexed referents (Corbett 1991: 34, 51). Spanish and Portuguese display largely identical binary gender systems with a basic masculine-feminine distinction (with a sex-based core but additional formal criteria). Contact-induced gender in the replica languages differs insofar as Chamorro displays characteristics of semantic gender whereas Tetun Dili seems to go beyond purely sexbased gender assignment (Hajek and Williams-van Klinken 2019).

6.1 Gender in Chamorro For Chamorro, the basic facts are summarized already by Rodríguez-Ponga (1995: 77–79). First, the formal distinction of gender is optional in the language.

|| compatible. The strikingly higher numbers of gendered word-pairs in Tetun Dili can be explained with the recent increase of Portuguese influence in the domain of neology.

432 | Thomas Stolz and Nataliya Levkovych

Furthermore, it is restricted to constructions which involve a noun with the feature [+animate] (preferably [+human]). More importantly, the distinction has been extended to also cover cases in which the Spanish only has one word-form used for referents of both sexes like SP hipócrita ‘hypocrite’ as opposed to CH hipókrito ‘male hypocrite’ ≠ hipókrita ‘female hypocrite’ (Rodríguez-Ponga 1995: 78). Pagel (2010: 75) is cautious not to exaggerate the importance of gendercopy in Chamorro because of its facultative nature and the small number of adjectives which show gender-sensitivity. Owing to these factors, gender agreement is only rarely found in Chamorro texts although perhaps not as infrequently as Pagel’s skepticism might suggest. Chung (2020: 115–116) argues that the borrowing of gendered nouns from Spanish does not automatically imply the existence of a grammatical gender system in Chamorro. We restrict the report on the gender-issue to the very basic facts. In (27)–(28) we provide evidence of the gender distinction by way of contrasting two sentences which host NPs with differently gendered identical adjectives. (27)

CH [Taimanglo et al. 1999: 20] Yanggen binenu i hanom este na when poison ART.CN water DEM.PROX LINKER tinekcha’ ya ha puno’ yu’, na’danña’ ham fruit and 3SG.ERG kill 1SG.ABS CAUS:unite 1PL.EXCL.ABS gi naftan yan [i difunt-a hågu-hu]. in grave and [ART.CN deceased-F daughter-POR.1SG] ‘If this juice is poisonous and it kills me, unite me with [my deceased daughter] in the grave.’

(28)

CH [Taimanglo et al. 1999: 170] Låo yanggen guaha siha fina’achåki, siempre but when EXI PL pretend:trouble always si Tan Chong u finatoigue nu ART.PN Tan Chong FUT pretend:visit by [difunt-o Tun Pepi]. [deceased-M Tun Pepi] ‘But when they were in trouble, Tan Chong would always be visited by [the deceased Tun Pepi].’

The Spanish adjective/noun M difunto/F difunta ‘deceased’ is at the origin of the Chamorro first constituent of the bracketed appositions in (27)–(28). The feminine form difunta is employed in (27) because the possessed Austronesian noun hågu ‘daughter’ has a female referent. In contrast, in (28), the proper name Tun

Parallel Romancization: Chamorro and Tetun Dili – two heavy borrowers compared | 433

Pepi has a male referent and thus the preceding gender-sensitive element has to take the masculine form difunto. The most frequently encountered case is that of M bunitu/F bunita ‘pretty’ (< SP M bonito/F bonita) as in (29)–(30). (29)

CH Un

[Perez 1975: 30] dia i dikike’ yan kulot apu na INDEF day ART.CN little and color ash LINKER Cha’kan Lancho binisita ni’ primu-ña rat farm visit by cousin:M-POR.3SG [un bunit-u na Cha’kan Siuda]. [INDEF pretty-M LINKER rat city] ‘One day the small and ash-colored farm rat was visited by his cousin, [a pretty city rat].’

(30)

CH [Taimanglo et al. 1999: 49] Gi un familia guaha in INDEF family EXI [un bunita~t-a na sotterit-a] [INDEF pretty~RED-F LINKER adolescent-F] na’ån-ña Elena. name-POR.3SG Elena ‘In a family, there was [a very pretty adolescent girl] by the name of Elena.’

In contrast to the examples (27)–(28) which can be considered to be appositions, the two NPs in (29)–(30) are completely structured according to the morphosyntactic requirements of adjectival attribution according to the Austronesian style. The adjective precedes the head noun to which it is connected via the intervening linker morpheme na. This means that the juxtaposition of adjective and noun as known from Spanish syntax does not apply – and nevertheless there is semantic gender agreement. It is interesting that semantic gender agreement is possible independent of the origin of the head noun provided it meets the criterion of being [+animate]. The adjective of course has to be a Hispanism. In Chamorro, gender-sensitive adjectives partake also in agreement patterns if they are used predicatively as in (31) or do not form part of the same NP as the noun they modify as in (32). (31)

CH Banidos-a pompous-F

mampos very

i ART.CN

[Taimanglo et al. 1999: 8] nana-n hilitai mother-LINKER lizard

434 | Thomas Stolz and Nataliya Levkovych

ni i lahi-ña. with ART.CN man-POR.3SG ‘The lizard’s mother was very arrogant with her man.’ The adjective M banidosu/F banidosa ‘pompous, showy’ (< SP M vanidoso/F vanidosa) occupies the leftmost slot in this sentence which is the position in which predicative nouns and adjectives are placed. The adjective takes the feminine form banidosa because it agrees with the subject noun nana ‘mother’ which refers to a female being. (32)

CH Åntes before un

[Taimanglo et al. 1999: 21] tiempo giya Hagåtña, guaha LINKER time in:ART.PLN Hagåtña EXI sotterit-a ni’ gof bunit-a INDEF adolescent-F REL very pretty-F ‘Long ago in Hagåtña, there was an adolescent girl which was very pretty […].’ na

In (32), the adjective is used predicatively in a relative-clause modifying the head-noun sotterita. The gender agreement applies although noun and adjective belong to two different clauses. Figure 4 summarizes the properties of gender agreement in Chamorro. The double slashes symbolize syntactic boundaries. [ __ (na) NHUMAN]NP M≠F ADJSP

/

→

(NHUMAN) // __PREDICATIVE // (NHUMAN) M=F

/

NNONHUMAN

Figure 4: Rules of optional gender agreement in Chamorro.

6.2 Gender in Tetun Dili Hajek (2006a: 171) identifies M bonitu/F bonita ‘handsome/pretty’ as one of the very few Portuguese loan-adjectives which are “obligatorily marked for gender by all speakers”. This adjective also agrees in gender when used predicatively. It is striking that the replica languages are again in agreement. Their prime candidates for gender agreement are cognates. The Tetun Dili Bible yields examples like (33)–(34).

Parallel Romancization: Chamorro and Tetun Dili – two heavy borrowers compared | 435

(33)

TD [Bible TD, Hahuu/Jénesis 12:14] Kuandu Abrão sira too ona iha Ejitu, when Abraham PL arrive now in Egypt ema sira haree [feto nee bonit-a lahalimar]. person PL see [woman DEM.PROX pretty-F really] ‘When Abraham and his people had arrived in Egypt, the people saw [a really pretty woman].’

(34)

TD José nee [mane José DEM.PROX [man ‘This José was [a handsome man].’

[Bible TD, Hahuu/Jénesis 39:5] bonit-u ida]. pretty-M one]

The picture resembles the one painted for Chamorro above. The head-nouns feto ‘woman’ and mane ‘man’ are Austronesian whereas the attributive adjective is of Portuguese extraction. Their different origin does not preclude gender agreement. The adjective comes as bonita in (33) because feto refers to a woman. The adjective displays masculine agreement in (34) because mane refers to a male person. The feature [+human] is crucial (Choupina 2011), i.e. semantic agreement applies. Manhitu (2016: 41) states, however, that [m]any adjectives are borrowed from Portuguese […]. Those with masculine and feminine forms should be applied in accordance with the noun gender. Therefore, it is important to know the gender of both nouns and adjectives.

This invokes syntactic gender since the feature [+human] is not mentioned as essential for agreement. Hull and Eccles (2005: 156–157) depict gender agreement as a social marker of the acrolect whereas it is frequently violated against in mesolectal and basilectal varieties. Moreover, gender agreement does not apply – in the acrolect as well – if adjective and head noun fail to be direct neighbors. Hajek (2006a: 171) further explains without reference to sociolinguistic variables that [g]ender agreement always occurs on Portuguese adjectives that precede a Portuguese noun […]. Feminine gender agreement is variable – and avoided by many speakers – in postposed adjectives after Portuguese feminine nouns, in collocations which have not been borrowed as fixed phrases […]. In fixed phrases, post-nominal gender agreement always occurs […].

This quote and the other statements mentioned above give the impression that the position of gender agreement in Tetun Dili is comparatively instable and very limited. However, Hajek and Williams-van Klinken (2019) analyze fresh

436 | Thomas Stolz and Nataliya Levkovych

data which speak in favor of a possible expansion of the domain of contactinduced gender in Tetun Dili. The optional agreement of postnominal adjectives with Portuguese head nouns is mostly possible with nonhuman nouns. Figure 5 summarizes the above restrictions in the format of an optional rule. X is a placeholder for any kind of intercalated element. [ __ NPT]NP M≠F

/

[ NHUMAN __ ]NP bonit-PREDICATIVE

ADJPT

→

M ≠/= F

/

[ NNONHUMAN&PT __ ]NP [ NNONHUMAN __ ]NP

M=F

/ [ N X __ ]NP

Figure 5: Rules of optional gender agreement in Tetun Dili.

The rules of gender agreement of the replica languages do not fully match. However, the structural differences that separate Tetun Dili and Chamorro in this domain cannot obscure the fact that the replica languages display parallel borrowing behavior. Gender is now an issue in the replica languages. We do not assume this phenomenon to be pervasive. Gender is not a fully established category in either of the replica languages. Grammatical gender is, however, an option which speakers may not use all too frequently, but they choose to distinguish masculine from feminine far too often to allow us to ignore the phenomenon altogether. No matter how weak the position of grammatical gender is in Chamorro and Tetun Dili, for the purpose of this paper, it suffices to see that gender-copy is one of the corner-stones of the parallel Ibero-Romancization of the replica languages.

7 Conclusions The parallels presented in the foregoing sections are clearly suggestive of a kind of long-distance convergence of the replica languages on account of their perennial contacts with two Ibero-Romance donor languages whose structural properties more often than not are similar. Chamorro and Tetun Dili have come to resemble each other closely exactly in those domains in which Spanish and Portuguese look like one another. Where contact has led to different results in

Parallel Romancization: Chamorro and Tetun Dili – two heavy borrowers compared | 437

the replica languages, the differences can be correlated mostly to differences which separate Spanish from Portuguese already. This means that parallel borrowing may on the one hand trigger convergence whereas it does not preclude divergent developments at the same time. Moreover, parallel borrowing behavior does not require the results to be identical. The phenomenology reviewed in Sections 3–6 ranges from phonology via morphology to gender agreement. Each of the above phenomena can ultimately be connected to massive lexical borrowing, i.e. the intrusion of a considerable number of foreign lexemes into the replica language’s lexicon paves the way for the subsequent developments. Massive borrowing creates the necessary conditions for the phonological, morphological, and morpho-syntactic phenomena reviewed in this paper. This is not to assume that the causal chain is inevitable. It should be investigated whether massive borrowing is possible also without further structural entailments. In the introduction, we have posed several questions as to which factors are crucial for the parallel behavior of the languages under scrutiny. As yet, we cannot put forward any conclusive answers to these questions. The high turnout of borrowings in the domain of the fossilized plurals, for instance, is largely outside the scope of predictability. In this case, the chance factor comes into play. Token frequency is certainly essential but its importance not only in the language-contact situations discussed here still needs to be demonstrated empirically. Furthermore, we have seen that – as expected – there is variation in the replica languages when it comes to adopting patterns of the donor languages. The differentiation along the acrolect-mesolect-basilect continuum is such that the acrolect speaker’s greater familiarity with the donor language is a favorable condition for the acceptance of grammatical borrowings from the donor language. With decreasing bilingualism on the mesolectal and basilectal levels, the readiness to integrate donor language structures into the replica language diminishes considerably. This means that the acrolects may come rather close to Meakins’ (2013: 216) summary of what is common to all Mixed Languages, namely “the common socio-historical cradle” whereas the basilects are characterized by the limited extent or complete absence of bilingualism with the donor language. At the extremes of the continuum, acrolects and basilects therefore may belong to different categories in the language contact taxonomy. There are also dissimilarities between Chamorro and Tetun Dili which may result from the fact that Spanish has long since ceased to function as an option for bilingualism in the Marianas whereas, in Timor Leste, Portuguese has been gaining ground in this domain recently. A case in point is the (still weak) ten-

438 | Thomas Stolz and Nataliya Levkovych

dency of Tetun Dili to extend the domain of gender agreement from purely semantic patterns to syntactic agreement. The comparative look at heavy borrowers promises interesting results for language contact studies. We have not exhausted the topic for the languages of our choice yet. There are still many open questions most of which can be answered only on a much larger empirical basis. It has to be borne in mind that, independent of the factor language contact, the structural properties of Chamorro and Tetun Dili have not been described comprehensively in the extant literature on these languages. To understand fully what repercussions language contact has had on the replica languages, it is often necessary to provide an adequate description of a given grammatical domain first before judging how much thereof is contact-induced. The next step to take in the framework of our project is to check our data (especially of Sections 5–6) against the information provided in e.g. the dictionaries compiled by Flores and Bordallo Aguon (2009) for Chamorro and Costa (2000) for Tetun Dili. Helpful for further research on fossilized plurals in Chamorro will be the consultation of Safford’s (1905) treatise of plants on Guam and its Timorese equivalent by Hull (2006) although the latter is on Tetun Terik. To answer the questions which arise from this study, the empirical basis has to be enlarged considerably. It is mandatory to take account of the spoken register as well. Corpus data and grammaticality tests as well as psycholinguistic experiments are inevitable. Native speakers of the replica languages should be asked whether they accept combinations of the indefinite articles (CH un, TD ida) in combination with nouns hosting the erstwhile Ibero-Romance pluralizer -s. The experimental design could require the native-speaker participants to judge whether certain nonce words bearing a final -s refer to singular entities or groups thereof to determine whether we are dealing with a class of nouns which is conceptually defined or whose members only incidentally share a formal property, i.e. the final -s. Section 5 has revealed that the Ibero-Romance pluralizer has undergone de-morphologisation. It remains to be seen whether the final -s has also been reinterpreted in some cases as marker of the (hypothetical) collective. Similarly, native speaker judgements are needed for the adequate evaluation of the status of gender agreement in Chamorro and Tetun Dili. To this end, one could use the data in Appendix IV and Stolz (2012a) to test whether native speakers are familiar with the gendered word pairs. For those word pairs which pass this test it is then recommendable to investigate how native speakers use them in constructions which potentially trigger agreement. Chamorro and Tetun Dili give ample evidence of Romancization. As shown above the borrowing behavior of the replica languages is characterized by many

Parallel Romancization: Chamorro and Tetun Dili – two heavy borrowers compared | 439

parallels. On account of these parallels the two cases are particularly important for the research program Romancization worldwide (cf. Section 2.3) because the similarities might ultimately reflect patterns which are also common for other cases of Romancization. Beyond Romancization, Chamorro and Tetun Dili are also very interesting objects of study for theoreticians of language contact because the contact-induced processes in the replica languages suggest that massive lexical borrowing is a potential trigger of grammatical borrowing. Therefore, research on Chamorro and Tetun Dili and similar cases should be intensified in the future.

Acknowledgments: This paper has benefitted from the discussion of thematically related talks delivered at the universities of Augsburg, Bremen, Heidelberg, Milan, Oldenburg, and Potsdam in 2015–2017. We are especially grateful to Sybille Große, Péter Maitz, Maria Mazzoli, Jörg Peters, Andrea Scala, Christoph Schröder, Eeva Sippola, and Heike Wiese for their thought-provoking comments upon these proto-versions of our study. Sandra Chung, Barbara Dewein, Eric Forbes, John Hajek, Alexander Loch, Rosa Salas Palomo, Steve Pagel, and Catharina Williamsvan Klinken kindly provided us with reading matter in and on Chamorro and Tetun Dili, respectively. Maja Robbers was helpful in bibliographical matters. Beke Seefried was instrumental in creating Appendix V for this study. The sole responsibility for what is said in this paper remains, however, entirely ours.

Abbreviations 1/2/3 σ ABS ADJ ART

C CAUS

CH CN COMP DEM DIS

DL DP DU EMPH

1st/2nd/3rd person syllable absolutive adjective article consonant causative Chamorro common noun comparative demonstrative distal donor language determiner phrase dual emphatic

440 | Thomas Stolz and Nataliya Levkovych

ERG EXCL EXI F FUT INDEF IPFV LINKER M MAX NEG

NP/NP OBJ OBLIG PASS PERF PL PLN PN POR PROX

PT P.T. QP RED REL

RL SG

SP TD V

ergative exclusive existential feminine future indefinite imperfective linker particle masculine maximal negation noun phrase object obligation passive perfective plural place name proper name possessor proximal Portuguese plurale tantum quantifier phrase reduplication relative marker replica language singular Spanish Tetun Dili vowel

Primary Sources Barcinas, Jesus C. 1973. Umepanglao [Crab fishing]. Agana, Guam: Government of Guam, Department of Education. Bible – TD Bible. Genesis e o Novo Testamento em língua tétum Dili. Wycliffe Bible translations. https://bible.cloud/ebooks/pdf/ TDTUBB/TDTUBB.pdf (accessed 11 June 2018). De Vera, Roman Maria. 1941. Nobenan I Sagrada Familia. Yona: St. Francis Church. LPP CH = Le petit prince (Chamorro) – Antoine de Saint-Exupéry. 2018. I dikkiki' na prinsipi. (translated by Eric Forbes). Unpublished draft version. LPP TD = Le petit prince (Tetun Dili) – Antoine de Saint-Exupéry. 2010. Liurai-oan ki’ik. (translated by João Paulo Esperança, Triana Corte-Real de Oliveira & Emília Almeida de Araújo) S.l.: Timor Aid ho SUL-Associação de Cooperação para o Desenvolvimento.

Parallel Romancization: Chamorro and Tetun Dili – two heavy borrowers compared | 441

Onedera, Peter. 1994. Fafa’ña’gue yan hinengge siha [Horrors and beliefs]. Agana, Guam: St. Anthony School. Perez, Remedios L. G. 1975. Estera si rai [Once upon a time]. Agana, Guam: Government of Guam, Department of Education. Taimanglo, Roland L.G., Aline Yamashita & Maria A. T. Rivera (eds.). 1999. Mandidok yan mamfabulas na hemplon Guåhan [Profundity and fables of Guamese stories]. Hagåtña, Guam: Government of Guam, Department of Education. Underwood, Robert A. 1974. I duhendes yan i denggat [The goblin and the luminous mushroom]. Agana, Guam: Government of Guam, Department of Education. Underwood, Robert A. 1998. Si Rosario yan si Roberto. In Roland L. G. Taimanglo et al. (eds.), Mångge yan håyi? [How many and who?]. Hagåtña, Guam: Government of Guam, Department of Education. [separate pagination].

References Alvar, Manuel. 1995. Por los caminos de nuestra lengua. Alcalá de Henares: Universidad de Alcalá. Bakker, Peter. 1997. A language of our own. The genesis of Michif, the mixed Cree-French language of the Canadian Métis. Oxford: Oxford University Press. Bakker, Peter. 2003. Mixed languages as autonomous systems. In Yaron Matras & Peter Bakker (eds.), The mixed language debate. Theoretical and empirical advances, 107–150. Berlin & New York: De Gruyter. Bakker, Peter & Yaron Matras. 2013. Introduction. In Peter Bakker & Yaron Matras (eds.), Contact languages. A comprehensive guide, 1–14. Berlin & Boston: De Gruyter Mouton. Bakker, Peter & Maarten Mous. 1994. Introduction. In Peter Bakker & Maarten Mous (eds.), Mixed languages. 15 case studies in language intertwining, 1–11. Amsterdam: IFOTT. Borja, Joaquin Flores, Manuel Flores Borja & Sandra Chung. 2006. Estreyas Marianas: Chamorro. Saipan, MP: Estreyas Marianas Publications. Choupina, Celda Morgado. 2011. Reflexões sobre o género em português europeu e em tétum. ELINGUP: Revista electronica de linguistica dos estudiantes da Universidade do Porto 3(1). http://ojs.letras.up.pt/index.php/elingUP/article/view/2523 (accessed 29 Mail 2020). Corbett, Greville. 1991. Gender. Cambridge: Cambridge University Press. Costa, Luís. 2000. Dicionário de Tétum-Português. Lisboa: Colibri. Couto, Hildo H. do. 1996. Introdução ao estudo das línguas crioulas e pidgins. Brasilia: Universidade de Brasilia. Chung, Sandra. 1998. The design of agreement. Evidence from Chamorro. Chicago & London: The University of Chicago Press. Chung, Sandra. 2020. Chamorro grammar. Santa Cruz/CA: University of California. [Permalink https://escholarship.org/uc/item/2sx7w4h5]. Cunha, Celso & Lindley Cintra. 1984. Nova gramática do português contemporâneo. Lisboa: João Sá da Costa. Direcção Geral de Estátistica Timor-Leste. 2016. Timor-Leste population and housing census 2015. Population distribution by administrative area. Vol. 2: Language. Dili: República Democrática de Timor-Leste.

442 | Thomas Stolz and Nataliya Levkovych

Fischer, John L. 1961. The retention rate of Chamorro basic vocabulary. Lingua 10. 255–266. Flores, Sylvia M. & Katherine Bordallo Aguon (eds.). 2009. The official Chamorro-English dictionary. Hagåtña, Guam: The Department of Chamorro Affairs. Gardani, Francesco, Peter Arkadiev & Nino Amiridze. 2015. Borrowed morphology: An overview. In Francesco Gardani, Peter Arkadiev & Nino Amiridze (eds.), Borrowed morphology, 1–23. Berlin & Boston: De Gruyter Mouton. Gillon, Carrie & Nicole Rosen. 2018. Nominal contact in Michif. Oxford: Oxford University Press. Granda, Germán de. 1978. Estudios lingüísticos hispánicos, afrohispánicos y criollos. Madrid: Gredos. Grant, Anthony P. 2012. Contact, convergence, and conjunctions: A cross-linguistic study of borrowing correlations among certain kinds of discourse, phasal adverbial, and dependent clause markers. In Claudine Chamoreau & Isabelle Léglise (eds.), Dynamics of contactinduced language change, 311–358. Berlin & Boston: De Gruyter Mouton. Greksakova, Zuzana. 2018. Tetun in Timor-Leste: The role of language contact in its development. Coimbra: Universidade de Coimbra, Faculdade de Letras. Unpublished PhD-thesis. Grimes, Barbara F. (ed.). 1992. Ethnologue. Languages of the world. Dallas, TX: Summer Institute of Linguistics. Hagège, Claude. 2000. Halte à la mort des langues. Paris: Odile Jacob. Hajek, John. 2006a. Language contact and convergence in East Timor: The case of Tetun Dili. In Alexandra Aikhenvald & R. M. W. Dixon (eds.), Grammars in contact. A cross-linguistic typology, 163–178. Oxford: Oxford University Press. Hajek, John. 2006b. Serial verbs in Tetun Dili. In Alexandra Aikhenvald & R. M. W. Dixon (eds.), Serial verb constructions. A cross-linguistic typology, 239–253. Oxford: Oxford University Press. Hajek, John & Catharina Williams-van Klinken. 2003. Um sufixo românico numa língua austronésia. Revue Linguistique Romane 66(265/266). 55–65. Hajek, John & Catharina Williams-van Klinken. 2019. Language contact and gender in Tetun Dili: What happens when Austronesian meets Romance? Oceanic Linguistics 58(1). 59–91. Haspelmath, Martin & Uri Tadmor. 2009. The Loanword Typology project and the World Loanword Database. In Martin Haspelmath & Uri Tadmor (eds.), Loanwords in the world’s languages. A comparative handbook, 1–34. Berlin & New York: De Gruyter. Heine, Bernd. 1997. Cognitive foundations of grammar. Oxford & New York: Oxford University Press. Holm, John. 1988. Pidgins and Creoles. Vol. I: Theory and structure. Cambridge: Cambridge University Press. Holm, John. 2008. Creolization and the fate of inflections. In Thomas Stolz, Dik Bakker & Rosa Salas Palomo (eds.), Aspects of language contact. New theoretical, methodological and empirical findings with special focus on Romancisation processes, 299–324. Berlin & New York: De Gruyter. Hull, Geoffrey. 1999. Standard Tetum-English dictionary. Sydney: Unwin. Hull, Geoffrey. 2006. Timorese plant names and their origins. Dili, Timor Leste: Instituto Nacional de Linguística Universidade Nacional de Timor Lorosa'e (The National Linguistic Institute of the National University of East Timor). Hull, Geoffrey & Lance Eccles. 2005. Gramática da língua Tétum. Lisboa, Porto & Coimbra: Lidel. Klein, Thomas B. 2000. >Umlaut< in Optimality Theory. A comparative analysis of German and Chamorro. Tübingen: Niemeyer.

Parallel Romancization: Chamorro and Tetun Dili – two heavy borrowers compared | 443

Malherbe, Michel. 1995. Les langages de l’humanité. Paris: Laffont. Manhitu, Yohanes. 2016. Tetum. A language for everyone. Tetun. Lian ida ba ema hotu-hotu. New York: Mondial. Matras, Yaron. 1998. Utterance modifiers and universals of grammatical borrowing. Linguistics 36. 281–331. Matras, Yaron. 2000. How predictable is contact-induced change in grammar? In Colin Renfrew, April McMahon & Larry Trask (eds.), Time depth in historical linguistics (Vol. 2), 563– 583. Cambridge: McDonald Institute for Archeological Research. Matras, Yaron. 2007. The borrowability of structural categories. In Yaron Matras & Jeanette Sakel (eds.), Grammatical borrowing in cross-linguistic perspective, 31–74. Berlin & New York: De Gruyter. Matras, Yaron. 2009. Language contact. Cambridge: Cambridge University Press. Matras, Yaron & Jeanette Sakel. 2007. Introduction. In Yaron Matras & Jeanette Sakel (eds.), Grammatical borrowing in cross-linguistic perspective, 1–13. Berlin & New York: De Gruyter. Mattos e Silva, Rosa Virginia. 1984. Estruturas trecentistas. Elementos para uma gramática do Português Arcaico. Salvador de Bahia: Imprensa Nacional – Casa da Moeda. Meakins, Felicity. 2013. Mixed languages. In Peter Bakker & Yaron Matras (eds.), Contact languages. A comprehensive guide, 159–228. Berlin & Boston: De Gruyter Mouton. Munteanu, Dan. 1997. Notas sobre el léxico de orígen español en chamorro. Anuario de Lingüística Hispánica 12. 959–974. Pagel, Steve. 2010. Spanisch in Asien und Ozeanien. Frankfurt a.M.: Lang. Pagel, Steve. 2015. Beyond the category: Towards a continuous model of contact-induced change. Journal of Language Contact 8. 146–179. Pagel, Steve. 2018. The opposite of an anti-creole? Why Modern Chamorro is not a new language. In Ralph Ludwig, Peter Mühlhäusler & Steve Pagel (eds.), Linguistic ecology and language contact, 264–294. Cambridge: Cambridge University Press. Quilis, Antonio. 1992. La lengua española en cuatro mundos. Madrid: MAPFRE. Rodríguez-Ponga, Rafael. 1995. El element español en la lengua chamorra (Islas Marianas). Madrid: Universidad Complutense. Unpublished PhD-thesis. Rodríguez-Ponga, Rafael. 2009. Del español al chamorro. Lenguas en contacto en el Pacífico. Madrid: Gondo. Safford, William Edwin. 1905. The useful plants of the island of Guam. Washington, DC: Government Printing Office. Sakel, Jeanette. 2007. Types of loan: Matter and pattern. In Yaron Matras & Jeanette Sakel (eds.), Grammatical borrowing in cross-linguistic perspective, 15–30. Berlin & New York: De Gruyter. Salas Palomo, Rosa & Thomas Stolz. 2008. Pro or contra Hispanisms: Attitudes of speakers of modern Spanish. In Thomas Stolz, Dik Bakker & Rosa Salas Palomo (eds.), Hispanisation. The impact of Spanish on the lexicon and grammar of the indigenous languages of Austronesia and the Americas, 237–267. Berlin & New York: De Gruyter. Sippola, Eeva. 2011. Una gramática descriptiva del chabacano de Ternate. Helsinki: University of Helsinki. Stewart, Miranda. 1999. The Spanish language today. London & New York: Routledge. Stolz, Christel & Thomas Stolz. 2001. Hispanicised comparative constructions in indigenous languages of Austronesia and the Americas. In Klaus Zimmermann & Thomas Stolz (eds.), Lo propio y lo ajeno en las lenguas austronésicas y amerindias. Procesos interculturales

444 | Thomas Stolz and Nataliya Levkovych

en el contacto de lenguas indígenas con el español en el Pacífico e Hispanoamérica, 35– 56. Frankfurt a.M.: Vervuert. Stolz, Thomas. 2001. Singulative-collective: Natural Morphology and stable classes in Welsh number inflexion on nouns. STUF/Language Typology and Universals 54(1). 52–76. Stolz, Thomas. 2003. Not quite the right mixture. Chamorro and Malti as candidates for the status of mixed language. In Yaron Matras & Peter Bakker (eds.), The Mixed Language debate. Theoretical and empirical advances, 271–315. Berlin & New York: De Gruyter. Stolz, Thomas. 2008. Romancisation world-wide. In Thomas Stolz, Dik Bakker & Rosa Salas Palomo (eds.), Aspects of language contact. New theoretical, methodological and empirical findings with special focus on Romancisation processes, 1–42. Berlin & New York: De Gruyter. Stolz, Thomas. 2010. Don’t mess with ergatives! How the borrowing of the Spanish indefinite article affects the split-ergative system of Chamorro. STUF/Language Typology and Universals 63(1). 79–95. Stolz, Thomas. 2012a. Survival in a niche. On gender-copy in Chamorro (and sundry languages). In Martine Vanhove et al. (eds.), Morphologies in contact, 93–140. Berlin: Akademie Verlag. Stolz, Thomas. 2012b. The attraction of indefinite articles: On the borrowing of Spanish un in Chamorro. In Claudine Chamoreau & Isabelle Léglise (eds.), Dynamics of contact-induced language change, 167–196. Berlin & Boston: De Gruyter Mouton. Stolz, Thomas. 2013. Liquids where there shouldn’t be any. What hides behind the orthographic post-vocalic tautosyllabic and in early texts in and on Chamorro. In Steven Roger Fischer (ed.), Oceanic voices – European quills. The early documents on and in Chamorro and Rapanui, 201–233. Berlin: Akademie Verlag. Stolz, Thomas. 2015. Adjective-noun agreement in language contact: Loss, realignment and innovation. In Francesco Gardani, Peter Arkadiev & Nino Amiridze (eds.), Borrowed morphology, 269–301. Berlin & Boston: De Gruyter Mouton. Stolz, Thomas & Maja Robbers. 2016. Unorderly Ordinals. On suppletion and related issues of ordinals in Europe and Mesoamerica. STUF/Language Typology and Universals 69(4). 565–594. Thomaz, Luís Filipe F. R. 2006. Timor. In Maria De Jesus dos Mártires Lopes (ed.), O império oriental 1660–1820 (Vol. 2), 392–431. Lisboa: Estampa. Topping, Donald & Bernadita C. Dungca. 1973. Chamorro reference grammar. Honolulu: University of Hawaii Press. Topping, Donald, Pedro M. Ogo & Bernadita C. Dungca. 1975. Chamorro-English dictionary. Honolulu: University of Hawaii Press. Van Engelhoven, Aone & Catharina Williams-van Klinken. 2005. Tetun and Leti. In Alexander Adelaar & Nikolaus Himmelmann (eds.), The Austronesian languages of Asia and Madagascar, 735–768. London & New York: Routledge. Vicente, Alonso Zamora. 1974. Dialectología Española. Madrid: Gredos. Williams-van Klinken, Catharina. 2002. High registers of Tetun Dili: Portuguese press and purist priests. Paper presented at the 2001 Conference of the Australian Linguistic Society, Canberra. Williams-van Klinken, Catharina. 2011. Interactive Tetun-English dictionary. Computer program. Dili: Dili Institute of Technology. Williams-van Klinken, Catharina. 2015. Peace Corps East Timor: Tetun language course. Dili: Peace Corps East Timor.

Parallel Romancization: Chamorro and Tetun Dili – two heavy borrowers compared | 445

Williams-van Klinken, Catharina & John Hajek. 2018a. Language contact and functional expansion in Tetun Dili: The evolution of a new press register. Multilingua 37(6). 613–648. Williams-van Klinken, Catharina & John Hajek. 2018b. “Akua rua, satu dolar.” Mixing numeral systems in Timor-Leste. In A. Schapper (ed.), Contact and substrate in the languages of Wallacea, Part 2. [Special issue of NUSA 64]. 63–93. Williams-van Klinken, Catharina & John Hajek. 2020. Double agent, double cross? Or how suffix changes nature in an isolating language: dór in Tetun Dili. In David Gil & A. Schapper (eds.), Austronesian undressed. How and why languages become isolating, 369–390. Amsterdam & Philadelphia: John Benjamins. Williams-van Klinken, Catharina, John Hajek & Rachel Nordlinger. 2002. A short grammar of Tetun Dili. München: LINCOM Europa. Williams-van Klinken, Catharina & Robert Williams. 2015. Mapping the mother tongue in TimorLeste: Who spoke what where in 2010? Dili: Dili Institute of Technology.

446 | Thomas Stolz and Nataliya Levkovych

Appendix I: Fossilized plural forms with nouns borrowed from Spanish (Chamorro) #1 agonias ‘agony’ (< PL agonías ‘agonies’/SG agonía), #2 ajos ‘garlic’ (< PL ajos/SG ajo ‘garlic’), #3 alentos ‘energy’ + #4 alientos ‘breath’ (< PL alientos ‘breaths’/SG aliento), #5 alitos ‘ear-ring; ear-lobe’ (< PL aretes ‘ear-rings’/SG arete), #6 ánimas ‘soul, spirit, ghost’ (< PL ánimas ‘souls’/SG ánima), #7 ánsias ‘longing, eagerness’ (< PL ansias ‘nausea’/SG ansia ‘fear’), #8 (a)tachuelas ‘(thumb) tack’ (< PL tachuelas ‘tacks’/SG tachuela), #9 atkos ‘arch, bow’ (< PL arcos ‘arches, bows’/SG arco), #10 atmas ‘weapon’ (< PL armas ‘arms’/SG arma), #11 babarias ‘foolishness, stupidity, folly’ (< PL boberías ‘stupidities’/SG bobería), #12 balas ‘bean pole, riding whip, stick, twig’ (< PL balas ‘bullets’/SG bala), #13 baliles ‘barrel’ (< PL barriles ‘barrels’/SG barril), #14 batatas ‘potato’ (< PL batatas ‘sweet potatoes’/SG batata), #15 batbas ‘beard’ (< PL barbas ‘beards’/SG barba), #16 batunes ‘button’ (< PL botones ‘buttons’/SG botón), #17 bendas ‘blindfold’ (< PL vendas ‘blindfolds’/SG venda), #18 bíburas ‘angry person’ (< PL víporas ‘vipers’/SG vípora), #19 boñelos ‘doughnut’ (< PL buñuelos ‘pastry’/SG buñuelo), #20 botlas ‘sarcasm’ (< PL burlas ‘jokes’/SG burla), #21 broas ‘sponge cake’ (< PL broas ‘biscuits’/SG broa), #22 brochas ‘duster, brush’ (< PL brochas ‘brushes’/SG brocha), #23 chankletas ‘sandal, slipper’ (< PL chancletas ‘slippers’/SG chancleta), #24 churisos ‘sausage’ (< PL chorizos ‘sausages’/SG chorizo), #25 duendes ‘goblin’ (< PL duendes ‘dwarves’/SG duende), #26 embarasos ‘obstacle’ (< PL embarazos ‘obstacles’/SG embarazo), #27 enredos ‘slander’ (< PL enredos ‘intrigues’/SG enredo), espehos ‘mirrors’ (< PL espejos ‘mirrors’/SG espejo), #28 estiyas ‘splinter’ (< PL astillas ‘splinters’/SG astilla), #29 estreyas ‘star’ (< PL estrellas ‘stars’/SG estrella), #30 fábulas ‘lie, fable’ (< PL fábulas ‘lies, fables’/SG fábula), #31 flores ‘flower’ (< PL flores ‘flowers’/SG flor), #32 frinkas ‘real estate’ (< PL fincas/SG finca ‘real estate’), #33 fueyes ‘furnace’ (< PL fuelles ‘bellows’/SG fuelle), #34 garapatas ‘tick’ (< PL garapatas ‘ticks’/SG garapata), #35 griyos ‘cricket’ (< PL grillos ‘crickets’/SG grillo), #36 guantes ‘glove’ (< PL guantes ‘gloves’/SG guante), #37 Hesuitas ‘Jesuit’ (< PL Jesuitas ‘Jesuits’/SG Jesuita), #38 hibiyas ‘buckle’ (< PL hebillas ‘buckles’/SG hebilla), #39 Hudios ‘Jew’ (< PL Judios ‘Jews’/SG Judio), #40 kaderas ‘hip’ (< PL caderas ‘hips’/SG cadera), #41 kakalotes ‘corn cob’ (< (Mexican) cacalotes PL ‘ravens’/SG cacalote), #42 kápsulas ‘pill’ (< PL cápsulas ‘capsules’/SG cápsula), #43 kasadules ‘chaser’ (< PL cazadores ‘hunters’/SG cazador), #44 kihadas ‘jaw’ (< PL quijadas ‘chins’/SG quijada), #45 korehas ‘belt’ (< PL correas ‘belts’/SG correa), #46 kosas ‘thing’ (< PL cosas ‘things’/SG cosa), #47 kotchetes ‘snap’ (< PL

Parallel Romancization: Chamorro and Tetun Dili – two heavy borrowers compared | 447

cochetes ‘snaps’/SG cochete), #48 kotniyos ‘canine tooth’ (< PL colmillos ‘(canine) teeth’/SG colmillo), #49 kuetdas ‘mainspring’ (< PL cuerdas ‘strings’/SG cuerda), #50 kuetes ‘fire-cracker’ (< PL cohetes ‘rockets’/SG cohete), #51 kukunitos ‘beetle’ (< PL cuconitos ‘little maggots’/SG cuconito), #52 kustiyas ‘rib’ (< PL costillas ‘ribs’/SG costilla), #53 labios ‘lip(s)’ (< PL labios ‘lips’/SG labio), #54 ligas ‘garter’ (< PL ligas ‘garters’/SG liga), #55 litanias ‘litany’ (< PL litanías ‘litanies’/SG litanía), #56 manggas ‘sleeve’ (< PL mangas ‘sleeves’/SG manga), #57 memorias ‘remembrance’ (< PL memorias ‘memories’/SG memoria), #58 meyas ‘sock, stocking’ (< PL medias ‘stockings’/SG media), #59 miyas ‘(nautical) mile’ (< PL millas ‘miles’/SG milla), #60 moskas ‘carpenter’s fitting’ (< PL muescas ‘notches’/SG muesca), #61 motas ‘defect in sewing’ (< PL motas ‘specks’/SG mota), #62 motsiyas ‘stuffed neck of chicken’ (< PL morcillas ‘sausages’/SG morcilla), #63 nigritos ‘black race, negrito, negro’ (< PL negritos ‘negritos’/SG negrito), #64 ñatas ‘scum’ (< PL natas/SG nata ‘froth’), #65 obehas ‘sheep’ (< PL ovejas ‘sheep’/SG oveja), #66 ohales ‘buttonhole’ (< PL ojales ‘buttonholes’/SG ojal), #67 ohas ‘leaf’ (< PL hojas ‘leafs’/SG hoja), #68 pañales ‘nappy’ (< PL pañales ‘nappies’/SG pañal), #69 pares ‘pair’ (< PL pares ‘pairs’/SG par), #70 parientes ‘relative’ (< PL parientes ‘relatives’/SG pariente), #71 patnitos ‘heart (coconut palm)’ (< PL palmitos ‘small palm-trees’/SG palmito), #72 patas ‘(animal) foot, leg’ (< PL patas ‘animal foot, leg’/SG pata), #73 pekas ‘freckle, mildew’ (< PL pecas ‘freckles’/SG peca), #74 peras ‘pear’ (< PL peras ‘pears’/SG pera), #75 pétduras ‘pill’ (< PL píldoras ‘pills’/SG píldora), #76 petlas ‘pearl’ (< PL perlas ‘pearls’/SG perla), #77 pimentos ‘sweet pepper’ (< PL pimientos/SG pimiento ‘pepper’), #78 pipitas ‘seed’ (< PL pepitas ‘seeds’/SG pepita), #79 planas ‘plains’ (< PL planas/SG plana ‘plains’), #80 plantiyas ‘mold, template’ (< PL plantillas ‘templates’/SG plantilla), #81 pleges ‘pleat’ (< PL pliegues ‘pleats, folds’/SG pliegue), #82 polainas ‘legging’ (< PL polainas ‘spats’/SG polaina), #83 potgas ‘mite’ (< PL pulgas ‘fleas’/SG pulga), #84 potseras ‘bracelet’ (< PL pulseras ‘bracelets’/SG pulsera), #85 prófesias ‘prophecy’ (< PL profecías ‘prophecies’/SG profecía), #86 puyitos ‘baby chicken’ (< PL pollitos ‘baby chickens’/SG pollito), #87 rabanos ‘turnip’ (< PL rábanos ‘radish plants’/SG rábano), #88 ramas ‘branch, twig’ (< PL ramas ‘branches’/SG rama), #89 rehas ‘railing’ (< PL rejas ‘grids’/SG reja), #90 ritasos ‘shred, fragment’ (< PL retazos ‘pieces, fragments’/SG retazo), #91 roskas ‘crisp bread’ (< PL roscas ‘pretzels’/SG rosca), #92 sábanas ‘blanket’ (< PL sábanas ‘blankets’/SG sábana), #93 salinas ‘saltern’ (< PL salinas ‘salterns’/SG salina), #94 sandalias ‘sandal’ (< PL sandalias ‘sandals’/SG sandalia), #95 satbayones ‘athlete’s foot’ (< PL sabañones ‘chillblains’/SG sabañón), #96 satdinas ‘sardine’ (< PL sardinas ‘sardines’/SG sardina), #97 sédulas ‘I.D. card’ (< PL cédulas ‘documents’/SG cédula), #98 sehas ‘eyebrow’ (< PL cejas ‘eyebrows’/SG ceja), #99

448 | Thomas Stolz and Nataliya Levkovych

séntimos ‘cent’ (< PL céntimos ‘cents’/SG céntimo), #100 seremonias ‘ceremony’ (< PL ceremonias ‘ceremonies’/SG ceremonia), #101 siboyas ‘onion’ (< PL cebollas ‘onions’/SG cebolla), #102 sikos ‘snout’ (< PL hocicos ‘snouts’/SG hocico), #103 sintas ‘(wooden) strip’ (< PL cintas ‘belts’/SG cinta), #104 suekos ‘wooden Japanese slipper’ (< PL zuecos ‘clogs’/SG zueco), #105 suelas ‘shoe sole’ (< PL suelas ‘shoe soles’/SG suela), #106 suleras ‘beam’ (< PL soleras ‘beams’/SG solera), #107 talapos ‘rag’ (< PL trapos ‘rags’/SG trapo), #108 tasahos ‘chunk’ (< PL tasajos ‘slices of dried meat’/SG tasajo), #109 trampas ‘trick’ (< PL trampas ‘traps’/SG trampa), #110 tratos ‘bargain’ (< PL tratos ‘contracts’/SG trato), #111 tumates ‘tomatoes’ (< PL tomates ‘tomatoes’/SG tomate), #112 tuninos ‘dolphin’ (< PL toninas ‘tuna fish’/SG tonina), #113 ubas ‘grape’ (< PL uvas ‘grapes’/SG uva), #114 yantas ‘iron rim’ (< PL llantas ‘wheel rims’/SG llanta)

Appendix II: Fossilized plural forms with nouns borrowed from Portuguese (Tetun Dili) #1 azulejus ‘glazed (decorative) tile(s)’ (< PL azulejos ‘tiles’/SG azulejo), #2 botas ‘boot(s)’ (< PL botas ‘boots’/SG bota), #3 botinas ‘ankle-boot(s)’ (< PL botinas ‘baby-shoes’/SG botina), #4 brinkus ‘ear-ring(s)’ (< PL brincos ‘ear-rings’/SG brinco ‘toy’), #5 fielas ‘buckle’ (< PL fivelas ‘buckles’/SG fivela), #6 fraldas ‘nappy, shirt tail’ (< PL fraldas ‘nappies, shirt tails’/SG fralda), #7 froñas ‘pillow-slip’ (< PL fronhas ‘pillow-slips’/SG fronha), #8 goiabas ‘guava’ (< PL goiabas ‘guavas’/SG goiaba), #9 kontas ‘bead(s)’ (< PL contas ‘(rosary) beads’/SG conta), #10 luvas ‘glove’ (< PL luvas ‘gloves’/SG luva), #11 meias ‘sock(s)’ (< PL meias ‘socks, stockings’/SG meia), #12 minas ‘land mine’ (< PL minas ‘mines’/SG mina), #13 nervus ‘nervousness’ (< PL nervos ‘nerves; nervousness’/SG nervo), #14 oras ‘hour, time’ (< PL horas ‘hours’/SG hora), #15 rins ‘kidney(s)’ (< PL rins ‘kidneys, lower back’/SG rim), #16 sandálias ‘sandal’ (< PL sandálias ‘sandals’/SG sandália), #17 saudades ‘longing, yearning, homesickness’ (< PL saudades ‘melancholic feelings’/SG saudade), #18 selus ‘(postage) stamp’ (< PL selos ‘stamps’/SG selo), #19 tiras ‘ribbon, strip of cloth’ (< PL tiras ‘strips of cloth’/SG tira), #20 xinelus ‘slipper’ (< PL chinelos ‘slippers’/SG chinelo)

Parallel Romancization: Chamorro and Tetun Dili – two heavy borrowers compared | 449

Appendix III: Fossilized plural forms employed as (potential) collectives (Chamorro) #1 alahas ‘jewelry, jewel’ (< PL alhajas ‘valuables’/SG alhaja ‘jewelry, jewel’), #2 alikates ‘pliers’ (< P.T. alicates ‘pliers’), #3 andas ‘barrow, stretcher’ (< P.T. andas ‘stretcher’), #4 aparatos ‘equipment’ (< PL aparatos ‘machines’/SG aparato), #5 arañas ‘candelabra’ (< PL arañas ‘candelabra’/SG araña), #6 arias ‘(measurement) 100 sqm’ (< PL áreas/SG área ‘(measurement) 100 sqm’), #7 atkángheles ‘(arch)angels’ (< PL arcángeles ‘archangels’ /SG arcángel), #8 atkayadas ‘gill’ (< PL agallas ‘gills’/SG agalla), #9 betguelas ‘(small’ pox’ (< PL viruelas ‘pox’/SG viruela), #10 botas ‘boots’ (< PL botas ‘boots’/SG bota), #11 chícharos ‘peas’ (< (Mexican) PL chícharos ‘peas’/SG chícharo), #12 dientes ‘teeth (comb)’ (< PL dientes ‘teeth’/SG diente), #13 espuelas ‘spurs’ (