Linguistic Complexity: Second Language Acquisition, Indigenization, Contact 9783110229226, 9783110229219

Linguistic complexity is one of the currently most hotly debated notions in linguistics. The essays in this volume refle

256 61 1MB

English Pages 270 [272] Year 2012

Table of contents :
Preface: A closer look
Introduction: Linguistic complexity – Second Language Acquisition, indigenization, contact
Accounting for analyticity in creoles
Nothing will come of nothing
Deletions, antideletions and complexity theory, with special reference to Black South African and Singaporean Englishes
The complexity of the personal and possessive pronoun system of Norf’k
Interlanguage complexity: A construct in search of theoretical renewal
Complexity as a function of iconicity: The case of complement clause constructions in New Englishes
Acquisitional complexity: What defies complete acquisition in Second Language Acquisition
Syntactic and variational complexity in British and Ghanaian English. Relative clause formation in the written parts of the International Corpus of English
Complexity hotspot: The copula in Saramaccan and its implications

Recommend Papers

Arabic Second Language Acquisition of Morphosyntax 9780300159158

While the demand for Arabic classes and preparation programs for Arabic language teachers has increased, there is a nota

185 101 10MB Read more

Language Acquisition and Contact in the Iberian Peninsula 9781501509988, 9781501516795

The Spanish language is spoken by an estimated 477 million people worldwide. This volume focuses on the contact between

185 85 770KB Read more

Fossilization in Adult Second Language Acquisition 9781853596889

This book addresses the issue of fossilization in relation to a key question in SLA research, which is: why are learners

105 81 668KB Read more

Measuring Second Language Vocabulary Acquisition 9781847692092

Measuring Second Language Vocabulary Acquisition describes the effect that word frequency and lexical coverage have on l

142 25 1MB Read more

Morphosyntactic Issues in Second Language Acquisition 9781847690661

This volume examines different aspects of morphosyntactic development of bilingual language learners/users such as langu

136 51 1MB Read more

Approaches to Second Language Acquisition 9781800418271

In this book the authors address five central problems in the study of second language acquisition: transfer, staged dev

104 55 73MB Read more

Problems in Second Language Acquisition 0805835806, 9780805835809

Second language acquisition has an identity problem. It is a young field struggling to emerge from the parent fields of

105 14 Read more

Language Acquisition and Contact in the Iberian Peninsula 9781501509988, 9781501516795

The Spanish language is spoken by an estimated 477 million people worldwide. This volume focuses on the contact between

173 80 9MB Read more

Problems in Second Language Acquisition 0805835806, 9780805835809

Second language acquisition has an identity problem. It is a young field struggling to emerge from the parent fields of

119 52 Read more

TheAffective Dimension in Second Language Acquisition 9781847699695

This volume presents a series of empirical studies which focus on affectivity in relation to both individual learner dif

154 2 3MB Read more

Linguistic Complexity: Second Language Acquisition, Indigenization, Contact
9783110229226, 9783110229219

Author / Uploaded
Bernd Kortmann (editor)
Benedikt Szmrecsanyi (editor)

0 0 0
Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up

File loading please wait...

Citation preview

Linguistic Complexity linguae & litterae

13

linguae & litterae Publications of the School of Language & Literature Freiburg Institute for Advanced Studies

Edited by

Peter Auer · Gesa von Essen · Werner Frick Editorial Board Michel Espagne (Paris) · Marino Freschi (Rom) Erika Greber (Erlangen) · Ekkehard König (Berlin) Per Linell (Linköping) · Angelika Linke (Zürich) Christine Maillard (Strasbourg) · Pieter Muysken (Nijmegen) Wolfgang Raible (Freiburg)

13

De Gruyter

Linguistic Complexity Second Language Acquisition, Indigenization, Contact

Edited by Bernd Kortmann and Benedikt Szmrecsanyi

De Gruyter

ISBN 978-3-11-022921-9 e-ISBN 978-3-11-022922-6 ISSN 1869-7054 Library of Congress Cataloging-in-Publication Data A CIP catalog record for this book has been applied for at the Library of Congress. Bibliographic information published by the Deutsche Nationalbibliothek The Deutsche Nationalbibliothek lists this publication in the Deutsche Nationalbibliografie; detailed bibliographic data are available in the Internet at http://dnb.dnb.de. ” 2012 Walter de Gruyter GmbH & Co. KG, Berlin/Boston Printing: Hubert & Co. GmbH & Co. KG, Göttingen 앝 Printed on acid-free paper 앪 Printed in Germany. www.degruyter.com

V

Table of Contents

Diane Larsen-Freeman Preface: A closer look . . . . . . . . . . . . . . . . . . . . . . . . .

1

Benedikt Szmrecsanyi and Bernd Kortmann Introduction: Linguistic complexity – Second Language Acquisition, indigenization, contact . . . . . . . . . . . . . . . . . . . . . . . . .

6

Jeff Siegel Accounting for analyticity in creoles . . . . . . . . . . . . . . . . . .

35

Terence Odlin Nothing will come of nothing . . . . . . . . . . . . . . . . . . . . .

62

Rajend Mesthrie Deletions, antideletions and complexity theory, with special reference to Black South African and Singaporean Englishes . . . . . . . . . .

90

Peter Mühlhäusler The complexity of the personal and possessive pronoun system of Norf ’k . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 Lourdes Ortega Interlanguage complexity: A construct in search of theoretical renewal . . . . . . . . . . . . . . . . . . . . . . . . . 127 Maria Steger and Edgar W. Schneider Complexity as a function of iconicity: The case of complement clause constructions in New Englishes . . . 156 ZhaoHong Han and Wai Man Lew Acquisitional complexity: What defies complete acquisition in Second Language Acquisition . . . . . . . . . . . . . . . . . . . . . 192 Magnus Huber Syntactic and variational complexity in British and Ghanaian English. Relative clause formation in the written parts of the International Corpus of English . . . . . . . . . . . . . . . . . . . . . . . . . . . 218 John McWhorter Complexity hotspot: The copula in Saramaccan and its implications . 243

VI

Preface

1

Diane Larsen-Freeman

Preface A closer look

The issue of linguistic complexity is, well, complex. It is certainly perceived as more so today than when the structural linguists first declared that, at least at the level of langue, all languages are equally complex. I can think of few such scholarly pronouncements about language that have withstood the test of time as equi-complexity has. However, the nature of the academy being what it is, such pronouncements exist only until someone decides to take a closer look. Although it is not the first to do so, this volume impels that closer look. Readers will not be disappointed. The book is replete with linguistic analyses, argumentation, and a comprehensive introduction with which to make sense of some of the complications and controversies concerning linguistic complexity. Such a level of interest makes one wonder why such a seemingly modest topic has attracted such attention. The answer is simple (at least something is!). Linguistic complexity is important for both theoretical and practical reasons. To give one example of the former, McWhorter has used the linguistic complexity of creoles to challenge the equi-complexity stance, arguing that creoles are less grammatically complex than older languages. On the practical side, Freiburg researchers seek to understand comparatively simple complexity variance in varieties of English before approaching the comparatively complicated complexity variation cross-linguistically (Kortmann and Szmrecsanyi). To give another example on the practical side, Second Language Acquisition (SLA) researchers have long harbored aspirations of developing developmental indices, which at least in part would be based on the growing complexity of interlanguage (Ortega). Sometimes we can understand something in isolation, when our attention is narrowly focused. Other times, we can better understand something when we pour more into the mix and look for (cross-disciplinary) patterns. Indeed, the latter was suggested recently by Sprouse (2010) in his call to restore the relationship that once existed between pidginists/creolists and SLA researchers. In this same innovative “interdisciplinary spirit”, the editors have chosen to bring together scholars who study complexity in pidgins/creoles, SLA, and indigenized L2 varieties with the goal of integrating what we

2

Diane Larsen-Freeman

know about native Englishes, non-native Englishes (including English-based Pidgins and Creoles), and learner Englishes. And this choice has paid off. For example, Mesthrie’s contribution highlights the contrast between native varieties and indigenized L2 varieties: L1 varieties exhibit above-average grammaticity, and indigenized L2 varieties exhibit below-average grammaticity. Then, too, Han and Lew’s discussion of second language learner endstate grammars should benefit scholars interested in complexity variance among indigenized L2 varieties. It would be uncharacteristic of me if I didn’t try to relate my reading to my understanding of complexity theory (Larsen-Freeman 1997; LarsenFreeman and Cameron 2008), which brings a transdisciplinary perspective to bear on complexity (Larsen-Freeman 2011), particularly computational complexity (the sort that Odlin’s descriptive complexity here adopts) (see, for instance, Gell-Mann 1994). It seems to me that this collection is consistent in spirit with seeing language as a complex adaptive system or CAS (Ellis and Larsen-Freeman 2009), although admittedly the characteristics of such a system are not encapsulated in any one chapter, but rather distributed across them. Here are four: 1. Boundary crossing: Complexity theory thinking discourages dichotomous thinking. While the dichotomy between absolute and relative complexity persists, one of the interesting features of the definitions of complexity in this book is how much the authors cross boundaries. Indeed, a number of authors create hybrids between absolute and relative complexity. For example, Odlin’s particular focus is on theoretical value of the interface between descriptive complexity, which is absolute, and relative complexity as manifest in the transfer effects for SLA research. Then, too, Siegel’s approach to complexity combines absolute componential complexity (more elements makes a language more complex) with relative structural complexity (the degree to which a grammar is hard to understand or analyze). Huber also draws on a hybrid complexity notion that is partly absolute and partly relative. The absolute component consists of defining complexity in some linguistic domain as being proportional to the number of elements and, particularly, communicative redundancies in that domain; the relative part highlights those communicative redundancies that are uncommon either cross-linguistically or in the relative adstrate languages, for these are supposedly hard to acquire for adult language users. 2. Interactivity: Complexity in complex systems arises from the interaction of the components which make it up. In a complex system, it is highly unlikely that a single cause will give rise to a complex event. In keeping with

Preface

3

this position, Han and Lew make the point that complexity is as much a linguistic construct as a psycholinguistic one. We see this characteristic also played out in Steger and Schneider’s contribution to this volume, where their analysis of indigenized L2 varieties of English shows an interaction of linguistic form and cognitive function, specifically where linguistic complexity is considered inversely correlated with the iconicity of signs. 3. Change is central: Change in complex systems can be brought about in two ways. The first occurs as the system interacts with its context. As Cilliers (2001: 142) put it, “because everything is always interacting with others and the environment organically the notions of ‘inside’ a system and ‘outside’ a system are never simple or uncontested”. Illustrating this point, Mühlhäusler shows how important it is to take the context into account, and how pronominal complexity in Pitkern Norf ’k is perfectly suited to the complex social setting in which it has evolved. Change can also be internal to the system as well. In a complex system, it takes place through self-organization. This can been seen in McWhorter’s contribution, in which he argues that not all pidgin complexity is due to environmental influences, but rather can be attributed to the same internal pressures to change that other languages experience. In particular, McWhorter’s usage-based orientation to Saramaccan complexity is entirely compatible with language as a CAS, in which the complexity of the copula construction in Saramaccan creole is emergent. 4. The gradient and variable nature of system components: This characteristic of complex systems can be seen in the discussion of the nativization of English in Ghana by Huber, where GhE nativization is not always characterized by the addition of completely new features in an indigenized L2 variety and the deletion of others in the L1 variety, but rather by a subtle reinterpretation of existing systems that are gradual and varied, not categorical, and that can only be represented statistically. The physicist Seth Lloyd proposed in a 2001 paper three different dimensions along which to measure the complexity of an object or process: – How hard is to describe? – How hard is it to create? – What is its degree of organization? Lloyd then listed about forty measures of complexity that addressed one or more of these three questions (Mitchell 2009). Although Lloyd was speaking more of the complexity of dynamical systems, I believe that contributors to this book have wrestled with similar questions. In the end, it is probably not

4

Diane Larsen-Freeman

a matter of arguing among ourselves about which definition or treatment of complexity is right, however. It is a matter of choosing from among those which are available those that are right for a given purpose. For instance, Ortega points out that SLA researchers have used interlanguage complexity with at least three main purposes in mind: (a) to predict proficiency, (b) to describe performance, and (c) to benchmark development. She adds that it may be that certain complexity measures are more useful for tracking the development of dynamic styles (low formality; typically oral language) than others for tracing development in synoptic styles (high formality; typically written). To complicate matters further, certain measures may actually be inappropriate for measuring interlanguage complexity of learners with advanced proficiency. Perhaps it is obvious why I started this preface out by observing that complexity is complex. Accompanying this realization is the humility that comes with seeing that nothing is as simple as it first seems. And yet, due to the importance of the issue of linguistic complexity, we must persevere, aided by the scholarship presented in this book, but always remembering the provisionality of our findings. Cilliers (2010: 41) says it this way: When we have to solve a specific problem in the real world we cannot involve life, the universe and everything. We have to frame the problem in a specific way and use specific tools and methods. This process is restricted by necessity. Nevertheless, this does not lead to a final, complete and objective understanding of the complex issues at hand. Our solutions and our understanding remains provisional. We will only take this provisionality seriously if we constantly return to the critical reflection necessitated by a “general” understanding of the complex world we live in.

Ultimately, then, a commitment to critical reflection on linguistic complexity, with a respect for the provisionality of findings, is what I believe that this book accomplishes.

References Cilliers, Paul 2001 Boundaries, hierarchies and networks in complex systems. International Journal of Innovation Management 5: 135–147. Cilliers, Paul 2010 The value of complexity: A response to Elizabeth Mowat & Brent Davis. Complicity: An International Journal of Complexity and Education 7: 39–42. Ellis, Nick C., and Diane Larsen-Freeman (eds.) 2009 Language as a complex adaptive system. Language Learning 59, Supplement 1. Gell-Mann, Murray 1994 The Quark and the Jaguar. Adventures in the Simple and the Complex. New York: W.H. Freeman and Company. Larsen-Freeman, Diane 1997 Chaos/complexity science and second language acquisition. Applied Linguistics 18: 141–65.

Preface

5

Larsen-Freeman, Diane 2011 Complex, dynamic systems: A new transdisciplinary theme for applied linguistics? Language Teaching. Available on CJO 2011. doi:10.1017/S0261444811000061 Larsen-Freeman, Diane, and Lynne Cameron 2008 Complex Systems and Applied Linguistics. Oxford: Oxford University Press. Lloyd, Seth 2001 Measures of complexity: a non-exhaustive list. IEEE Control Systems Magazine, August. Mitchell, Melanie 2009 Complexity: A Guided Tour. Oxford: Oxford University Press. Sprouse, Rex 2010 Review article: The invisibility of SLA theory in mainstream creole linguistics. Second Language Research 26(2): 261–277.

6

Benedikt Szmrecsanyi and Bernd Kortmann

Benedikt Szmrecsanyi and Bernd Kortmann

Introduction: Linguistic complexity Second Language Acquisition, indigenization, contact

The present volume endeavors to summarize a workshop held at the Freiburg Institute for Advanced Studies (FRIAS) in May 2009 about “Linguistic complexity in interlanguage varieties, L2 varieties, and contact languages”. The presentations at the workshop, and the essays in this volume, reflect the intricacies of thinking about complexity in three major contact-related fields of (and schools in) linguistics: creolistics, indigenization and nativization studies (i.e. in the realm of English linguistics, the “World Englishes” community), and Second Language Acquisition (SLA) research. The authors in this volume are all leading linguists in different areas of specialization, and they were asked to elaborate on those facets of linguistic complexity which are most relevant in their area of specialization, and/or which strike them as being most intriguing. The result is a collection of papers that, when viewed collectively, may raise more questions than it does answer – but such is the way good science advances. And the fact of the matter is that linguistic complexity is complicated. This general introduction is structured as follows. In Section 1, we review the literature on linguistic complexity to provide a backdrop against which the contributions in this volume are to be read. In Section 2, we sketch the objectives of this volume. Section 3 provides summaries of the contributions in this book, and contextualizes them in the context of the volume as a whole.

1.

Linguistic complexity: a review of the research literature

In contrast to SLA researchers, who have always considered linguistic complexity a gauge for learners’ proficiency in the target language, a descriptor for performance, and an index to benchmark development, and thus an intrinsically variable concept (see, for example, Ortega, this volume and Larsen-Freeman 1978), for most of the twentieth century most theoretical linguists have agreed that on the level of langue, all natural languages must be equally complex. Hence, linguistic complexity was supposed to be invariant,

Introduction: Linguistic complexity

7

such that there were no ‘simple’ or ‘complex’ languages. This consensus – which was an article of faith rather than an insight backed up by empirical evidence – has been dubbed the ALEC statement (“All Languages are Equally Complex”) (Deutscher 2009: 243), or the linguistic equi-complexity dogma (Kusters 2003: 5). For a detailed and illuminating review of the history of thought on linguistic complexity and the equi-complexity dogma, we refer the reader to Sampson (2009). Suffice it to say here that the basic idea behind the equi-complexity dogma is that the total complexity of a language is fixed because sub-complexities in linguistic sub-systems trade off. Accordingly, simplicity in some domain A must be compensated by complexity in domain B, and vice versa. This view is nicely articulated in a 1958 textbook by descriptivist Charles Hockett (quoted in, e.g., Sampson 2009): Objective measurement is difficult, but impressionistically it would seem that the total grammatical complexity of any language, counting both morphology and syntax, is about the same as that of any other. This is not surprising, since all languages have about equally complex jobs to do, and what is not done morphologically has to be done syntactically. Fox, with a more complex morphology than English, thus ought to have a somewhat simpler syntax; and this is the case. (Hockett 1958: 180–181)

The popularity of the equi-complexity dogma during the twentieth century may be seen against the backdrop of the nineteenth century, when linguists such as Wilhelm von Humboldt put forward somewhat unfortunate claims to the effect that differences (in terms of complexity or otherwise) between languages can be traced back to the differential mental capacities of their speakers (Humboldt 1836: 37). One reason for the dominance of the equicomplexity dogma in the twentieth century, then, was that it meshed well with more modern and egalitarian perspectives, and specifically with the idea that all human speakers are endowed with the same mental, cultural, and biological capacities. But be that as it may, the beginning of the twenty-first century has witnessed a number of critical reviews of the idea that all languages are, under all circumstances, equally complex (see Karlsson, Miestamo, and Sinnemäki 2008 for a detailed survey). This iconoclasm has no doubt been in the air for a while, but one of the primers that really got the debate going was a special issue of the journal Linguistic Typology (2001, volume 5, issue 2/3). In the issue’s lead article, John McWhorter suggested, somewhat provocatively, that “the world’s simplest grammars are creole grammars”: Creole languages are unique in having emerged under conditions which occasioned the especial circumstance of stripping away virtually all of a language’s complexity (as defined in this paper), such that the complexity emerging in a

8

Benedikt Szmrecsanyi and Bernd Kortmann creole is arising essentially from ground zero, rather than alongside the results of tens of thousands of years of other accretions. As such, creoles tend strongly to encompass a lesser degree of complexity than any older grammar. (McWhorter 2001: 155)

In the same special issue, a range of luminaries responded to this claim – some concurring, some strongly objecting. In retrospect, it seems to us that many of the most vigorous dissenters were actually beholden to the Humboldtian notion that ‘complex is beautiful, simple is retarded’, an equation that few modern complexity analysts would subscribe to (on the contrary!). In this connection, we would like to emphasize that present-day challengers of the equi-complexity dogma do not, of course, resort to nineteenth century racism to posit or explain complexity variance. Instead, many modern complexity analysts seek to model variable complexity as a function of language-external variables (see Section 1.3), which all essentially boil down to ‘cultural’ factors (Deutscher 2010) in the broadest sense. In subsequent years the discourse on linguistic complexity and the equicomplexity dogma has been nourished by several dedicated workshops and conferences (for example, the 2007 workshop on “Language Complexity as an Evolving Variable”, hosted by the Max Planck Institute for Evolutionary Anthropology in Leipzig,), a number of pertinent journal papers (e.g. Shosted 2006; Trudgill 2004), monographs (e.g. Dahl 2004; Kusters 2003), and edited volumes (e.g. Miestamo, Sinnemäki, and Karlsson 2008; Sampson, Gil, and Trudgill 2009). A succinct review of this literature will be presented in what follows. This review will be selective in that we shall focus on synchronic complexity (in)variance on some level of aggregation (language, dialects, or speaker communities) rather than, say, differences between individuals (e.g. Chipere 2009), or the evolutionary genesis of complexity (e.g. Givón 2009).

1.1. Local complexity versus global complexity There is a sense in much of the literature that we need to distinguish carefully between, in the parlance of Miestamo (2008), (1) global linguistic complexity (that is, complexity of a language, dialect, etc. as such) and (2) local linguistic complexity (i.e. domain-specific linguistic complexity). While assessing a language’s global complexity is a very ambitious and indeed probably hopeless endeavor (akin to a “wild goose chase” [Deutscher 2009, 2010]), measuring local complexities in linguistic subdomains is seen as a more doable task. Previous scholarship has been concerned with measures of local complexity in the following linguistic subdomains:

Introduction: Linguistic complexity

9

Phonological complexity (e.g. Nichols 2009; Shosted 2006). Aspects investigated include the size of the phoneme inventory, the incidence of marked phonemes, tonal distinctions, suprasegmental phonology, phonotactic restrictions, and the maximum complexity of consonant clusters. Morphological complexity (e.g. Dammel and Kürschner 2008; Kusters 2003). What is the scope of inflectional morphology in a given language or language variety? Specifically, what is the extent to which we are dealing with allomorphies and morphophonemic processes as classical complexity-inducing “nuisance factors” (Braunmüller 1990: 627) not only in inflectional morphology but also in phonology? Syntactic complexity (e.g. Givón 2009; Karlsson 2009). How many rules are mandated by the syntax of a language (motto: the more, the more complex), and what is the level of clausal embedding and recursion mandated and/or permitted by a language? Observe that the degree of embedding is incidentally also by far the most popular metric of complexity in SLA research (see Ortega 2003). Semantic and lexical complexity (e.g. Fenk-Oczlon and Fenk 2008; Nichols 2009). How many homonymous and polysemous lexemes or phrases are we dealing with? Do personal pronouns have an inclusive/exclusive distinction? We note in this connection that establishing type-token-ratios as a customary SLA tool to assess lexical variation in corpora (for example, Kuiken and Vedder 2008) is essentially also a lexical complexity metric. Pragmatic complexity, also known as “hidden complexity” (Bisang 2009). What is the degree of pragmatic inferencing on the part of hearers mandated by a given language? Some studies have suggested that high levels of local complexity in some domain of a given language do not necessarily entail low local complexity in some other domain of that same language, as the equi-complexity dogma would predict. Shosted (2006), for example, investigates morphological and phonological complexity in a sample of 34 languages and finds that there is no significant correlation, and no trade-off, between morphological and phonological complexity scores; Nichols (2009) likewise explores local complexities in a wide range of languages and fails to obtain a trade-off. FenkOczlon and Fenk (2008: 63) do diagnose “balancing effects” between local complexities, but not to the extent that the equi-complexity dogma could be assumed to hold true under all circumstances. Gil (2008) argues that isolating

10

Benedikt Szmrecsanyi and Bernd Kortmann

languages do not necessarily compensate for simpler morphology with more complexity in other domains.

1.2. Complexity notions and measures Expectably, there is a daunting variety of ways in which analysts have sought to assess linguistic complexity. In most general terms, these measures fall into two categories (see, e.g., Dahl 2009; Lindström 2008; Miestamo 2008): measures of absolute complexity (theory-oriented and thus ‘objective’) versus measures of relative complexity (user-oriented and thus ‘subjective’, about ‘cost’ and ‘difficulty’). Miestamo defines the difference as follows: The absolute approach defines complexity in objective terms as the number of parts in a system, of connections between different parts, etc. […] The relative approach to complexity defines complexity in relation to language users: what is costly or difficult to language users (speakers, hearers, language learners) is seen as complex. Complexity is thus identified with cost and difficulty of processing and learning. (Miestamo 2009: 81)

Notice that the absolute approach is popular especially in the cross-linguistic typology camp, while most sociolinguistically and psycholinguistically inclined complexity analysts have a knack for the relative approach. Note also that some analysts (for example, P. Mühlhäusler 1974, 1992) have wondered whether there are viable language-neutral definitions of ‘complexity’ and ‘simplicity’ (or ‘complexification’ and ‘simplification’) at all. We move on to a discussion of four more specific complexity notions and measures that have served as the workhorses in extant complexity scholarship: (1) absolute-quantitative complexity, (2) redundancy-induced complexity, (3) irregularity-induced complexity, and (4) L2 acquisition complexity.1 1.2.1. Absolute-quantitative complexity Many studies marshal a genuinely absolute-quantitative measure of complexity which restricts attention to what McWhorter (for example, in this volume) refers to as ‘structural elaboration’. Arends has dubbed this notion the ‘more is more complex’ approach:

1

Other complexity/simplicity measures that have been proposed but not explored extensively include processing complexity (J. A. Hawkins 2009; Seuren and Wekker 1986); morphological naturalness (Mayerthaler 1981), and information-theoretic complexity (Juola 2008; Sadeniemi et al. 2008).

Introduction: Linguistic complexity

11

A grammar is judged to be more complex if it has more (marked) phonemes, more tones, more syntactic rules, more grammatically expressed semantic and/or pragmatic distinctions, more morphophonemic rules, more cases of suppletion, allomorphy, agreement. Qualitative aspects of complexity, such as the internal complexity of the rules themselves, are not taken into account. (Arends 2001: 180)

Students of absolute-quantitative complexity have been interested in, e.g., the number of grammatical categories in a language (Shosted 2006), the number of phonemic contrasts (McWhorter 2001), or the length of the minimal description of a linguistic system (Dahl 2004).2 1.2.2. Redundancy-induced complexity This notion is a hybrid between relative and absolute complexity. What takes center stage here is not the absolute number of linguistic elements in a system, but the extent of “overspecification” (McWhorter, this volume): the prevalence of redundant features, in the sense of communicatively dispensable elements that are “incidental to basic communication” (McWhorter 2001: 126). Such material has been called, somewhat polemically, “historical baggage” (Trudgill 1999: 149), “ornamental elaboration” (McWhorter 2001: 132), or “baroque accretion[s]” (McWhorter 2001: 126). In short, redundancy-induced complexity notions are concerned with linguistic elements that are in the language for no apparent reason other than historical accident. Such features may include ergativity, grammaticalized evidential marking, inalienable possessive marking, switch-reference marking, inverse marking, obviative marking, “dummy” verbs, syntactic asymmetries between matrix and subordinate clauses, grammaticalized subjunctive marking, verb-second, clitic movement, any pragmatically neutral word order but SVO, noun class or grammatical gender marking (analytic or affixal), or lexically contrastive or morphosyntactic tone […] (McWhorter 2001: 163)

Note that redundancy-induced complexity is an absolute notion because theory dictates what should count as redundant and what not. At the same time, however, redundancy-induced complexity is a relative concept because it is arguably language users, not the analyst, who bear the cost of this redundancy. 2

In this connection, it is interesting to note that the design of Basic English, an international auxiliary language (B. Mühlhäusler and P. Mühlhäusler 2005; Ogden 1934), appears to be primarily indebted to the ‘more is more complex’ notion – its most prominent, supposedly simplifying design feature is its limited lexicon.

12

Benedikt Szmrecsanyi and Bernd Kortmann

1.2.3. Irregularity-induced complexity Irregularity-induced complexity (a.k.a. ‘non-transparency’; see Nichols forthcoming) is a hybrid concept as well that to some extent combines absolute and relative complexity notions. The idea is to classify as complex the outcome of irregular inflectional and derivational processes, which are opaque and non-transparent (see also McWhorter, this volume; P. Mühlhäusler 1974; Trudgill 2004): as often as not inflection leads to the development of morphophonological processes, which constitute an added component of a grammar to be learned […] Meanwhile, suppletion also complexifies an area of grammar according to our metric. The various suppletive forms of be in English (am, are, is, was, were, been, and similarly in most Indo-European languages) render these languages more complex in this area than languages where the copula is invariable across person and number in the present (McWhorter 2001: 137)

Irregularity-based complexity is an absolute notion in that the definition of what should count as ‘irregular’ is theory-driven, but once again the cost associated with such irregularity is incurred by language users.

1.2.4. L2 acquisition complexity As a genuinely relative complexity measure, L2 acquisition complexity – a metric which is particularly popular in the sociolinguistic community – is concerned with the degree to which a language or language variety (or some aspect of a language or language variety) is difficult to acquire for adult language learners. Thus, Kusters (2003) (see also Kusters 2008) defines complexity as the amount of effort an outsider has to make to become acquainted with the language in question […] an outsider is someone who learns the language in question at a later age, and is not a native speaker. Therefore, phenomena that are relatively difficult for a second language learner in comparison with a first language learner are more complex […] Phenomena that are easy to acquire for a second language learner but difficult for a first language learner are less complex. (Kusters 2003: 6)

In exactly the same spirit, Trudgill (2001: 371) defines linguistic complexity as being commensurate with “difficulty of learning for adults”.

1.3. Language-external conditioning factors Much of the recent discourse on linguistic complexity (in)variance is oriented towards the question of whether and to what extent linguistic complexity is a function of certain language-external factors. On the most general level, Deutscher (2010) has argued that culture is partly to blame for com-

Introduction: Linguistic complexity

13

plexity variance – for example, complex, literate cultures need complex, sizable vocabularies (Deutscher 2010: 110). More narrowly conceptualized proposals include McWhorter’s claim that the age of a language can affect its complexity (see also Parkvall 2008). In this perspective, the world’s simplest languages are creole languages because they were born as pidgins, and thus stripped of almost all features unnecessary to communication, and since then have not existed as natural languages for a long enough time for diachronic drift to create the weight of “ornament” that encrusts older languages. (McWhorter 2001: 125)

Other researchers have suggested that while older languages may indeed be more complex than younger languages (all other things being equal), a history of language contact and concomitant adult SLA may render a per se old language simpler than it otherwise would have been (see also McWhorter 2008; Thomason 2001). The reason is that, for one thing, speakers often “simplify their native language for specific social purposes in contact situations” (Thomason and Kaufman 1988: 177). Secondly, adult language learners regularly adopt simplifying strategies (Selinker 1972) to avoid irregularities and to increase transparency in the sense of Seuren and Wekker (1986) (cf. also Klein and Perdue 1997; Muysken and Smith 1995). Trudgill succinctly sums up this argument as follows: Adult language contact means adult language learning; and adult language learning means simplification, most obviously manifested in a loss of redundancy and irregularity and an increase in transparency. (Trudgill 2001: 372)

A third line of research has endeavored to show that linguistic complexity is sensitive to certain sociological parameters. Sinnemäki (2008, 2009), for example, demonstrates that in a typological sample of fifty languages, population size correlates with complexity in core argument marking. In much the same vein, Trudgill shows that “small, isolated, low-contact communities with tight social network structures” (2004: 306) tend to have more complex languages than high-contact communities. Lastly, Nichols (forthcoming) suggests that linguistic complexity is to some extent predicted by geographic altitude, a first-rate proxy for the degree of isolation of a speaker community.

1.4. Data sources It is fair to say that much of the recent literature on linguistic complexity is based on rather unsystematic and intuition-based, if not anecdotal, evidence. To the extent that claims are based on real data, we find that most researchers interested in cross-linguistic variation have tapped reference grammars in conjunction with typological sampling techniques (for example,

14

Benedikt Szmrecsanyi and Bernd Kortmann

Miestamo 2009; Sinnemäki 2009). Parkvall (2008) accesses a major typological database, the World Atlas of Language Structures (Haspelmath et al. 2005). Analysts interested in language-internal variation typically rely on fieldwork and standard dialectological data sources such as dialect atlases (for example, Trudgill 2009). Few researches have based claims on major, balanced naturalistic text corpora (but see, e.g., Karlsson 2009). Using an approach that is corpus-linguistic in spirit, Juvonen (2008) studies Chinook Jargon, a pidgin language, on the basis of a fictional text. Some analysts have also explored Bible translations (e.g. Dahl 2008). Gil (2008) is a rare example of a complexity study that uses an experimental design.

1.5. The Freiburg perspective The Freiburg program endeavors to marry methodologies and interpretational approaches familiar from the study of large-scale cross-linguistic variation to the analysis of language-internal variation (see Kortmann 2004 for a collection of papers in this spirit). It is precisely in this tradition that recent Freiburg-based work has sought to explore linguistic complexity variance in varieties of English, for the sake of understanding the comparatively simple (i.e. language-internal complexity variation) before approaching the comparatively complicated (i.e. cross-linguistic complexity variation). Our research agenda has had two specific objectives: first, to elucidate large-scale patterns of (complexity) variance in varieties of English and in Englishbased pidgin and creole languages; and second, to develop the necessary metrics for passing judgment on degrees of complexity and claims concerning processes of simplification and complexification of grammars. A first line of research has tapped into the morphosyntactic survey of the multimedia reference tool that accompanies the Handbook of Varieties of English (Kortmann et al. 2004). Kortmann and Szmrecsanyi (2004) describe the survey design in ample detail; suffice it to say here that the survey describes 46 varieties of English (specifically, 20 L1 varieties, 11 indigenized L2 varieties of English, and 15 English-based pidgin and creole languages) in terms of the presence or absence of 76 non-standard morphosyntactic features, which fall into eleven broad areas of morphosyntax: pronouns, the noun phrase, tense and aspect, modal verbs, verb morphology, adverbs, negation, agreement, relativization, complementation, and discourse organization and word order. What can this database tell us about complexity variance in varieties of English? To start with, when applying satellite-view exploratory statistical analysis techniques to the dataset, it turns out that varieties of English can be arranged along two major dimensions. In Figure 1, for example,

Introduction: Linguistic complexity

15

we find a plot that uses Principal Component Analysis to assign each variety of English a coordinate in two-dimensional principal component space. The picture that emerges is that varieties of English cluster very neatly according to variety type: English-based pidgin and creole languages cluster in the topleft quadrant while L1 varieties cluster in the bottom-right quadrant; indigenized L2 varieties of English cover the middle ground. The meaning of the

Figure 1. Locating varieties of English in a two-dimensional plane (adapted from Szmrecsanyi and Kortmann 2009a: Fig. 4)

Figure 2. Rule simplicity by L2 acquisition simplicity in varieties of English (adapted from Kortmann and Szmrecsanyi 2009: Figure 1)

Figure 3. Analyticity by syntheticity in varieties of English (adapted from Szmrecsanyi and Kortmann 2009c: Fig. 5.1)

Figure 4. Analyticity by syntheticity: indigenized L2 varieties versus learner varieties (adapted from Szmrecsanyi and Kortmann 2011: Figure 1)

16

Benedikt Szmrecsanyi and Bernd Kortmann

axes can be interpreted as follows. Component 2 (the vertical axis in Figure 1) plots a given variety’s degree of analyticity, a notion that is defined as bringing about a greater number of features that are autonomous, invariable and periphrastic in nature. Component 1 (the horizontal axis in Figure 1) is taken to indicate increased morphosyntactic complexity, which is broadly defined in the ‘more is more complex’ spirit following McWhorter (2001). According to this interpretation, then, English-based pidgin and creole languages are least complex, L1 varieties are most complex, and indigenized L2 varieties of English exhibit intermediate complexity. Digging deeper into complexity variance in the morphosyntax survey, Kortmann and Szmrecsanyi define three more compartmentalized complexity measures: Ornamental rule complexity (Kortmann and Szmrecsanyi 2009; Szmrecsanyi and Kortmann 2009c): the relative incidence in a variety’s inventory of features that add contrasts, distinctions, or asymmetries without providing a communicative or functional bonus (a definition that is, of course, reminiscent of McWhorter 2001’s approach). Prime examples of such features include she/her used for inanimate referents, or be as a perfect auxiliary. Rule simplicity (Kortmann and Szmrecsanyi 2009): as the mirror image of ornamental rule complexity, this measure is concerned with the relative incidence in a variety’s inventory of features that simplify the system, visà-vis the standard system. An example of a simplifying feature would be regularized reflexives-paradigms (where oppositions in the Standard English system are leveled), or the loosening of the sequence of tenses rule (where a rule in the Standard English system is dispensed with). L2 acquisition simplicity (Kortmann and Szmrecsanyi 2009; Szmrecsanyi and Kortmann 2009c): as a relative complexity metric, this measure establishes the relative incidence in a variety’s inventory of features that SLA research has shown to recur in interlanguage varieties. Examples include invariant don’t in the present tense (learners tend to avoid inflection; see, e.g., Klein and Perdue 1997), and regularization of irregular verb paradigms (learners tend to overgeneralize [e.g. Towell and R. Hawkins 1994]). The results in terms of ornamental rule complexity are fairly clear-cut. While traditional low-contact L1 vernaculars (e.g. traditional dialects in the British Isles) attest between two and three ornamentally complex features on average, high-contact L1 varieties, indigenized L2 varieties, and English-based pidgin and creole languages typically have only about one ornamentally com-

Introduction: Linguistic complexity

17

plex feature in their inventory. Thus, ornamental rule complexity appears to be a function of the degree of contact (and of a history of L2 acquisition among adults), which is in line with, e.g., Trudgill (2001: 372). As for rule simplicity and L2 acquisition simplicity, consider Figure 2, which plots the two simplicity indices against each other. As with Figure 1, we find variety clusters such that English-based pidgin and creole languages attest many simple features, while L1 varieties and, curiously, indigenized L2 varieties of English seem to attest few simplifying features as well as few L2-simple features. A second line of research pursued in Freiburg has based claims about language-internal complexity variance in English not on survey data but – drawing on a method in quantitative morphological typology originally suggested by typologist Joseph Greenberg (1960) – on frequency vectors extracted from naturalistic corpus data. In this research, four frequency-based complexity indices are distinguished (Kortmann and Szmrecsanyi 2011; Szmrecsanyi 2009; Szmrecsanyi and Kortmann 2009c): An overt grammatical analyticity index, which measures the text frequency (normalized to 1,000 words of running text) of free grammatical markers (also known as function words). An overt grammatical syntheticity index, which measures the normalized text frequency of bound grammatical markers, i.e. inflections (recall that synthetic marking is supposedly particularly complex, thanks to inevitable allomorphies and the fact that learners tend to avoid inflectional marking). An irregularity index, which establishes the proportion of bound grammatical markers that are irregular and lexically conditioned (e.g. catch > caught, as opposed to catch > catched). An overt grammaticity index, which is concerned with the normalized text frequency of any grammatical markers, analytic or synthetic. The findings can be summarized as follows. Consider Figure 3, which plots syntheticity levels of a range of varieties of English for which naturalistic corpus data are available against their analyticity levels. We find that traditional, low-contact British English dialects (e.g. East Anglian English) tend to be both fairly analytic and synthetic, while some ‘deleting’ (Mesthrie 2006) indigenized L2 varieties of English (e.g. Hong Kong English) tend to be neither particularly synthetic nor particularly analytic. Overall, the generalization that emerges from Figure 3 is that there is a theoretically interest-

18

Benedikt Szmrecsanyi and Bernd Kortmann

ing positive correlation between analyticity and syntheticity levels, instead of a trade-off. This is another way of saying that varieties of English differ primarily in terms of their grammaticity (that is, the text frequency of grammatical marking, analytic or synthetic): thus, L1 varieties exhibit above-average grammaticity, while indigenized L2 varieties exhibit below-average grammaticity (and do not selectively avoid synthetic marking only). As for the irregularity index, it turns out that in indigenized L2 corpus material, 82 per cent of all bound grammatical allomorphs are regular, a percentage that goes down to 71 per cent in the case of high-contact L1 varieties and to 65 per cent in traditional L1 vernaculars. In other words, indigenized L2 varieties are least complex and most transparent morphologically whereas traditional, lowcontact L1 vernaculars are most complex and least transparent – again, in line with what one would expect from perusing the literature. Szmrecsanyi and Kortmann (2011) advance this approach3 to cover differences between written registers in indigenized L2 varieties as sampled in the International Corpus of English (ICE; Greenbaum 1996) and the written material sampled in the Louvain International Corpus of Learner English (ICLE; Granger 2003). In short, the name of the game is comparing essays written by, say, Hong Kong English writers to essays written by e.g. advanced Spanish learners of English. Figure 4 thus arranges the data points in a customary two-dimensional analyticity-syntheticity plane. The fact of the matter is that in this perspective, learner varieties and indigenized L2 varieties are clearly different beasts in that learner Englishes are significantly more analytic than indigenized L2 varieties of English. There is also a notable tendency for indigenized L2 varieties to be more synthetic than learner Englishes. In addition, Szmrecsanyi and Kortmann (2011) show that the complexity profiles of learner Englishes (for example, German learner English) cannot necessarily be predicted by the complexity profile of learners’ native languages (e.g. German). In other words, there do not appear to be reliable substrate effects.

2.

The present volume

There is a new angle from which this volume tries to shed light on the multifacetted and highly controversially discussed concept of linguistic complexity. This angle is determined by three independent, but clearly related, language-external factors conditioning linguistic complexity, namely language contact (including dialect contact), adult language acquisition, and indigen3

For further extensions, see Ehret (2008) and Szmrecsanyi (2009) on text type variability, and Szmrecsanyi (2012) on long-term historical variability.

Introduction: Linguistic complexity

19

ization. The present volume thus meets the increasingly voiced need for approaches integrating what we know about native Englishes, non-native Englishes, English-based pidgin and creole languages, and learner Englishes with what we know about the processes involved in SLA, indigenization and creolization, in general (for studies in the same spirit compare, for example, Davydova 2011, to appear). In doing so, it brings together for the first time leading representatives of three fields of research in each of which, as we have shown in the literature review, the notion of linguistic complexity has figured in recent debates. More specifically, the authors were asked to address one or more of the following three questions in their papers: 1. How can we adequately assess complexity in interlanguages (e.g. Finnish learner English), indigenized L2 varieties (e.g. Hong Kong English), and pidgin and creole languages (e.g. Jamaican Creole)? 2. Should we be interested in absolute complexity or rather relative complexity? (One may spontaneously assume that most authors would go for a relative complexity measure, given the nature of the languages under consideration in this volume.) 3. What is the extent to which language contact and/or (adult) language learning might lead to morphosyntactic simplification? In which contact situations does it lead to morphosyntactic simplification, at all? (As presented in Section 1.5 above, simplification in grammar clearly correlates with a high degree of language and dialect contact for varieties of English. However, from these results it must not be concluded that intensive language contact necessarily results in simplification processes. Sarah Thomason has repeatedly stated [p.c.] that complexification processes are equally observable among contact languages and contact varieties.) All authors were asked to provide explicit statements on what exactly they understand by ‘complexity’, how to measure complexity, and which role complexity plays in their line of research or their research framework for the varieties and languages that they are interested in. It so happens that all papers in this volume primarily address varieties of English and English-based pidgin and creole languages. We believe, however, that this does not impinge on the validity of the general points concerning many of the crucial issues sketched above. Ultimately, this volume offers a unique opportunity to explore both the different views on, but also the degree of common ground among three research communities, each of which is concerned with linguistic complexity in, broadly speaking, (current or past) conditions of overcoming the language barrier.

20

3.

Benedikt Szmrecsanyi and Bernd Kortmann

Summaries of the contributions in this volume

A perfect representative of the present volume’s inter-disciplinary spirit, Jeff Siegel’s “Accounting for Analyticity in Creoles” draws on evidence and observations deriving from research on SLA, interlanguage varieties, indigenized L2 varieties, restricted pidgins, and expanded pidgin and creole languages. The contribution is a meticulously argued piece of detective work illuminating why creole languages are, as a rule, pervasively analytic and therefore, by Siegel’s reckoning, non-complex. To fix terminology, Siegel distinguishes between two types of complexity, componential complexity (an absolute-quantitative notion: more components will render a language more complex) and structural complexity (the degree to which a grammar is hard to understand or analyze – hence, a relative notion); the contribution is concerned with morpho-grammatical (and thus, local) structural complexity. Siegel proceeds from the widely accepted assumption that all other things being equal, analytic morphemes are simpler than synthetic morphemes thanks to the perceptual salience and concomitant acquisitional ease that comes with analytic marking (see McWhorter’s contribution for a critical stance towards this view). But what is the exact role that SLA plays in the genesis of creole analyticity? In principle, analytic simplicity in creole languages may be the consequence of (i) a process of reduction (in Siegel’s parlance, reductive simplification with lexifier speaker agentivity), or (ii) a lack of development (developmental simplicity, with substrate speaker agentivity). Siegel points out that (ii) is the more likely scenario. However, if normal adult SLA were implicated in the process, it would follow that any analytic creole morphology is simply the product of the expansion of an earlier interlanguage, in which case creole languages should exhibit certain traits they share with interlanguages and indigenized L2 varieties – a corollary that, according to Siegel, does not always mesh with the facts. The interim conclusion is, therefore, that adult SLA cannot entirely account for the analytic nature of creole languages. Instead, Siegel offers that what may better explain pervasive creole analyticity is the notion of functional transfer, defined as a process where agents apply “the properties of a grammatical morpheme of one language to a syntactically congruent lexical item of another language”. Unlike in the case of indigenized L2 varieties, in the genesis of which agents may have drawn on material (possibly synthetic) in the target language for expanding an interlanguage, speakers of restricted pidgins may not have that sort of access to the lexifier. Instead, speakers may tap into knowledge of their first language(s) via functional transfer, thus mapping grammatical information on lexical items drawn from the lexifier. The outcome is pervasive creole analyticity.

Introduction: Linguistic complexity

21

Like Jeff Siegel, Terence Odlin in “Nothing will come of nothing” is also interested in transfer effects, though Odlin’s particular focus is on the theoretical value of the interface between transfer effects and linguistic complexity for SLA research. The specific complexity notion that Odlin explores is descriptive complexity (see also Mühlhäusler’s contribution), an absolute-quantitative notion that defines complexity as being proportional to the length of the minimal adequate description of the system being described (cf. Dahl 2004; Rescher 1998). Observe here that Odlin’s idea of complexity also partially overlaps with Kusters’ (2008) idea of outsider complexity, the crucial difference being that while Kusters abstracts away from transfer effects, Odlin is explicitly interested in them. The contribution investigates preposition and article usage in written learner English by native speakers of Finnish and Swedish in Finland, two learner groups with native languages that are typologically quite different (Swedish being related to English, and Finnish not at all), but with minimal cultural differences. The paper relies on a cross-sectional dataset collected by Scott Jarvis that documents task-oriented written learner English (movie retells, to be precise) by different learner groups (Finnishspeaking Finns versus Swedish-speaking Finns) with varying numbers of years of formal instruction. It turns out that in this dataset, zero prepositions (as in Chaplin go a restaurant) and zero articles (as in girl run away but police[man] take girl) make all the difference: Finnish-speaking Finns tend to omit prepositions and (even more) articles that are normally obligatory in the specific syntactic environments, while Swedish-speaking Finns are better at providing prepositions and articles where appropriate. How can we account for this difference? Both Finnish-speaking and Swedish-speaking Finns have native languages whose spatial reference system with overt formal elements aids eventual acquisition of correct English prepositional usage, so both learner groups seem to eventually manage to come to terms with English prepositional usage. By contrast, Finnish offers little help, via positive transfer, with the English article system (note that Finnish does not have real articles), whereas Swedish-speaking Finns have an advantage because Swedish does have an article system that is broadly similar to the English system. Odlin observes that unlike English preposition usage English article usage is descriptively fairly complex: while the article system does not have many formal elements, the determinants of the choice between zero and explicit articles are descriptively complex and thus present a tall order to learners whose first language does not have a comparable article system. In any case, the combination of the absence of positive transfer effects and lots of complexity in the target language go a long way towards explaining why Finnish-speaking learners of English have particular difficulties with the English article system.

22

Benedikt Szmrecsanyi and Bernd Kortmann

Zero and non-zero take center stage in Odlin’s contribution, and so they do in “Deletions, antideletions and complexity theory, with special reference to Black South African and Singaporean Englishes” by Rajend Mesthrie. Focusing on the strikingly different morphosyntactic behavior – economy versus explicitness – of two indigenized L2 varieties of English, Mesthrie draws on previous work (Mesthrie 2006) to define a ‘deletion’ parameter that yields a fairly reliable typology of indigenized L2 varieties of English. In the explicitness camp, we find varieties, such as Black South African English, that are ‘anti-deleters’. These obey the following principles: 1. If a grammatical feature can be deleted in Standard English, it can be undeleted in the anti-deleting variety. 2. If a grammatical feature can’t be deleted in Standard English, it almost always can’t be deleted in the anti-deleting variety. 3. If X is a grammatical feature of the anti-deleting variety that is not covered by Principles 1 and 2, then X almost always involves the presence of a morpheme not found in Standard English. Across the aisle in the economy cabin, we are dealing with indigenized L2 varieties of English (for example, Singaporean English) which are ‘deleters’. Their morphosyntax profile is governed by the following principles: 4. If a grammatical feature can be deleted in Standard English, it can be deleted in the deleting variety. 5. If a grammatical feature can’t be deleted in Standard English, it almost always can be (variably) deleted in the deleting variety. Mesthrie points out that the deleting/anti-deleting polarity between varieties such as Black South African English and Singaporean English can be interpreted in terms of complexity variance. Hence, the most straightforward interpretation of the behavior of Black South African English is one along the lines of an inverted absolute-quantitative complexity notion: due to its supposedly user-friendly explicitness, Mesthrie considers Black South African English comparatively simple (motto: ‘more is [cognitively] less complex’). However, by virtue of logic this would mean that a deleting variety such as Singaporean English should count as complexifying, a conclusion that is hardly palatable to many in the World Englishes community (which is beholden to the view that deletion must be simplifying). In an attempt to resolve this issue, Mesthrie emphasizes that substrate languages (for example, isolating languages in the case of Singaporean English) need to be taken into account, as these can explain away a good deal of what might look at first blush as complexity variance (we note that Huber’s contribution is more critical about this line of explanation). Mesthrie’s study thus highlights the fact that comparing complexity levels across contact lan-

Introduction: Linguistic complexity

23

guages with different substrate profiles, as is customary in the literature, can be problematic. In Peter Mühlhäusler’s “The complexity of the personal and possessive pronoun system of Norf ’k”, we are likewise reminded of a number of shortcomings in conventional thinking about linguistic simplicity and complexity. Mühlhäusler illustrates these drawing on Pitkern Norf ’k as a case study. Pitkern Norf ’k is a creole language spoken on Pitcairn and Norfolk Island in the Pacific Ocean by the descendants of the Bounty mutineers, and the language operates in a very specific social ecology. For example, from the beginning there has been a pressing need to distinguish between insiders and outsiders, and this is expressed not only lexically but the distinction is also grammaticalized. Mühlhäusler’s overall point – besides drawing attention to the deplorably sketchy nature of published accounts on Pitkern Norf ’k – is that this setting must be taken into account when evaluating Pitkern Norf ’k’s linguistic simplicity (or, rather, complexity). Mühlhäusler focuses on local complexity in the pronoun system and uses, or rather critiques, an absolutequantitative complexity measure according to which more contrasts and distinctions yield more “descriptive” complexity. It turns out that, overall, the Pitkern Norf ’k pronominal system is staggeringly complex according to customary definitions: the inventory is large, and the forms are irregular. For instance, there is plenty of synthesis and fusion, which violates the one-formone-meaning principle; there are pronouns that must be used when referring to insiders; there is a gender contrast in the third person singular and dual; and so on. All in all, these are complexities that exceed the pronominal complexity of Pitkern Norf ’k’s source languages and probably all English-based pidgin and creole languages. However, Mühlhäusler highlights the fact that most of Pitkern Norf ’k’s pronominal complexity resides in the system of deictic reference (whose function it is to carve up “people space”), and not in the system of anaphoric reference. Why is this? Mühlhäusler presents an extended argument that Pitkern Norf ’k’s complex social setting has promoted a complex system of deictic reference. Thus, what looks like linguistic complexity at first glance is actually just about adequate given the communicative and social-indexical job Pitkern Norf ’k has to do: to carve up people space given the islanders’ needs. Mühlhäusler thus demonstrates that failing to take into account communicative function results in linguistic complexity ratings that are maybe descriptively accurate, but that do not tell the whole story – akin to asking ‘What is the simplest tool’ rather than asking the better question ‘What is the simplest tool for a certain job?’. Mühlhäusler’s idea of toolbox complexity is echoed, to some extent, by Lourdes Ortega’s plea for complexity metrics in SLA research that take

24

Benedikt Szmrecsanyi and Bernd Kortmann

language function seriously. Thus, “Interlanguage complexity: A construct in search of theoretical renewal” reviews previous SLA approaches to linguistic (specifically, interlanguage) complexity, and sketches a research agenda to guide future SLA research in this spirit. Ortega’s point of departure is that while extant SLA research has sought to measure interlanguage complexity for a variety of purposes, the literature is dominated by three notions of linguistic complexity, all of which come into the remit of ‘more is more complex’ absolute-local complexity: (1) length of selected linguistic units (e.g. increased mean T-unit length is supposed to index increased complexity), (2) density of subordination (e.g. more subordination is more complex), and (3) frequency of complex forms (higher frequencies of complex forms equate with increased complexity). Ortega argues that all of these notions are somewhat simplistic, selective, and do not consistently do justice to the intricacies of interlanguage complexity, which is why they suffer from what Ortega calls “construct reductionism”. To address these shortcomings, Ortega advocates drawing on ideas and concepts originally developed in Systemic Functional Linguistics (SFL) (e.g. Halliday and Matthiessen 1999), a framework that usefully distinguishes between dynamic styles (low formality, typically oral) and synoptic styles (high formality, typically written). It turns out that while subordination – which, to reiterate, is one of the workhorse complexity diagnostics in extant SLA research – is crucial for the development of dynamic styles, it is less relevant to the development of synoptic styles, where processes such as nominalization and grammatical metaphor (e.g. to feel [verb] → feeling [noun]) take center stage. Ortega cites preliminary evidence that measures of subordination may not be appropriate to measure interlanguage complexity under all circumstances, and that they may be actually inappropriate when dealing with advanced levels of proficiency. This is another way of saying that what is needed is a set of (at least) two complexity metrics: one that measures complexities in dynamic styles, typically at early levels of proficiency (where semantic content is mapped onto grammatical categories via, e.g., subordination), and one that gauges complexity in synoptic styles, more often than not at advanced levels of proficiency (where we find less prototypical form-meaning mappings, more nominalizations, and more grammatical metaphors). Ortega’s proposal highlights the empirical potential of nominalization and grammatical metaphor as complexity indicators, phenomena that are virtually unexplored in the existing complexity literature. Also, the added bonus of rejecting one-size-fits-all complexity measures is that the analyst is encouraged to take into account register and modality differences. Such an endeavor dovetails nicely with recent usage-based approaches to language (see,

Introduction: Linguistic complexity

25

for example, Ellis and Larsen-Freeman 2006) and in so doing opens up research interfaces with corpus-linguistic research programs concerned with the dynamics of contact languages, such as indigenized L2 languages, at different stages of development. A corpus-based perspective on indigenized L2 varieties, then, is precisely what underpins Maria Steger and Edgar W. Schneider’s approach in “Complexity as a function of iconicity: The case of complement clause constructions in New Englishes”. Drawing on a bedrock principle in the functionalist literature – iconicity – the contribution probes complexity variance in a number of indigenized L2 varieties of English (specifically: Singapore English, Indian English, Hong Kong English, and East African English) visà-vis Standard British English for benchmarking purposes (note that the study is entirely based on the publicly available International Corpus of English). Steger and Schneider utilize a complexity notion that has hitherto received less attention than it deserves: they define linguistic complexity as being inversely proportional to linguistic iconicity of various types, which all boil down to the degree to which form-meaning pairings are motivated and transparent (in this sense, the contribution ties in with Mesthrie’s paper, which considers anti-deletion simple because explicitness is maintained). The view that iconicity may translate into linguistic simplicity, and un-iconicity into complexity, is essentially a hybrid absolute-relative notion: cognitive-functionalist theory dictates which structures are considered iconic and which are not, but the key assumption is relative in that iconic constructions are posited to be prioritized in settings of multilingualism/multidialectism and concomitant adult SLA (thanks to speaker-oriented processing optimization). It is precisely because of this that Steger and Schneider expect that iconicity should be more pervasive in New Englishes than in, e.g., Standard British English. The study explores local complexity in the verbal complexity domain, operationalizing iconicity-based complexity such that, for example, non-finite complementation patterns (as in John expects Mary to come) count as less iconic and hence more complex than finite complementation patterns (for example, John expects that Mary comes). In exactly this spirit, Steger and Schneider spell out a number of structural hypotheses that should be borne out in the data if iconicity were indeed prioritized in the genesis of indigenized L2 varieties of English, as Steger and Schneider conjecture. Thus, indigenized L2 varieties of English should exhibit (1) a preference for finite (rather than non-finite) complementation patterns; (2) a preference for patterns with an overt (as opposed to zero) complementizer; (3) a preference for explicit expressions of modality; (4) a preference for explicit structural marking of double conceptual functions; and (5) a dispreference for raising

26

Benedikt Szmrecsanyi and Bernd Kortmann

constructions. Steger and Schneider subsequently present a meticulous qualitative and quantitative analysis to put these hypotheses to the empirical test, and they demonstrate that, indeed, New Englishes – and especially so Hong Kong English and East African English – tend towards the iconic pole compared to Standard British English. In the context of the present volume, the contribution highlights the fact that the corpus-based complexity literature would profit from more studies along the lines of Steger and Schneider’s that go beyond, e.g., mere counting of elements or some such and that instead rely on key notions in functionalist theory to conceptualize the notion of linguistic complexity. In “Acquisitional complexity: What defies complete acquisition in Second Language Acquisition”, ZhaoHong Han and Wai Man Lew center – much as Steger and Schneider as well as, to some extent, Ortega do – on the interplay between form and function. The paper specifically discusses a novel take on conceptualizing complexity in SLA research and beyond. In most general terms, Han and Lew view complexity as a linguistic as well as psycholinguistic notion – more precisely, as an “intricate relationship, relative rather than absolute, between linguistic elements, with both a clinal and a temporal dimension”. Han and Lew differentiate between two pertinent types of complexity, developmental and acquisitional; the focus of the contribution is on the latter type. Traditional complexity metrics in the SLA literature, such as, e.g., number of T-units per sentence, come within the remit of developmental complexity, which is all about what is acquirable at a given time. Against this backdrop, the authors define acquisitional complexity as what is ultimately non-acquirable by learners. Acquisitional complexity is mostly idiosyncratic and static; it focuses on form-meaning pairings; it is a function of stabilized interactions between exogenous and endogenous contingencies; and, despite its static nature, empirically it can be captured on the basis of longitudinal data only. We note, along these lines, that acquisitional complexity in the spirit of Han and Lew is one of the most radically relative complexity notions discussed in the present volume. How exactly do we find out about acquisitional complexity, then? Han and Lew offer that the notion of fossilization (i.e. cessation of learning) and, specifically, Han’s Selective Fossilization Hypothesis (Han 2009) are à propos. The Selective Fossilization Hypothesis identifies L1 markedness and L2 input robustness as crucial determinants of likely fossilization candidates and, hence, of acquisitional complexity; L1 markedness and L2 input robustness, in turn, are a function of formal frequency and formal-functional variability. Han and Lew’s argument, which they illustrate on the basis of two case studies, boils down to an argument that linguistic features that are marked (i.e. infrequent and/or vari-

Introduction: Linguistic complexity

27

able) in a learner’s L1 and robust (i.e. frequent and invariable) in L2 input are unlikely to fossilize and thus should count as acquisitionally simple. By contrast, features that are unmarked in the L1 and non-robust in L2 input are acquisitionally complex. A crucial element in this nexus is Slobin’s (1987) Thinking for Speaking Hypothesis. In all, Han and Lew’s contribution is an inspiring piece that combines ideas about first language influence, fossilization, markedness, formal frequency, and formal-functional (in)variability with cross-disciplinarily relevant reasoning about linguistic complexity. The proposal’s focus on L2-endstate grammars in particular should provide inspiration to scholars interested in complexity variance among indigenized L2 varieties. By comparison to Han and Lew (and also Odlin), Magnus Huber, in “Syntactic and variational complexity in British and Ghanaian English: Relative clause formation in the written parts of the International Corpus of English”, is relatively skeptical about the explanatory potency of transfer effects when it comes to explaining the grammatical blueprint of indigenized L2 varieties. Huber’s contribution is a careful variationist investigation, based on the International Corpus of English, into what may happen to a particular grammatical domain (such as restrictive relative clause formation) when users of an indigenized L2 variety (such as Ghanaian English) at a particular evolutionary stage (in the case of Ghanaian English, nativization and endonormative stabilization in the parlance of Schneider 2007) are faced with certain complexities in the norm-providing input variety (in the Ghanaian scenario, Standard British English). Interpretationally, Huber draws on a hybrid complexity notion that is partly absolute and partly relative. The absolute component consists of defining complexity in some linguistic domain as being proportional to the number of elements and, particularly, communicative redundancies in that domain. The relative component highlights those communicative redundancies that are uncommon either cross-linguistically and/or in the relative adstrate languages, for these are supposedly hard to acquire for adult language users. Consequently, ‘uncommon’ redundancies should be disadvantaged in the genesis of indigenized L2 varieties. In addition, Huber considers inherent variability as complexifying (as ‘variationally’ complex, that is). Against the backdrop of this particular complexity definition, the (Standard) British English relativization domain seems fairly complex, since (i) it exhibits several relativization strategies (pronouns, particles, and zero) with functional overlap; (ii) there is inherent variability between these relativization strategies, which is governed by somewhat exotic factors such as antecedent animacy; and (iii) relative pronouns in particular are crosslinguistically relatively uncommon. On the empirical plane, Huber ana-

28

Benedikt Szmrecsanyi and Bernd Kortmann

lyzes a dataset in which relative clause occurrences in the corpus database, which covers Ghanaian English as well as British English, were annotated for several contextual variables (syntactic function of the relativizer, voice of the relative clause, animacy of the antecedent). Huber goes on to present an extended empirical argument that relative clause formation in Ghanaian English can be seen as being less complex than in (Standard) British English: for example, Ghanaian English has a stronger preference – one that is in line with the cross-linguistic picture – for the invariant relative particle that. Crucially, Huber shows that the Ghanaian “reinterpretation” of the (Standard) British English relativization system cannot be entirely explained away by properties of Ghanaian adstrate languages. In sum, the study merits particular attention thanks to its focus on gradience and variation – as Huber points out, Ghanaian relative clauses are typically not ungrammatical in (Standard) British English, but in the big picture the underlying variationist constraints appear to be subtly different. John McWhorter’s “Complexity hotspot: The copula in Saramaccan and its implications” also takes a critical stance toward the explanatory power of adstrate or substrate effects. The study sets its sights on local grammatical complexities in Saramaccan creole, an English-lexified contact language spoken in Suriname that is related to Sranan creole but also deeply influenced by Fongbe, an African language spoken by its creators. McWhorter marshals a complexity metric that, in the tradition of McWhorter (2001), is absolutequantitative in nature and consists of the following three components: – overspecification: complexity is a function of the number of overt and obligatory marked distinctions – structural elaboration: complexity is a function of the number of rules required to generate well-formed grammatical output – irregularity: complexity is a function of the amount of irregularity and suppletion The grammatical domain in Saramaccan that McWhorter inspects with regard to the above metric is that of copula constructions, which turn out to be unusually complex – against the backdrop of overall grammatical complexity in Saramaccan, but also in the wider context of creole languages in general. The Saramaccan copula exhibits overspecification in that constructions vary as a function of the semantics of the predicate. We also find plenty of structural elaboration in the form of, e.g., allomorphies, and the Saramaccan copula features a substantial amount of irregularity and suppletion. McWhorter discusses diachronic and synchronic evidence suggesting that this sort of local complexity must have developed gradually over time in Saramaccan, and that there is no good reason to believe that these complexities

Introduction: Linguistic complexity

29

were transferred from one of Saramaccan’s substrate languages (especially since, as McWhorter points out, copula constructions are typically omitted, rather than preserved, in creole genesis and SLA). In short, the complexity of the Saramaccan copula constructions is somewhat puzzling. Where does it come from? McWhorter offers an essentially usage-based explanation. What probably happened in Saramaccan was that initially, a deictic element was subject to reanalysis as a copula construction, as is customary cross-linguistically. Next, phonetic erosion created allomorphies which – thanks to high discourse frequency – were fairly resistant to simplification. In addition, we find in Saramaccan recurrent reanalysis of topic-comment constructions into subject-predicate constructions, a process that additionally creates complexities. According to McWhorter, then, the complexity of the Saramaccan copula, which comes into the remit of a typologically analytic domain, highlights the fact that complexity need not necessarily reside in synthetic grammatical marking or clausal embedding, which are the usual suspects in the complexity literature. What is more, the Saramaccan copula scenario demonstrates that linguistic complexity does not necessarily result from languageexternal forces (such as language contact), for in the Saramaccan case we find that complex copulas are the result of language-internal processes along the lines of, e.g. grammaticalization theory. The argument presented in “Complexity hotspot” thus overlaps substantially with Edward Sapir’s notion of ‘drift’, and in so doing comes back full circle to structuralist lines of analysis that, ironically, triggered the emergence of the equi-complexity axiom in the early twentieth century.

References Arends, Jacques 2001 Simple grammars, complex languages. Linguistic Typology 5 (2/3): 180–182. Bisang, Walter 2009 On the evolution of complexity: sometimes less is more in East and mainland Southeast Asia. In: Geoffrey Sampson, David Gil, and Peter Trudgill (eds.), Language Complexity as an Evolving Variable, 34–49. Oxford: Oxford University Press. Braunmüller, Kurt 1990 Komplexe Flexionssysteme – (k)ein Problem für die Natürlichkeitstheorie? Zeitschrift für Phonetik, Sprachwissenschaft und Kommunikationsforschung 43: 625–635. Chipere, Ngoni 2009 Individual differences in processing complex grammatical structures. In: Geoffrey Sampson, David Gil, and Peter Trudgill (eds.), Language Complexity as an Evolving Variable, 178–191. Oxford: Oxford University Press. Dahl, Östen 2004 The growth and maintenance of linguistic complexity. Amsterdam, Philadelphia: Benjamins. Dahl, Östen 2008 Grammatical resources and linguistic complexity: Sirionó as a

30

Benedikt Szmrecsanyi and Bernd Kortmann

language without NP coordination. In: Matti Miestamo, Kaius Sinnemäki, and Fred Karlsson (eds.), Language Complexity: Typology, Contact, Change, 153–164. Amsterdam, Philadelphia: Benjamins. Dahl, Östen 2009 Testing the assumption of complexity invariance: the case of Elfdalian and Swedish. In: Geoffrey Sampson, David Gil, and Peter Trudgill (eds.), Language Complexity as an Evolving Variable, 50–63. Oxford: Oxford University Press. Dammel, Antje, and Sebastian Kürschner 2008 Complexity in nominal plural allomorphy. In: Matti Miestamo, Kaius Sinnemäki, and Fred Karlsson (eds.), Language Complexity: Typology, Contact, Change, 243–262. Amsterdam, Philadelphia: Benjamins. Davydova, Julia to appear English as a second language vs. English as a foreign language: Bridging the gap. World Englishes. Davydova, Julia 2011 The Present Perfect in non-native Englishes. Berlin, New York: de Gruyter Mouton. Deutscher, Guy 2009 “Overall complexity”: a wild goose chase? In: Geoffrey Sampson, David Gil, and Peter Trudgill (eds.), Language Complexity as an Evolving Variable, 243–251. Oxford: Oxford University Press. Deutscher, Guy 2010 Through the language glass: why the world looks different in other languages. New York: Metropolitan Books. Ehret, Katharina Luisa 2008 Analyticity and syntheticity in East African English and British English: a register comparison. URN: urn:nbn:de:bsz:25-opus-58047, URL: http://www.freidok.uni-freiburg.de/volltexte/5804/. Freiburg: University of Freiburg. Ellis, Nick C., and Diane Larsen-Freeman 2006 Language emergence: Implications for applied linguistics – Introduction to the special issue. Applied Linguistics 27 (4): 558–589. Fenk-Oczlon, Gertraud, and August Fenk 2008 Complexity trade-offs between the subsystems of language. In: Matti Miestamo, Kaius Sinnemäki, and Fred Karlsson (eds.), Language Complexity: Typology, Contact, Change, 43–65. Amsterdam, Philadelphia: Benjamins. Gil, David 2008 How complex are isolating languages? In: Matti Miestamo, Kaius Sinnemäki, and Fred Karlsson (eds.), Language Complexity: Typology, Contact, Change, 109–131. Amsterdam, Philadelphia: Benjamins. Givón, T. 2009 The genesis of syntactic complexity: diachrony, ontogeny, neuro-cognition, evolution. Amsterdam, Philadelphia: Benjamins. Granger, Sylviane 2003 The International Corpus of Learner English: A New Resource for Foreign Language Learning and Teaching and Second Language Acquisition Research. TESOL Quarterly 37 (3): 538–546. Greenbaum, Sidney 1996 Comparing English worldwide: the International Corpus of English. Oxford, New York: Clarendon Press/Oxford University Press. Greenberg, Joseph H. 1960 A quantitative approach to the morphological typology of language. International Journal of American Linguistics 26 (3): 178–194. Halliday, M. A. K., and Christian M. I. M. Matthiessen 1999 Construing experience through meaning: A language-based approach to cognition. London: Cassell. Han, Zhaohong 2009 Interlanguage and fossilization: towards an analytic model. In: Vivian Cook and Li Wei (eds.), Contemporary Applied Linguistics, 137–162. London: Continuum. Haspelmath, Martin, Matthew S Dryer, David Gil, and Bernard Comrie (eds.) 2005 The World Atlas of Language Structures. Oxford: Oxford University Press.

Introduction: Linguistic complexity

31

Hawkins, John A. 2009 An efficiency theory of complexity and related phenomena. In: Geoffrey Sampson, David Gil, and Peter Trudgill (eds.), Language Complexity as an Evolving Variable, 252–268. Oxford: Oxford University Press. Hockett, Charles F 1958 A Course in Modern Linguistics. New York: Macmillan. Humboldt, Wilhelm von 1836 Über die Verschiedenheit des menschlichen Sprachbaues und ihren Einfluss auf die geistige Entwicklung des Menschengeschlechts. Berlin: Dümmler. Juola, Patrick 2008 Assessing linguistic complexity. In: Matti Miestamo, Kaius Sinnemäki, and Fred Karlsson (eds.), Language Complexity: Typology, Contact, Change, 89–108. Amsterdam, Philadelphia: Benjamins. Juvonen, Päivi 2008 Complexity and simplicity in minimal lexica: the lexicon of Chinook Jargon. In: Matti Miestamo, Kaius Sinnemäki, and Fred Karlsson (eds.), Language Complexity: Typology, Contact, Change, 321–340. Amsterdam, Philadelphia: Benjamins. Karlsson, Fred 2009 Origin and maintenance of clausal embedding complexity. In: Geoffrey Sampson, David Gil, and Peter Trudgill (eds.), Language Complexity as an Evolving Variable, 192–202. Oxford: Oxford University Press. Karlsson, Fred, Matti Miestamo, and Kaius Sinnemäki 2008 Introduction: The problem of language complexity. In: Matti Miestamo, Kaius Sinnemäki, and Fred Karlsson (eds.), Language Complexity: Typology, Contact, Change, vii-xiv. Amsterdam, Philadelphia: Benjamins. Klein, Wolfgang, and Clive Perdue 1997 The basic variety (or: Couldn’t natural languages be much simpler?). Second Language Research 13: 301–347. Kortmann, Bernd (ed.) 2004 Dialectology Meets Typology: Dialect Grammar from a CrossLinguistic Perspective. Berlin, New York: Mouton de Gruyter. Kortmann, Bernd, Edgar W. Schneider, Kate Burridge, Rajend Mesthrie, and Clive Upton (eds.) 2004 A Handbook of Varieties of English. Berlin, New York: Mouton de Gruyter. Kortmann, Bernd, and Benedikt Szmrecsanyi 2004 Global synopsis: morphological and syntactic variation in English. In: Bernd Kortmann, Edgar W. Schneider, Kate Burridge, Rajend Mesthrie, and Clive Upton (eds.), A Handbook of Varieties of English, Vol. 2, 1142–1202. Berlin, New York: Mouton de Gruyter. Kortmann, Bernd, and Benedikt Szmrecsanyi 2009 World Englishes between simplification and complexification. In: Lucia Siebers and Thomas Hoffmann (eds.), World Englishes – Problems, Properties and Prospects: selected papers from the 13th IAWE conference, 265–285. Amsterdam, Philadelphia: Benjamins. Kortmann, Bernd, and Benedikt Szmrecsanyi 2011 Parameters of morphosyntactic variation in World Englishes: prospects and limitations of searching for universals. In: Peter Siemund (ed.), Linguistic Universals and Language Variation, 264–290. Berlin, Boston: De Gruyter Mouton. Kuiken, Folkert, and Ineke Vedder 2008 Cognitive task complexity and written output in Italian and French as a foreign language. Journal of Second Language Writing 17 (1): 48–60. Kusters, Wouter 2003 Linguistic Complexity: The Influence of Social Change on Verbal Inflection. Utrecht: LOT. Kusters, Wouter 2008 Complexity in linguistic theory, language learning and language change. In: Matti Miestamo, Kaius Sinnemäki, and Fred Karlsson (eds.), Language Complexity: Typology, Contact, Change, 3–22. Amsterdam, Philadelphia: Benjamins.

32

Benedikt Szmrecsanyi and Bernd Kortmann

Larsen-Freeman, Diane 1978 An ESL Index of Development. TESOL Quarterly 12 (4): 439–448. Lindström, Eva 2008 Language complexity and interlinguistic difficulty. In: Matti Miestamo, Kaius Sinnemäki, and Fred Karlsson (eds.), Language Complexity: Typology, Contact, Change, 217–242. Amsterdam, Philadelphia: Benjamins. Mayerthaler, Willi 1981 Morphologische Natürlichkeit. Wiesbaden: Akademische Verlagsgesellschaft Athenaion. McWhorter, John 2001 The world’s simplest grammars are creole grammars. Linguistic Typology 5 (2/3): 125–166. McWhorter, John 2008 Why does a language undress? Strange cases in Indonesia. In: Matti Miestamo, Kaius Sinnemäki, and Fred Karlsson (eds.), Language Complexity: Typology, Contact, Change, 167–190. Amsterdam, Philadelphia: Benjamins. Mesthrie, Rajend 2006 World Englishes and the multilingual history of English. World Englishes 25 (3/4): 381–390. Miestamo, Matti 2008 Grammatical complexity in a cross-linguistic perspective. In: Matti Miestamo, Kaius Sinnemäki, and Fred Karlsson (eds.), Language Complexity: Typology, Contact, Change, 23–42. Amsterdam, Philadelphia: Benjamins. Miestamo, Matti 2009 Implicational hierarchies and grammatical complexity. In: Geoffrey Sampson, David Gil, and Peter Trudgill (eds.), Language Complexity as an Evolving Variable, 80–97. Oxford: Oxford University Press. Miestamo, Matti, Kaius Sinnemäki, and Fred Karlsson 2008 Language complexity: typology, contact, change. Amsterdam, Philadelphia: Benjamins. Mühlhäusler, Beverly S., and Peter Mühlhäusler 2005 Simple English in the South Seas Evangelical Mission: Social context and linguistic attributes. Language Problems & Language Planning 29 (1): 1–30. Mühlhäusler, Peter 1974 Pidginization and simplification of language. Pacific linguistics. Series B. Canberra: Dept. of Linguistics, Research School of Pacific Studies, Australian National University. Mühlhäusler, Peter 1992 Twenty years after: a review of Peter Mühlhäusler’s pidginization and simplification of language. In: Martin Pütz (ed.), Thirty Years of Linguistic Evolution, 109–117. Amsterdam, Philadelphia: Benjamins. Muysken, Pieter, and Norval Smith 1995 The study of pidgin and creole languages. In: Jacques Arends, Pieter Muysken, and Norval Smith (eds.), Pidgins and Creoles: An Introduction, 1–14. Amsterdam, Philadelphia: Benjamins. Nichols, Johanna forthcoming The vertical archipelago: Adding the third dimension to linguistic geography. In: Peter Auer, Martin Hilpert, Anja Stukenbrock, and Benedikt Szmrecsanyi (eds.), Space in language and linguistics: geographical, interactional, and cognitive perspectives. Berlin, New York: Walter de Gruyter. Nichols, Johanna 2009 Linguistic complexity: a comprehensive definition and survey. In: Geoffrey Sampson, David Gil, and Peter Trudgill (eds.), Language Complexity as an Evolving Variable, 64–79. Oxford: Oxford University Press. Ogden, C. K 1934 The system of Basic English. New York: Harcourt. Ortega, Lourdes 2003 Syntactic complexity measures and their relationship to L2 proficiency: A research synthesis of college-level L2 writing. Applied Linguistics 24: 492–518. Parkvall, Mikael 2008 The simplicity of creoles in a cross-linguistic perspective. In: Matti Miestamo, Kaius Sinnemäki, and Fred Karlsson (eds.), Language Complexity: Typology, Contact, Change, 265–285. Amsterdam, Philadelphia: Benjamins.

Introduction: Linguistic complexity

33

Rescher, Nicholas 1998 Complexity: a philosophical overview. New Brunswick: Transaction Publishers. Sadeniemi, Markus, Kimmo Kettunen, Tiina Lindh-Knuutila, and Timo Honkela 2008 Complexity of European Union Languages: A Comparative Approach. Journal of Quantitative Linguistics 15 (2): 185–211. Sampson, Geoffrey 2009 A linguistic axiom challenged. In: Geoffrey Sampson, David Gil, and Peter Trudgill (eds.), Language complexity as an evolving variable, 1–18. Oxford: Oxford University Press. Sampson, Geoffrey, David Gil, and Peter Trudgill 2009 Language complexity as an evolving variable. Oxford linguistics. Oxford, New York: Oxford University Press. Schneider, Edgar W 2007 Postcolonial English. Varieties around the World. Cambridge: Cambridge University Press. Selinker, Larry 1972 Interlanguage. International Review of Applied Linguistics 10: 209–230. Seuren, Pieter, and Herman Wekker 1986 Semantic transparency as a factor in creole genesis. In: Pieter Muysken and Norval Smith (eds.), Substrata versus Universals in Creole Genesis, 57–70. Amsterdam, Philadelphia: Benjamins. Shosted, Ryan K 2006 Correlating complexity: A typological approach. Linguistic Typology 10: 1–40. Sinnemäki, Kaius 2008 Complexity trade-offs in core argument marking. In: Matti Miestamo, Kaius Sinnemäki, and Fred Karlsson (eds.), Language Complexity: Typology, Contact, Change, 68–88. Amsterdam, Philadelphia: Benjamins. Sinnemäki, Kaius 2009 Complexity in core argument marking and population size. In: Geoffrey Sampson, David Gil, and Peter Trudgill (eds.), Language Complexity as an Evolving Variable, 126–140. Oxford: Oxford University Press. Slobin, Dan I. 1987 Thinking for speaking. Proceedings of the Berkeley Linguistics Society 13: 435–444. Szmrecsanyi, Benedikt 2009 Typological parameters of intralingual variability: grammatical analyticity versus syntheticity in varieties of English. Language Variation and Change 21 (3): 319–353. Szmrecsanyi, Benedikt 2012 Analyticity and syntheticity in the history of English. In: Terttu Nevalainen and Elisabeth Closs Traugott (eds.), Rethinking the History of English. Oxford: Oxford University Press. Szmrecsanyi, Benedikt, and Bernd Kortmann 2009a The morphosyntax of varieties of English worldwide: a quantitative perspective. Lingua 119 (11): 1643–1663. Szmrecsanyi, Benedikt, and Bernd Kortmann 2009b Vernacular universals and angloversals in a typological perspective. In: Markku Filppula, Juhani Klemola, and Heli Paulasto (eds.), Vernacular Universals and Language Contacts: Evidence from Varieties of English and Beyond, 33–53. London, New York: Routledge. Szmrecsanyi, Benedikt, and Bernd Kortmann 2009c Between simplification and complexification: non-standard varieties of English around the world. In: Geoffrey Sampson, David Gil, and Peter Trudgill (eds.), Language Complexity as an Evolving Variable, 64–79. Oxford: Oxford University Press. Szmrecsanyi, Benedikt, and Bernd Kortmann 2011 Typological profiling: learner Englishes versus indigenized L2 varieties of English. In: Joybrato Mukherjee and Marianne Hundt (eds.), Exploring Second-Language Varieties of English and Learner Englishes: Bridging a Paradigm Gap, 167–187. Amsterdam, Philadelphia: Benjamins.

34

Benedikt Szmrecsanyi and Bernd Kortmann

Thomason, Sarah Grey 2001 Language Contact: An Introduction. Edinburgh: Edinburgh University Press. Thomason, Sarah Grey, and Terrence Kaufman 1988 Language contact, creolization, and genetic linguistics. Berkeley: University of California Press. Towell, Richard, and Roger Hawkins 1994 Approaches to second language acquisition. Clevedon, Philadelphia: Multilingual Matters. Trudgill, Peter 1999 Language contact and the function of linguistic gender. Poznan Studies in Contemporary Linguistics 35: 133–152. Trudgill, Peter 2001 Contact and simplification: historical baggage and directionality in linguistic change. Linguistic Typology 5 (2/3): 371–374. Trudgill, Peter 2004 Linguistic and Social Typology: The Austronesian migrations and phoneme inventories. Linguistic Typology 8: 305–320. Trudgill, Peter 2009 Sociolinguistic typology and complexification. In: Geoffrey Sampson, David Gil, and Peter Trudgill (eds.), Language Complexity as an Evolving Variable, 98–109. Oxford: Oxford University Press.

Accounting for analyticity in creoles

35

Jeff Siegel

Accounting for analyticity in creoles*

1.

Introduction

This paper is about one aspect of expanded pidgins and creoles that makes them less complex than the languages that contributed to their development – that is, the use of analytic grammatical markers rather than synthetic ones. Accounting for the origins of this “analyticity” leads into a discussion of the meaning of simplification and the different forms it takes. The major part of the paper examines the hypothesis that creole analyticity is the result of simplification that occurs in adult second language acquisition. But first, as requested by the workshop organisers, I present some background about my views concerning the notion of complexity and my particular interest in it.

1.1. Defining complexity In characterising the notion of complexity in general, I believe that there are two basic facets: componential and structural. An entity or system is complex if: a) it has numerous elements or components (componential complexity), and/or b) its internal structure is difficult to understand or analyse (structural complexity). In defining linguistic complexity in particular, my view is that we need to specify which level or sub-domain of language we are talking about – i.e. to make a modular rather than holistic analysis (Siegel 2004: 143, 2008: 20). The main focus of studies on complexity in language appears to be on grammatical or morphological complexity. This is what I concentrate on here. Componential morphological complexity concerns factors such as the number of marked grammatical distinctions and the amount of grammatical morphology. Structural complexity concerns factors such as semantic trans* I am grateful to the Freiburg Institute for Advanced Studies for a three month fellowship in 2009 that allowed me to work on the topic of linguistic complexity and to participate in the workshop that this book is based on. Thanks also go to Bernd Kortmann and Benedikt Szmrecsanyi for their help and support, and to Terry Odlin for his detailed comments on an earlier version of this chapter.

36

Jeff Siegel

parency, regularity and perceptual salience. Greater complexity, then, can arise from more positive values for the componential factors – e.g. more marked grammatical distinctions and grammatical morphology. It can also arise from more negative values for the structural factors – e.g. less semantic transparency and regularity. Greater simplicity can arise from the reverse – e.g. fewer grammatical distinctions and more semantic transparency. The factors or criteria that other scholars use to measure complexity are generally either componential or structural. With regard to componential criteria, for example, McWhorter’s (2008) Overspecification increases complexity while Kusters’ (2008) Economy and Miestamo’s (2008) Fewer Distinctions reduce it. With regard to structural criteria, McWhorter’s Structural Elaboration and Irregularity (2007, 2008) again increase complexity and Kusters’ Transparency and Miestamo’s One-Meaning-One-Form reduce it. (See Figure 1.) + Complexity

– Complexity (simplicity)

Componential complexity Overspecification

Economy Fewer distinctions

Structural complexity

Transparency One-Meaning-One-Form

Structural elaboration Irregularity

Figure 1: Criteria for complexity

1.2. My interest in simplicity/complexity I do research on pidgin and creole languages. These are contact languages, which by the usual definitions are less complex than the languages that contributed to their development, at least in some linguistic features. This is especially true of a pidgin or creole compared to its lexifier language – i.e. the language that provided the forms for the bulk of the lexical items for the contact language. Pidgins and creoles also appear to be less complex than their substrate languages – i.e. the first languages of the speakers involved, other than the lexifier. According to my definition, a restricted pidgin is the most morphologically simple kind of language, since it has little if any grammatical morphology. Things that are expressed with grammatical morphology in other languages are either not expressed at all (for example, no complementisers to indicate subordination) or they are expressed lexically (for example, adverbs rather than any kind of tense/aspect markers to indicate temporal and aspec-

Accounting for analyticity in creoles

37

tual relationships) (Siegel 2008: 26). In this way, restricted pidgins are very similar to the Basic Variety, the early stage of interlanguage in second language acquisition, described by Klein and Perdue (1997). This simplicity is shown in the following examples from early Hawai‘i Pidgin English (HPE) (1870–1899): (1) early HPE (no complementiser) (Roberts 2005: 163) Today go court house Ø buy license, go church Ø make marry … ‘Today I’m going to the court house to buy a license and going to a church to get married …’ (2) early HPE (adverb before used to indicate past) (Roberts 2005: 153) before Kiku and me make marry Japanese style … ‘Kiku and I got married the Japanese way …’ One goal in my research has been to account for this morphological simplicity. Another question has concerned “complexification”, more commonly referred to in pidgin and creole studies as morphological expansion or elaboration – i.e. how a restricted pidgin gains grammatical morphology as it becomes an expanded pidgin or a creole. For example, compare the following examples from early Hawai‘i Creole (1905–1915) to the ones above from early HPE: (3) early Hawai‘i Creole (for as complementiser) (Roberts 1998: 29) You speak you want one good Japanese man for make cook. ‘You said you wanted a good Japanese man to cook.’ (4) early Hawai‘i Creole (bin as past tense marker) (Roberts 2005: 180) This fella bin see. ‘This person saw (it/him/her).’

1.3. How to measure (morphological) complexity I have just given some indication of one way that componential complexity is measured – i.e. by determining the amount of grammatical morphology. However, since there is no agreement about what specific amount of grammatical morphology makes a language complex, morphological complexity is generally comparative – i.e. we can say that one language is more complex

38

Jeff Siegel

than another in a particular grammatical area. However, a language that has little or no grammatical morphology can be said to be morphologically simple in an absolute sense – as I have claimed for restricted pidgins. When there is grammatical morphology, measuring structural complexity is not so straightforward. An indicator that I have used in the past (Siegel 2004, 2008) is based on one of the scales or clines used in the study of grammaticalisation, although I use it for synchronic rather than diachronic analysis. Hopper and Traugott (1993: 7) describe a “cline of grammaticality” as follows: content item > grammatical word > clitic > inflectional affix This scale goes from lexicality on the left to grammaticality on the right. Items in the leftmost category are lexical or content morphemes, while the items in categories to the right are all grammatical or function morphemes. Hopper and Traugott (1993: 7) note: “Each item to the right is more clearly grammatical and less lexical than its partner to the left.” The assumption I adopted is that with regard to expressing particular semantic distinctions, lexicality is an absolute indicator of morphological simplicity, while increased grammaticality corresponds to greater complexity. Dahl (2004: 106) presented a similar scale, including fusional markers: free > periphrastic > affixal > fusional As in the grammaticalisation cline, he considered these as developmental stages, with those on the left the least developed or immature, and he also assumed that the lesser the development, the lesser the complexity. Following Dahl, here I have deleted clitics and added fused forms, and also added replacive morphemes to give the following “cline of morphological complexity”: content item > grammatical word > affix > fused affix/replacive morpheme Examples of each stage concerning verbal morphology follow in (5): (5) a. content item indicating past (Hawai‘i Pidgin English) (Roberts 2005: 156): Garnie before eat too much Wahiawa pineapple, now get sore tooth. ‘Garnie ate so much Wahiawa pineapple that his teeth now hurt.’

Accounting for analyticity in creoles

39

b. grammatical word indicating future (English): They will arrive tomorrow. c. inflectional affix indicating past: George worked last Saturday. d. fused affix indicating both nonpast and 3rd person singular: Lucy runs marathons. e. replacive morpheme indicating plural: We saw lots of geese. It is clear that morphemes involving fusion or apophony (stem modification) decrease the semantic transparency of one form, one meaning, and therefore are more complex than plain affixes. On the other hand, the difference in complexity between analytic or periphrastic constructions (those using grammatical words) and synthetic constructions (those involving affixation or apophony) is not so clear. In fact, some linguists (e.g. McWhorter 2007: 6; Parkvall 2008: 270) say that synthesis itself is not complex, although it may lead to complexity, for example in fusion. However, I would argue that with regard to structural complexity, analytic morphemes are simpler than synthetic ones on the basis of their perceptual salience (see Bates and Goodman 1997: 542–3; Siegel 2008: 35–6). This would certainly be the case according to views that relate linguistic complexity of a language to the ease or difficulty of acquiring it as a second language (e.g. Kusters 2003, 2008). For example, van de Craats, Corver and van Hout (2000: 228) note that in second language acquisition, items with “perceptual saliency” are learned first – generally content words, and then free function morphemes and finally bound function morphemes. This is relevant to the main part of this paper, which begins with section 2.

2.

Analyticity in expanded pidgins and creoles

Expanded pidgins and creoles are often characterised as being predominantly analytic (or isolating) languages as opposed to synthetic languages (Romaine 1988: 28–9; Parkvall 2008: 281–2). Here, I am concentrating on analyticity with regard to grammatical morphology. That is, when I say expanded pidgins and creoles are analytic, I mean that when they have grammatical markers, these are generally free rather than bound morphemes. For

40

Jeff Siegel

example, in English-lexified Hawai‘i Creole past tense is indicated by the preverbal marker wen rather than the suffix -ed, as in dey wen paint his skin (Morales 1988: 72) ‘they painted his skin’; I am not talking about analyticity in general. For example, the use of adverbials to indicate temporal relationships is considered less complex than the use of free grammatical tense markers, even though the two are both analytic. Thus, the main aim of this paper is to account for the predominance of analytic as opposed to grammatical morphology. Although what I say applies to both expanded pidgins and creoles, for the sake of brevity I will use the term creole to refer to both. Table 1: Synthetic versus analytic constructions for five features in 18 creoles past

prog

fut/irr

pl

poss

Angolar

ta V

θa ka V

kia V

anE N

N2 (ri) N1

Berbice Dutch

wa V

V-a(r ε)

ma/si V

N-apu

N1 si N2

Cape Verdean

V-ba

(s)ta V

ad/al/a V

uns N*

N1 di N2

Guinea-Bissau Kriyol

V-ba

(s)ta V

ta/na V

uns N*

N1 di N2

Dominican

te V

ka V

ka alé V

se N

N2 N1

Haitian Creole

te V

ap V

a(va)/va V

N yo

N2 (a/pa) N1

Jamaican Creole

did/ben V

(d)a V

(a)go/wi V

N-dem

N1 N2

Korlai

ti V

(te) V-n

l ə/t εd V

––

N1 su N2

Krio

bin V

dè/dé pan V

go V

N d εn

N1 in/den N2

Nubi

kan V

gi V

bi-V

––

N2 ta N1

Ndyuka

be V

eV

sa/o V

den N

N2 fa N1

Nagamese

V-ise/ys e

V-yose

V-to

N-khan

N2 laga N1

Negerhollands

aV

le/lo V

lo/sa V

N sini

N2 fan N1

Palenquero

V-ba

ta V

tan V

––

N2 ri N1

Papiamentu

tabata V

ta V

lo/ta V

N nan

N2 di N1 N1 su N2

Seychellois

ti V

(a)pe V

a(va) V

ban N

N2 N1

Tok Pisin

bin V

V i stap

bai V

ol N

N2 bilong N1

Zamboangueño

ya V

ta V

ay V

mana N

N2 di N1

shaded features are unmarked; boxed features are only synthetic; the others are analytic N1 = possessor; N2 = possessum; * -s also exists.

41

Accounting for analyticity in creoles

To confirm that grammatical marking in creoles is predominantly analytic, I examined five linguistic features in the 18 creole grammars in Holm and Patrick (2007). These are features of the verb phrase (TMA marking) and the noun phrases (plural and nominal possessive marking) (See Table 1.). In English, for example, four out of five of these features are normally marked synthetically (with an affix or clitic): past tense (e.g. walked), progressive aspect (e.g walking), plural (dogs), and nominal possession (the dog’s bone); only future is marked analytically (e.g. will walk). In the creoles, out of 5 times 18, or 90 possibilities, these features are marked with an overt morpheme in 84 cases (or 93.3 percent). Out of these 84 cases, only synthetic marking is used 11 times (or 13 percent) and analytic marking is used 73 times (or 87 percent). [If we omit Nagamese, which many would argue is not a creole, then the figures are 9 percent synthetic and 91 percent analytic.] For four English-lexified creoles (Jamaican Creole, Krio, Ndyuka and Tok Pisin), only 1 out of 19 possibilities (5 percent) is synthetic compared to 4 out of 5 (80 percent) in the lexifier. The figures are summarised in Table 2. Table 2: Percentages of synthetic vs analytic marking for 5 features all 18 creoles 17 creoles (without Nagamese) 4 English-lexified creoles English

synthetic

analytic

13

87

9

91

5

95

80

20

Linguistic evidence of the analyticity of creoles also comes from the preliminary database for the Atlas of Pidgin and Creole Language Structures (Michaelis et al. forthcoming). Four of the five grammatical features mentioned above (all except future/irrealis) are specifically dealt with for 70 languages. Of the pidgins or creoles that grammatically mark past tense and progressive aspect, less than a quarter use an affix or clitic while more than two thirds use a separate word. Some use both, and other mechanisms such as stem change and reduplication. However, analytic marking of tense and aspect is by far the most frequent strategy. With regard to plurals, analytic and synthetic marking occur almost equally in the creoles, but the presence of analytic marking appears to be much more common than in the lexifier languages. Synthetic marking in nominal possession is relatively common in lexifiers other than English; however, where such possessive markers occur in the creoles, about nine tenths are analytic, including those in most of the English-lexified creoles.

42

Jeff Siegel

So it appears that on the basis of these two sources of data, the generalisation about the grammatical analyticity of creoles has some basis. As mentioned already, creoles are generally considered to be “simpler” or less complex than their lexifiers, at least in some aspects of their grammar. Of course, there is not so much agreement about whether creoles are simpler holistically (e.g. see McWhorter 2001 vs DeGraff 2001). However, regarding grammatical morphology, and taking into account the cline of morphological complexity, the fact that creoles are more analytic than their lexifiers would be one argument for saying they are simpler, at least in this linguistic module. This point of view would be backed up by scholars who see predominantly analytic (or isolating) languages as grammatically simpler than predominantly synthetic ones (e.g. Gil 2008; McWhorter 2007, 2008).

3.

Origins of analyticity in creoles

The main question I want to address in this paper is: What are the origins of the analytic grammatical structures in creoles? More specifically, what kinds of processes led to the development of these structures and in what contexts? Clearly, some analytic markers must have come directly from the lexifier. An example is the Hawai‘i Creole future marker go(n), from English to be going to or gonna: (6) Hawai‘i Creole (Sakoda and Siegel 2003: 39) Ai gon bai wan pikap. ‘I’m going to buy a pickup.’ More problematic, however, are the cases where the lexifier uses a synthetic construction but the creole has developed an analytic one instead – for example, the replacement of the English past tense suffix -ed with the Hawai‘i Creole preverbal past tense marker bin as in example (4). (This marker has become wen in modern Hawai‘i Creole.) The most common explanation is that the change from synthetic to analytic structures is the result of “simplification”, and that such simplification is a consequence of adult second language learning (e.g. DeGraff 1999, 2005; Chaudenson 2001, 2003; Mufwene 2001; McWhorter 2007, 2008). That is, at some stage in the development of creoles, the lexifier language was the target that was being learned as a second language (L2) to use for wider communication. The learners were adults who had various other first languages (L1s), usually referred to in this context as the substrate languages.

Accounting for analyticity in creoles

43

Simplification is an ambiguous term, however, since it can refer to the process of something becoming less complex, or to a particular state characterised by a lack of complexity. I prefer the term “simplicity” for referring to the state, and therefore use the term “morphological simplicity” to describe the absence or near absence of grammatical morphology in restricted pidgins. With regard to simplification as a process that brings about simplicity, there are again two possible interpretations: either that simplicity reflects a decrease of complexity or that it reflects a lack of development of complexity. The general view in pidgin and creole studies seems to be that simplicity is the result of complex forms of language having been made less complex. This is implied by the commonly used terms to describe morphology in pidgins – such as “decreased”, “reduced”, and “loss”. Referring to pidgins, some scholars have given the impression that a process of drastic reduction of complexity occurs; for example, Romaine (1988: 24) writes: “A pidgin represents a language which has been stripped of everything but the bare essentials necessary for communication.” McWhorter (2000: 106) refers to “radically reduced pidgins” and later (2003: 207) says that the emergence of pidgins “largely entails stripping languages of features unnecessary for basic communication”. With regard to creoles, McWhorter (2007: 17) states that they “represent a degree of simplification intermediate between the utter breakdown in pidgins and the typical baroqueness of, for instance, Navajo”. He goes on to say (p.18) that “languages can become robustly simplified from the ‘top down’ as well, without needing to touch base at a pidgin stage before rising again from the ashes”, and that “grammars can be simplified to an intermediate degree from the onset, with a preliminary dip into the pidginization stage playing no part in the process” [emphasis in original]. But another view of the origin of grammatical simplicity in restricted pidgins and creoles is that it represents a lack of development of complexity, rather than a reduction of complexity. This view holds that one cannot simplify what is not yet complex (Traugott 1977; Corder 1981). Therefore, the simplicity found in child language in first language acquisition and interlanguage in second language acquisition is not the result of any reductive process, but rather a reflection of an early stage of linguistic development. Thus there are two possible sources of morphological simplicity: a process of reduction and a lack of development. In order to distinguish the two, I will refer to reductive simplicity versus developmental simplicity. These two different sources have different agents. In the case of reductive simplicity, the agents must be speakers of the complex language that is being reduced or simplified. In language contact situations, speakers do often simplify their speech when communicating with foreigners, and some languages

44

Jeff Siegel

have a “foreigner talk” register. It could be that speakers of the lexifier used reduced forms of their language (e.g. avoiding inflections) when talking to speakers of the substrate languages, and then the substrate speakers acquired these reduced forms as their L2. This is the “altered model” theory of the origins of simplicity in pidgins and creoles (Siegel 1987: 18–19). This theory could explain the change from is going to to go(n) in Hawai‘i Creole – i.e. speakers of English left out the auxiliary verb to be and the -ing of going or the end of gonna. It could also be the lexifier speakers perceived analytic structures for grammatical marking as simpler and used them in foreigner talk as opposed to synthetic structures when such alternatives were available in the language. For example, French has two ways of marking future tense: (7) French a. Pierre manger-a. b. Pierre va manger. ‘Pierre will/is going to eat.’ French-lexified creoles, such as Haitian Creole, have adopted the analytic alternative va (see Table 1). This could be the result of French speakers using the analytic alternative with non-French speakers – i.e. lexifier speaker agentivity. In the case of developmental simplicity in creole origins, the agents would be learners of the lexifier language – i.e. the speakers of the substrate languages. They acquire only some aspects of the L2 – e.g. lexical items and analytic grammatical markers, but not bound morphemes, and this would account for the simplicity of creoles. This is the “imperfect learning theory” (Siegel 1987: 19–20), or “imperfect second language learning theory” (Muysken and Smith 1995: 10). According to Seuren and Wekker (1986) and Kusters (2003, 2008), the semantic transparency of analytic grammatical markers makes them easier to acquire than synthetic markers. The research by van de Craats, Corver and van Hout (2000: 228) mentioned above backs this up, demonstrating that learners acquire free functional morphemes before bound ones. Thus, substrate speakers may have acquired analytic morphemes (such as va indicating future in French) but not the synthetic alternatives (such as the -a suffix) – i.e. substrate speaker agentivity. The examples we have looked at seem to show that grammatical analyticity in creoles may result from either reductive simplification (with lexifier speaker agentivity) or developmental simplification (with substrate speaker agentivity). However, many other examples indicate that reductive simplification would have been unlikely.

Accounting for analyticity in creoles

45

First of all, there are examples such as the bin past tense marker in Hawai‘i Creole (example 4). Although been is used in sentences such as I’ve been sick or They’ve been working, this is not an alternative way of marking past tense in English, and it is unlikely that speakers would think that using been before the bare verb would make their speech more intelligible to foreigners. On the other hand, it is quite feasible that learners could have misinterpreted the been in this analytic construction as marking past tense, and adopted it into their interlanguage. Similarly, Hawai‘i Creole adopted the English word stay as a preverbal progressive (ste or stei), as in the following examples: (8) Hawai‘i Creole a. He stay laugh. ‘He is laughing.’ (Ferreiro 1937: 68) b. Wat yu ste it? ‘What are you eating?’ (Sakoda and Siegel 2003: 60) It is hard to imagine that this could be derived from English speakers’ simplification of their own language. Furthermore, it is even harder to imagine that the use of combinations of analytic markers found in Hawai‘i Creole, such as bin kn (been can) and gon kn (gon can), could be the result of lexifier speaker agentivity – for example: (9) Hawai‘i Creole (Sakoda and Siegel 2003: 58–9) a. Hi bin kæn go? ‘Was it possible for him to go?’ b. De gon kæn kam o wat? ‘Will they be able to come or what?’ Rather these seem to be the result of overgeneralisation of the contexts of use of stay and been typical of second language acquisition, and therefore the result of substrate speaker agentivity and developmental rather than reductive simplification. This conclusion contradicts the view that morphological simplicity of creoles is the result of reduction from the top down. Rather it indicates lack of development from the bottom up. Nevertheless, the position that analyticity in creoles could result from adult second language acquisition remains intact.

4.

Examination of the simplification via L2 learning explanation

But let us now evaluate this position in two ways: by examining the nature of indigenised L2 varieties of English, and by taking a closer look at the analytic features of creoles.

46

Jeff Siegel

4.1. Recent research by Kortmann and Szmrecsanyi Szmrecsanyi and Kortmann (2009) describe their research measuring complexity in three different types of varieties of English. One frequency-based, corpus-derived metric of complexity that they use is “grammaticity and redundancy”. The grammaticity of a variety is determined by the text frequency of grammatical markers in naturalistic spoken discourse. This overall grammaticity is divided into two components: synthetic grammaticity (the incidence of bound grammatical morphemes) and analytic grammaticity (the incidence of free grammatical morphemes). The authors assume, as I do, that the greater the grammaticity, the greater the complexity. They imply as well that synthetic grammaticity is more complex than analytic grammaticity. However, they also equate grammaticity with redundancy, which I find problematic (but this is not relevant here). The three types of varieties of English examined by Szmrecsanyi and Kortmann were: traditional (low-contact) L1 varieties (e.g. English Midlands English), high-contact L1 varieties (e.g. New Zealand English) and indigenised L2 varieties (e.g. Singapore English). Digitised corpora of spoken data were accessed for a subset of five of each type of variety, and 1000 orthographically transcribed words were randomly selected for analysis from each of the 15 varieties. The mean percentages for overall grammaticity and for its two components are shown in Table 3 (based on Table 5.5 in Szmrecsanyi and Kortmann 2009: 72). Table 3: Mean percentages for the three kinds of grammaticity variety type

overall grammaticity

synthetic grammaticity analytic grammaticity

traditional L1

61

13

48

high-contact L1

57

11

46

indigenised L2

54

9

45

Because of the developmental simplification that occurs in L2 learning, we would expect that in indigenised L2 varieties there would be fewer grammatical morphemes than in L1 varieties, and thus a lower percentage of grammaticity. This is indeed what we find. Also because of developmental simplification in L2 learning, we would expect that grammatical marking does occur, it would be mainly analytic as opposed to synthetic. However, this is not what we find. On the basis of their statistical analysis, Szmrecsanyi and Kortmann (2009: 74) conclude that “there is no trade-off between analyticity and syntheticity” [italics in original]. In other words, they say that indigenised L2 varieties “opt for less overt marking, rather than trading off

47

Accounting for analyticity in creoles

synthetic marking for analytic marking”, which is supposed to be easier to acquire (p. 74). This is very different to what we find in creoles, where grammatical marking often occurs where it does in the lexifier, but the marking is analytic as opposed to synthetic. For example, if we compare English and the four English-lexified creoles in Holm and Patrick (2007) with regard to the five features examined in Table 1 above, we see that there is a trade-off between synthetic and analytic marking (see Table 4.) Table 4: Mean percentages for the three kinds of grammaticity for 5 features. variety type English English-lexified creoles

overall grammaticity

synthetic grammaticity analytic grammaticity

100

80

20

95

5

90

Thus, the differences between indigenised L2 varieties and creoles are evidence against the conclusion that the prevalent grammatical analyticity of creoles is the result of adult L2 learning, or second language acquisition (SLA).

4.2. A closer look at the analytic grammatical features of creoles Further evidence comes from a closer look at the analytic grammatical features of creoles, which shows that some of them cannot be accounted for by processes of SLA. First, some analytic grammatical morphemes in creoles result from language internal processes such as grammaticalisation. Here, as in conventional language change, a grammatical morpheme developed over several decades as the result of a lexical item gradually acquiring a grammatical function, and eventually losing its original lexical meaning. A clear example is the English adverbial expression by and by becoming the future/irrealis marker baimbai and eventually bai in Tok Pisin, and bambae and bae in Bislama: (10) Bislama (Crowley 2003: 41) Pikinini blong mi bae i child poss 1sg fut 3sg

go go

skul nekis yia. school next year

‘My child will be going to school next year.’

48

Jeff Siegel

In this case, it is clear that by and by went through all the stages typical of grammaticalisation – being used first as an adverb, and then gradually losing its lexical meaning, becoming reduced phonologically, being used redundantly and changing its syntactic properties and surface position. (11) Older Bislama (fabricated on the basis of examples in Crowley 1990) a. Man ia i wok long Santo. ‘This man works/worked/will work in Santo.’ b. Bambae man ia i wok long Santo. c. Man ia i wok long Santo bambae. ‘This man will work in Santo (some time in the future).’ (12) Modern Bislama (fabricated) a. Bae man ia i wok long Santo. b. *Man ia i wok long Santo bae. c. Man ia bae i wok long Santo. ‘This man will work in Santo.’ Thus, this is an example of complexification or elaboration via the process of grammaticalisation – the gradual development of a grammatical marker in the language where none had previously existed. However, there are many more examples where a lexical item derived from the lexifier has also taken on a grammatical function in the creole, but very rapidly (within 10–20 years) without any evidence of the stages of grammaticalisation having occurred. We have already seen one instance of this – the use of stay as a progressive marker in Hawai‘i Creole (example 8). In addition, stay functions as a perfect marker: (13) Hawai‘i Creole a. The bell stay ring. ‘The bell has rung.’ (Ferreiro 1937: 62) b. Ai ste kuk da stu awredi. ‘I already cooked the stew.’ (Sakoda and Siegel 2003) Another example from HawC is the negative existential marker and negative possessive marker nomo, derived from English no more.

49

Accounting for analyticity in creoles

(14) Hawai‘i Creole a. Nomo kaukau in da haus. ‘There’s no food in the house.’ (Sakoda and Siegel 2003: 83) b. How come I no more one real glove? [‘Why don’t I have a real glove?’] (Chock 1998: 29) For both stay and nomo, there are no features in English with the same range of grammatical functions as those found in Hawai‘i Creole. However, these and other analytic grammatical constructions in the language have striking parallels with structures in the dominant substrate languages. These were the original languages of speakers who first shifted to the developing expanded pidgin or creole as their primary language. In the case of Hawai‘i Creole, the key substrate languages were Portuguese, Cantonese and Hawaiian (Siegel 2000). The Hawai‘i Creole marker stei behaves like the Portuguese estar, indicating both progressive and perfect: (15) Portuguese a. O combio está chegando. ‘The train is arriving.’ (Prista 1966: 52) b. As árvores estão cortadas. ‘The trees are (= have been) cut.’ (Willis 1965:362) The marker nomo behaves like the Cantonese móuh, indicating both negative existential and negative possessive: (16) Cantonese (Matthews and Yip 1994: 138, 283) a. Móuh

yàhn

gaau

ngóh Jugmán.

neg.have person teach 1sg Chinese ‘There’s no one to teach me Chinese.’ b. Ngóh móuh

saai chín

la

wo.

1sg neg.have all money prt prt ‘I’m out of money.’ [‘I don’t have any money.’] These examples illustrate the result of what Bruyn (1996: 42) calls “apparent grammaticalization” and Heine and Kuteva (2005) call “contact-induced grammaticalization” – i.e. a lexical item in one of two languages in contact taking on grammatical properties very similar to those of a corresponding

50

Jeff Siegel

item in the other language. I have argued (Siegel 2004, 2008) that while similar grammatical markers result from the process of grammaticalisation in other contexts, the markers in creoles are not the result of this process as it is normally conceived of. In other words, the outcomes are the same but the determinants are different. With regard to these features in creoles, the process is a particular kind of language transfer – what I call “functional transfer”: applying the properties of a grammatical morpheme of one language to a syntactically congruent lexical item of another language (Siegel 2003, 2008). Here the properties of the grammatical morphemes come from the dominant substrate languages, and the lexical items from the lexifier language. Bislama also illustrates several examples of what are most likely the result of functional transfer. First is the preverbal marker stap (from English stop) that indicates either habitual or progressive: (17) Bislama (Crowley 1990: 217) Hem i stap toktok. ‘She talks/is talking.’ This appears to be modeled on a marker with the same functional range typical of the dominant North/Central Vanuatu substrate languages, such as Nguna: (18) Nguna (Schütz 1969a: 29): e too mari a 3sg hab/prog do

it

‘He does/is doing it.’ Second are the preverbal subject referencing pronouns (or agreement markers): i for 3rd person singular subjects, derived from he, and oli for plural (usually human) subjects, derived from ol ‘they’ (< all) plus i. (Ol is now obsolete as the 3rd person plural pronoun in Bislama, having been replaced by olgeta). These are shown in examples (19a) and (19b). (19)Bislama (Siegel 1999: 13) a. Man ia i stil-im

mane.

man det srp.3sg steal-tr money ‘ This man stole the money.’

51

Accounting for analyticity in creoles

b. Ol woman oli kat-em taro. pl woman srp.3pl cut-tr taro ‘ The women cut taro.’ These have striking parallels with the subject referencing pronouns of the dominant languages, again shown here in Nguna: (20) Nguna (Schütz 1969b: 27, 33) a. Manulapa e tiri pano. M. srp.3sg fly go ‘Manulapa flew off.’ b. Manulapa wanogoe go koroi M. dem and girl

waogoe dem

ero too ganikani. srp.3du prog eat

‘Manulapa and the girl were eating.’ Third is the prenominal plural marker ol (the now obsolete 3rd person pronoun), demonstrated in (19b) and the following example: (21) Bislama (Siegel 2008: 88) Ol gel blong Malo pl girl from M. ‘the girls from Malo’ Similar use of the 3rd person pronoun as a prenominal negative marker is also found in North Central Vanuatu languages – for example, in Raga: (22) Raga (Crowley 2002: 629) ira vavine 3pl woman ‘the women’ Finally, there is the use of blong (from belong) as the possessive marker for possession: (23) Bislama (Siegel 2008: 89) haos

blong jif

house poss chief ‘the chief ’s house’

52

Jeff Siegel

Again, there is a parallel in Nguna: (24) Nguna (Schütz 1969a: 42) natokoana ki nawota village

poss chief

‘the chief ’s village’ Individually, these examples may look like typical examples of grammatical development in a language. But the facts that they occurred so rapidly and that so many individual features match those of substrate languages lead to the conclusion that this was not ordinary grammaticalisation, but rather some other process, which I argue is functional transfer. The view for many years in pidgin and creole studies has been that such examples arose from transfer in adult second language acquisition. In the field of SLA, transfer refers to the form of cross-linguistic influence that involves “carrying over of mother tongue patterns into the target language” (Sharwood Smith 1996: 71), or more accurately, into the interlanguage. In other words, learners use linguistic features of their first language (L1) – phonemes, grammatical rules, or meanings or functions of particular words – when learning the second language (L2) (or third language, fourth language, etc). The L1 is used either to provide a basis for constructing the grammar of the L2, or because the learner has not yet recognised differences between the L2 and the L1. Mufwene (1990) was one of the first creolists to suggest that the influence of the substrate languages (of the kind just described for Hawai‘i Creole and Bislama) could be the result of transfer in SLA at an earlier stage of development. Wekker (1996: 144) described the process of creolisation as “one of imperfect second-language acquisition, predominantly by adults, involving the usual language transfer from the learners’ L1”. Lefebvre (e.g. 1998) and Lumsden (e.g. 1999) refer to the process of “relexification” to account for substrate influence in creoles. As a result of this process, lexical items of the creole have semantic and syntactic properties from the substrate but phonological forms from the lexifier. This appears to be very similar to what I have been calling functional transfer, and indeed, relexification is described as being closely related to L1 transfer (Lefebvre 1998: 34; Lumsden 1999: 226). More recently, Lefebvre, White and Jourdan (2006: 5) state, following Naro (1978: 337): “Relexification is a particular type of transfer.” Furthermore, relexification is seen as a process of SLA. Lumsden (1999: 226) says that relexification “plays a significant role in second language acquisition in general”. Lefebvre (1998: 10) states that “the process of relex-

Accounting for analyticity in creoles

53

ification is used by speakers of the substratum languages as the main tool for acquiring a second language, the superstratum language”. The hypothesis also assumes that relexification is a process that occurs in “ordinary cases” of targeted second language acquisition (Lefebvre 1998: 34). Thus, the view is that the results of functional transfer or relexification are a consequence of imperfect adult L2 learning, and therefore the use of relatively less complex analytic grammatical morphemes is a reflection of an early stage of linguistic development – i.e. developmental simplification. I previously held this view myself, but changed my mind when I examined the SLA research in detail (Siegel 2006, 2008). First of all, it is very difficult to find clear examples of functional transfer as defined here in studies of SLA. There are studies that show the influence of properties of L1 grammatical markers on the use of L2 grammatical markers that learners may perceive as equivalent. For example, Collins (2002) found that French learners of English inappropriately use the English perfect construction in simple past contexts, presumably because of its superficial similarity with the French passé composé – e.g. French elle a dansé ‘she danced’ and English she has danced (see also Wenzell 1989; Rocca 2007). However, learners in these studies apply grammatical properties of TMA markers from their L1 to existing TMA markers in the L2, not to words that normally do not have a grammatical function – as we saw with words such as stay and stop which became TMA markers in Hawai‘i Creole and Bislama (see examples 13 and 17 above). Similarly, while the functions of prepositions or case markers from the L1 may influence the choice of prepositions in the L2 (Ijaz 1986; Jarvis and Odlin 2000), learners never use an L2 lexical item to create a new preposition with L1 properties. In other words, SLA studies may show L1 influence, but not the kind of functional transfer that appears to be responsible for analytic grammatical morphemes in creoles.1 Furthermore, some creolists have specifically examined the interlanguage of learners of languages that are the lexifiers for various pidgins and creoles – Dutch (Muysken 2001), French (Véronique 1994; Mather 2000, 2006) and Swedish (Kotsinas 1996).2 An analysis of their results shows that in each case, the learners’ interlanguage does not contain features that clearly result from functional transfer and that correspond to analytic grammatical mark1

2

Odlin and Jarvis (2004) and Odlin (2008) describe as transfer the use of English some as a relative pronoun by Swedish learners of English, as in: Than come the woman and said it was the girl some were took the bread (Odlin and Jarvis 2004: 135). However, this appears to result from the formal similarity of some to the Swedish relative pronoun som rather than from functional transfer. Swedish was not actually a lexifier, but is very similar to Norwegian, which was a lexifier for Russenorsk.

54

Jeff Siegel

ing in the relevant pidgin or creole.3 This is especially true for TMA marking. Thus, although it has been claimed that transfer of substrate features in L2 acquisition is responsible for the analytic TMA systems of expanded pidgins and creoles, there is no evidence that this occurs in the interlanguage of L2 learners. On this point, Mather (2000: 258) refers to “the mystery” of TMA markers in French creoles appearing to be similar to those of the substrate languages while “there is very little evidence of [such] TMA markers in any French or other European interlanguage variety”.

5.

Alternative explanation

If processes of SLA cannot account for functional transfer and ultimately the prevalence of analytic grammatical structures in creoles, then what are their origins? First, let us look at contexts where functional transfer does occur. So far I have been talking about L2 acquisition as the study of how learners gradually attain the targeted L2 grammar. But many researchers in the field of SLA (e.g. Ellis 1994: 13; Kasper 1997: 310) also talk about L2 use, as opposed to acquisition. This is concerned with how speakers make use of what they have already learned combined with other resources in order to communicate. Transfer from the L1 is considered primarily a strategy of L2 use rather than acquisition (e.g. Kellerman 1995). L1 knowledge is therefore a resource in communication, used unconsciously to compensate for insufficient L2 knowledge. In other words, transfer is thought to occur as speakers make use of features of their L1 when the features of the L2 that they have learned are inadequate to express what they want to say or to interpret what is being said to them. Jarvis and Odlin (2000: 537) describe transfer in L2 use as a strategy for coping with the “challenges of using or understanding a second language”. Sharwood Smith (1986: 15) says that crosslinguistic influence (i.e. transfer) typically occurs in two contexts: (1) “overload” situations or “moments of stress” when the existing L2 system cannot cope with immediate communicative demands, and (2) “through a desire to express messages of greater complexity than the developing control mechanisms can cope with”. According to these views, transfer is considered to be a communication strategy, or a means for overcoming communication problems. What I have been calling functional transfer has been described specifically as one type of communication strategy in L2 use (Tarone, Cohen and Dumas 1983: 5, 11; Blum-Kulka and Levenston 1983: 132). 3

For a discussion of one possible exception, see Siegel (2008: 117).

Accounting for analyticity in creoles

55

Another important difference between acquisition and use concerns the relationship between the L1 and the target language. In acquisition, the goal is obviously to acquire the grammar of the target language (the L2). Therefore, structures used by learners that do not match input from the L2 are quickly abandoned. In use, however, the goal is to communicate in the L2. If structures based on the L1 but not found in the L2 lead to successful communication, they may be retained rather than abandoned. Therefore, in L2 use, at least in my conception of it here, there is not really a target language as such because the goal is not acquisition but communication, and speakers are not necessarily using the L2 as a basis for expanding their language. Thus, I see the development of creoles as follows: the first stage involves L2 learning as speakers of the substrate languages target the lexifier for use as a language of wider communication. In the first stages of SLA, most learners basically acquire lexical items and not grammatical markers (i.e. the Basic Variety). If the needs are only for basic communication, speakers may continue to use their own individual interlanguages. Alternatively, a restricted pidgin conventionalising common features of individuals’ interlanguages may emerge. However, if the needs for communication expand, individuals’ interlanguages or the restricted pidgin also need(s) to expand. This occurs in one of two ways: continuing L2 acquisition with the lexifier as a target or expansion without a target. As the result of L2 acquisition, substrate speakers may acquire analytic and eventually synthetic grammatical morphemes from the lexifier. But if access to the lexifier is for some reason unavailable, or if speakers do not wish to access it (e.g. for reasons of maintaining a separate identity), they expand their interlanguage or restricted pidgin by using communication strategies such as functional transfer. This results in the emergence of new analytic grammatical markers. (See Figure 2.)

Figure 2: Expansion in L2 learning and creole formation

56

Jeff Siegel

A developing contact language can take both paths. For example, in the early stages of expansion Hawai‘i Creole continued to target English and adopted the -ing suffix for progressive marking, as well as going its own way and developing the preverbal marker ste or stei (derived from stay), as shown by these early examples: (25) Early Hawai‘i Creole (Hawai‘i Education Review 1921: 11, 19) a. This time he stay coming. ‘He is coming right now.’ b. I stay working my house. ‘I was working at home.’ Hawai‘i Creole also has other grammatical morphology from English – e.g. the -s plural marking. In contrast, expansion in Melanesian Pidgin hardly involved targeting the lexifier language at all, and there was little influence from English.

6.

Conclusions

Frist, the use of analytic rather than synthetic grammatical marking is one aspect of the relative structural morphological simplicity of creoles compared to their lexifiers. Second, if this simplicity is a consequence of adult second language acquisition, then the following must be true: a. The primary agents of simplification are the adult learners, not the speakers of the lexifier (target L2). b. The simplicity is the result of developmental rather than reductive simplification. The simplification is therefore not “top down”. c. Most learners go through a stage of development in which their versions of the lexifier L2 are characterised by a lack of grammatical morphology (the Basic Variety), whether or not a “radically reduced” restricted pidgin emerges. Therefore the development of any morphology represents the expansion (or complexification) of interlanguage (or of a restricted pidgin if one does emerge). Third, if the prevalence of analytic rather than synthetic morphology in creoles is a consequence of this expansion in SLA, then we would expect the following: a. Indigenised L2 varieties would have a similar prevalence of analytic morphology. b. The characteristics of analytic morphemes that appear in interlanguage varieties would be similar to those of creoles – e.g. L2 forms exhibiting functional properties of corresponding morphemes in the learner’s L1 (i.e. functional transfer).

Accounting for analyticity in creoles

57

The fact that neither of these expectations is met is an indication that normal adult SLA cannot account for analyticity of creoles. Fourth and finally, functional transfer does occur, however, as a strategy for expanding the grammar of an interlanguage variety without further targeting of the L2. In the case of the origin of a creole, an interlanguage variety has become a contact language. When the functions of this contact language are extended, its grammar needs to expand. This is accomplished by drawing not on the structures from the lexifier language but on the speakers’ own linguistic knowledge – which includes knowledge of their first language. Thus it appears that the analyticity of creoles is a result of untargeted but L1-influenced grammatical expansion rather than SLA.

References Bates, Elizabeth and Judith C. Goodman 1997 On the inseparability of grammar and the lexicon: Evidence from acquisition, aphasia and real-time processing. Language and Cognitive Processes 12: 507–586. Blum-Kulka, Shoshana and Eddie A. Levenston 1983 Universals of lexical simplification. In: Claus Færch and Gabriele Kasper (eds.), Strategies in Interlanguage Communication, 119–139. London/New York: Longman. Bruyn, Adrienne 1996 On identifying instances of grammaticalization in creole languages. In: Philip Baker and Anand Syea (eds.), Changing Meanings, Changing Functions: Papers Relating to Grammaticalization in Contact Languages, 29–46. London: University of Westminster Press. Chaudenson, Robert 2001 Creolization of Language and Culture (revised in collaboration with Salikoko S. Mufwene). London: Routledge. Chaudenson, Robert 2003 La Créolisation: Théorie, Applications, Implications. Paris: L’Harmattan. Chock, Eric 1998 Da glove. In: Eric Chock, James R. Harstad, Darrell H. Y. Lum and Bill Teter (eds.), Growing Up Local: An Anthology of Poetry and Prose from Hawai‘i, 28–29. Honolulu: Bamboo Ridge. Collins, Laura 2002 The roles of L1 influence and lexical aspect in the acquisition of temporal morphology. Language Learning 52: 43–94. Corder, S. Pitt 1981 Formal simplicity and functional simplification in second language acquisition. In: Roger W. Andersen (ed.), New Dimensions in Second Language Research, 146–152. Rowley, MA: Newbury House. Crowley, Terry 1990 Beach-la-mar to Bislama: The Emergence of a National Language of Vanuatu. Oxford: Clarendon Press. Crowley, Terry 2002 Raga. In: John Lynch, Malcolm Ross and Terry Crowley (eds.), The Oceanic Languages, 626–637. Richmond, Surrey: Curzon. Crowley, Terry 2003 A New Bislama Dictionary (2nd edition). Suva/Vila: Institute of Pacific Studies/Pacific Languages Unit, University of the South Pacific. Dahl, Östen 2004 The Growth and Maintenance of Linguistic Complexity. Amsterdam/ Philadelphia: Benjamins. DeGraff, Michel 1999 Creolization, language change, and language acquisition: An

58

Jeff Siegel

epilogue. In: Michel DeGraff (ed.), Language Creation and Language Change: Creolization, Diachrony, and Development, 473–543. Cambridge, MA/London: The MIT Press. DeGraff, Michel 2001 On the origins of creoles: A Cartesian critique of Neo-Darwinian linguistics. Linguistic Typology 5: 213–310. DeGraff, Michel 2005 Morphology and word order in “creolization” and beyond. In: Guglielmo Cinque and Richard S. Kayne (eds.), The Oxford Handbook on Comparative Syntax, 293–372. Oxford: Oxford University Press. Ellis, Rod 1994 The Study of Second Language Acquisition. Oxford: Oxford University Press. Ferreiro, John A. 1937 Everyday English for Hawaii’s Children. Wailuku: Maui Publishing Company. Gil, David 2008 How complex are isolating languages? In: Matti Miestamo, Kaius Sinnemäki and Fred Karlsson (eds.), Language Complexity: Typology, Contact, Change, 109–131. Amsterdam/Philadelphia: Benjamins. Heine, Bernd and Tania Kuteva 2005 Language Contact and Grammatical Change. Cambridge: Cambridge University Press. Holm, John and Peter L. Patrick (eds.) 2007 Comparative Creole Syntax. London: Battlebridge. Hopper, Paul J. and Elizabeth Closs Traugott 1993 Grammaticalization. Cambridge: Cambridge University Press. Ijaz, I. Helene 1986 Linguistic and cognitive determinants of lexical acquisition in a second language. Language Learning 36: 401–451. Jarvis, Scott and Terence Odlin 2000 Morphological type, spatial reference, and language transfer. Studies in Second Language Acquisition 22: 535–556. Kasper, Gabriele 1997 “A” stands for acquisition: A response to Firth and Wagner. The Modern Language Journal 81: 307–312. Kellerman, Eric 1995 Crosslinguistic influence: Transfer to nowhere? Annual Review of Applied Linguistics 15: 125–150. Klein, Wolfgang and Clive Perdue 1997 The Basic Variety (or: Couldn’t natural languages be much simpler?). Second Language Research 13: 301–347. Kotsinas, Ulla-Britt 1996 Aspect marking and grammaticalization in Russenorsk compared with Immigrant Swedish. In: Ernst Håkon Jahr and Ingvild Broch (eds.), Language Contact in the Arctic: Northern Pidgins and Contact Languages, 123–154. Berlin: Mouton de Gruyter. Kusters, Wouter 2003 Linguistic Complexity: The Influence of Social Change on Verbal Inflection. Utrecht: LOT. Kusters, Wouter 2008 Complexity in linguistic theory, language learning and language change. In: Matti Miestamo, Kaius Sinnemäki and Fred Karlsson (eds.), Language Complexity: Typology, Contact, Change, 3–22. Amsterdam/Philadelphia: Benjamins. Lefebvre, Claire 1998 Creole Genesis and the Acquisition of Grammar: The Case of Haitian Creole: Cambridge Studies in Linguistics 88. Cambridge: Cambridge University Press. Lefebvre, Claire, Lydia White and Christine Jourdan 2006 Introduction. In: Claire Lefebvre, Lydia White and Christine Jourdan (eds.), L2 Acquisition and Creole Genesis, 1–14. Amsterdam/Philadelphia: Benjamins. Lumsden, John S. 1999 The role of relexification in creole genesis. Journal of Pidgin and Creole Languages 14: 225–258.

Accounting for analyticity in creoles

59

Mather, Patrick-André 2000 Creole genesis: Evidence from West African L2 French. In: Dicky G. Gilbers, John Nerbonne and Jos Schaeken (eds.), Languages in Contact, 247–261. Amsterdam/Atlanta: Rodopi. Mather, Patrick-André 2006 Second language acquisition and creolization: Same (i-) processes, different (e-) results. Journal of Pidgin and Creole Languages 21: 231–274. Matthews, Stephen and Virginia Yip 1994 Cantonese: A Comprehensive Grammar. London/New York: Routledge. McWhorter, John 2000 Defining “creole” as a synchronic term. In: Ingrid Neumann-Holzschuh and Edgar W. Schneider (eds.), Degrees of Restructuring in Creole Languages, 85–123. Amsterdam/Philadelphia: Benjamins. McWhorter, John 2001 The world’s simplest grammars are creole grammars. Linguistic Typology 5: 125–166. McWhorter, John 2003 Pidgins and creoles as models of language change: The state of the art. Annual Review of Applied Linguistics 23: 202–212. McWhorter, John 2007 Language Interrupted: Signs of Non-Native Acquisition in Standard Language Grammars. Oxford/New York: Oxford University Press. McWhorter, John 2008 Why does a language undress? Strange cases in Indonesia. In: Matti Miestamo, Kaius Sinnemäki and Fred Karlsson (eds.), Language Complexity: Typology, Contact, Change, 167–215. Amsterdam/Philadelphia: Benjamins. Michaelis, Susanne, Philippe Maurer, Martin Haspelmath and Magnus Huber (eds.) forthcoming Atlas of Pidgin and Creole Language Structures. Munich: Max Planck Digital Library. Miestamo, Matti 2008 Grammatical complexity in a cross-linguistic perspective. In: Matti Miestamo, Kaius Sinnemäki and Fred Karlsson (eds.), Language Complexity: Typology, Contact, Change, 23–41. Amsterdam/Philadelphia: Benjamins. Morales, Rodney 1988 The Speed of Darkness. Honolulu: Bamboo Ridge Press. Mufwene, Salikoko S. 1990 Transfer and the Substrate Hypothesis in creolistics. Studies in Second Language Acquisition 12: 1–23. Mufwene, Salikoko S. 2001 The Ecology of Language Evolution. Cambridge: Cambridge University Press. Muysken, Pieter 2001 The origin of creole languages: The perspective of second language learning. In: Norval Smith and Tonjes Veenstra (eds.), Creolization and Contact, 157–173. Amsterdam/Philadelphia: Benjamins. Muysken, Pieter and Norval Smith 1995 The study of pidgin and creole languages. In: Jacques Arends, Pieter Muysken and Norval Smith (eds.), Pidgins and Creoles: An Introduction, 1–14. Amsterdam/Philadelphia: Benjamins. Naro, Anthony J. 1978 A study on the origins of pidginization. Language 54: 314–347. Odlin, Terrence 2008 Focus constructions and language transfer. In: Danuta Gabry´s-Barker (ed.), Morphosyntactic Issues in Second Language Acquisition, 3–28. Clevedon: Multilingual Matters. Odlin, Terrence and Scott Jarvis 2004 Same source, different outcomes: A study of Swedish influence on the acquisition of English in Finland. International Journal of Multilingualism 1: 123–140. Parkvall, Mikael 2008 The simplicity of creoles in a cross-linguistic perspective. In: Matti Miestamo, Kaius Sinnemäki and Fred Karlsson (eds.), Language Complexity: Typology, Contact, Change, 265–285. Amsterdam/Philadelphia: Benjamins.

60

Jeff Siegel

Prista, Alexander da R. 1966 Essential Portuguese Grammar. New York: Dover. Roberts, Sarah J. 1998 The role of diffusion in the genesis of Hawaiian Creole. Language 74: 1–39. Roberts, Sarah J. 2005 The Emergence of Hawai‘i Creole English in the Early 20th Century: The Sociohistorical Context of Creole Genesis. PhD dissertation. Stanford University. Rocca, Sonia 2007 Child Second Language Acquisition. Amsterdam/Philadelphia: Benjamins. Romaine, Suzanne 1988 Pidgin and Creole Languages. London: Longman. Sakoda, Kent and Jeff Siegel 2003 Pidgin Grammar: An Introduction to the Creole Language of Hawai‘i. Honolulu: Bess Press. Schütz, Albert J. 1969a Nguna Grammar. Honolulu: University of Hawai‘i Press (Oceanic Linguistics Special Publication No.5). Schütz, Albert J. 1969b Nguna Texts. Honolulu: University of Hawai‘i Press (Oceanic Linguistics Special Publication No.4). Seuren, Pieter A. M. and Herman Wekker 1986 Semantic transparency as a factor in creole genesis. In: Pieter Muysken and Norval Smith (eds.), Substrata versus Universals in Creole Genesis, 57–70. Amsterdam: Benjamins. Sharwood Smith, Michael 1986 The competence/control model, crosslinguistic influence and the creation of new grammars. In: Michael Sharwood Smith and Eric Kellerman (eds.), Crosslinguistic Influence in Second Language Acquisition, 10–20. Oxford: Pergamon Press. Sharwood Smith, Michael 1996 Crosslinguistic influence with special reference to the acquisition of grammar. In: Peter Jordens and Josine Lalleman (eds.), Investigating Second Language Acquisition, 71–83. Berlin: Mouton de Gruyter. Siegel, Jeff 1987 Language Contact in a Plantation Environment: A Sociolinguistic History of Fiji. Cambridge: Cambridge University Press. Siegel, Jeff 1999 Transfer constraints and substrate influence in Melanesian Pidgin. Journal of Pidgin and Creole Languages 14: 1–44. Siegel, Jeff 2000 Substrate influence in Hawai‘i Creole English. Language in Society 29: 197–236. Siegel, Jeff 2003 Substrate influence in creoles and the role of transfer in second language acquisition. Studies in Second Language Acquisition 25: 185–209. Siegel, Jeff 2004 Morphological simplicity in pidgins and creoles. Journal of Pidgin and Creole Languages 19: 139–162. Siegel, Jeff 2006 Links between SLA and creole studies: Past and present. In: Claire Lefebvre, Lydia White and Christine Jourdan (eds.), L2 Acquisition and Creole Genesis, 15–46. Amsterdam/Philadelphia: Benjamins. Siegel, Jeff 2008 The Emergence of Pidgin and Creole Languages. Oxford/New York: Oxford University Press. Szmrecsanyi, Benedikt and Bernd Kortmann 2009 Between simplification and complexification: Non-standard varieties of English around the world. In: Geoffrey Sampson, David Gil and Peter Trudgill (eds.), Language Complexity as an Evolving Variable, 64–79. Oxford: Oxford University Press. Tarone, Elaine, Andrew D. Cohen and Guy Dumas 1983 A closer look at some interlanguage terminology: A framework for communication strategies. In: Claus Færch and Gabriele Kasper (eds.), Strategies in Interlanguage Communication, 4–14. London/New York: Longman.

Accounting for analyticity in creoles

61

Traugott, Elizabeth 1977 Pidginization, creolization, and language change. In: Albert Valdman (ed.), Pidgin and Creole Linguistics, 70–98. London/Bloomington: Indiana University Press. van de Craats, Ineke, Norbert Corver and Roeland van Hout 2000 Conservation of grammatical knowledge: On the acquisition of possessive noun phrases by Turkish and Moroccan learners of Dutch. Linguistics 38: 221–314. Véronique, Daniel 1994 Naturalistic adult acquisition of French as L2 and Frenchbased creole genesis compared: Insights into creolization and language change. In: Dany Adone and Ingo Plag (eds.), Creolization and Language Change, 117–137. Tübingen: Max Niemeyer. Wekker, Herman 1996 Creolization and the acquisition of English as a second language. In: Herman Wekker (ed.), Creole Languages and Language Acquisition, 139–149. Berlin: Mouton de Gruyter. Wenzell, Vanessa E. 1989 Transfer of aspect in the English oral narratives of Russian speakers. In: Hans Dechert and Manfred Raupach (eds.), Transfer in Language Production, 71–97. Norwood, NJ: Ablex. Willis, R. Clive 1965 An Essential Course in Modern Portuguese. London: George G. Harrap.

62

Terence Odlin

Terence Odlin

Nothing will come of nothing*

1.

The notion of zero in second language acquisition

This chapter examines the phenomenon of zero prepositions and zero articles in the writing of learners of English in Finland and considers some implications of the phenomenon for the problem of complexity in second language acquisition (SLA). All of the examples below come from native speakers of Finnish writing about the Charlie Chaplin film Modern Times: (1) Chaplin go a restaurant. (F5 48) (2) Chaplin, girl and policeman drop [fall from] a car. (F5 48) (3) Girl stover [steals] bread and run go way. (F5 11) In the first example, the most likely preposition that speakers of the target language might choose would be to, and in the second sentence it might be from since the movie scene showed Chaplin and others falling out of a police vehicle. In that sentence policeman is not marked for an article by the Finnish learner, as is twice evident with girl in the second and third sentence. Since such ‘simplification’ (a term whose precise meaning will be discussed) often occurs in second language acquisition, the patterning of zero articles and zero prepositions and their causes constitute important topics for the study of simplicity and complexity in language. The working assumption about complexity adopted in the following analysis relies on the notion of ‘descriptive complexity’ as propounded by Rescher (1998), who defines it as the “Length of the account that must be given to provide an adequate description of the system at issue” (Rescher 1998: 9); this notion is similar to a specifically linguistic notion of Dahl (2004: 42), who considers it plausible to measure systemic complexity “in terms of the length of the shortest description of an object”. Working in a wider framework, Rescher notes that C.S. Peirce formulated a similar notion, but Peirce restricted his notion to physical systems whereas Rescher’s analysis embraces other systems as well, including ones involving language and cognition. De* I would like to thank ZhaoHong Han, Lourdes Ortega, and the editors for their helpful comments. Special thanks to Scott Jarvis and apologies to King Lear.

Nothing will come of nothing

63

scriptive complexity is the first type given in a general taxonomy offered by Rescher, and his framework has had an impact on linguistics as seen, for instance, in the introduction to a recent volume on complexity in language (Karlsson, Miestamo, and Sinnemäki 2008) adopting Rescher’s taxonomy. As Rescher observes, descriptive complexity overlaps somewhat with other types such as ‘generative complexity’ (“Length of the set of instructions that must be given to provide a recipe for producing the system at issue” [Rescher 1998: 9]). Even so, using descriptive complexity as the frame of reference in this chapter has particular advantages. Such an approach foregrounds the need for an accurate and thorough account of the target language that learners encounter (in this chapter, English). In effect, an adequate description of the target language details the complexity that learners face, and it also provides a standard to measure the adequacy of theories of acquisition and, to some extent, more general theories of language. This view of complexity through the prism of the acquisition of the target by a second language learner somewhat resembles an approach of Kusters (2008), who defines complexity “as the amount of effort a generalized outsider has to make to become acquainted with the language in question” (Kusters 2008: 9). Kusters details this approach thus: A generalized outsider learns the language in question at a later age and is not a native speaker. Therefore phenomena that are relatively difficult for a second language learner in comparison with a first language learner are the most complex. The notion “generalized” prevents positive and negative interferences of an accidental first language to cloud our account of complexity. (Kusters 2008: 9)

The main difference in the approach to complexity in this chapter and the one in Kusters’ lies in the role of positive and negative influences of an ‘accidental’ language, whether it is a first language or perhaps some other language encountered before the complexity of the new target language is confronted (e.g., L2 influence on L3). Kusters refers to such influence as ‘interference’, but it is an oxymoron to say ‘positive interference’ when such influence actually facilitates acquisition. Such positive and negative influences will henceforth be referred to as positive or negative transfer or, more generally, to cross-linguistic influence or just transfer (cf. Odlin 1989, 2003). The main problem in Kusters’ approach stems from the fact that there is really no such a creature as a ‘generalized outsider’. Kusters admits as much in a footnote (2008: 9), yet defends the notion as having a heuristic value. In any case, while the descriptive complexity of the target constitutes a key concern in second language acquisition research, the complexity of the L1 or other previously acquired languages is also relevant, especially because

64

Terence Odlin

such complexity is often transferable, as the evidence in this chapter will show. While descriptive complexity serves well for the following discussion, any notion of zero poses descriptive and theoretical problems. Zero has figured prominently in many analyses, and the idea of using it goes back to antiquity, as its use by the Sanskrit grammarian Panini shows (Shukla 2006). In comparatively recent times, the use of zero prompted some interesting discussions among American Structuralists about just when and how zero might be employed usefully in descriptions. For example, Bloch ([1947] 1967) posited a past tense suffix with “the phonemic shape zero” after the base of irregular verbs such as put (1967: 245). By this convention, put could be systematically included in a much larger set of both regular verbs such as helped and some irregulars with overt suffixes such as left. Nida ([1948]1967), however, urged caution in positing zero morphemes, and deemed that “it is hard to determine how extensive an item must be to demand the introduction of zero” (1967: 263). He did not reject zero as a descriptive tool, but he did propose several principles to identify cases where the tool could be used appropriately. Since the structuralist heyday, zero has continued and indeed proliferated in the kinds of concepts associated with it including, for instance, the notion of ‘zero case’ in some generative analyses (e.g., Haegeman and Guéron 1999) and zero anaphora in pragmatic as well as syntactic analyses (e.g., Givón 1983). Recent studies of variation and change in English also show that zero plays a prominent role in contemporary analyses (e.g., Tagliamonte and Smith 2005). Similar issues arise for the description of articles. In a contrastive analysis of English and Finnish, Chesterman (1991) supports analyses that propose not only a zero article but also a null article, as in boldfaced nouns in people who live in glass houses (zero article) and in prison (null article). The arguments for the distinction between null and zero go beyond the scope of the present chapter since the empirical issue about articles to be studied is whether or not the language learners in Finland actually supplied an article in an obligatory environment, one where native speakers would routinely supply one. While Chesterman distinguishes null and zero articles in English, he describes Finnish as a language with no articles. By his analysis, the clause Nainen juoksee (woman runs) has a noun phrase nainen which could be either definite or indefinite in meaning, and discourse context is the only clue as to which interpretation is best in any given situation. Neither the term null nor the term zero article forms part of the structural description of the Finnish clause. Chesterman’s assumption about no articles in Finnish will be considered more closely later in the chapter, but the conclusion is that his assumption proves workable for the following analysis.

Nothing will come of nothing

65

Apart from the descriptive problem of the native and target languages related to zero, there is also the psycholinguistic issue of the ontogeny of zero in what can be considered products of second language acquisition processes: interlanguage speech or writing (Selinker 1992). More than one explanation seems possible. Zero may arise from any of the following: (1) a lack of vocabulary knowledge (whether receptive or productive); (2) unawareness of target language grammatical rules (e.g., when to use articles); (3) failure to monitor interlanguage production; (4) cross-linguistic influence. These explanations are not mutually exclusive, of course, and may singly or jointly account for much of what is often called ‘simplification’. Even so, simplification is perhaps more of a theoretical construct than an actual psychological process, and different analyses have come to different conclusions about the relation between transfer and simplification (cf. Meisel 1983, Jarvis and Odlin 2000). In any case, zero still presents theoretical challenges. For instance, if transfer is involved, can the absence of something in the native language be considered an ‘influence’ on interlanguage production or comprehension? Although zero has long been viewed as a useful notion in morphology and syntax, the psychological reality of zero in second language acquisition is problematic. As a common interlanguage phenomenon, it does require explanation, of course, but it would be mistaken to assume that the phenomenon is unitary. That is, the reasons for zero can be diverse. Even when cross-linguistic influence is at work the particular role that transfer plays in one kind of interlanguage zero is not necessarily identical to its role in another kind. Thus, the implication is that not all zeros are equal, and the evidence to be presented supports that conclusion. The plan for the following analysis includes a discussion of the main languages of Finland as well as an account of the specific groups to be studied: the comparative methodology used in relation to these groups allows for reasonably strong inferences about transfer. The specific procedures will then be described, followed by an examination of the somewhat different results involving zero prepositions and zero articles. Implications of these results will then be considered both for the Finnish context and for more general inferences about simplification and complexity.

2.

Groups studied and methods used

Although Finland has a bilingual language policy for many purposes, the relation between Finnish and Swedish is definitely one of a majority language (about 93 % native speakers of Finnish) and minority language (about 6 % native speakers of Swedish). Both languages are taught as first and second

66

Terence Odlin

languages in schools, and English is normally a required foreign language, although the details of who studies what language and for how many years have varied in the past few decades (Ringbom 1987, 2007). Data collected in Finland by Jarvis (1998) will serve as the main evidence for the interpretation of zero prepositions and zero articles in this chapter. The methodology that Jarvis used has a number of strengths. First, his study compared speakers of two languages that are typologically quite different: Swedish, a Germanic language similar to English in many ways (albeit with some points of contrast), and Finnish, a non-Indo-European language showing many more points of contrast with English. Second, the speakers whom Jarvis compared had similar social backgrounds, all of the participants being pupils enrolled in schools in Finland. (For the sake of brevity, the native Finnish speakers will often be referred to as “Finns” and the native Swedish speakers as “Swedes”, but it should be kept in mind that all the participants were Finnish citizens.) A third strength of the Jarvis study is that the data collected come from performances of a task given to all the participants, with details of that task provided further on in this section. Finally, the fact that Jarvis collected data from Finns and Swedes with varying numbers of years of English studied allows for some inferences about interlanguage development. Table 1 summarizes the relevant characteristics of the different groups. In the first column of the table, the numbers after the letter indicate the year of school: thus F5, for instance, indicates Finns in the fifth grade, and S7 Swedes in the seventh grade. Although the F5 and S7 groups are different in their years of schooling and also in their ages (the former 11–12 and the latter 13–14), they have an important commonality in that each group was in its third year of English study at the time Jarvis collected his data. As noted above, some Finns began their study of English before studying Swedish (as seen with the F5 group) while others began English after considerable study of Swedish (as seen with the F9B group). The many years of prior study of Swedish of the F9B prove important in the differences between this group and the F5 even while both have only two years of prior study of English, and more will be said in the chapter about the difference. The Swedes had all studied Finnish as their L2 before coming to English, but the prior study of Finnish does not prove evident in any of the results and so will not be considered further.

67

Nothing will come of nothing Table 1. Experimental participant groups (cf. Jarvis 1998, Odlin and Jarvis 2004) Group

N

L1

Ages

Grade

English Instruction

Swedish Instruction

F5 F7 F9A F9B

35 35 35 35

Finnish Finnish Finnish Finnish

11–12 13–14 15–16 15–16

5 7 9 9

3rd year 5th year 7th year 3rd year

S7 S9

35 35

Swedish Swedish

13–14 15–16

7 9

3rd year 5th year

None 1st year 3rd year 7th year Finnish Instruction 5th year 7th year

As noted above, there have been changes in when particular languages are studied at school, and groups such as F9B (Table 1), who have an early start with Swedish relative to English, are not so common as they were ten or twenty years ago, as details given by Ringbom (1987, 2007) indicate. Ringbom, a member of the Swedish-speaking minority, sees the socio-economic and cultural differences between Finns and Swedes as minimal, and so the results of comparisons that he, Jarvis, and others have made of the two groups’ performance in English seem largely attributable to the linguistic differences and not to socio-cultural factors. The role of these linguistic differences can be assessed in detail because Jarvis (1998) collected data not only from the groups described in Table 1 but also from native speakers of Finnish, Swedish, and English, and some examples of constructions used by native speakers will be given further on. The main language task was the same for the non-native speakers of English as well as for the three native-speaker groups just mentioned. All groups wrote accounts of certain episodes in the Chaplin film Modern Times. The groups described in Table 1 wrote their narratives in English, and other native speaker groups wrote narratives in Finnish, Swedish, and English (in the latter case, speakers of American English in Indiana participated). The scenes watched were presented in two segments of the film, first a 5-minute sequence and later a 3-minute sequence. After each sequence, participants were given intervals of about 30 minutes (for the first sequence) and 14 minutes (for the second) to write their narratives. Additional details about the elicitation procedure and materials are provided by Jarvis (1998: 85–93). The analysis of data in this chapter comes from the database of narratives collected. Every narrative of the 210 individuals described in Table 1 was examined for prepositional phrases and articles. The analysis of preposi-

68

Terence Odlin

tional phrases was restricted to spatial constructions with combinations of the following Verb + Argument classes: Verbs: motion (go, walk, etc.), manipulation (take, put, etc.), posture (stand, sit, etc.) Arguments: goal (e.g., to the restaurant), path (along the street), source (from the truck)

There were, of course, prepositional phrases that showed other patterns such as with verbs of perception (e.g., look at) and mental states (e.g., dream about). However, restricting the analysis to the three types of verbs and three types of arguments allows for straightforward assessments of the discourse context and the linguistic expressions used in context. The analysis of articles was based on the classifications of Jarvis (2002) of definite/indefinite article patterns, but where Jarvis restricted his analysis to references to the female protagonist in the film (played by Paulette Goddard), the analysis in this chapter is based on the classification of articles in every noun phrase in the Modern Times database, with that classification giving the frequency of zero, definite, and indefinite articles in the narrative of every one of the 210 participants in Table 1. The frequencies for the two article types were combined in the count undertaken, and in the tables reporting the results below, they are reported simply as overall frequencies of supplied articles (even if a non-standard usage in the discourse context). For prepositions, the count was similar, where the absence of a preposition was counted as zero and the presence of one (even if a non-standard usage in the discourse context) was counted as a supplied preposition.

3.

Results

The results for prepositions will be considered first, followed by those for articles. There will then be a discussion of certain facets of the contrastive analyses of English with Swedish and Finnish which will shed light on the results, and the final subsection will consider alternative explanations besides transfer for the results.1 1

The term contrastive analysis as used here simply means the comparison of languages (Odlin 1989). Some second language acquisition researchers (e.g., Selinker 1992) have used the term to refer mainly to the ideas about language learning and teaching associated with American Structuralists such as Charles Fries. Wardhaugh (1970) used the term Contrastive Analysis Hypothesis with similar ideas in mind, especially ones related to the prediction of difficulties in second language acquisition. It is beyond the scope of this article to explore such issues (but see Odlin [2006]), though it should be noted that it is not unusual for linguists in translation studies (e.g., Chesterman 1991) or cognitive anthropology (e.g., Lucy 1992) to use contrastive analysis without any assumptions such as those of Wardhaugh or Selinker.

69

Nothing will come of nothing

3.1. Results for prepositions Table 2 indicates the raw figures for zero prepositions and for supplied prepositions in phrases that require a preposition. Two patterns are salient: first, among the Finns, zero prepositions are not unusual in the F5 group, but they are somewhat rare in the others, especially in the F9A group. The percentage figures of Table 3 make this pattern even clearer. The second significant fact evident in Table 2 is that the L1 Swedish groups almost never had zero prepositions in their writing, and the percentage value for the S7 group in Table 3 (.8 %) likewise makes this fact quite plain. Table 2. Frequency of zero and supplied prepositions in phrases requiring a preposition (raw figures) F5

F7

F9A

F9B

S7

S9

Zero

43

19

2

13

2

0

Prep.

100

178

212

220

225

228

Table 3. Percentage of zero in all phrases requiring a preposition . F5 Zero

30.1

F7

F9A

F9B

.S7

S9

9.6

0.9

5.7

0.8

0

Table 4. Number of learners in each group using at least one zero preposition F5

F7

F9A

F9B

S7

S9

18

8

2

12

1

0

Over half of the F5 group used at least one zero preposition, as Table 4 indicates. A much smaller number of F7 students (not even a quarter of the group) did likewise, and the figure for the F9A group (only two students) is comparable to the S7 group (just one student). The F9B group had a rather large number of individuals (12) whose writing showed a zero preposition. However, the ratio of such errors (13) to individuals is just slightly more than 1.0. In contrast, there were 43 zero prepositions in the work of 18 individuals in the F5 group (thus, a ratio well over 2.0) and there were some F5 individuals whose writing showed as many as four or even six zero prepositions. Such cases indicate a more systemic problem – not just an occasional lapse – among the F5 group in contrast to the F9B group. Moreover, the large numbers of zero prepositions among the F5 students cannot simply be attributed to a lack of knowledge of English vocabulary (e.g., of the prepositions in or from as vocabulary items), as will be discussed in section 3.4.

70

Terence Odlin

As noted above, the advantage of the Swedes in using prepositions in obligatory environments was striking. Combining the F5 and F7 groups (with about two and four years of English respectively) and contrasting them with the S7 and S9 groups will allow for further comparison. The group mean of zero prepositions for the two Finnish groups was 0.88, with a standard deviation of 1.5), while the mean for the Swedish groups was only 0.02 (s.d. = .7). This difference of means was statistically significant on a t-test using the independent samples method (df 138, t=4.47, p < .001). The rarity of zero prepositions among the Swedes thus argues for the facilitating effects of the L1 Swedish prepositional system in supplying prepositions in English, in other words, positive transfer of the L1 system. The F9B group was not part of the statistical test just described, but their six years of L2 Swedish study probably also led to much positive transfer; like the F5 group they had studied English for only two years, yet zero prepositions were far less of a problem for them.2

3.2. Results for articles In comparison with zero prepositions, zero articles are much more frequent, as Table 5 indicates. Even members of the S9 group, whose narratives had no zero prepositions, do show some zero articles. Nevertheless, the Swedes of the S7 and S9 groups enjoyed a clear advantage as the percentages of zero in Table 6 indicate: 1.2 and 2.0 respectively. The Finns with the highest percentage of zero articles are clearly the F5 group (54.2), but it is also noteworthy that the F7 and F9A groups show very high percentages (34.6 and 36.2). Although they display greater proficiency in rarely having zero prepositions in their writing, the F9A group shows no improvement over the F7 group when it comes to zero articles. Moreover, they do considerably worse than 2

While the statistics given show a clear difference between the Finns and the Swedes in the frequency of zero prepositions and also zero articles, they do not fully capture the range of individual variation among the Finns. For example, with one learner (F5 01) there was only one preposition supplied in an obligatory context and five cases of zero prepositions, whereas with another learner (F5 14), there were seven prepositions supplied in obligatory contexts and only one zero, and yet another’s narrative (F5 48) showed three prepositions and four instances of zero prepositions. Space does not allow an account of all the details of the individual variation, but another analysis (Odlin 2010) provides several profiles. Among the Swedes the range of such individual variation was much smaller, and this fact warrants the inference that a contrastive analysis can have some predictive power to highlight group differences even while a deterministic contrastive approach will likely fail to account for the behavior of any particular individual.

71

Nothing will come of nothing

the F9B croup, which had only two years of English but six years of Swedish. With the high percentage of 29.3 zero articles, the F9B students do not perform nearly as well as the native speakers of Swedish yet they do outperform all the other Finnish groups. Table 5. Frequency of zero and supplied articles (indefinite or definite) in NPs requiring an article (from unpublished data of Scott Jarvis) F5

F7

F9A

F9B

S7

S9

Zero

398

281

403

323

14

28

Articles

335

531

709

781

1142

1241

S7

S9

1.2

2.0

Table 6. Percentage of zero in all NPs requiring an article. F5 Zero

54.2

F7 34.6

F9A 36.2

F9B 29.3

The difficulty is also evident in the absolute of numbers learners using at least one zero article (Table 7). In the case of the F7 students, at least one zero was found in the narrative of every individual of the group, and the numbers of individuals in the other L1 Finnish groups are also high. In contrast, about one third of the Swedes in the S7 group used zero articles and about a quarter of the S9 group. The advantage that the Swedes showed in supplying an article in obligatory contexts is statistically significant. As was done with the zero prepositions of the F5 and F7 groups, a combined group mean was calculated, and the same procedure was used with the S7 and S9 groups. The group mean of zero articles for the two Finnish groups was 9.7, with a standard deviation of 6.9, while the mean for the Swedish groups was 0.6 (s.d. = 2.3). This difference of means proved very great, with a t-statistic of 10.3 (independent samples method, df 138, p < .001). Table 7. Number of learners in each group using at least one zero article F5

F7

F9A

F9B

S7

S9

32

35

32

30

11

9

3.3. Contrastive analysis of Finnish, Swedish, and English It will now be useful to summarize the main points indicated in the results: – A strong propensity of L1 Finnish speakers to have zero prepositions and zero articles.

72

Terence Odlin

– A strong advantage for Swedish speakers in supplying prepositions and articles. – Greater overall difficulty in supplying obligatory articles as opposed to prepositions. Although these results suggest the importance of structural differences between Finnish and English, on the one hand, and of structural similarities between Swedish and English on the other, it is necessary now to look at actual details of Swedish and Finnish structure so that the inferences of transfer can be justified. These details can also help to understand why zero prepositions prove somewhat less difficult than zero articles. The relevant structural details are sometimes evident in the narratives written in Swedish and Finnish, thus allowing for structural comparisons in the same discourse context. Like English, Swedish relies heavily on prepositions to express spatial reference, as in a clause written by a native speaker: (4) De satte sig ner i gräset. They sat REFLX down in grass-the ‘They sat down in the grass.’ (SX 04) Although some grammatical details differ from English (e.g., the reflexive pronoun), the preposition i corresponds to English in in this discourse context. While the prepositions are obligatory, the adverbs ner/down are not. Finnish has a handful of prepositions, but most spatial reference functions are realized through either postpositions or nominal case inflections, and in the latter option no adposition may be necessary as in (5) Chaplin ja

tyttö kävelevät tiellä

Chaplin and girl

ja

istuvat nurmikolle.

walk-PL road-ADES and sit-PL grass-ALLA

‘Chaplin and the girl walk on the road sit on the grass.’ (FX 23) The adessive case used with tiellä (-llä) formally contrasts with the allative case inflection -lle, which realizes the spatial relation that involves movement toward a goal (here, the grass) even while both inflections can translate into English as on.3 The prepositions that do exist in Finnish (e.g., yli ‘after’) can 3

The tense of istuvat is present and for some native speakers of Finnish the use of the present form with nurmikolle seems a little unnatural. For some, the collocation is more felicitous in the past tense: thus, istuivat nurmikolle. However, in the corpus for narrations of Modern Times in Finnish, both of these collocations were evident,

Nothing will come of nothing

73

sometimes also function as postpostions, and as prepositions they are less important than either case inflections or postpositions in spatial reference, and seem to have little influence in forestalling the zero prepositions of Finns’ interlanguage English. Postpositions, on the other hand, play an important role and will be discussed in some detail further on. Like English, Swedish has an article system, albeit with a few significant differences from English articles in serial order and in number and grammatical gender distinctions. The gender distinction between the neuter (N) and common class (C) is manifest in the choice of indefinite articles: ett apple (‘an apple’, N) ett pass (‘a passport’, N), en flicka (‘a girl’, C), en pojke (‘a boy’, C). As the latter two examples show, the common case does not mark sex distinctions; moreover, the neuter class also includes some human denotata: e.g., ett barn (‘a child’). The gender distinction is also manifest in singular definite articles, but these are bound morphemes following the noun, and in the following examples the article is in boldface letters: applet (‘the apple’), passet (‘the passport’), barnet (‘the child’), flickan (‘the girl’), pojken (‘the boy’). A plural definite form -na is also a bound morpheme but neutralizes the common/neuter distinction: e.g., flickorna (‘the girls’), applena (‘the apples’), where the plural number in flickorna is doubly marked by the article and the nominal inflection -or but where applena is marked as plural only by the article. Despite these considerable systemic differences in the Swedish article system, they did not interfere in any major way with supplying articles in English, as the low percentage of zero articles among Swedes shows (Table 6). It thus appears that Swedish-speaking schoolchildren can usually make the necessary identifications between the article morphemes in their native language and those in English. Finnish does not have articles, although there is a pronoun se (‘it’) which can double as a determiner translatable as the or as that or as other forms (Chesterman 1991: 103). The instances where se (which is inflected for case and number) comes closest to resembling an article are where it invites a definiteness reading in a context where the noun it accompanies might otherwise be interpreted as an indefinite. Chesterman notes a tendency for subject noun phrases that follow verbs to have an indefinite reading in contrast to preverbal subject noun phrases. Thus a se in contexts such as the following overrides the indefinite reading: for example, in putosi se pullo (‘fell with five cases of the present istuvat and three of the past istuivat. Perhaps the somewhat greater frequency of the present can be explained as a discourse convention similar to the convention of stage directions in English (e.g., President Lincoln stands and greets General Grant might be used by a playwright to position actors).

74

Terence Odlin

the/that bottle’), the subject noun pullo might be read as indefinite without the se. (Chesterman [1991: 103] gives this example in a more complicated sentence whose other details are not relevant to the question of definiteness marking.) The translatability of se as either that or the is one of the reasons not to consider it as a dedicated article. Another reason is that se is relatively rare in some discourse contexts where articles are routinely used in other languages. Thus Jarvis (2002: 405) reports infrequent use of se in referring to Paulette Goddard in the Modern Times narratives written in Finnish: forms of se were used but in most contexts the use with a noun phrase was much less than the use of zero noun phrases. In comparable contexts, the Swedish narratives showed much more frequent uses of articles. As for the Finns, the figures that Jarvis provides do not distinguish uses of se as a pronoun from uses as a determiner, and if the statistics did, the figure for presumptive articles would likely be even lower.

3.4. Vocabulary and awareness of rules As noted earlier, one conceivable explanation for zero prepositions and zero articles includes cases where learners of English may simply not have an active or even a passive knowledge of vocabulary. In the case of articles, such an explanation seems improbable given the high frequency of the and a in most English texts, even though learners may not always grasp all the semantic or pragmatic dimensions of these deceptively simple forms. In the case of prepositions, the appeal to vocabulary gaps seems more plausible. English has many more prepositions than articles, of course, and in the earlier stages of acquisition such basic vocabulary will be a challenge, especially for learners whose native language lacks shared forms or cognates: while Swedish learners can count on the virtual identity of English till and Swedish till (often translated as to), Finnish offers no such head starts on prepositions. However, the absence of any word in a Modern Times narrative does not necessarily imply such a lack of knowledge of the word. Also worthy of further consideration is the argument that the occurrences of zero arise from a lack of “knowledge” about grammatical patterns such as the occurrence of prepositions before noun phrases or of articles before singular countable nouns. However, while neither argument can be ruled out altogether, neither helps to understand many of the patterns actually observed in the data. These arguments often fall short because several Modern Times narratives show a real knowledge yet inconsistent use of prepositions and articles. For instance, one Finn (F7 03) wrote, Then som other woman goes in the place (thus supplying the preposition in), yet the same individual wrote a few sentences

Nothing will come of nothing

75

later, Policeman caught he and put he a police car, where the zero before a police car could be acceptably filled by the preposition in. Such inconsistency does suggest that the learner has a problem in monitoring production but not in failing to know the vocabulary or grammatical rule required. Many similar inconsistencies appear in the data. Articles seem especially prone to inconsistent use, but examples of inconsistent uses of prepositions such as the one just given are likewise common. In a small number of Modern Times narratives (four Finns’ stories), no spatial preposition at all is used, and in these cases ignorance of vocabulary and/or ignorance of the prepositional phrase rule cannot be ruled out. However, in other cases these explanations seem less plausible in light of the evidence for transfer and monitoring problems. Although a joint contribution of negative transfer and monitoring failure will frequently account for the occurrences of zero (cf. Han and Lew, this volume), it will not work in all cases. Sometimes only a monitoring failure seems involved, as in the case of the lone Swede who produced a zero preposition (Table 4), something not attributable to Swedish influence. Likewise a small number of zero articles of Swedes may have little to do with transfer, as where one person (S9 58) wrote, Then the policemen com to them and Charlie Chaplin giv the bread to policmen and the policmen taken Charlie Chaplin, where the reference to the policeman is inconsistent, with zero in one case and the definite article in two others in the same sentence. Swedish does require articles in all three contexts, and so here negative transfer is not plausible as an explanation, but a failure to monitor is. Even so, zero articles are rare in the Swedes’ narratives, and thus cases of where monitoring failures alone should be invoked as explanations are likewise rare.

4.

Fossilization and positive transfer

Although zero prepositions are common in the earlier stages of acquisition, the problems seem to be eventually overcome as the performance of the F9A learners (with six years of English) indicates. On the other hand, zero articles remain a serious problem. Indeed the group performance differs only slightly from the F7 group, as Tables 6 and 7 indicate; for most people in the F9A group, the two years of additional instruction have had little effect in reducing the problem of zero articles, even though the extra instruction may well have played a role in the avoidance of zero prepositions. With structures where no progress is evident despite opportunities to learn and practice, the apparently permanent stabilization is often termed stabilization in second language acquisition research (Long 2003, Han 2004). In contrast, the term sta-

76

Terence Odlin

bilization is neutral as to whether performance can improve with even more instruction and other opportunities for acquisition for those learners who feel motivated to try to overcome the problem. If such opportunities prove to have little effect in improving performance, the argument for fossilization (cessation of learning) becomes stronger. Han and Long recognize, however, that cross-sectional studies such as the one in this chapter cannot count as strong evidence one way or the other, since the progress of individual learners is not tracked over time – only longitudinal research can persuasively address the fossilization issue (especially since the greatest difficulties can persist for decades, as Han’s research indicates). Still, the success of most of the F9A group in avoiding zero prepositions supports Han’s position that fossilization should not be considered a global but rather a local phenomenon, and thus articles but not prepositions would be the locus of any putative fossilization. Whether or not zero articles can illustrate real fossilization, they are clearly more difficult for Finns than they are for Swedes, and the similarities of the Swedish article system do account well for the success not only for the native speakers of Swedish but also for the F9B group, i.e. Finns who had studied English for only two years but had studied Swedish for six. This group outperformed all the other Finnish groups even though they had more difficulty with zero articles than the native-speaker Swedish groups. Such positive transfer is consistent with several other studies of article use: when learners of a new language that has articles already speak a language with articles, they enjoy an advantage, according to several empirical studies (Odlin 1989: 34; 2003: 461). As noted already, the Swedish article system does differ from the English article system in some ways, yet the fundamental similarity of using articles for specific semantic and pragmatic purposes proves great enough (perhaps with the aid of some classroom instruction) for learners of English to exploit the parallels. It likewise seems probable that the success of Finns in overcoming zero prepositions owes a great deal to positive transfer as the discussion will now consider.

5.

The symbiotic relation of some positive and negative transfer

At first glance the difference between positive and negative transfer seems clear-cut: the former leads to convergences between the source and target language in the interlanguage (where source is neutral as to whether the language is a first in an L2 acquisition context or a second in an L3 context), while the latter leads to divergences between the source and target that are occasioned by something different in the source. For language teachers, this

Nothing will come of nothing

77

distinction is crucial, of course, since foreign language pedagogy necessarily distinguishes real cross-linguistic correspondences such as between English stood and Swedish stod from “false friends” such as between English fabric and Swedish fabrik (which means ‘factory’). Despite the pitfalls of false friends or other illusory correspondences, however, cross-linguistic influence often facilitates lexical acquisition as Ringbom (1987: 58–59) notes. Ringbom also raises the question of the extent to which “we are justified in assuming that evidence of much negative transfer also implies an equivalent amount of positive transfer” (Ringbom 1987: 59). In other words, some “transfer errors” may have a kernel of facilitating influence despite the overt divergence that the error constitutes. Later in his analysis Ringbom offers examples of likely instances of false friends from L2 Swedish used by Finns in L3 English where there is at least a little semantic equivalence between Swedish and English. Ringbom does not give much attention to spatial reference in his books, but his insight about the symbiosis of some positive and negative transfer is indeed evident in certain erroneous prepositional choices that Swedes and Finns make. In the case of the Finns such transfer proves especially interesting since the choices show progress beyond zero prepositions, when learners start to make consistent use of some fairly straightforward correspondences between English and Finnish. For example, some Finns use the preposition to to translate the allative case ending in Chaplin ja tyttö […] istuvat nurmikolle (example 5 above). The allative ending -lle in the final word would best be translated here as on (thus, nurmikolle translates as ‘on the grass’) and many Finns (21 exactly) did correctly choose on (Jarvis and Odlin 2000: 544). However, six Finns (and no Swedes) chose to instead of on, as in […] they sat to the grass soon (F7 32). Yet only one F5 learner did, while three of the five cases of zero prepositions in the same discourse context (e.g., C.C and girl walking the street and sit gras, F5 51) came from the F5 group. Thus although the choice of to is an error, it does indicate progress beyond zero. The system of English prepositions also remains challenging in other ways, as example (5) shows a different case ending (the adessive) for the noun tiellä even while the translation of both the adessive ending and the allative ending in nurmikolle in English should be the preposition on; thus the occasional mapping of two different case forms to the same English preposition constitutes part of the learning problem. The translation choice of to in the lawn scene of Modern Times is not the only example of errors that indicate some positive (along with negative) transfer. Other instances include the following, with the target-like form in square brackets and the form actually supplied just to the left:

78

Terence Odlin

(6) […] the poliseman put in on [into] the poliscar that young lady […] (F7 15) (7) The policeman takes Chaplin and leave to [for] the policestation [sic] with him. (F9B 09) (8) When they had escaped in [from] the police car they sat under the tree. (F7 38) (9) Chaplin comes behind [around] the corner (F9A 05) The likelihood of transfer as the cause of such errors is high since each prepositional choice is compatible with something in Finnish, and because no Swede and no native speaker of English made these prepositional choices in the same context. (Although these are not unusual errors among the Finns, it should be noted that many learners did choose correct prepositions as was the case with sit on the grass.) While the anomalous uses can be specified as prepositional errors, more generally they represent problems of collocation. The verb put and the noun car can collocate in Standard English, but not where the car is the goal and the preposition indicating the goal is on. The Finnish influence here seems to involve a systemic distinction made between internal and external cases (Jarvis and Odlin 2000). Similarly in example 7, leave and to can also collocate, but for is the usual choice. (A search on Google with the string “leave for the airport” produced 263,000 hits while “leave to the airport” produced 20,000.) In the case of in the police car, the context makes the choice of in an error since the vehicle in the movie scene was the place from which Chaplin and Goddard had escaped. The choice of in might not seem to involve cross-linguistic influence since Finnish cases unambiguously distinguish sources from goals and other locations. However, the analysis of Jarvis and Odlin (2000: 547–549) offers reasons to consider this and some other anomalous uses of in as reflections of the internal/external case system of Finnish. The same analysis considers in as a particular type of transfer-induced lexical simplification, one not accounted for in the approach to simplification taken by Meisel (1983), which rules out transfer. Although an overgeneralized use of in indicates that there remains much about English prepositional semantics which some learners need to grasp, the supplying of a preposition does show a recognition of some correspondence between the semantic information of Finnish spatial case inflections and the semantic information of English prepositions, and thus the “wrong” preposition nevertheless manifests a greater degree of complexity over zero prepositions (cf. Doughty 2003: 274). The choice of behind instead of around (or round) in example 9 differs from the other errors in one respect; the others indicate the influence of nominal case inflections in Finnish whereas behind in this instance reflects the influence of

Nothing will come of nothing

79

a Finnish postposition. Details of this type of transfer will be discussed in section 7.

6.

Articles and the difficulty of positive transfer for Finns

Although prepositional errors such as those just considered prove common in the Modern Times narratives of Finns, they often indicate interlingual identifications between the L1 and English. Such identifications are much more difficult to make in the case of articles. The problem is not the lack of a conceptual or a linguistic category of definiteness in Finnish. As the analysis of Chesterman (1991) indicates, Finnish has a number of formal devices to signal definiteness in noun phrases. He does see the salient dimensions of the definiteness category as somewhat different in Finnish as opposed to English, but the semantic and pragmatic overlap of the category in the two languages is considerable and is consistent in certain cross-linguistic correspondences as seen in his analysis of translations. Two of the main formal devices have already been considered (section 3.3): word order and determiners in the example putosi se pullo (‘fell the/that bottle’). As noted, the Verb-Subject pattern invites an indefinite reading, but the definite determiner se overrides it, and so the Verb-Subject pattern is, in Chesterman’s judgment, only a somewhat useful predictor of indefiniteness while the converse SubjectVerb pattern tends to mark definiteness but, again, not with extremely high reliability. Still another formal indicator of a definiteness contrast is the choice of accusative versus partitive case, with the former marking definite direct objects and the latter indefinite ones (1991: 92). Finnish thus does not lack formal devices to signal definite/indefinite contrasts. On the contrary, the multiplicity of devices may be an obstacle for Finns to make the appropriate interlingual identifications when attempting to use English articles. A further difficulty is statistical: that is, despite the various cues to the definiteness/indefiniteness contrast in Finnish, they are not so frequent. As noted, Jarvis (2002) found that the use of se with a noun phrase occurred much less than did the use of zero noun phrases. Still another difficulty for Finns is the inconsistency of the cue provided by word order, as already discussed. In general, although the definiteness/indefiniteness contrast is real in Finnish, its cue strength is not high, as seen from an emergentist perspective: “we can talk about cue validity as the product of availability and reliability […]. A fully valid cue would always be present when you need it and always give you the right answer” (MacWhinney 2008: 355). For Finns looking in their native language for such cues to help with English articles, the aid of the L1 is only occasional. The existence of some

80

Terence Odlin

correspondences makes it implausible to say that Finns approach English articles simply as “generalized outsiders” in the sense that Kusters (2008) posits (see section 1), but the lack of articles in Finnish suggests at least a rough fit between Kusters’ theoretical construct and the pervasive problem of zero articles in the interlanguage of Finns. In contrast with definite/indefinite reference, spatial reference in Finnish provides learners with potentially much greater help, since case inflections marking spatial relations have better cue validity. These inflections are extremely common in native speakers’ narratives of Modern Times. Moreover, the inflections are very consistent because of the agglutinative character of Finnish morphology. Thus, the same inflection (e.g., the allative -lle in nurmikolle) can be used on virtually any Finnish noun. (The agglutination patterns of Finnish are not quite as consistent as they might conceivably be, but the exceptions involve stems more than affixes.) Although Finnish inflections are fairly transparent, they are of course bound morphemes, and bound morphology itself seems to be an area of grammar less accessible to awareness than, for example, word order in the judgment of Heeschen (1978), who sees metalinguistic awareness as a capacity not so different among everyday speakers of any language, be they in New Guinea or in Europe (Heeschen 1978: 175). For the F5 learners, the bound morphology of Finnish may prove less accessible than for older learners who seem much more adept and consistent at matching the semantic information of case inflections with comparable information of English prepositions. The serial order of steminflection versus preposition-noun may present an additional problem, but not likely a major one, since Swedes seem to have little difficulty with the serial order difference between definite articles in Swedish and English. In fact, the fixed order of Finnish inflections may contribute to high cue validity since the location of the transferable information is very predictable and the serial order of the constituents of a prepositional phrase in English is likewise predictable. As for Finnish postpositions, however, certain facts about serial order combined with other grammatical properties lead to special challenges.

7.

Postpositions and agreement rules

As seen in example 9, the choice of behind sometimes figures in the preposition errors of Finns as in Chaplin comes behind [around] the corner although some Finns do produce the target-like form as in […] she ran just around the corner […] (F9B 29). A set of related words in Finnish seems to be the source for both the errors and target-like uses in English: taakse, takana, takaa, and taka-

Nothing will come of nothing

81

naan. The semantic nuances of these forms vary, but what they have in common is that they are all postpositions. Moreover, all are used by native speakers of Finnish in their narrations of the Modern Times episodes. These postpositions normally co-occur with a noun or pronoun in the genitive case as in the following examples: (10) Nainen

juoksee kulmantaakse

Woman runs corner-GEN-around ‘The woman runs around the corner.’ (FY 22) (11) Poliisi tulee heidän taakse Policeman comes they-GEN behind ‘The policeman comes behind them.’ (FX 02) (12) […] poliisi seisoo heidän takana Police stands they-GEN behind ‘The policeman stands behind them.’ (FY 08) In example 10, the writer does not follow the usual practice in Finnish of separating taakse from the preceding noun, but the noun (kulman) is marked for genitive case. The postpositions are polysemous in that they can indicate either of the spatial meanings that are distinctly coded in English as around and behind. Interestingly, this polysemy sometimes results not only in errors with behind but also around: e.g., Charlie didn’t notice that the policeman stood around [behind] him (F9A 20). Even more remarkable is the frequent use of English forms that reflect the Finnish genitive: (13) Suddenly girl sow [saw] their behind is police and their run away. (F5 62) (14) A policeman came behind of they. (F9A 14) (15) In their back was a litle [sic] house where came man and a women out of doors (F9B 20) While all of the examples reflect Finnish influence, each shows a unique problem-solving approach on the part of the learner. Student F5 62 keeps the L1 order of the genitive pronoun followed by a postposition whereas F9A 14 reverses the order. F9B 20 uses back instead of behind (something that other Finns also did), and it is apparently part of a prepositional phrase. Des-

82

Terence Odlin

pite these individual approaches, the relation between the genitive and postposition is clear in all three examples and is, moreover, not found in the writing of the Swedes. The relation between postpositions and genitive marking in Finnish appears in the writing of several Finns, but no one attempted to mark the noun corner in any genitive form (e.g., corner’s or of the corner). All of the genitive marking signaled Chaplin and Goddard as the possessors through third-person plural forms (of they, their, of them). The sample is too small to warrant strong inferences, but there may be a constraint on genitive marking where only human or animate pronominal forms seem to be good candidates for transfer of the L1 genitive even though in Finnish the nouns used to refer to a corner were marked in genitive case (kulman and nurkan). A larger data collection might resolve this issue, but in any event some learners are susceptible not only to the spatial information of Finnish postpositions but also to a co-occurrence of case marking with the spatial form. The transfer of such agreement rules raises certain issues for the question of complexity in language contact settings.

8.

Complexity and transfer

Transfer can occur even if a superficial typological comparison might suggest otherwise. Even when languages differ radically, as with a highly inflected language such as Finnish and a rather analytical one such as English, language learners seem to discover points in common, and such discoveries can promote positive or negative transfer. In spatial constructions Finns may not discover the commonalities right away, as the frequent occurrence of zero prepositions in the F5 group indicates. Nevertheless, the prepositional errors noted in examples 6–9 demonstrate that learners can eventually make interlingual identifications between the semantic information of Finnish inflections (or postpositions) and of English prepositions. Although the errors in 6–9 show that the transfer is not always positive, they do indicate progress beyond zero prepositions. In regard to zero articles, however, Finns have much less success in overcoming the problem, and the difficulty they experience suggests that Kusters’ approach to complexity should not be categorically rejected (section 1). Finns are not “generalized outsiders” in coming to English, but their native language offers little help with the English article system. Even though definiteness is a linguistic category in Finnish, it is not marked on most noun phrases. The complexity of the English system must therefore be confronted with only minimal help from the native language. The success

Nothing will come of nothing

83

of the Swedes shows that many second language learners do not approach English articles as generalized outsiders. On the other hand, Finns are not the only learners who experience great difficulty with articles. Pedagogical grammars of English (e.g., Celce-Murcia, Larsen-Freeman, and Williams 1999) have often devoted much space to the problem, as it arises in the English of speakers of many different languages. Just how the category “article” should be defined on a panlinguistic basis is problematic, since structures that may be considered articles show extensive typological variation; yet even with a fairly broad definition of the article category, there are many languages like Finnish that have no indefinite or definite articles (Dryer 2005a, 2005b). The relative ease that speakers of certain European languages such as Swedish have with articles may make it tempting to overlook the complexity of the system, but the challenges are quite evident in many other contact situations. The formal contrasts of the English article system are not extensive, yet they suffice to make choosing the target-like form difficult. The contrast is not only between definite the and indefinite a/an: a third option is no marking at all, whether or not one adopts Chesterman’s distinction between a zero and a null article (section 1). There also exist potential contrasts with determiners that are not articles such as some and any, but even the three-way choice between the, a, and zero suffices to lead to errors. The notion of ‘error’ here is not a concern of prescriptive grammarians who seek to warn native speakers away from patterns considered nonstandard. In fact native speakers of English in monolingual communities do not show much variation in their use of articles regardless of whether they speak a standard or nonstandard variety. In language contact, however, the problem is salient; in their synopsis of a worldwide survey of varieties of English, Kortmann and Szmrecsanyi (2004: 1158) report that the irregular use of articles is found in non-native varieties of English but not in native speaker varieties. Children acquiring English as their native language do require time before they no longer have problems such as zero articles, but research (e.g., Menyuk 1963) indicates that few problems remain by the age of six. A pair of roughly synonymous sentences can help illustrate the complexity inherent in the article system: I asked for information about when to travel and I asked for a tip about when to travel. Several erroneous variants are possible: (16)*I asked for tip about when to travel. (17)*I asked for an information about when to travel. (18)*I asked for informations about when to travel

84

Terence Odlin

(19)*I asked for the information about when to travel (if the context is generic instead of specific). In example (16) the count noun tip has to be marked with an article or some other determiner, while the status of information as a noncount noun disallows using an indefinite article before it and likewise disallows pluralization as in (18). Every English noun has its own countability profile, as in the case of tip (normally count) and information (normally noncount), but some profiles are more intricate as with wine, frequency, and love, which can all be either count or non-count but with important meaning differences according to the classification in a specific context. Example (19) would be correct in some discourse contexts but not all; it thus shows that the generic/specific distinction is yet another factor affecting the decision of what article if any will be used in a noun phrase.4 The examples show that a set of formally simple oppositions (a, the, and zero) can prove hard to use since success depends partly on the specific lexical classification of nouns as count or noncount. Thus this complexity of English lexis somewhat resembles the complexity of languages that mark nouns for grammatical gender. Furthermore, acquisition depends on understanding (whether explicitly or implicitly) some semantic and pragmatic contrasts implicated in article use (e.g., old versus new information) as well as the semantic information relevant to any particular noun in any particular context (e.g., countable versus uncountable and generic versus specific). Rescher’s notion of descriptive complexity is thus useful: despite the formal simplicity of the article system, the details of how the system actually works prove very elusive for speakers of Finnish and other languages with no article system. Although characterizations of language acquisition often invoke 4

The interaction of the categories of definiteness and countability is all the more complex because of some differences between English and the two languages of Finland. For example, food was invariably a noncount noun when used by the English native speaker control group (and thus there were no cases of a food in the native-speaker corpus), but some Swedes did use a food. (e.g., When the bagery [baker] did not see she stole a food. S7 33). Although food can be countable in some discourse contexts in English (e.g., Spinach is a food rich in nutritional value), the context of the film did not encourage such uses among the native-speaker control group. In Swedish, there is a cognate form foda, but it was never used by the Swedish native-speaker control group. Several native speakers did, however, use another word, mat, which can often translate as food. In Swedish, mat can be countable, as seen in the sentence of one control-group writer: Mannen for in till en restaurang och åt en god mat (‘The man went in to a restaurant and ate a good food/meal’, SX 17, emphasis added).

Nothing will come of nothing

85

‘rules’ (i.e., generalizations) that have to be acquired, it is the interaction of lexis and grammar in specific contexts that makes accurate use of articles so difficult. As the earlier discussion of fossilization suggested, the challenge may persist for decades, with success coming – if it does come – one word at a time (cf. Han 2010). Although cross-linguistic similarities of article systems do not guarantee successful acquisition, they can provide considerable help in dealing with the complexity, as the success of virtually all the Swedes in this study indicates. Their relative ease in using articles suggests that a great deal of any understanding of the descriptive details must be intuitive. Indeed, native speakers of English having an intuitive understanding and an effortless ability to use definite and indefinite articles along with zero are often hard-pressed to explain the differences between I asked for information about when to travel and I asked for a tip about when to travel (even if they equate the formal simplicity of articles with a simple area of language). The point about intuitive transfer seems especially important when other factors such as language instruction are considered. Some research has indicated that instruction can foster the accurate use of articles (e.g., Master 1994). Even so, the very different success rates of Swedes and Finns evident in the current study suggest that instruction played only a secondary role: the curriculum for teaching English in Finland does not differ radically according to L1 group (Ringbom 2007). If intuition helps to grasp complex systems such as articles whenever positive transfer is involved, it also seems relevant to explaining the prevalence of articles in language contact situations such as the formation of Hawai‘i Creole, where recent analyses have favored substrate over universalist approaches (Siegel 2000, Odlin 2009). On the other hand, transfer does not arise in every conceivable situation; some kinds of complexity may be less susceptible to cross-linguistic influence. For example, the lexical complexity of idioms and other collocations presents a mixture of results, regarding transfer. Although some research indicates that idioms are transferable from the native language (e.g., Odlin 1991), other work shows a certain skepticism on the part of learners about the transferability of idioms (e.g., Kellerman 1977). Yet even in Kellerman’s research, there is considerable individual variation, where some individuals proved less skeptical than others about transferability. The issue of individual differences is just one of many problems that remain to be solved before a fully viable theory of what does and does not transfer can be developed (cf. Odlin 2003, Jarvis and Pavlenko 2008). For the moment, it is best simply to state that a good theory must allow for the transfer of some complex as well as some simple characteristics of a source language.

86

9.

Terence Odlin

Summary

In a comparison of written English accounts of a Charlie Chaplin film, native speakers of Finnish showed a much more frequent use of zero prepositions and zero articles than did native speakers of Swedish. The frequency of zero proved greatest among Finns with only two years of study of English and no Swedish instruction, while extra years of study of English or of Swedish as a second language contributed to a lower occurrence of zero. Zero articles proved more difficult for all groups, even for native speakers of Swedish, although again, the Swedes had significantly fewer problems. While typologically quite different from English, the system of spatial reference in Finnish can help to explain the success that Finns with more than two years of study of English showed in overcoming the problem of zero prepositions; Finnish students seem able to learn to make identifications between the semantic information in Finnish case inflections and postpositions and the information in English prepositions, and these interlingual identifications eventually lead to consistent supplying of obligatory prepositions in English, though sometimes with incorrect choices of specific prepositions. Very few interlingual identifications seem available to help Finns to use their native language to determine where articles are required in English, despite the existence of a definite/indefinite semantic contrast in Finnish. Accordingly, the error rate for zero articles is much higher than for zero prepositions, and the problem may illustrate an area of fossilization in the interlanguage English of Finns. These empirical findings are consistent with a theory of linguistic complexity that defines it as the degree of descriptive detail needed to account for all phenomena that might pose a challenge to learning. Although the notion of a ‘generalized outsider’ does not explain well the eventual success of Finns in overcoming the problem of zero prepositions, the notion seems more viable to account for the problem of zero articles in the interlanguage of Finns and speakers of other languages that do not have articles. On the other hand, the success of Swedes with articles suggests that they can intuitively identify correspondences in the complex semantics and pragmatics of articles in English and in their native language. The contrast between Finns and Swedes in their success with articles seems to have wider implications for theories of complexity. In many language contact situations, speakers of a language with no articles will have difficulties comparable to those of the Finns in acquiring a system such as the English one, whereas speakers of languages having articles may benefit from this similarity not only in cases of instructed second language acquisition but also in informal contact settings as in certain cases of creolization.

Nothing will come of nothing

87

References Bloch, Bernard 1967 [1947] English verb inflection. In: Martin Joos (ed.), Readings in Linguistics, Volume 1, 243–254. Chicago: University of Chicago Press. Celce-Murcia, Marianne, Diane Larsen-Freeman and Howard Williams 1999 The Grammar Book: An ESL/EFL Teacher’s Course, 2nd ed. Boston: Heinle and Heinle. Chesterman, Andrew 1991 On Definiteness: A Study with Special Reference to English and Finnish. Cambridge: Cambridge University Press. Dahl, Östen 2004 The Growth and Maintenance of Linguistic Complexity. Amsterdam: John Benjamins. Doughty, Catherine 2003 Instructed SLA: Constraints, compensations, and enhancement. In: Catherine Doughty and Michael Long (eds.), Handbook on Second Language Acquisition, 256–310. Oxford: Blackwell. Dryer, Matthew 2005a Definite articles. In: Martin Haspelmath, Matthew Dryer, David Gil and Bernard Comrie (eds.), The World Atlas of Linguistic Structures, 154–157. Oxford: Oxford University Press. Dryer, Matthew 2005b Indefinite articles. In: Martin Haspelmath, Matthew Dryer, David Gil and Bernard Comrie (eds.), The World Atlas of Linguistic Structures, 158–161. Oxford: Oxford University Press. Givón, Talmy 1983 Topic continuity in spoken discourse. In: Talmy Givón (ed.), Topic Continuity in Discourse. Amsterdam: John Benjamins. Haegeman, Liliane and Jacqueline Guéron 1999 English Grammar: A Generative Perspective Oxford: Blackwell. Han, Zhaohong 2004 Fossilization in Second Language Acquisition. Clevedon, U.K.: Multilingual Matters. Han, Zhaohong 2010 Grammatical inadequacy as a function of linguistic relativity: A longitudinal case study. In: Zhaohong Han and Teresa Cadierno (eds), Linguistic Relativity in Second Language Acquisition: Evidence of First Language Thinking for Speaking, 154–182. Clevedon, U.K.: Multilingual Matters. Heeschen, Volker 1978 The metalinguistic vocabulary of a speech community in the highlands of Irian Jaya (West New Guinea). In: Anne Sinclair, Robert Jarvella and Willem Levelt (eds.), The Child’s Conception of Language, 155–187. Berlin: Springer-Verlag. Jarvis, Scott 1998 Conceptual Transfer in the Interlanguage Lexicon. Bloomington, IN: Indiana University Linguistics Club Publications. Jarvis, Scott 2002 Topic continuity in L2 English article use. Studies in Second Language Acquisition, 24: 387–418. Jarvis, Scott and Terence Odlin 2000 Morphological type, spatial reference, and language transfer. Studies in Second Language Acquisition 22: 535–555. Jarvis, Scott and Aneta Pavlenko 2008 Cross-linguistic Influence in Language and Cognition. New York: Routledge. Karlsson, Fred, Matti Miestamo and Kaius Sinnemäki 2008 Introduction: The problem of language complexity. In: Matti Miestamo, Kaius Sinnemäki and Fred Karlsson (eds.), Language Complexity: Typology, Contact, Change, vii-xiv. Amsterdam: John Benjamins. Kellerman, Eric 1977 Towards a characterisation of the strategy of transfer in second language learning. Interlanguage Studies Bulletin 2: 58–145.

88

Terence Odlin

Kortmann, Bernd, and Benedikt Szmrecsanyi 2004 Global synopsis: morphological and syntactic variation in English. In: Bernd Kortmann, Kate Burridge, Rajend Mesthrie, Edgar Schneider, and Clive Upton (eds.), Handbook of Varieties of English, Volume 2: Morphology and Syntax, 1142–1202. Berlin: Mouton de Gruyter. Kusters, Wouter 2008 Complexity in linguistic theory, language learning and language change. In: Matti Miestamo, Kaius Sinnemäki and Fred Karlsson (eds.), Language complexity: Typology, Contact, Change, 3–22. Amsterdam: John Benjamins. Long, Michael 2003 Stabilization and fossilization in interlanguage development. In: Catherine Doughty and Michael Long (eds), Handbook on Second Language Acquisition, 487–535. Oxford: Blackwell. Lucy, John 1992 Grammatical Categories and Cognition. Cambridge: Cambridge University Press. MacWhinney, Brian 2008 A unified model. In: Peter Robinson and Nick Ellis (eds.), Handbook of Cognitive Linguistics and Second Language Acquisition, 341–371. New York: Routledge. Master, Peter 1994 The effect of systematic instruction on learning the English article system. In: Terence Odlin (ed.), Perspectives on Pedagogical Grammar, 229–252. Cambridge: Cambridge University Press. Meisel, Jürgen 1983 Strategies of second language acquisition: More than one kind of simplification. In: Roger Andersen (ed.), Pidginization and Creolization as Language Acquisition, 120–157. Rowley, Mass.: Newbury House. Menyuk, Paula 1963 Syntactic structures in the language of children. Child Development 34: 407–422. Nida, Eugene 1967 [1948] The identification of morphemes. In: Martin Joos (ed.), Readings in Linguistics Volume 1, 245–271. Chicago: University of Chicago Press. Odlin, Terence 1989 Language Transfer. Cambridge: Cambridge University Press. Odlin, Terence 1991 Irish English idioms and language transfer. English World-Wide 12: 175–193. Odlin, Terence 2003 Cross-linguistic influence. In: Catherine Doughty and Michael Long (eds.), Handbook of second Language Acquisition, 436–486. Oxford: Blackwell. Odlin, Terence 2006 Could a contrastive analysis ever be complete? In: Janusz Arabski (ed.), Cross-linguistic Influence in the Second Language Lexicon, 22–35. Clevedon, U.K.: Multilingual Matters. Odlin, Terence 2009 Methods and Inferences in the study of substrate influence. In: Markku Filppula, Juhani Klemola and Heli Paulasto (eds.), Vernacular Universals and Language Contacts: Evidence from Varieties of English and Beyond, 265–279. New York: Routledge. Odlin, Terence 2010 Accelerator or inhibitor? On the role of substrate influence in interlanguage development. Presented at symposium on English as a Contact Language, University of Zurich, 9 June 2010. Odlin, Terence and Scott Jarvis 2004 Same source, different outcomes: A study of Swedish influence on the acquisition of English in Finland. The International Journal of Multilingualism 1: 123–140. Rescher, Nicholas 1998 Complexity: A Philosophical Overview. New Brunswick, NJ: Transaction Publishers. Ringbom, Håkan 1987 The Role of the First Language in Foreign Language Learning. Clevedon, UK: Multilingual Matters.

Nothing will come of nothing

89

Ringbom, Håkan 2007 Cross-linguistic Similarity in Foreign language Learning. Clevedon, U.K.: Multilingual Matters. Selinker, Larry 1992 Rediscovering Interlanguage. London: Longman. Shukla, Shaligram 2006 Panini. In: Keith Brown (ed.), Encyclopedia of Language and Linguistics, Voume. 9, 153–155. Amsterdam: Elsevier. Siegel, Jeff 2000 Substrate influence in Hawai‘i Creole English. Language in Society 29: 197–236. Tagliamonte, Sali and Jennifer Smith 2005 No momentary fancy! The zero ‘complementizer’ in English dialects. English Language and Linguistics 9: 289–309. Wardhaugh, Ronald 1970 The Contrastive Analysis Hypothesis. TESOL Quarterly 4: 123–30.

90

Rajend Mesthrie

Rajend Mesthrie

Deletions, antideletions and complexity theory, with special reference to Black South African and Singaporean Englishes

1.

Introduction

In their work on the morphosyntax of World Englishes, Szmrecsanyi and Kortmann (2009) have called for an approach that “would combine careful, intralingual-philological, variationist analysis with the broad, abstract bird’s eye perspective that is the hallmark of language typology”. In keeping with this desideratum, I will outline a broad morphosyntactic parameter to characterise ‘New Englishes’ (especially L2 and EFL varieties) as well as a detailed databased characterisation of one variety in particular, viz. L2 Black South African English (henceforth BlSAfE). The syntax of the latter variety, which is fairly typical of a large class of varieties in sub-Saharan Africa, gives us a useful way of broadly conceptualising the presence (vs. absence) of grammatical morphemes that is at the heart of complexity theory. I contrast this sub-Saharan English syntax with its polar opposite, Singapore English (henceforth SgE) and suggest ways of characterising other New Englishes in relation to these poles.

2.

The antideletion tendency in Black South African English

This section is based on a detailed analysis of BlSAfE syntax given in Mesthrie (2006). The current presentation is necessarily much shorter and will draw on new examples where possible, to avoid repetition. I nevertheless rely on the 8 speakers who were interviewed in the 1990s, having the following characteristics: – they were all fluent in English; – they functioned in it as their main academic language; – they did not learn it as a first language; – they did not use it in the home as children; – they were highly multilingual, often speaking about five other languages; – they had a Southern Bantu language as mother tongue; – they were interviewed while studying at an English-medium university; – they used English as a means of interaction with Black peers, but not exclusively;

Deletions, antideletions and complexity theory

91

– they fell within the (upper) mesolect of the familiar basilectal – mesolectal – acrolectal continuum. The data was gathered using principles of the sociolinguistic interview (Labov 1966), with a particular emphasis on student’s schooling in the turbulent 1970s and ’80s, the transition to university and their interests as students. On the whole, I found that BlSAfE was not radically different from Standard English (henceforth StE) syntax in the ways that Irish English, South African Indian English and SgE tend to be (see Hickey 2007, Mesthrie 1992, Wee 2004 respectively). However, a close examination of the data with a particular focus on the mesolectal speakers revealed an interesting regularity of the syntax when compared to Standard English. I labelled this regularity ‘anti-deletion’, which was itself made up of three subtypes: ‘undeletion’, ‘non-deletion’ and ‘addition’. The first – undeletion – restores an element that is assumed to be deleted in most analyses of StE. In generative terms an alternative conceptualisation would be to say that the StE structure has an empty node at D-level. Mesolectal BlSAfE noticeably has this element at surface level (and hence D-level too). The syntactic framework I find most useful is an eclectic “descriptive syntax” one, which draws on insights from Chomskyan and other branches of linguistics (e.g. typology), without committing itself to a particular version of a particular theory. This approach is best exemplified by the Cambridge Grammar of the English Language (Huddleston and Pullum 2002), which presents findings based on over fifty years of syntactic theory relating to English in ways that are accessible outside the generative paradigm (e.g. to sociolinguists). Like Huddleston and Pullum, I find the notion of underlying structure to be a useful one for descriptive purposes. The first two subtypes conspire to disfavour empty nodes on surface trees. The third subtype adds elements that makes the structure of BlSAfE go beyond those of StE.

2.1. Undeletions In Mesthrie (2006) I discuss the following constructions in relation to the undeletions theme: (1) complementiser that with direct quotations, (2) adjunct that, (3) infinitival to, (4) resumptive pronouns, (5) appositional pronouns in left dislocation, (6) dummy pronoun it in adjunct comparative clauses, (7) being in small clauses, and (8) verb and auxiliary repetition in sentences having wh-movement. For reasons of space and to avoid repetition I will merely exemplify these undeletions, with the numbering system of the sentence coinciding with (1) to (8) above. For further details and complications see Mesthrie (2006).

92

Rajend Mesthrie

(1) They announced that, “We are going to do all courses in Afrikaans”. (2) As you might have heard maybe that women were quite restricted. (that for ) (3) And even the teachers at school made us to hate the course. (4) It was something that I hope it will not happen again. (5) Yes, most of them, I call them confused scholars. (6) As it is the case elsewhere in Africa, much can still be done for children. (7) But this higher primary and lower primary still have schools being strictly for Tswanaspeaking pupils. (8) Come what may come. Example (9) involves a matter of markedness. The standard equivalent without ellipsis is of course fully grammatical in StE (Though he didn’t speak English in the home previously, he does speak it at work now), where it carries a marked, emphatic nuance. An unmarked colloquial standard equivalent would be He didn’t speak English at home, but he now does at work, showing ellipsis of the V1 (given in bold in the example). The BlSAfE full form does not appear to be as marked as in StE: (9) Though he didn’t speak English at home, but he does speak it at work. Examples (1) to (9) above are all taken from spoken data. However, the undeletion tendency is sometimes carried over into the written language, too. Consider sentences (10) and (11), both taken from more recent graduate student essays. (10) An exposition of what I call it Central Sotho will be provided. (11) The thesis has six chapters. Chapter 1 is the introduction and it includes sections on the objectives of the research. These two examples add two new types to my original analysis: undeletion in a double object construction via a resumptive pronoun (in 10) and the (admittedly not very common) surfacing of a pronoun in a simple co-ordinated sentence (in 11), where the stylistic preference in StE is to delete the second pronoun. On the basis of these types I proposed the following principle: Principle 1: If a grammatical feature can be deleted in StE, it can be undeleted in BlSAfE mesolect.

Deletions, antideletions and complexity theory

93

2.2. Non-deletion A related property is that BlSAfE shows no tendency to delete elements found in some non-standard varieties of English. It largely follows the pattern(s) of the standard variety rather than the occasional non-standard ones in the constructions listed below. This means that these would not ordinarily count as “dialect features” of BlSAfE, but nevertheless constitute, I argue, an important characteristic of its syntax: lack of copula deletion; presence rather than absence of do-support; lack of pro-drop; lack of gapping and ellipsis. Since these are standard features, there is nothing to exemplify, except to point out that sentences like the following, which have been reported as very salient in other varieties of South African English, are rare in BlSAfE (for partial verification of this claim see Mesthrie 2006): (12) We five in the family (phonological copula deletion in Cape Flats English – Malan 1996: 136). (13) You go there? (lack of do-support in informal South African Indian English – Mesthrie 1992: 47). (14) A: I was looking for some shoes in town. B: And did you find ? (Object pro-drop in White South African English – Branford 1991: 223). The discussion in the present section suggests a second generalisation: Principle 2: If a grammatical feature can’t be deleted in StE, it almost always can’t be deleted in BlSAfE mesolect either.

2.3. Insertions Principles 1 and 2 logically imply that if there are any other features of BlSAfE grammar, they involve properties other than deletion. From a purely logico-syntactic point of view these would involve replacement of a StE morpheme, permutation of elements, and insertion of elements not found in StE. The first two of these processes are exemplified by (15) and (16), the third is discussed in greater detail below. (15) … because the other one overlaps with the other one. (16) … to control the amount of food or what type of food should we eat.

94

Rajend Mesthrie

Sentence (15) corresponds to StE The one overlaps with the other. Although confusing to StE speakers, this is not a very complex change (replacement of form, but not of overall meaning: cf. the StE option Some were singing, some were dancing). Other types of substitution did occur: e.g. the use of I’ll for I’d; occasional present tense for past in subordinate clauses, or of past for perfect; occasional gender conflation (e.g. he for she) and preposition replacement (e.g. on for to, though not mandatorily). This subclass of substitutions involves mergers. Permutation, by contrast, moves an element or string of elements around, in relation to StE syntax. Thus in (16) the auxiliary should is moved from out of the VP to precede the subject. This inversion of course has analogues in many other New Englishes. The inversion in (16) was in fact the only structural type of permutation in BlSAfE that differed from the standard. In Mesthrie (2006) I showed that of these three logical options the most common one is, by far, the insertion of a morpheme. Since StE gives no evidence for them, these constructions count as additions rather than strict undeletions. They therefore involve either a degree of double marking for reasons of parallelism, regularisation and explicitness (see Williams 1987) or the explication of a concept not expressed in StE. The first such construction involves the insertion of a conjunction in the main clause of sentences beginning with although, though, thus, so, but etc.: (17) Although I knew about it, but it was more than I expected. (18) Though he didn’t speak English at home, but he does speak it at work. (19) So I wouldn’t like to inconvenience people, so I prefer coming to campus. (20) But during the exam times they extended the [library] times, but they changed the venues. The StE equivalents to the above are Although … (17), Though … (18), As … (19) and … but (20). In each example the zero of the main clause COMP is explicitly filled in BlSAfE. The balance achieved between clauses by insertion is common in many other L2 varieties in Africa and Asia. A second set involves phrase-internal insertions. Again, this is a widespread phenomenon in sub-Saharan Africa. Well-known occurrences are the following: (a) the frequent addition of be + -ing in stative verb contexts (sentence 21); (b) the use of can be able for ‘can’ – 22); (c) the frequent use of that one for anaphoric that (23);

Deletions, antideletions and complexity theory

95

(d) the existence of occasional double conjunctions like supposing if ‘if ’; unless if ‘unless’; because why ‘because’; and double comparatives like more better; (e) the presence of “underlying” prepositions with verbs like mention (about), discuss (about), voice (out) (24); (f) the use of the partitive genitive more of, most of, too much of for StE ‘more’, ‘most’, ‘too much’ (25). (21) We are talking Zulu at home. (22) … how am I going to construct a good sentence so as this person can be able to hear me clearly … (23) A.: (Interviewer cracks a joke) B.: I like that one. (24) That time we were discussing about it. (25) Most of high schools, they have mixed teachers. Although there are examples of swopping of elements and other type of permutations, on the whole a third principle is suggested by the data: Principle 3: If X is a grammatical feature of BlSAfE mesolect that is not covered by Principles 1 and 2, then X almost always involves the presence of a morpheme not found in StE. It is now possible to extend the work reported in Mesthrie (2006) by giving an overview of the syntax of the 8 mesolectal speakers studied. Table 1 gives the number of occurrences of each process (undeletion, addition, etc.), while Table 2 collapses these into three main categories: (a) undeletions and additions vs. (b) deletions vs. (c) substitutions and mergers. (The non-deletions in which BlSAfE and StE coincide were not counted). The deletions that did occur were of the following types: occasional non-standard missing articles and prepositions etc. and occasional standard absence of to in infinitives, absence of complementiser that, gaps in places of resumptive pronouns etc. as described in section 2.1. To provide a sense of contrast a single typical speaker of White SAE is provided as a control at the bottom of Tables 1 and 2.

96

Rajend Mesthrie

Table 1: A comparison of 5 processes in BlSAfE grammar Speaker

Undeletions

Additions

Substitution

Merger

Deletion

Total

A B C D E F G H Total control

4 0 1 3 3 6 5 1 23 0

13 11 10 17 19 67 23 31 191 3

12 2 10 5 7 15 13 8 72 5

0 0 0 4 13 8 7 7 39 2

2 2 10 5 19 16 11 4 69 7

31 15 31 34 61 112 59 51 343 17

Table 2: A comparison of three syntactic processes amongst 8 BlSAfE speakers Speaker

Undeletions/Additions

Deletions

Substitution/Merger

A B C D E F G H Total control

17 11 11 20 22 73 28 32 214 3

2 2 10 5 19 16 11 4 69 7

12 2 10 9 20 23 20 15 111 7

Figure 1 provides a graphic illustration of the ratio of undeletions and additions to deletions (leaving out the substitutions and mergers). The ratio of undeletions and additions amongst the 8 BlSAfE speakers is 3:1. By contrast the control L1 White speaker has a ratio of 3:7. This translates roughly as BlSAfE having an undeletion to deletion rate that is seven times higher than that of the control speaker. The number of non-standard anti-deletions listed in this paper is high and as Figure 1 suggests, this trait is not generally shared by the historical superstrate (or indeed other varieties of English in the country, this is briefly illustrated in Section 2.2). It is regular and widespread enough to suggest that BlSAfE mesolect incorporates a typologically consistent anti-deleting principle in its grammar.

Deletions, antideletions and complexity theory

97

Figure 1:A comparison of deletions and undeletions/additions amongst 8 mesolectal BlSAfE speakers. Key: y BlSAfE speakers Y BlSAfE mean X Control White speaker

3.

The opposite of anti-deletion: A brief look at SgE

If BlSAfE shows an aversion to ellipsis to an extreme, there are some varieties, notably SgE which delight in it. It is not possible to provide the detailed analysis of the variety in the same way, since I am reliant on data and analysis at second hand. However, a cursory examination of descriptions of SgE shows a strong tendency to deletions. In order to illustrate this I will work through the data in Wee (2004), adding a to denote a missing element. immediately after a word (without a character space and as subscript) denotes that a bound morpheme (a suffix) is missing, and in normal font with space on either side denotes that a free morpheme (a word) is missing. Zero third person verb suffixes: (26) She always borrow money from me. Zero past tense if adverb marks time: (27) He eat here yesterday. Auxiliary deletion: (28) He not yet eat lunch. (29) The students still writing.

98

Rajend Mesthrie

Copula deletion: (30) The house very nice. Deletion of dummy there. (31) Here got very many people. (= ‘There are many people here’) Article deletion: (32) I don’t have ticket. Plural suffix deletion: (33) She queue up very long to buy ticket for us. Deletion of dummy do: (34) You buy what? (= ‘What did you buy?’) Pro-drop: (35) Always late! (= ‘You’re always late!) VP deletion in questions in context: (36) Can or not? (= ‘Can I go home or not?) Passive participle deletion: (37) John kena scold (= “John was scolded’) As Wee makes clear, these are pervasive properties of SgE. On the basis of such data one is tempted to propose two principles for the variety: Principle 4: If a grammatical feature can be deleted in StE, it can also be deleted in SgE. This principle reflects the fact that SgE has none of the BlSAfE undeletion characteristics, which we argue restore a structural element where StE has an empty category. Principle 5: If a grammatical feature can’t be deleted in StE, it almost always can be (variably) deleted in SgE.

Deletions, antideletions and complexity theory

99

This (admittedly loosely formulated) principle reflects the fact that apart from a small amount of pro-drop in narrative contexts and idiomatic elements, the deletions of SgE are non-starters in StE. Principles 4 and 5 are meant to provoke more research into SgE from this grammatical standpoint.

4.

Conclusion: implications for complexity theory

Szmrecsanyi and Kortmann (2009) propose that variety X is more complex than X’ if: (a) X has fewer features that simplify syntactic rules (b) X has fewer features that aid processing (c) X has fewer features that indicate distinctions beyond communicative necessity. Characterisations (a) to (c) would apply to BlSAfE (as X’), in which the surface syntax is remarkably explicit. The rationale would be that the process of second language acquisition (henceforth SLA) makes speakers spell out the syntax in full. This is – almost paradoxically – a kind of simplification of the standard variety generally promulgated in books and classroom teaching. Where complexity is concerned, the motto “More is less” (i.e. more surface morphology means less complexity) might appear appropriate here. But this explanation and view of complexity is cast into doubt if we contrast the BlSAfE state of affairs with that of SgE. There, too, processes of SLA have been historically operative to produce a more-or-less focussed variety that is the polar opposite of BlSAfE. Clearly we are dealing with the influence of the substrates here. BlSAfE exists amidst a number of Southern Bantu languages, whose syntax has not been characterised via the antideletion motif, but which I believe is indeed the case. SgE, on the other hand, looks rather like an isolating language, akin to languages of the Chinese communities (Hakka, Mandarin etc), though some influence from other languages of the country like Tamil cannot be ruled out. The thrust of this polarity is that complexity in terms of morphemic presence or absence cannot correlate with actual processes of SLA. Unless we say that, given inadequate access to StE, (which was almost always the case in colonial times), speakers match the surface complexity of their L2 with that of the L1. But it is hard to accept that BlSAfE speakers are engaged in what looks like from a global English perspective as “simplifying” whereas SgE speakers are “complexifying”. We appear to need a return to a Whorfian view of morphology: radically dif-

100

Rajend Mesthrie

ferent languages encode the world differently, making comparisons between their complexities a difficult matter. This difference may be carried through into the World English context, if access to the superstrate is less than optimal. The broader claim I make is that substrates made a huge contribution in the kind of group SLA that gave rise to the New Englishes. Speakers with substrates which disfavour deletion fall on the BlSAfE side of the continuum; whereas New Englishes whose substrates are isolating languages like Hakka or Mandarin fall on the SgE end. The continuum makes provisions for varieties falling in between. This continuum can be overridden if there are special circumstances of acquisition – e.g. if there is sufficient access to StE to minimise substrate influence. This appears to be the case with Hong Kong English, though further research is certainly warranted here. The predictions made in this approach are that if the EFL varieties in China, Thailand, Laos etc. were to turn into more stable L2s (which is not unlikely in the global village), they will most likely fall on the SgE rather than the BlSAfE end of the continuum.

References Branford, Jean (with William Branford) 1991 A Dictionary of South African English. Cape Town: Oxford University Press. Hickey, Raymond 2007 Irish English: History and Present-Day Forms. Cambridge: Cambridge University Press. Huddleston, Rodney and Geoffrey Pullum 2002 Cambridge Grammar of the English Language. Cambridge: Cambridge University Press. Labov, William 1966 Sociolinguistic Patterns. Philadelphia: University of Philadelphia Press. Malan, Karen 1996 Cape Flats English. In: Vivian de Klerk (ed.), Focus on South Africa, 125–148. Amsterdam: Benjamins. Mesthrie, Rajend 2006 Anti-deletions in a second language variety: A study of Black South African English mesolect. English World Wide 27(2): 111–146. Szmrecsanyi, Benedikt 2009 Typological parameters of intralingual variability: Grammatical analyticity in varieties of English. Language Variation and Change 21(3): 319–353. Szmrecsanyi, Benedikt and Bernd Kortmann 2009 The morphosyntax of varieties of English worldwide: A quantitative perspective. Lingua 119(11): 1643–1663. Wee, Lionel 2004 Singapore English: Morphology and syntax. In: Bernd Kortmann et al. (eds.), A Handbook of Varieties of English, Vol. 2: Morphology and Syntax, 1058–1072. Berlin: Mouton de Gruyter. Williams, Jessica 1987 Non-native varieties of English: A special case of language acquisition. English World Wide 8: 161–199.

The complexity of the personal and possessive pronoun system of Norf ’k

101

Peter Mühlhäusler

The complexity of the personal and possessive pronoun system of Norf ’k

Introduction Pitkern Norf ’k has been labelled “a laboratory case of Creole formation” (Reinecke et al 1975: 590) but has received insufficient attention from creolists. In 1989 Holm (1989: 546) observed that “much remains unknown about the language itself ” and this situation is changing only slowly. Because the language has been an esoteric insider language, many of its more interesting aspects have remained unrecorded. My data are the result of fifteen stints of fieldwork on Norfolk Island and an exhaustive study of previously published and unpublished materials. They confirm that even core areas of grammar, such as its pronominal grammar, require considerable upgrading and that existing published accounts are an inadequate basis for typological studies and/or claims rejecting or confirming the simplicity or complexity of the language. The present paper is part of my ongoing work on the grammar of the language and its sociolinguistic nature. I have worked and repeatedly published on questions of simplicity and complexity of pidgin and creole languages since 1972, but do not feel that I fully understand the nature of these concepts as they apply to human languages. In this paper I shall review my own and other researchers’ definitions and suggest a re-evaluation of existing approaches. It is argued that traditional criteria such as regularity or number of distinctions made fail to address the question of communicative and social-indexical functions of grammars. The data presented show that both a structurally reduced anaphoric pronoun system and a much more sophisticated indexical pronoun system are needed to meet the specific needs of Pitkern Norf ’k speakers.

2.

Norf ’k pronouns – existing descriptions

Before being able to pronounce on the complexity or otherwise of the Pitkern Norf ’k pronouns it is necessary to establish what they actually are. This is by no means an easy task. There are three original accounts: Ross and Moverley (1964), Laycock and Buffett (1988) and Harrison (1986). Ross and Moverley (163–165) did not propose a complete paradigm of

102

Peter Mühlhäusler

pronoun forms for the Pitkern variety and their account mixes speculation about etymology, comments on form and observations about pronominal use. What can be reconstructed from this information is: Table 1: Summary of Ross and Moverley’s (1964) account of Pitkern-Norf ’k pronouns singular dual plural

1st

2nd

3rd

I – mi: ai hmi aklan – kln

you – ju:

it – it

yolye – jɔljε

them – dεm

Three things are of interest in this account: – it is used as a pronominal subject; – it does not appear as a first and second person object pronoun; – aklan is listed as a subject pronoun. In her 1986 PhD thesis, Shirley Harrison proposed a pronoun chart that distinguishes the three persons and for each person a subject, object, possessive and predicate form (forms 1–4). Table 2: Harrison’s (1986) Norf ’k pronoun table

1

SINGULAR

1st person

2nd person

3rd person

Form 1

ɑi

jυ

Form 2 Form 3 Form 4 DUAL Forms 1 and 2

mi mɑis mɑin

jυ jυs jon

masculine feminine neuter ʃi hi h εm h εt his h də his h s

mi ən hεm mi ən h hεmi d εm υwəs hεmis mi ən hεms mi ən h s hεmis

jυ tu1

dεm

tu

tu jυ tus

dεm

tus

wi kln υws

jɔljε jɔljε jɔljεs

dεm dεm dεms

Form 3 Form 4 PLURAL Form 1 Form 2 Form 3 and 4 1

Harrison does not distinguish between subject ju and object ju:, but represents both with a short vowel.

The complexity of the personal and possessive pronoun system of Norf ’k

103

There are a number of issues, which Harrison herself addressed in her comments on this table, including: – that the subject form can also be used as object or benefactive as in giw ai wan piece or giw wan ai piece ‘give me a piece’; – that person number and gender are neutralized in the object position in the shape of et, which can mean ‘me, you, him, her, it, us, them’; For reasons not known to me Harrison does not incorporate these insights into her table. This neutralization of person, number and gender in object pronouns is also quite common in my data, e.g. et in examples (1) to (4): (1) Wataim Ø haewa pat et iin ‘When do you have to put them (bean seeds) in?’ (2) Ø naewa bin sii et fer dar long. ‘I have not seen her (= mother) for so long’. (3) Ai korl et. ‘I called him’. (4) Ai kant mais madda nor gwenna let et. ‘I can’t, my mother won’t let me’. This raises an interesting issue. Bresnan (1998: 78) claimed that: Pronominals are inherently specified for person/number/gender contrast if and only if they are overt. Pronominals are reduced if and only if they are specialized for topic anaphoricity.

She goes on to say that: It follows that no language has an overt definite personal pronoun devoid of any distinctions of any person, number or gender, while many languages have zero pronouns with this property. (Bresnan 1998: 78)

It would seem that Norf ’k’s et violates this putative language universal. Checking Harrison’s inventory of pronouns against actual data raises a number of problems. Most of these relate to the fact that in connected speech and larger texts, additional regularities can be observed. They include: (i)

Variant pronunciations: Table 2 is highly normalized; many pronunciation variants are not considered. Thus, no mention is made of

104

Peter Mühlhäusler

ju~ju: or hi~hi: . She represents second person plural with a glide jolje. This is extremely rare; in Flint’s phonetic texts: jo: le (you nonsingular) is preferred instead. The etymology of this form remains disputed. Hem alternates with em in these texts as well. (ii) Occasional neutralization of number in the third person pronoun. Hem or em at times is used instead of the plural form dem. Impressionistically, this is most likely with object pronouns referring to non-humans but the conditions under which this occurs still need to be confirmed. (iii) The first person plural pronoun aklan or orl aklan is often used as a subject, alternating with wi, wii and at times as. (iv) The dual inclusive hemi, hVmi alternates with wi, wii: Typically, once hemi has been introduced to refer to a dyad it is not used again in the same paragraph or discourse but replaced by wi. (see comments on anaphoric and deictic pronouns below) (v) Subject pronouns, once introduced, are frequently realized as zero, as in examples (1) and (2) and in the following one: (5) Ei hou glad fe ouwas sullen nor tekan et layen down. Ø Onie hoep we gwen ell tek dea saen passin. ‘I’m glad our people don’t take it lying down. I only hope we are going to be able to take the same passion’. (vi) Object pronouns are also often realized as Ø. (vii) Possessive auwas at times becomes auwa, possibly because of English influence as in: (6) Foot ar Government nor give et to auwa sullen? Why does the Government not give it to our people? (viii) The forms orl aklan and ol yorle are frequently documented. This appears to be a free variable and thus not a clearly signalled distinction between dual, paucal or plural. (ix) The plural pronoun auwa mentioned as a rare alternative by Harrison is becoming more frequent with some speakers. (x) Dem is reduced to de or em in connected speech. Contraction of pronoun and following possessive marker or auxiliary is found for instance, in:

The complexity of the personal and possessive pronoun system of Norf ’k

105

[dez / des] for dems ‘their’ [del / des] for dem ell ‘they will, can’ and [dez/dels] dem es ‘they are, were’ [orlem] orl dem ‘all of them’ Such contractions (which involve both synthesis and fusion) of certain pronouns increases the complexity of Norf ’k by violating the one form-one meaning principle (constructional iconicity). (xi) Her, hem can be used as subject pronouns in conjuncts additional to pronominal conjuncts, i.e. not just miienher but also: (7) Her en Tutta goe fu guava one days down Archies. ‘She and Tutta went for guava one day down at Archie’s place’. (xii) There is no documented possessive form *aklans. To indicate possession the circumlocution with fer is required or auwas +N is selected. (8) ar said fer aklan or auwas said ‘our place’ (xiii) Second person plural yorlyi at times is replaced by you and you people possibly under the influence of English. The third account of Norf ’k pronouns appeared in Laycock and Buffett (1988). They provide the set in Table 3. Table 3: Norf ’k pronouns according to Laycock and Buffett (1988) Subject

Object

Possessive

Predicate

ai yu hi shi –

mii yuu hem her et

mais yus his her –

main yoen his hers –

himii miienhem

himii miienhem

himiis auwas

himiis milenhis

miienher

miienher

auwas

miienhers

yutuu demtuu

yutuu demtuu

yutuus demtuus

yutuus demtuus

106

Peter Mühlhäusler

Subject

Object

Possessive

Predicate

wi yorlyi dem

aklan yorlyi dem

auwas yorlyis dems

auwas yorlyis dems

They note that the subject form can appear in the benefactive and indirect object under a range of conditions and they acknowledge that the distinction between subject ju and object juu is problematic. Again, their table is neither complete nor are their statements about pronoun use in spoken Norf ’k observationally adequate. This is evident, for instance with Laycock and Buffet listing hem as a singular masculine object but not presenting the alternative pronunciation em or its use as plural third person object pronoun. Like the other descriptive accounts they make no mention of first person plural subject and object us/as. Parkvall (1999: 46) has commented on the absence of this latter form in all other Atlantic English Creoles (which includes Norf ’k; see Baker and Huber 2001), other then in acrolectal varieties. In Norf ’k it is the common object form when referring to groups including non-Pitcairn descendants.

3.

Missing data

There are several areas of pronominal grammar that have been particularly poorly documented and described: indefinite pronouns, possessives, partitives, reflexives and the difference between anaphora and deictics.

3.1 Indefinite Pronouns Common to the descriptive accounts considered thus far is the apparent absence of indefinite pronouns in Norf ’k, though there are a number of possible candidates: Dem ‘third person plural’ as in: (9) Bifor d’fas moeta boet dem bin yuus ’haewa roe orlem boet. ‘Before the first motor boat one had to row all the boats’. (10) Dem tull. ‘It is said’.

The complexity of the personal and possessive pronoun system of Norf ’k

107

(11) Es dem tull. ‘As one says’. (12) Dem say es invasion of privacy. Salan ‘people’ is used in a similar way: (13) Salan tull is d’wieh Norf ’k se kam. ‘They say look at what has become of Norfolk’. (14) Sullun shouldn’t hat fer goe. ‘One shouldn’t have to go’. (15) Es dar sullun tull things. ‘This is the way people say it’.

3.2 Possessives Both Harrison and Laycock and Buffett only list possessive forms ending in -s. There is another way of signifying pronominal possession, which appears to be the basis for Heine and Kuteva’s (2001: 5) unwarranted typological classification of ‘Pitcairnese’ as “possessee – invariable possessive marker – possessor Creole.” This construction is indeed found, sometimes optionally as in: (16) Said fer dems. ‘Their home’. varying with: (17) Dems said. ‘Their home’. Sometimes it occurs obligatorily, as in the case of aklan where we get: (18) Thanks fer aklan. ‘Our thanks’. (19) Ar said fer aklan. ‘Our place’.

108

Peter Mühlhäusler

By contrast, *aklans said is not permitted in Norf ’k. There does not seem to be a semantic distinction between dems and fe dems in (20) and (21): (20) Dems car. ‘Their car’ (21) Said fer dems. ‘Their place, their home’.

3.3 Partitive genitive pronouns In partitive genitive pronoun constructions, the genitive indicates that of which something is part of. Norf ’k expresses this either as in English, by means of a preposition a or by juxtaposing a yet to be fully documented range of pronouns with preceding adjectives. Examples include: (22) Dar much ucklan. ‘So many of us’. (23) None ucklan bin doubt. ‘None of us has doubted it’. (24) None a ucklan nor bin orf Norfolk. ‘None of us have been away from Norfolk’. (25) Plenti uwa runnen about. ‘Lots of us are/were running about’.

3.4 Reflexives Reflexive pronouns, with the exception of first person dual and plural, are formed by the addition of saelf or sael (no number distinction) to the object form of the personal pronoun. Laycock and Buffett (1988: 69) distinguish:

The complexity of the personal and possessive pronoun system of Norf ’k

109

Table 4: Reflexive pronouns of Norf ’k according to Laycock and Buffett (1988)

ai: yu: hi: shi: et:

misaelf yusaelf hemsaelf hersaelf etsaelf

yutuu: demtuu: yorlyi: dem:

yutuusaelf demtuusaelf yorlyisaelf demsaelf

This table is suspect in that it includes yutuusaelf and demtuusaelf, violating the requirement that distinctions made with deictic pronouns are neutralized in anaphoric use. I do not have data that confirm the use of demtuusaelf and yuutuusaelf, nor do I have texts that would require them. Note also that -saelf is the acrolectal form and that sael without final consonant cluster is more widely used. Laycock and Buffett (1988: 69) note neutralization in the first person plural, however: … whenever the pronoun is any first person non-singular pronoun (that is, any of the pronouns meaning we), it is more usual to use aklan or aklansaelf than to use the pronouns formed with -saelf: himii: aklan instead of himiisaelf miienhem: aklan instead of miienhemsaelf miienher: aklan instead of miienhersaelf wi: aklan instead of auwasaelf It is interesting to note that aklan is the least marked of the various first person non-singular pronouns. Laycock and Buffett’s table does not include other ways of signalling reflexives in Norf ’k. For instance, the first person singular can either be represented by Ø or worn. The first possibility is illustrated by the construction: (26) Side ai cut fer ar axe. ‘Where I cut myself with an axe’. The second possibility is: (27) Ai se kat mi worn. ‘I cut myself ’.

110

Peter Mühlhäusler

Worn is possibly derived from English ‘wan’, which also exists in Norf ’k but is not used as a reflexive. More commonly worn is used in emphatic constructions, where English uses ‘self ’, as in: (28) Ai se meket mi worn. ‘I made it myself ’. When we consider the pronoun forms of Norf ’k it becomes obvious that: 1) There are considerable differences between Norf ’k and English, Tahitian and West Indian Creole, the three languages that were involved in the genesis of the Norf ’k language (see Mühlhäusler 2004); 2) The pronoun inventory is greater in Norf ’k than in the other three languages; 3) Norf ’k has pronoun inflections for possessives; 4) Several of these forms (aklan, sullun, yorlyi) are not encountered in any of the source languages. Before I comment on the simplicity of other of Norf ’k pronouns I would like to note an important distinction between pronouns qua anaphors and pronouns as deictic elements whose function is to carve up people space. In Mühlhäusler and Harré (1990) we made the observations that: 1) The primary function of pronouns is deictic not anaphoric, both developmentally (in L1 and L2) and in unmonitored speech of native speakers (statistically); 2) The primary role of deictic pronouns is to carve up people space by signalling proximity and/or distance. I feel this distinction is important and I do not agree with Klein and Perdue’s (1997: 312) characterization of the pronoun system of their ‘Basic Variety’: “The pronoun system consists of minimal means to refer to speaker/hearer and a third person (functionally, deictically and anaphorically). Anaphoric pronominal reference to animals is not observed.” Their characterization ignores the absence of anaphoric pronouns in early L2 learning and incipient pidgins, and the reduced number of pronominal forms, even in later developmental stages, for anaphoric purposes, when compared to deictic ones. They also ignore that many full languages do not have any anaphoric pronouns, or a much smaller inventory than that postulated for the Basic Variety. For instance, the Papuan language Morwap, as researched by Laycock (1977: 41), has only two pronouns: ‘I’ versus ‘the rest’.

The complexity of the personal and possessive pronoun system of Norf ’k

4.

111

Some observations on Norf ’k anaphora

Anaphoric pronouns are placeholders for previously introduced nouns and noun phrases. There are two major differences between Norf ’k and English: 1) Norf ’k routinely can have Ø where English cannot; 2) Only a limited subset of all pronoun forms listed in the previous section would seem to function in an anaphoric role. To what extent these differences are the result of simplification, substrate or independent development is not quite clear, mainly because comparative data are not available for Tahitian and St. Kitts Creole. The longitudinal development of anaphoric pronouns in L2 learners of English, by contrast, has been documented in detail by Huebner (1983), who arrives at the following generalizations. L2 learners produce: a) first and second person pronouns before 3rd; b) subject before object forms; c) 3rd person singular male is the unmarked 3rd person – it is used for females and placed in early stages of L2 development; d) zero anaphors are found in early stages only; e) Ø is replaced first by 1st and 2nd person pronoun objects, followed by 1st and 2nd subject, 3rd subject and 3rd object in that order; f) 1st person plural object (us) is very rare until later stages of development; g) acquisition of pronominal case occurs regardless of the first language of learners. Huebner does not comment on the occurrence of the object 1st person dual pronoun yu aen mii ‘you and me’ preceding the anaphoric wii in his own example: (29) Enimii now shuwt Ø now shuwt yu aen mii enimii shuwt, o, wii dai. ‘The enemy did not shoot at us, if they had, we would be dead’ (1983: 154). Unfortunately, we do not have developmental/longitudinal data for Norf ’k so a direct comparison is not possible. However, I note some interesting similarities:

112

Peter Mühlhäusler

a) the rarity of third person non-human singular subject; b) the infrequent use of us; c) the use of a single anaphoric wii for both inclusive and exclusive 1st person plural.

5.

Anaphors and deictics in L2

Let me now ask a few questions about simplicity. The simplest system would seem to be the one when there are only Ø anaphora or full noun phrases. Such a system would be both easy to learn and easy to describe. However, Huebner suggests that very early in the learning process L2 learners develop a distinction between subject and object forms, masculine and feminine in 3rd person, and somewhat later singular and plural. Huebner’s L2 learners of English are taught learners, something which was not the case in the early development of the Pitcairn/Norf ’k language. The resulting system of pronominal anaphors consequently is different. The most general statement that can be made is that Norf ’k pronouns used anaphorically are a subset of the pronoun forms listed in Tables 2 and 3 above, and that in subject position wii is the preferred anaphoric pronoun first person plural, being the placeholder for aklan, miienhem, hemii, aua and wii. (30) D.E. tip ucklun up en dem dress shrink way up past aua thigh. Wi ketch et when wi get home. ‘D.E. tipped us up and the dresses shrank way up beyond our thighs. We were told off when we got home’. (31) Let auwa know wathing happen daun Kingston wi anxiously wait the outcome. Got plenti uwa runnen about with new cars. Dar problem is wi gut some sullen se spend too much in en good times. ‘Let us know what is happening in Kingston – we are waiting anxiously for the outcome’. ‘Many of us drive around in new cars. The problem is that we have some people who have spent too much in the good times. (32) Ucklan Norfolk Islander wi try anything. ‘We Norfolk Islanders, we try anything’. (33) Me en hem gu aut en our yacht wi kam back Spit Bridge. ‘He and I went out in our yacht and we returned to Spit Bridge’.

The complexity of the personal and possessive pronoun system of Norf ’k

113

(34) Fut himi gu daun Amy’s fe some hihi. Fut wi want e hihi. ‘Why did the two of us go to Amy’s place for periwinkles? Why did we want periwinkles?’ The distinction between dual and plural is neutralized in both third person subject and object position, as in: (35) I can see demtuu daun Kingston. At least dem gut dar get up an go about dem. ‘I can see the two of them in Kingston. At least they have the get-upand-go spirit about them’. (36) Demtuu ess bared news enn dems parents should shame furret. ‘Those two are bad news and their parents should be ashamed of them’. The object anaphoric pronoun as already mentioned is et for all persons. First person plural object can also be us, and third person plural when referring to persons, can be dem or em. Number in second person is neutralized occasionally. The sole recorded example of second person singular being the anaphoric pronoun for a previously mentioned plural is (37): (37) Dar sun gwenna shine gude fe yorli fe hae yus wattles daun a town. ‘The sun will shine good for yous to have your food down in town’.

6.

Deictic Pronouns

The number of pronoun forms that can be used for social deixis is considerably larger than that of anaphoric pronouns, and speakers can make numerous choices such as: wii, aklan is used instead of ai to create solidarity. Using impersonal pronouns, particularly dem, creates distance and ai de wan emphasizes personal responsibility. To present all the properties of deictic Norf ’k pronouns goes beyond the scope of this paper. The principal function of pronouns, according to Mühlhäusler and Harré (1990) is to carve up people space, to create proximities, solidarities or distance. From the beginning, the Pitcairners emphasized differences between insiders and outsiders: the mutineers against the Tahitians, the old families against the ‘interlopers’ who arrived in the 1820s, the Pitcairners versus the ‘English’. Norf ’k has many words for outsiders: English, horse, mainlanders, outsiders, loopies and more, and the daily discussion is

114

Peter Mühlhäusler

dominated by the tension between local Pitcairners and Australians and New Zealanders who dominate the economy. There are two Norf ’k pronouns that refer to insiders only – ucklan/aklan and auwa2 though who is included is at times disputed and negotiated. The conventions for using ucklan, auwa and wii are social, open-ended and highly sensitive. Ucklan has been described as ‘quite the most mysterious word in Pitcairnese’ (Ross and Moverley 1964: 164), and Klingel (1998) produced an interesting but speculative account of its origin, arguing that it derives from ‘our clan’ and was popularized by the eldest child on Pitcairn, Sally McCoy and that it became adopted, through an act of identity, as a symbol of solidarity and common purpose. Klingel (1988: no page numbers given), quoting Källgård for Pitcairn and Harrison for Norfolk, claims that the use of ucklan is on the decline and this reflects that ‘the need and wish for a unique communal identity is on the decline’. This is certainly not born out by my own data and observations of the conversational practices among the ‘Aklanders’ as they are sometimes referred to. The internet debates on ‘Norfolk Forum’ also clearly disconfirm this. Tensions between insiders and outsiders have been a key motif in the history of the Norfolk Islanders and are often manifested in the debate about the meaning of words. A recent debate in the Norfolk Window (21st April 2006) illustrates this for the pronoun aklan (see appendix 1). The meaning of pronouns employed for social deixis is not predetermined but depends on the situation and negotiating power of interlocutors, medium used, speakers’ views and identity, and many other factors. Laycock and Buffett (1988: 11–13) first characterise aklan as the object for wi and then introduce a distinction between aklan and auwa: “Note that some Norfolk speakers (especially the younger generation) often use auwa in place of aklan – for example: Sam a’ auwa gwen f ’ tennis. Some of us are going to tennis. In some cases there may be a distinction made between aklan and auwa, the former referring to a larger group, and the latter referring to the islanders of Pitcairn descent3, – as in the following example: 2

3

At a recent meeting of the Norfolk Island Council of Elders, the Chief Minister was criticized by a member for using auwa instead of aklan. The examples given in (Laycock & Buffett 1988:13) include the use of aklan preceded by the pluralizer orl which suggests larger numbers than unmarked aklan, and it is noted that all aklan or auwa can also appear in subject position in their illustrative sentences.

The complexity of the personal and possessive pronoun system of Norf ’k

115

Orl aklan or d’ Risl gwen a’ plieh dem or aa Liigs daats, bat uni auwa gwen a’ haew a’ staat erli ko’ wi haew a’ gu rama f ’ aa allen dena. All of us from the RSL will play darts with the Leagues Club members, but only the islanders will have to leave early, because we have to gather periwinkles (for pies) for the island dinner”. It is true that auwa is often used more to refer to people of Pitcairn descent, but who qualifies is often open to negotiation. Auwa can also be used to refer to people with the right to their cultural identity in contrast, for instance, to politicians of Pitcairn descent, as in the following sentence from a recent internet exchange. (38) Let auwa know wething happen doun Kingston good for yorlyi keeping up the pressure on Mr. B. ‘Let us true Pitcairners know what is happening in Kingston, good on you for keeping up the pressure on Mr. B’. The politicians in Kingston are often referred to as ‘dem, dem doun the town’. Dem is the pronoun used to refer to outsiders and to create distance as in: ucklan or dem ‘us’ or ‘them’ Or in numerous passages such as: (39) Wi gut da bass side never du mein dem ediot en thing dem wunt help fe tull. ‘We got the best place and never mind these idiots and those things they want to tell’. (40) Es dem de one comen ya looken fe aklan unae. ‘They are the ones who come here to look at us, aren’t they’. (41) Wi nor moosa want any dem yu. ‘We certainly don’t want any of them here’.

116

7.

Peter Mühlhäusler

Applying linguistic criteria for simplicity and complexity

In her MA thesis comparing Norf ’k with other Creoles, Gleißner (1991: 71–72) writes that,“the personal pronoun system in Norf ’k is rather complex. In addition to a distinction between singular and plural, which exists in Standard English as well, it comprises forms that refer to two people alone. Furthermore, certain pronouns differ as to whether they function as subject or object / prepositional complement. Including reflexives, Norf ’k has about fifty different pronoun forms, considerably more than English, Tahitian or West Indian Creole. With the historical evidence we have, most of these pronouns have been in the language since the early 19th century”. There have been a number of approaches designed to ascertain the simplicity and complexity of languages, particularly of Pidgins and Creoles. My own Pidginization and Simplification of Language was published in 1974. In 1992, I produced a critical auto-review of this book, pointing out my shortcomings with regard to an understanding of simplicity, in particular – whether it is possible to characterize a pidgin by means of structural criteria alone and – whether there is a language-neutral definition of simplification. The answer to the first question was a cautious ‘yes’, an answer which ignored that pidgins are complexity-changing dynamic systems rather than static fixed codes. The answer to the second problem introduced a distinction between simplicity of description, on the one hand, (which I regarded as a problematic and potentially misleading criterion because it ignores markedness and naturalness considerations), and simplification and impoverishment (defined as reduction in referential power), on the other hand. It also suggested that simplification can be characterized as employing maximally general context-free rules and as reducing the amount of irregularity in a language, particularly reducing the size of the lexicon, by using grammatical means such as lexical redundancy rules and derivational morphology to compensate for a smaller set of irregular lexical items. My auto-critique of 1992 included the observation that the distinction between rules of pidgins and descriptive rules was fudged and that pidgins abound in exceptions and semi-productive processes. It was noted that regularity developed late during the expansion phase of pidgins. Miestamo (2008) has summarized some linguistic approaches to simplification and complexity, including the distinction between absolute and relative complexity and global versus local complexity. He argues that the absolute approach (the more parts a system has, the more complex it is) is methodologically preferable to the relative approach (complex for native L1 or

The complexity of the personal and possessive pronoun system of Norf ’k

117

L2 speakers, etc.) and that the local approach (complexity in a specific area of grammar) is preferable, a view also shared by Siegel (2007). Applying the criteria of absolute complexity to a system presupposes that we know what the system is. I have tried to argue that there is a distinct absence of observationally adequate data and that I am not happy with regarding anaphoric and deictic pronouns as being a single system. If we were to regard them as a single system, Norf ’k grammar would be pretty complex in terms of having many parts as well as in terms of the descriptive apparatus needed to account for their use. It would also seem of interest to consider the complexity not just of the formal pronoun system but also the phonological substance of pronouns. It is noted that whereas the anaphoric forms are simple, some deictic forms are not. Thus aklan and its variants atlan and akland exhibit both a closed syllable structure and an earlier word-final consonant cluster. If we apply McWhorter’s (2001) criteria for global complexity, Norf ’k pronominals would not seem to be typologically similar to either contact-induced or young languages as they exhibit: – over-specification (they mark insider/outsider and inclusive/exclusive gender); – structure elaboration (rules for correspondence between deictic and anaphoric pronouns – rules for pronouns found in benefactives etc) – irregularity (e.g. first person pronoun mais not *ais lexicalization of a number of pronoun forms). Miestamo’s criteria for absolute simplicity/complexity (2008: 33) specify: a) fewer distinctions (which makes Norf ’k very complex); b) one form – one meaning (frequently violated in Norf ’k where there is suppletion, e.g. hami (historically ‘they and mi’ ) not yumi, mais not *ais). Szmrecsanyi and Kortmann, in their introduction to this volume, have presented a list of features for World Englishes, which includes properties of pronouns. Among the ‘simplifying’ features we find: – them instead of demonstrative those (yes for Norf ’k, which employs dem) – regularized reflexive paradigm (Norf ’k is not fully regularized but exhibits a considerable degree of variation) – object pronoun forms serving as a base for reflexives (unclear and not straightforward in Norf ’k) – lack of number distinction in reflexives (yes for Norf ’k: self in singular and plural)

118

Peter Mühlhäusler

– generic he/his for all genders (not in Norf ’k) – non-standard use of us (absent or marginal) Among ‘ornamentally complex’ properties Kortmann and Szmrecsanyi list – she/her used for inanimate referents (not in Norf ’k, – Ø used instead) – non-coordinated subject pronoun forms in object function (common in benefactive constructions) – non-coordinated object pronoun forms in subject function (unsure – possibly us and dem) L2-simple features include: – lack of number distinction in reflexives (Norf ’k employs invariable -sael) – generic he/his for all genders (Norf ’k has gender distinctions) In sum, Norf ’k exhibits simplicity in only four out of eleven cases and has complexities not encountered in English. This is hardly what one expects of a language regarded as a canonical creole by a number of creolists. In this paper, I can only apply these criteria to a local domain and conclude that Norf ’k is locally complex, but I would like to mention a number of other areas of considerable complexity in this language, such as consonant clusters, complex benefactive constructions, distributive vs. non-distributive plural marker, complex spatial deixis, and a very messy TMA system. Norf ’k pronouns are complex also in terms of the criteria in my abstract: large size and irregularity. Two sets of personal and two sets of possessive pronouns are distinguished as well as at least one set of reflexive pronouns. I have argued that the full system of distinctions is used only for social deixis. The set of pronouns used anaphorically is much reduced, particularly at the lower end of the accessibility cline. Table 5: Anaphoric pronouns of Norf ’k subject

object

object reduced

ai

mi yu him/her us (ol) yorlye (you people) (ol) dem (only people)

et et et et et et

yu

hi/shi wi (ol) yorlye (ol) dem (only people)

The complexity of the personal and possessive pronoun system of Norf ’k

119

The reduced set is much simpler and, above all, natural. It is in agreement with the animacy hierarchy in that singular/plural distinction is only made with reference to people. It is also in agreement with the accessibility hierarchy in that in object position and after prepositions, person, number and gender distinction need not be made. The absence of third person singular subject and the lack of number distinctions is a consequence of the operation of both hierarchies. The pronoun dem – with its implication of exclusion – marked both the Tahitian men (who were perceived as a menace and indeed had plotted to kill all white men) and the British (dem English), who might discover Pitcairn Island and bring the mutineers to justice. This experience of outsiders threatening the way of life of the Pitcairners is an ongoing theme. In the 1830s they experienced the religious dictatorship of a mad Englishman, and on Norfolk physical eviction, deceit and economic exploitation by the British and the Australians. The role of women throughout the history of Pitcairn and Norfolk has been important, beginning with the Polynesian ‘foremothers’, as they are referred to on Norfolk Island, to present-day cultural and spiritual leaders on both Pitcairn and Norfolk Islands. A gender distinction in the third person singular and dual (hi, shi, mi en hem, mi en her) seems to be a natural outcome. Its absence in the third person plural pronoun is somewhat unexpected, but follows the principle (described in Mühlhäusler and Harré 1990) that in pronominal systems, dual/paucal is more natural or less marked than plural. Dyadic communication is more important than large group interaction. Combined, these factors also explain the distinction between inclusive first person non-singular pronouns and the plural distinction aklan – wi. Again in a small society paucal is more important than plural and the emergence of a paucal plural distinction auwa (ol) aklan is not unexpected, nor is the yorlye – (ol) yorlye distinction in the second person plural. According to Baker (1990), the members of a Creole speech community adopt the language to express their social needs. The pronoun system for Pitcairn Norf ’k would thus appear to be an excellent illustration of these principles, as it allows a sufficiently calibrated system for dividing up people space. Combined with this is the ability to vary the calibration by not making optional distinctions such as that between mi en her (exclusive) and hemii (inclusive) first person plural. The question ‘Is the deictic pronouns system simple?’ in my view is a complicated question as it makes sense only with the addition ‘for the social purpose it needs to express’. It is of course feasible to postulate criteria for evaluating the simplicity or otherwise of linguistic descriptions but it would

120

Peter Mühlhäusler

seem much more difficult or impossible to argue that the absence or small inventory of pronouns constitutes simplicity without also addressing the question what pronouns are actually for. I therefore feel that one can not meaningfully ask ‘What is the simplest tool?’ but only ‘What is the simplest tool for a certain job?’ There are excellent reasons why the deictic pronoun system of Norf ’k is what it is: The Pitcairn language developed in a very small, isolated community against a background of enormous social tensions and divisions, and under the threat of outsiders.

8.

Overall pattern

One of the salient properties of Norf ’k is that it abounds in competing items and constructions in all areas of grammar, high variability in pronunciation, considerable lexical synonymy and competing or coexisting syntactic rules for existentials possessives, number marking, benefactives etc. This is typical of unfocussed creoles (LePage and Tabouret-Keller 1985) and reflects the fact that different families speak different varieties of Norf ’k. Existing descriptions are biased towards the Buffett variety – Shirley Harrison was the daughter of Moresby Buffett and Alice Buffett’s account again represents the Buffetts. The data I have obtained from members of other families do not necessarily agree with statements based on Buffett usage. The description given here does not reflect any particular dialect but the overall pattern, which is not located in any individual competence or social subconscious but as De Camp (1977: 57) illustrates with a comparable problem for Jamaican Creole: “What is required is a concept of speech acts within the context of a communication network”. This caters for the fact that: “Not everybody speaks to everybody, nor does one discuss the same topic or use the same style with everybody”. Communication is channelled through a complex network in three dimensions: personal content, register and style. Understanding the nature of this complex network helps to keep “the incompatible elements of the compound network separated” (De Camp 1977: 57). As yet, I am in no position to describe how the numerous apparent contradictions in the overall pattern of Norf ’k operate. Such understanding would seem to be necessary for making sense of the ‘equi-complexity axiom’ (Hockett 1958: 180–181). What appears to be the case is that a number of divergent historical developments have not yielded a single agreed grammar. Thus, the overall pattern of pronouns would seem to contain aspects of West Indian Creole grammar (e.g. possessive fi – fer) as well as the English genitive -s as in said fer aklan or ours said ‘our place’.

The complexity of the personal and possessive pronoun system of Norf ’k

121

The dual and inclusive/exclusive distinction in first person non-singular himii pronouns alternates with a neutral pronoun: himii – wii, with the English type of system being the unmarked one. The Tahitian contribution to Norf ’k pronouns is not easy to determine. Ross and Moverley (1964: 163) contrast the very messy Pitkern/Norf ’k system with what they call ‘a rather elegant system of Tahitian’: (42)

I (v)au, etc.

II ’oe

III ’oia, ona

Dual

maua (excl.) taua (incl.)

’orua

raua

Plural

matou (excl.) tatou (incl.)

’outou

ratou

Sing.

They assign Tahitian provenance to three pronouns: “hami ‘you and I’ which is functionally equivalent to Tahitian taua but etymologically derived from ‘thou and me’”. English (Ø) and Pitcairn Norf ’k (h) are also found in other examples, e.g. hargoe ‘there goes’. Thou was probably known to the Scottish mutineers but it has not been retained as a separate pronoun in Pitkern Norf ’k. Harrison (1972: 110) points out that a dual pronoun weme is documented for the dialects of Devon (and possibly Cornwall – home of Matthew Quintal). Dyadic communication employed in mother-child interaction, were more culturally important on Pitcairn than in most Creole societies where parents are rarely the linguistic socializers. Ross and Moverley (1964) regard yorlye as Tahitian second person dual orna with /r/ f /l/ ‘with some deformation due to stress conditions’. The problem is that it is distinguished from the second person dual yutuu. Harrison rejects Ross’s explanation and suggests that it comes from English olej ‘all of you’ plus prefix. The frequent use of ol yorlye makes this suggestion less likely. If Norf ’k distinguishes between yu – jutuu (dual), yorlye (paucal), ol yorlye (plural), then it is certainly more complex than any of its ancestral languages and any attempt to describe it as calques of Tahitian would seem to be uncalled for. Ross and Moverley also regard aklan as of Tahitian origin. They suggest that it derives from ‘little ones’ and that, like Tahitian ta’ata ri’a ‘little people’, aklan can be used to mean ‘the general run of people’, which, over time, came to mean ‘we in general’. I note that Pitkern orkel as in orkel sullen ‘little people’ is phonetically more similar to ucklan than Norf ’k lekel ‘little’. I also note

122

Peter Mühlhäusler

that the Tahitian phrase refers to ‘people from a lower class’ rather than to ‘children’. A very different origin, the word ‘island’, is suggested in Holland’s brief wordlist of 1954, where he contrasts uckland ‘people island’ and sullen ‘people English’. Shirley Harrison (1972) favours ‘little ones’ as the origin, but suggests another possible source – orlar salan ‘all people, everybody’ – and she notes “both developments come from phrases meaning literally ‘all the children’, presumably to be connected with the time when children made up most of the population”. Again, Norf ’k seems to make a number of distinctions not found in any of its contributory languages. Table 6: First person personal pronouns of Norf ’k

1st

Singular

dual

paucal

ai

hemii (incl.)

auwa

mi en hem mi en her

plural

(orl)

aklan wi

(excl.)

Variation in possessives and reflexives again reflects the contribution of a number of systems: For possessives there is no single paradigm, mais is irregular and *aklans is not permitted. Reflexives suggest an older system with Ø and an anglicized one with sael(f), as well as another system with wan/worn. Here as in the pronominal subsystems discussed above, we see that one cause of the complexity of Norf ’k pronouns is that the different systems have not been integrated into a coherent system.

9.

Conclusions

One of the principal findings has been that the pronominal system of Norf ’k is more complex than its source languages and possibly the most complex of all English-based Pidgins and Creoles. My reluctance to commit myself to saying that it is more complex than all other English Creoles is that I do not trust existing descriptive accounts. My research into Norf ’k pronouns has shown very clearly that none of the existing descriptions have attained observational adequacy – that they all normalize and simplify a much more complex state of affairs. This limits the usefulness of otherwise carefully researched typological accounts such as Baker and Huber’s (2000) study of creole pronoun systems.

The complexity of the personal and possessive pronoun system of Norf ’k

123

The data presented here would seem to challenge some of the observations made by other symposium participants. For instance, Huber’s assumption that the complexity of written languages is greater than that of spoken ones is not supported by the data considered here, as Norf ’k, until very recently was exclusively a spoken language. Nor is it easy to reconcile the Norf ’k data with Siegel’s view that creoles are more analytic than their lexifiers. Again, the notion of iconicity (a difficult notion to operationalize at the best of times) proposed by Steger and Schneider fails to illuminate the nature of the Norf ’k language. Applying the criteria of constructional iconicity in natural morphology (natural equals more form for more meaning) one wonders why dual should be longer than paucal or plural pronouns, and why third person plural forms can be contracted but other pronominal forms can not. From another perspective one might wish to argue that the complexity of the deictic pronouns of Norf ’k are iconic of the complexity of the society in which it is used. This would necessitate abandoning the neglect of the communicative function of grammar. In my view it makes little sense to ask questions about the simplicity of any tool (and languages are tools employed in the business of communication) unless one also considers the tasks in which it is employed. An important distinction made in this paper is that between anaphors and deictic pronouns. The system of anaphoric pronouns can be very small and include both reduced forms and Ø as is illustrated by the non-subject anaphoric pronoun et, which is an apparent counter-universal to Bresnan’s claims about neutralization. Deictic systems by contrast can not be judged in terms of simplicity but only in terms of social adequacy. The dual pronouns of Norf ’k would seem to fulfil its speakers’ need for excluding and including parties into their personal and social space and their wish to assert their separate identity in the face of continued challenges from outsiders. The fact that it is not more complex reflects the restricted domains and functions in which the language is used. The absence of a thou/you distinction reflects the fact that Norf ’k has never been used in the religious domain. There is no bible translation and all church services are held in English. Such an interpretation replaces the simplistic equation of maximal generality with maximum economy (the ‘economies of scale’ view) with an ‘economies of scope’ view: the right number of distinctions for a particular social ecology. As regards its historical origins, the pronominal forms of Norf ’k reflect the principal distinctions made in the source languages, English dialects, West Indian Creole and Tahitian as well as local innovations. The contribu-

124

Peter Mühlhäusler

tions from the different sources have not been well integrated and alternative forms and constructions coexist in present-day speakers. Now that the language has begun to be taught and written at the Norfolk Island Central School (NICS), combined with the fact that very few first language speakers remain, will probably lead to greater regularity and to an approximation to English. Whether this constitutes greater technical simplicity or impoverishment is difficult to establish with the research tools available.

Appendix 1 The Norfolk Window to the world. 21 April 2006 Norfolk Island’s Visiting Community Newspaper AKLAN! ‘Aklan’ is Norfuk (Norfolk language) for that wonderful word ‘us’ and it is an often used one on Norfolk Island. Some examples of its use by Islanders speaking their language are ‘Dem gwen lorng f ’aklan’ (They are going with us), and ‘Wosn aklan brek aa windo’ (It wasn’t us that broke that window). It is one of the most endearing ways of saying farewell, too. For example, when people who live on the island leave a party or a dinner with friends, you will hear them say, ‘Thaenks f ’aklan’ (Thanks for us). Unfortunately it is becoming more prevalent for ‘Aklan’ to be used not to mean ‘us’ but rather, ‘them’, in a derogatory ‘us versus them’ context. It is interesting to reflect on the cultural maze that is the modern Norfolk Island. When Norfolk Island became home to the Pitcairn Islanders on the 8th June 1856, the community of 194 individuals represented just a handful of nationalities. In the last census for population and housing conducted in 2001, those of Pitcairn descent represented half of the permanent population. The itinerant population numbered over 230 persons from many different countries. In September each year, the Multicultural Festival highlights the fact that today, Norfolk is a diverse community of over 25 nationalities – we have people from Australia, New Zealand, Fiji, Thailand, Malta, China, Philippines, United Kingdom, France, Holland and New Guinea, and that is an incomplete list. On Norfolk, you can be served in hotels, restaurants and cafes and entertained by people whose birthplaces may surprise.

The complexity of the personal and possessive pronoun system of Norf ’k

125

While the island is best known as the ‘home of the Pitcairner’ it is the wonderful mix of peoples that completes the tapestry that makes Norfolk such an interesting place to live. In this Sesquicentenary year, and at a challenging time as we attempt to protect our ability to self-determine, we need to think about ‘aklan’ in its broadest context, because we are in this together and for the benefit of all.

References Baker, Philip 1990 Off target? Journal of Pidgin and Creole Languages 5: 107–119. Baker, Philip and Magnus Huber 2000 Constructing new pronominal systems from the Atlantic to the Pacific. Linguistics 38: 833–866. Baker, Philip and Magnus Huber 2001 Atlantic, Pacific, and world-wide features in English-lexicon contact languages. English World Wide 22: 157–208. Bresnan, Joan 1998 Pidgin genesis in Optimality Theory. Proceedings of the LFG98 Conference. http: //csli-publications.stanford.edu/LFG/3/Bresnan.ps, accessed 5th February 2009. DeCamp, David 1977 Neutralizations, Iteratives and Ideophones: The Focus of Language in Jamaica. In: David DeCamp and Ian F. Hancock (eds.), Pidgins and Creoles, 46–60. Washington: Georgetown University Press. Gleißner, Andrea 1997 The Dialect of Norfolk Island. Unpublished MA thesis, University of Regensburg. Harrison, Shirley 1986 Variation in Present-Day Norfolk Speech. Ph.D. thesis, Macquarie University, Sydney, Australia. Harrison, Shirley 1972 The languages of Norfolk Island. Masters thesis, Macquarie University, Sydney, Australia. Heine, Bernd and Tania Kuteva 2001 Attributive possession in Creoles. Ms, Universities of Cologne and Düsseldorf. Hockett, Charles F. 1958 A Course in Modern Linguistics. New York: Macmillan. Holm, John 1989 Pidgins and Creoles. Vol II. Cambridge: CUP. Huebner, Thom 1983 A Longitudinal Analysis of the Acquisition of English. Ann Arbor: Karoma. Klein, Wolfgang and Clive Perdue 1997 The basic variety (or: couldn’t natural languages be much simpler?). Second Language Research 13: 301–347. Klingel, Markus 1998 A sociolinguistic attempt at explaining the dynamics of languages in contact: Pitkern [’Aklan] as a lexical act of identity. The Creolist Archives Paper online. Laycock, Donald 1977 Me and you versus the rest. IRIAN 6: 33–41. Laycock, Donald and Alice Buffett 1988 Speak Norfolk Today. Himii Publishing Co: Norfolk Island. Le Page, Robert B. and Andree Tabouret-Keller 1985 Acts of Identity. Cambridge: Cambridge University Press. McWhorter, John 2001 The Power of Babel: A Natural History of Language. New York: Harper Collins. Miestamo, Matti 2008 Grammatical complexity in a cross-linguistic perspective. In: Matti Miestamo, Kaius Sinnemäki and Fred Karlsson (eds.), Language Complexity:

126

Peter Mühlhäusler

Typology, Contact, Change (Studies in Language Companion Series 94), 23–41. Amsterdam: Benjamins. Mühlhäusler, Peter 1974 Pidginization and Simplification of Language, series B-26. Canberra: Pacific Linguistics. Mühlhäusler, Peter 1992 Twenty years after: A review of Peter Mühlhäusler’s Pidginization and Simplification of Language. In: Martin Pütz (ed.), Thirty Years of Linguistic Evolution, 109–117. Philadelphia and Amsterdam: John Benjamins. Mühlhäusler, Peter 2004 Norfolk Island – Pitcairn English (Pitkern Norfolk): Morphology and Syntax, In: Bernd Kortmann, Edgar W. Schneider, Kate Burridge, Rajend Mestrie and Clive Upton (eds.), A Handbook of Varieties of English, vol. 2: Morphology and Syntax, 789–804. Berlin: Mouton de Gruyter. Mühlhäusler, Peter and Rom Harré 1990 Pronouns and People: The Linguistic Construction of Social and Personal Identity. Oxford: Blackwell. Parkvall, Mikael 1999 Feature selection and genetic relationships among Atlantic Creoles. In: Magnus Huber and Mikael Parkvall (eds.) Spreading the Word: The Issue of Diffusion among the Atlantic Creoles, 29–68. London: University of Westminster Press. Reinecke, John E., Stanley M. Tzuzaki, David DeCamp, Ian F. Hancock and Richard Wood (eds.) 1975 A Bibliography of Pidgin and Creole Languages. Oceanic Linguistics Special Publication No. 14. Ross, Alan S. C. and A. W. Moverley 1964 The Pitcairnese Language. London: Andre Deutsch. Siegel, Jeff 2007 The Emergence of Pidgin and Creole Languages. Oxford: Oxford University Press.

Interlanguage complexity

127

Lourdes Ortega

Interlanguage complexity A construct in search of theoretical renewal*

1.

Introduction

The linguistic complexity exhibited in extended discourse samples produced by second language (L2) users has long been of interest in the field of second language acquisition (henceforth SLA). The construct is generally understood as “the range of forms that surface in language production and the degree of sophistication of such forms” (Ortega 2003: 492). In practice, this has meant going through the data and calculating the average length of some syntactic unit of production, such as a sentence or an utterance, computing the density of subordination in the discourse, or more rarely counting the frequency of occurrence of selected forms that are considered to be linguistically sophisticated, such as passive voice, past perfect, or epistemic modals. At the core of the construct is the claim that the ability to produce more linguistically complex oral or written texts reflects increasingly more developed and mature capacities to use the second language. Thus, at stake is the relationship of linguistic complexity of L2 oral or written production to other constructs that are central in the field, such as linguistic proficiency and linguistic development. In this paper, I offer an evaluation of the current state of knowledge about linguistic complexity in the field of SLA. I will use the term “interlanguage complexity” as a short-hand for the linguistic complexity of interlanguage production. This short-hand is useful in the context of the present collection, since it is devoted to exploring the notion of linguistic complexity across a number of different fields, and given that in the field of SLA the construct has been typically applied in the narrow sense of linguistic (and * I am grateful to Bernd Kortmann and Benedikt Szmrecsanyi for inviting me to join them in the multidisciplinary exploration of linguistic complexity, to the Freiburg Institute for Advanced Studies for support and intellectual stimulation through a residential four-month fellowship, to Douglas Biber and John Norris for discussion of ideas in this paper, and to Zhaohong Han and Jeff Siegel for helpful comments that improved the final chapter. I alone am responsible for any shortcomings.

128

Lourdes Ortega

most often syntactic) complexity of L2 productions. I argue that the construct begs for long-ranging theoretical renewal if we are to see improvement in definitions, operationalizations, and measurement practices in the future. I begin by examining three important purposes for investigating interlanguage complexity in SLA and I take stock at what is known and what has yet to be achieved in the context of each purpose. Next, I illustrate how interlanguage complexity might be characterized in practice via a brief analysis of two excerpts of learner production. This will allow me to pinpoint critical gaps in current conceptualizations of interlanguage complexity, which suffer from what I call construct reductionism, and offer a critique of current measurement practices. Finally, I will argue that insights from Systemic Functional Linguistics theory can help put subordination in its proper place as only one of several resources for linguistic complexification. This theoretical prism also brings to the fore a number of new questions about the relationship between interlanguage complexity, on the one hand, and genres and full developmental trajectory, on the other. I close with a forecast of future venues for the theoretical renewal and expansion of the study of interlanguage complexity under the thrust of functional theories, including Systemic Functional Linguistics theory, usage-based linguistics, and corpus linguistics.

2.

Purposes for measuring interlanguage complexity in second language acquisition

Second language acquisition researchers use interlanguage complexity measures (to be described in more detail in section 4) with at least three main purposes in mind: (a) to gauge proficiency, (b) to describe performance, and (c) to benchmark development. When SLA researchers investigate interlanguage complexity for its value as an index of proficiency, they pose themselves the following question: Does interlanguage complexity increase as second language users become more proficient in the target language? If the answer to this question is affirmative, it would be possible to use the best indices of complexity in language production as valid indicators for proficiency. In this line of work, researchers typically sample second language users representing heterogeneous levels of linguistic ability and elicit from them extended discourse samples from which complexity measures are extracted. Proficiency scores are also gleaned from the same participants through a variety of methods (see review by Thomas 2006). The relationship is then examined, either via correlations between the two sets of numerical values, when both complexity and proficiency are treated as inter-

Interlanguage complexity

129

val scale variables, or via statistical comparisons of complexity means among learner groups that have been deemed to be at non-overlapping levels of proficiency, in which case proficiency is treated as a categorical variable. The hope is that the scores representing the two constructs will yield strong correlation coefficients and/or that the learner groups will be distinguishable by the relative levels of interlanguage complexity they produced in writing or speaking. Metrics of syntactic complexity which show a strong and reliable relationship with proficiency in one or both of these two ways can then be used as shortcuts to gauge global proficiency. This is of practical usefulness for the conduct of SLA research because proficiency scores for learner samples are not always available or, even if available, their interpretation may be specific to a particular proficiency test or context. By contrast, complexity indices can be extracted from extended language production samples, if such data have been collected (and they may have been collected for a variety of research purposes that need not be related to proficiency scoring). The indices can also be meaningfully compared across studies and findings, since the interpretation of such numerical values remains relatively constant across samples and contexts. Comparisons might be relatively meaningful even across different target languages. The second type of research purpose for the measurement of interlanguage complexity is found in a line of work within instructed SLA which is known as task-based language learning (see Samuda and Bygate 2008). In this domain, syntactic complexity measures are used to describe second language performances rather than to gauge proficiencies. Tasks are things L2 users are asked to do with words (e.g. reporting on the results of an experiment, negotiating a business transaction, telling a story to a friend) and where meaning-making and nonlinguistic outcomes are central. Tasks thus stand in contrast with many other activities that are done, particularly in traditional instructional contexts, just to display language ability or for the sake of sheer language practice. Ultimately, task-based language learning researchers seek to determine whether certain manipulations of the cognitive demands which tasks place on learners can have specific, predictable effects on the quality of the language produced during task performance, in the three areas of complexity, accuracy and fluency. These manipulations target cognitive load as a function of, for example, reasoning demands, number of task elements, degree of internal task structure, and so on. In terms of complexity, the main question at stake is: Does interlanguage complexity systematically vary (i.e., can we see reliably lower, same, or higher complexity) depending on the cognitive task conditions imposed on L2 users? If this question can be answered in the affirmative (and contingent on how accuracy and fluency might be also

130

Lourdes Ortega

affected), then the hope is that task conditions might be manipulated systematically to engineer performance that supports favorable acquisition processes. The research methodology typically involves eliciting extended samples of oral (and less often written) interlanguage task performance by the same group of learners under the posited different conditions of cognitive task complexity, and then testing mean differences in the levels of linguistic complexity (as well as accuracy and fluency) produced across conditions. In other words, the studies tend to be designed as repeated-measures comparisons, where some manipulation of cognitive task complexity in the elicitation procedures is the independent variable and the resulting linguistic complexity of second language production is the dependent variable. The usefulness of this research tradition resides in its ultimate goal to generate an empirically grounded proposal for performance manipulations that can be planned and sequenced in formal instruction settings in order to place pressures on interlanguage use. Cumulative interlanguage changes can thus occur over time and second language users can gradually handle simultaneously more complex, accurate, and fluent language production during real-time, meaning-oriented communication in the target language. A third purpose for measuring complexity in the field of SLA is to be able to benchmark developmental level. Here, the following question is asked: Does the interlanguage complexity of production increase with growing grammatical development? Assuming the answer is affirmative, then researchers can search for the best complexity indices that can be used as benchmarks of broad interlanguage developmental stage. The rationale echoes much earlier research in the field of child language acquisition that led to the modest but useful and now ubiquitous use of the mean length of utterance, among other measures, ever since Brown (1973) established its use, and given the accepted conclusion that it is more informative for developmental benchmarking than chronological age (see Pan 1994; Rollins et al. 1996). We might wonder how well the three research programs just described have done to date in terms of advancing researchers’ understanding of the construct of interlanguage complexity. As we will see in the next section, not all three purposes have met with the same degree of interest or systematic examination in SLA. Considerable progress has been made in the investigation of complexity as an index of proficiency in a second language, while unfortunately much less systematic knowledge has been gleaned in the study of interlanguage complexity as a descriptor of performance or as a benchmark of development.

Interlanguage complexity

3.

131

Current achievements and pending questions

Efforts at establishing the validity of existing measures of interlanguage complexity have been most extensive in relation to complexity as an index of proficiency. There are both weaknesses and strengths in the considerable amount of knowledge that has accumulated. A weakness in this line of work is that SLA researchers have most often chosen to treat proficiency as a categorical variable and then have assessed mean differences in complexity values across proficiency groupings. Yet, this practice of converting interval variables (i.e. individual proficiency scores of some kind) into categorical ones (i.e. participants grouped by nominal proficiency levels) has always been criticized by statisticians because it discards much useful information. More specifically, it does away with the variance of continuous scores and leads to unreliability and increased likelihood of Type II errors (e.g. Skidmore and Thomson 2010), that is, the problem of failing to detect a difference, relationship, or effect that is in fact present because of some psychometric methodological problem, such as lack of power or (in the case at hand) lack of variance in the observations. It would be profitable in future work, therefore, to accumulate evidence from designs where both complexity and proficiency are treated as interval scales. This is particularly true if the data were then to be analyzed with more sophisticated statistics than simple correlations. Thus, for example, designs that lend themselves to the use of regression, discriminant analysis, factor analysis, cluster analysis, or structural equation modeling would allow researchers to investigate hierarchically how well proficiency is predicted by complexity vis-à-vis other relevant variables, such as lexical diversity, fluency, accuracy, and so on (e.g. Oh 2006, discussed in Norris and Ortega 2009: 569–570; see also Jarvis et al. 2003 for an interesting example involving writing quality ratings rather than proficiency). Despite this design limitation, the accumulated knowledge is encouraging. Most of the work has focused on second language written data and has been synthesized in two studies: Wolfe-Quintero, Inagaki, and Kim (1998) and Ortega (2003). Both conclude that, among the complexity measures typically employed in SLA, at least two can be used with relative confidence to distinguish broad (but not fine-grained) proficiency differences in second language written data produced at intermediate (but probably not very early or very advanced) levels of linguistic ability. They are: (a) the mean length of multiclausal units (i.e. the average number of words per sentence, utterance, or terminal unit, this latter developed by Hunt 1965, and called T-unit henceforth; see further discussion in Han and Lew this volume) and (b) the amount of subordination as measured by mean number of finite clauses

132

Lourdes Ortega

produced (e.g. average number of finite clauses per sentence, utterance, or T-unit). Typical values found across the some 40 studies examined between the two syntheses suggest that relatively intermediate second language writers can produce essays of T-units that on average are 12 words long and contain a subordinate clause in every other main clause. The range of averages seen across studies is wide, however, and can vary from 5 to 18 words and from no subordination to one subordinate clause per T-unit. To my knowledge, there has been no parallel appraisal of findings regarding similar evidence for oral data. The reason is probably that only a few individual studies exist that have focused directly on the relationship between complexity and proficiency with oral discourse samples (e.g. Halleck 1995; Norris 1996). Thus, by comparison to written proficiency, much less is known about the systematic relation of the complexity of interlanguage products to learners’ oral proficiency. Evidence regarding complexity as a descriptor of performance in the domain of task-based language learning has greatly accumulated, since almost every study that investigates cognitive task demands includes linguistic complexity as one of the dependent variables. On the other hand, there has been very little systematic synthesis or validation in this area. This is unsurprising, since the main interest of this body of work lies with the independent variable, that is, the various task manipulations allegedly leading to increased or decreased cognitive demands and, in turn, to systematic differences in the quality of the language produced. Nevertheless, the lack of attention paid to the dependent variable of complexity is regrettable on three grounds. First, working out valid tools for the measurement of complexity as a dependentvariable construct would tremendously advance this line of work, since two extant competing theoretical rationales have been proposed (one by Skehan 1998, and the other by Robinson 2001), and properly assessing the empirical evidence in favor of each requires being able to measure complexity (as well as accuracy and fluency) reliably and validly (Norris and Ortega 2009). Second, precisely because the task-based language learning domain has grown rapidly since the mid 1990s, we now have substantial accumulation of descriptive information for task-based interlanguage complexity with comparable data gleaned via quite similar designs. Hence, the time is ripe for methodological investigations focusing on this dependent variable specifically. Third, task-based language learning researchers have concentrated on oral performances, so efforts at synthesizing and validating the measurement of interlanguage complexity in oral data in this tradition would usefully extend the relatively clear picture we already have about the relationship between proficiency and the complexity of written performances (cf. Ortega 2003;

Interlanguage complexity

133

Wolfe-Quintero et al. 1998). The caveat must be added, however, that such comparative interpretations would have to be carefully contextualized, since most of the task-based oral evidence comes from repeated-measures data while the findings for proficiency and written complexity typically represent between-groups data. Ironically, interlanguage complexity has been used or investigated in SLA the least for what is perhaps its most central measurement purpose in any language acquisition field: as a tool for benchmarking linguistic development. Thus, little concrete empirical evidence has been furnished in SLA to support the otherwise reasonable assumption that interlanguage complexity in second language production increases as individual interlanguage grammars develop, and that the increases are meaningful vis-à-vis the mapped course of second language development. Since the inception of the SLA field in the 1970s, one of the goals in the study of interlanguage has been to trace the acquisitional timing of a range of forms and form-function mappings in the second language grammar, so as to shed light on how interlanguage development proceeds over time, from initial emerging representations to a full-blown, mature system of the new language. This has given way to a long tradition pertaining to the profiling of development in sub-systems of interlanguage, using theoretically eclectic and diverse notions such as sequences, stages, markedness hierarchies, and accuracy orders. This work has been synthesized, for example, by R. Ellis and Barkhuizen (2005), Ortega (2009), and Tarone and Swierzbin (2009). It would seem natural to engage in cross-pollination between the two research traditions. The mutual benefits would be obvious in two complementary directions: (a) exploring metrics that can serve as easy-to-calculate indicators of broad developmental level; and conversely (b) exploring whether well-established findings about the developmental timing of various grammatical subsystems in L2 acquisition can give content to new and better complexity metrics. Efforts in the latter direction, which are largely non-existent in SLA, would involve developing new, more elaborate metrics, in an effort analogous to that of Scarborough’s (1990) Index of Productive Syntax in the field of child language acquisition. Scarborough’s Index allows a range of forms chosen on the basis of their empirically attested developmental timing to be quantified, weighed, and eventually synthesized into overall values that can be extracted from each language sample and serve as rather fine-tuned indices of linguistic development. Although some efforts at interlanguage profiling have come close to investigating the validity of hypothesizing a relationship between linguistic complexity and development (e.g. Granfeldt and Nugues 2007; Myles and

134

Lourdes Ortega

Mitchell n.d.; Pienemann 1992), the potential for cross-pollination remains unfortunately largely untapped. This neglect may be thought odd, if only for research practical reasons. Namely, complexity measurement is much less labor intensive than are developmental profiling techniques, which require not only fine-grained work that must be done largely via manual codes but also individual grammar data from dense corpora which are not always available for a given sample. In the early SLA literature, the potential benefits of investigating complexity measurement as a tool for benchmarking interlanguage development were noted by seminal scholars such as Larsen-Freeman (1978). However, subsequent SLA research in this area soon shifted away from development and instead concentrated on validating the relationship between complexity and proficiency (i.e. the first line of research identified in this discussion). Yet, proficiency and development are distinct constructs (Hudson 1993; Pienemann et al. 1993). Proficiency is a coarse-grained, externally motivated and subjective, if formal, description of what it means to be an effective language user (with all the many layers of linguistic and nonlinguistic resources that this entails and by reference to target communicative demands and contexts). Development, on the other hand, is an internally motivated trajectory of linguistic acquisition. Therefore, complexity may relate to development and to proficiency in distinguishable, non-isomorphic ways, and the possibility of developmental benchmarking merits separate investigation. To summarize the discussion thus far, some pending questions in the study of interlanguage complexity can be identified. First, more sophisticated designs and statistical-analytical choices would be welcomed when probing the relation between complexity and proficiency. Second, there is a need to balance the focus on written data with a focus on oral data. This is true for the purposes of studying complexity as an index of proficiency, where a clear picture has begun to emerge for written data only, as well as for the purposes of using complexity as a descriptor of task-based performance, where empirical observations have been reported in great quantities for oral data but without much focus on making sense of the accumulated findings. Third, the domain of task-based language learning would be greatly aided by validation efforts that help researchers ascertain which measures might be sensitive enough to offer relevant empirical evidence on the two competing theoretical frameworks that exist. Fourth, it behooves SLA researchers in general to turn to empirical interlanguage findings in order to investigate whether the relationship between complexity and development, as distinct from proficiency, can be elucidated. The construct of interlanguage complexity, its definition and operationalizations, and its actual measurement

Interlanguage complexity

135

would be greatly refined if what is known by now about acquisitional timing of individual second language grammars were incorporated into systematic validation programs.

4.

Interlude: An example of interlanguage complexity analysis at work

I would like to turn away momentarily from the broad-stroke evaluation of this research domain and instead consider what interlanguage complexity may look like in actual learner samples. In order to do this, I will present a commentary on two excerpts extracted from two L2 Spanish narratives written by a student of Spanish as a foreign language while she was enrolled in a course at the third-year level of the university curriculum in the United States. Based on this curricular placement, we can speculate that this L2 user is at some intermediate to upper-intermediate level of proficiency that might correspond with the Intermediate-Mid or Intermediate-High descriptors in the American Council of the Teaching of Foreign Languages Proficiency Guidelines (ACTFL 1999) or the Level A2 in the Common European Framework of Reference for Languages (CEFR 2001; see Vandergrift n.d., for discussion of the equivalence between ACTFL and CEFR levels). This curriculum-based estimation is supported by the complexity values extracted from the two samples discussed in this section as well, when cross-referenced with the benchmarks distilled by Ortega (2003). The two written excerpts are reproduced here as originally written, but the data are shown segmented into T-units. Developed by Hunt (1965) to segment first language writing transcripts, a T-unit is defined as a main clause plus any subordinate or embedded clauses that may occur in it (see also discussion and examples in Han and Lew this volume). I have followed the convention in CHILDES (MacWhinney 2000) to show each segmented production unit in a different tier, called LRN for “learner,” and to have each tier end in a period, a question mark, or an exclamation mark to indicate a declarative, interrogative, or exclamatory statement. No corrections of spelling or grammar have been made, other than adding accents and using capitalization (only after any tier-final periods where the punctuation period was used by the writer in the original writing), both with the aim to improve readability. For the present purposes, we can disregard the nontargetlike features of this learner’s interlanguage, since accuracy is a distinct construct that should be analyzed separately from linguistic complexity.

136

Lourdes Ortega

(1) Excerpt: Unpublished data, Ortega (2000), 23-D04 *LRN: En mi niñez tenía mala suerte con ruidos o golpes. *LRN: No estoy segura de cuándo fue la primera vez. *LRN: sino había tres veces que recuerdo. *LRN: Una vez, tal vez la primera fue cuando estaba jugando con los vecinos. *LRN: Estábamos haciendo carreras. *LRN: En mi niñez podía correr muy rápido. *LRN: y tenía ganas de gano la carrera. *LRN: Lo de que no di cuenta fue que estaba un pared de ladrillos en la dirección en que estábamos corriendo. *LRN: “Puedo correr más rápido, más rápido!” *LRN: Pow! *LRN: Dios mío! *LRN: Gané la carrera. *LRN: pero recibí un golpe grandísimo en mi cabeza. *LRN: El doctor mandó a mi madre que no pobía [: podía] dormer porque fue un concusión cerebral y hubo peligro de una coma. ‘I had bad luck in my childhood with noises and falls. I am not sure when it was the first time, but there are three times that I remember. One time, perhaps the first one, was when I was playing with the neighbors. We were doing races. In my childhood I could run very fast and I wanted that I win the race [sic]. That which I did not notice was that there was a wall of bricks in the direction in which we were running. “I can run faster, faster.” Pow! My God! I won the race but I received a huge blow in my head. The doctor told my mother that I could not sleep because it was a cerebral concussion and there was danger of a coma.’ Excerpt 1 comes across as an entertaining narrative recount written with skill by a good writer. The language is effective while also relatively simple. There are 12 T-units (i.e., tiers containing at least a main clause and thus excluding the two tiers with the interjections Pow! and Dios!) and 113 word tokens. Therefore, the mean length of T-unit is 9.42 words. There are also 22 finiteverb clauses spread over the 12 tiers, which yield a mean length of finite clause of 5.14 words and an average subordination amount of 1.83 clauses per T-unit. That is, roughly three out of 4 main clauses appear with some subordination in them. The language produced thus shows relatively dense

Interlanguage complexity

137

subordination, and we also find a variety of noun, relative, and adverbial clauses. The advantage in average length of the T-unit, when compared to that of the clause, can therefore be attributed to the considerable amount of subordination seen in the text. In general, however, the quality of expression is brief, mostly containing repetition of many of the same lexical items and arguments without modification. (2) Excerpt: Unpublished data, Ortega (2000), 23-D09 *LRN: No sé cómo empieza la tradición. *LRN: Pero cada navidad en mi recuerdo recibimos nuestro familia una caja de regalos de Japón *LRN: y casa [: cada] año mi madre mandaba un caja a Japón. *LRN: Actualmente, para mí, lo que me interesa la mayoría fue el periódico japonesa que estaba usado en la caja para protegar los regales. *LRN: Estudiaba los fotos y la escritura japonesa con fascinación, imaginando un mundo japonés completamente diferente de lo mío. *LRN: Mi hermano y yo también nos gustábamos muchísimo los dulces japonesas con envolturas hecho de arroz que también se pueden comer. *LRN: Fantástico! ‘I don’t know how the tradition starts. But every Christmas in my memory we received [in?] our family a box of presents from Japan, and every year my mother would send a box to Japan. Actually, to me what interested me the most was the Japanese newspaper which was used in the box to protect the presents. I would study the pictures and the Japanese writing with fascination, imagining a Japanese world completely different from mine. My brother and I also liked very much the Japanese sweets with wrappings made out of rice that can also be eaten. Fantastic!’ Excerpt 2 was written by the same student six weeks later. Rhetorically, the writer has maintained her taste for alternating short and long sentences to attain an engaging style effect (e.g., Fantástico!). Myhill (2008) reported that such alternation for staccato and other effects was typical of strong first-language writers in her large sample of school students aged between 13 and 15. The mean length of T-unit in Excerpt 2 is higher than in Excerpt 1, at 15.50 words per T-unit on average (a total of 93 words divided by 6 tiers containing at least a main finite verb). On the other hand, this narrative excerpt has not changed much with regards to subordination strategies. If anything, it contains slightly less subordination (mean clauses per T-unit is now 1.66, with

138

Lourdes Ortega

10 finite clauses spreading over 6 tiers with at least one main clause), and we find noun and relative clauses but no adverbial subordination. What has changed somewhat noticeably, and what seems responsible for the overall longer T-units, is the length of clauses, which has expanded to 9.3 words on average. This increase is due to the fact that there is now more phrasal modification (e.g. imaginando un mundo japonés completamente diferente de lo mío, con envolturas hecho de arroz). The language in Excerpt 2 is effective in the characteristic style of this good writer, and the personal narrative genre is the same as in Excerpt 1. It is less linguistically simple than before, however, and the increase in complexity, although not dramatic, is clearly associated with modification via reduced non-finite clauses, adjectives, and other such phrasal elaboration strategies. The inference of some progress between the two excerpted data is partly justified if we think this learner received 18 hours of formal instruction in the 6-week interval between production of the two pieces of writing and, even more importantly, that she completed substantial amounts of writing during that brief period, including six weekly journal entries and a 3-page draft of a research paper. Nevertheless, although it may be that some of the change we see between Excerpts 1 and 2 reflects interlanguage development in some overall sense, it is logically impossible to exclude other possibilities in view of the closeness of the two waves of data collection. In this particular case, the change may as well reflect in part the demands of the different topics chosen for each narrative, which lend themselves to a faster or a slower rhetorical pace (i.e. a race that ends in an accident versus the pleasure of opening presents). My goal in presenting this analysis has been to illustrate three points that will prove to be important in the discussion for the remainder of this chapter. First, I have shown in practice that the definitions and operationalizations of interlanguage complexity most widely employed in SLA research historically have relied heavily on the notions of length (e.g., average T-unit length in words) and density of subordination (e.g., mean number of finite clauses per T-unit). Second, I have raised the possibility that complexification strategies other than subordination can be important resources for writers, for example, phrasal modification in the case of the second excerpt by this L2 writer. Third, I have suggested that “development” or “change” in interlanguage production is difficult to ascertain in the absence of full-trajectory evidence (i.e., longitudinal data spanning a long enough period of time) and varied genre evidence (i.e. sufficient data from the same learner produced across sufficiently varied genres).

Interlanguage complexity

5.

139

Construct reductionism

When SLA researchers measure interlanguage complexity, they typically declare an interest in the range and the sophistication of surface forms attested in oral or written samples produced by L2 users of the given target language. As illustrated in the previous analysis, in practice this has meant going through the data and calculating the average length of some syntactic unit of production or computing the density of subordination in the discourse. Some researchers have also occasionally counted the frequency of occurrence of selected forms that are considered to be linguistically sophisticated (e.g. Ellis and Yuan 2005). In this latter case, the choice of which forms will be counted as “more complex” is based on the fact that they exhibit low frequency of occurrence in mature adult first-language use and/or that they emerge late in first-language development. Obviously, timing of emergence in interlanguage development should be the most relevant criterion, but to my knowledge this information is never considered. If all three senses of complex (length, subordination, and selected form frequency) were applied to the same data in the same investigation concurrently, the evidence on which to claim “less” or “more” complexity would perhaps be more convincing. Unfortunately, SLA researchers typically apply only one of the three analytical strategies in isolation. An overwhelming reliance on subordination density is found in studies that focus on oral data (Norris and Ortega 2009) and a strong preference for operationalizing complexity as length is observed in studies that concentrate on second language written data (Ortega 2003). In the end, the strategy of operationalizing complexity exclusively as length, subordination, and more rarely as selected form frequency, might offer the conceptual virtue of simplicity and it definitely affords the practical advantage of ease of calculation. The down side, however, is that one must also be willing to believe that “more complex language use” can be boiled down to being able to plan and produce longer utterances full with subordination and/or with low-frequency and late-emerging forms. Even if the three qualities are posited to result from concurrent and necessary processes involved in complex language production, it is clear that there is a lot more to the construct of complexity as it relates to linguistic proficiency and interlanguage development. Furthermore, one would have to be willing to assume that the same notion of what counts as “more complex” remains largely constant and equally applicable (a) across the full developmental span of linguistic capacities by L2 users and (b) regardless of the genres, content, tasks, and modalities that they are asked to produce. It is easy to see that these premises are untenable. Thus, confinement to a single measurement of glo-

140

Lourdes Ortega

bal length, subordination, or selected form frequency constitutes a case of construct reductionism. It sanctions an impoverished notion of the role of complexity in language use and language development that should be questioned in the field.

6.

Challenges to measurement practices

Together with construct reductionism, several other weaknesses are obvious when one turns to evaluating the metrics that have been employed by SLA researchers in the name of interlanguage complexity measurement to date. At the methodological level, many SLA scholars have noted inconsistency and divergence in the definition and operationalization of the various measures employed (Foster, Tonkyn, and Wigglesworth 2000; Norris and Ortega 2009; Wolfe-Quintero, Inagaki, and Kim 1998). Thus, for example, the definition of “clause” may not be consistent across researchers (Ortega 2003). The definitional and operational problem is worsened by the fact that detailed coding schemes are rarely shared and intercoder reliability information is insufficiently reported. There is also a proliferation of different measures and units. Particularly striking is the variable choice of segmentation units, which then usually end up in the denominator of formulas, creating only slight differences in resulting numerical values with uncertain consequences for interpretation. For instance, the multiclausal unit of choice for segmentation of transcripts differs greatly across researchers, including sentences (in written data), utterances (in oral data), T-units (in both written data and spoken monologic data), c-units, idea units, and AS-units (this latter being a unit that has been devised specifically for interlanguage oral data by Foster et al. 2000). This variety of choices weakens overlap across studies and hampers comparison across reported values and findings. Even more important are conceptual considerations at the heart of measurement decisions. Most notably, research interpretations about interlanguage complexity are weakened by a lack of clarity in rationales that guide the choice of what to measure and why. Thus, some researchers might choose to measure complexity via the calculation of only one measure, a huge risk unless there is robust assurance that the specific chosen measure is itself to be trusted as a sufficient index of linguistic complexity for the particular purposes and types of data at hand. Conversely, other researchers may decide on the concurrent use of several complexity calculations that upon closer inspection turn out to be just slight variations of the same construct definition (e.g. three measures which represent three slightly different ways of calculating subordination density). This multiplies the hours of coding

Interlanguage complexity

141

work needed for analysis with no return in informational quality and, more seriously, it unwittingly creates undesirable collinearity, or the statistical problem that arises when some of the variables under study are all so highly intercorrelated to one another that when submitted to correlation-based inferential tests the results cannot be interpreted (Tabachnick and Fidell 1996; see discussion in Norris and Ortega 2009: 561, 574). The thin rationalization of what to measure and why under the name of interlanguage complexity is also seen in a recent unresolved disagreement as to whether length measures (e.g., mean length of utterance or mean length of T-unit) might index complexity, as traditionally assumed, or the very different construct of fluency, a quite unorthodox interpretive move (see Wolfe-Quintero et al. 1998, who argue for length as fluency; and Norris and Ortega 2009, who argue for length as complexity). In general, it is fair to say that SLA researchers seem to assume, first, that all linguistic complexity measures are equal, so choosing one over another does not really need any special justification and, furthermore, that all purposes and types of data can be accommodated with a given same measure, so again no accounting of purpose or data characteristics is needed when justifying complexity measurement choices. This is in stark contrast with the extensive research in the field of child language acquisition that over the years has pursued the goal to refine and validate different metrics. This program has yielded useful knowledge about, for example, the conditions under which a metric can be used in order to enable relevant interpretive inferences. As a way of illustration, we know that: (a) mean length of utterance is a good index of grammatical development in a first language only at a gross level of interpretation and when calculated on samples containing a minimum length of 50 utterances; (b) the word versus morpheme calculations for mean length of utterance correlate very highly with each other in early acquisition but mean length of utterance in morphemes is most reliable for samples of linguistic abilities typical of up to three years of age, whereas its predictive validity peaks past an average length of four morphemes per utterance; and (c) mean length of utterance in words may be preferable for tapping later language abilities typical of ages between 3 and 10 as well as for making better cross-linguistic interpretations (see, for example, studies by Hickey 1991; Pan 1994; Parker and Brorson 2005; Rice et al. 2006; Rollins et al. 1996). Clearly, first-language acquisition researchers expect no magic measurement strategy that is suitable across all developmental levels, purposes, types of data, and contexts. By contrast, SLA researchers seem to have given little thought to such considerations.

142

7.

Lourdes Ortega

In search of theoretical renewal

Ultimately, the biggest and most urgent questions about interlanguage complexity which beg to be addressed in SLA are conceptual and go to the heart of construct definition: What exactly should count as linguistically “complex” in interlanguage production, what factors may constrain this answer, and how do we know? In order to craft good answers to these questions, fresh theoretical insights that support renewal of the construct and its associated measurement practices are desperately needed. Norris and Ortega (2009) have recently argued that Systemic Functional Linguistics (SFL; Halliday and Mathiessen 2006) has the potential to illuminate one of the best trodden paths to operationalizing interlanguage complexity in SLA, subordination density. SFL views grammar as a resource for meaning-making that is grounded in social contexts of use. Thus, it offers a theory of language that is attuned to a general social reorientation that has been seen in much SLA at the turn of the century (Atkinson 2011) and is also compatible more specifically with usage-based and dynamic views of second language learning that are on the rise (de Bot, Lowie, and Verspoor 2007; N. Ellis and Larsen-Freeman 2006).

7.1. Dynamic and synoptic styles in Systemic Functional Linguistics Of direct interest for the study of linguistic complexity is the continuum of dynamic and synoptic styles proposed by SFL theory. In a nutshell, the claim is that as a linguistic resource, subordination plays an important role in generating complex language use at initial levels of development and with modalities, genres, tasks, and contents that involve low formality and technicality. The importance of subordination (and therefore its usefulness for measuring interlanguage complexity) wanes, however, at advanced levels of development or proficiency and with modalities, genres, tasks, and contents that involve high formality and technicality. Under such developmental and communicative premises, language complexity shifts away from subordination and relocates in the processes of nominalization and grammatical metaphor, both SFL-specific concepts that I will explain and illustrate in this section. The language differences that arise along the dynamic-synoptic style continuum, and the various conceptual and communicative effects that are served, are illustrated in Excerpts 3 through 5, which are examples of grammatical metaphor that have been drawn from interlanguage data found in published literature. At the end of each excerpt I have juxtaposed instan-

Interlanguage complexity

143

tiations of dynamic versus synoptic resources and have noted their meaningmaking effects. (3) Excerpt: From Byrnes (2009, p. 61), academic writing by a college-level L2 German writer Aber sie hat sich fremd gefühlt. Dieses Gefühl … ‘But she felt foreign. This feeling …’ To feel (verb) f The feeling (noun) – Effect: Greater cohesiveness of text (4) Excerpt: From Mohan and Beckett (2001: 137), extract of oral exchange between teacher and L2 user student Student: To stop the brain’s aging, we can use our bodies and heads Teacher: So, we can prevent our brain from getting weak by being mentally and physically active? Use bodies and heads (verb-centered action) f Be physically and mentally active (adjective-centered predication) – Effect: Expert scaffolds student in science class to engage in a more academic style (5) Excerpt: From Achugar and Colombi (2008: 47–48), extract from two essays written 9 months apart by same heritage language college-level writer 5.1. More grammatically intricate and subordination based complexity Por esta razon, es importante conservar la diversidad lo cual solamente puede lograrse con la existencia de varias culturas. ‘For this reason, it is important to preserve diversity, which can only be achieved with the existence of diverse cultures’ 5.2. More lexically dense and nominalization based complexity Los revolucionarios esperaban obtener democracia, estabilidad economica y social, y principalmente la liberacion de un gobierno establecido por mas de tres decadas. ‘The revolutionary troops expected to attain democracy, financial and social stability, and above all the freedom of a government established for over three decades’ Clause combining 5.1. f Grammatical metaphor 5.2.

144

Lourdes Ortega

In terms of linguistic resources, as Excerpts 3 through 5 illustrate, more dynamic styles will involve using prototypical pairings of semantic content mapped onto grammatical categories. In other words, processes will be expressed by verbs, attributes will be denoted by adjectives, and propositions will be encoded in sentences built around finite verbs. More synoptic styles, by contrast, draw on pairings of meaning and form that are less prototypical. In other words, verbal processes, attributive qualities, and even whole propositions will often be linguistically expressed through a variety of noncongruent grammatical categories, and most often through a noun which serves to re-construe an action (to capture), a quality (beautiful), or a proposition (because they wanted) as static and detached “things” (the capture, beauty, of their own will). This idea of non-prototypical or grammatically non-congruent pairings, in essence, is formalized in the concept of grammatical metaphor (Simon-Vandenbergen, Taverniers, and Ravelli 2003), of which nominalization is the main type (see also Baratta 2010). Dynamic styles in mature language use are characterized by syntactically intricate language with reliance on clause combining, which gives way to high density of coordination and (at the highest levels of syntactic complexity within this style) subordination. On the other hand, synoptic styles are characterized, also in mature language use, by high lexical density and tightly packed information with a high degree of cohesiveness, much of which is a result of nominalization processes. Dynamic styles, according to SFL theory (e.g. Halliday 1998), serve users’ needs for oral, low-formality levels in non-technical communication typical of everyday life. By contrast, synoptic styles evolve out of the communicative purposes of written, high-formality levels of communication typical of academic, scientific, or technical expertise. From the perspective of individual language development, or ontogeny, dynamic styles emerge in tandem with the naturalistic acquisition of a first language or languages in the early years of life prior to schooling, whereas synoptic styles come about in the repertoire of speakers considerably later in life. In fact, synoptic styles are fostered by schooling and literacy and continue developing well into adolescence. Derewianka (2003) explains that children at around 9 or 10 years of age begin to comprehend nominalizations in English as a first language, but it is only by age 14 or 15 that they are capable of producing them as well. She helpfully emphasizes that dynamic styles are built on the non-linear and incremental addition of new language subsystems, that is, on the development of the nuts and bolts of lexico-grammar (including eventually the development of subordination). By contrast, the capacity that develops later with the emergence of synoptic styles “[…] is not a matter of constantly adding new

Interlanguage complexity

145

subsystems, but of deploying existing subsystems to serve new functions” (Derewianka 2003: 185). And indeed, from an L2 acquisition perspective, the empirical evidence has begun to corroborate that, with increasing competence in a target language, and if there is a concomitant need to meet increasingly more formal and technical communication demands, L2 users develop synoptic styles that are built out of nominalization and other forms of grammatical metaphor (see Achugar and Colombi 2008, who investigated Spanish interlanguage; and Byrnes 2009 and Ryshina-Pankova 2010, both of whom investigated German interlanguage). Importantly for a theory of interlanguage complexification, the reliance on nominalization occurs at the expense of subordination, whose importance recedes in the language used by advanced learners in formal and academic genres. Going back to the commentary on Excerpts 1 and 2 presented earlier (see section 4), it seems possible to explain the noted subtle change towards less subordination and more phrasal elaboration strategies (via reduced non-finite clauses, adjectives, and other modifiers) as a case of a gradual move from more dynamic to more synoptic complexification strategies.

7.2. Beyond subordination in the measurement of interlanguage complexity The consequences of applying the theoretical insights of SFL to the relationship between subordination and linguistic complexity are deep ranging for researchers needing to measure interlanguage complexity for a variety of purposes. For broadly intermediate levels of proficiency, subordination density might be a telling indicator of increases in complexity, because the deployment of early-emerging, dynamic styles is at stake. As learners move towards advanced levels of proficiency, on the other hand, subordination is likely to peak or decrease and therefore its predictive or descriptive power greatly diminishes. At some point along the trajectory towards most developed and mature second language capacities, instead, nominalization and related grammatical metaphor processes are likely to be preferred as complexification strategies, and hence a measure other than subordination is needed. Ortega (2003) and Norris and Ortega (2009) thus have proposed that mean length of clause is a good global index of the complexity typical of the synoptic styles which should increasingly characterize advanced levels of mature use, particularly under formal (academic or technical) communicative demands. The reason is that grammatical metaphor impacts on the length achieved inside clauses, that is, it is likely to result in longer phrases

146

Lourdes Ortega

and hence longer individual clauses, but not in more clauses or in tighter integration across clauses via subordination. Subordination may be still a good choice for measuring complexity at the point in development when higher density and range of subordinating devices may begin to deploy in production. There is evidence in the SLA literature that subordination occurs at low-intermediate to intermediate levels of proficiency, which can be curricularly expected to be located in the first two years of study for college-level second language learners (Byrnes, Maxim and Norris 2010; Norris and Pfeifer 2003). On the other hand, subordination is likely to be dangerously uninformative in two other cases: (a) during early stages of emergent L2 use, when subordination is not yet part of the grammatical repertoire, and conversely (b) at the point of very advanced use, when L2 users have already reached a ceiling on how much subordination they can produce – after all, there is a limit to how many subordinate and embedded clauses mature speakers can keep packing into a given sentence without overtaxing working memory capacity. In the case of the writer of Excerpts 1 and 2, who in fact belonged to intermediate or upper-intermediate levels of proficiency, subordination was obviously already part of the interlanguage grammar in Excerpt 1, yet the increased complexity in Excerpt 2 was unrelated to it. Finally, Norris and Ortega (2009) argue that length represents a rather global notion of complexity and therefore a sufficiently robust operationalization to capture some kind of change via measurements. But it also is the least informative of the definitions of complexity, as it does not allow the researcher to know what kinds of complexification strategies account for a collection of utterances being longer on average than another collection. This was illustrated in the comparison of Excerpts 1 and 2, where the mean length of T-unit went from 9.42 words to 15.50 words, yet the lengthening had a different source in each case: dense subordination vs. greater phrasal modification. Byrnes, Maxim, and Norris (2010) have provided robust longitudinal and cross-sectional evidence supporting these claims for college-level German L2 learners. They demonstrate that reliable and noticeable increases in interlanguage complexity occur in their data over the span of four years of instructed development. As they explain, these changes from beginning to advanced levels were only adequately measured with the combined use of (a) a global length measure (e.g. mean length of T-unit), which is appropriate to index relatively well-sized increases in complexity regardless of their diverse provenance, (b) a subordination density measure (e.g. mean number of clauses per T-unit), which is sensitive to complexification at intermediate

Interlanguage complexity

147

levels, and (c) a phrasal lengthening measure (e.g. mean length of clause), which is able to capture increases in complexification that result from nominalization and other kinds of grammatical metaphor. Putting the findings in full-trajectory perspective, overall length was reflective of complexification all along the four years of college-level German study these researchers inspected but could not shed light on the different kinds of complexification involved that affected length at different time junctures; subordination density was a reliable and valid indicator only for changes that occurred during the first two years of study, as intermediate learners worked out subordination; and clausal length was a reliable and valid indicator only for changes that occurred after the third year in the data.

8.

Forecast of future developments

Many questions remain open to empirical verification for researchers who may be interested in further pursuing theoretical renewal of the construct of interlanguage complexity and its associated measurement practices. For one, most of the first-language research that investigates the SFL proposal of dynamic and synoptic styles has been carried out in English, and it will be imperative to engage in crosslinguistic comparisons in order to evaluate the importance of grammatical metaphor as a site of complexification and a developmental benchmark of the emergence of synoptic styles. For example, in a rare crosslinguistic typological comparison, Yang (2008) suggests Chinese exhibits lower degrees of grammatical metaphor than English and a number of differences in the resources employed to create grammatically non-congruent expressions. An even more serious challenge for the crosslinguistic validity of the SFL proposal is that nominalization is highly frequent and expected in the dynamic, informal styles in other typologically distant languages, such as Fijian or Quechua (Jeff Siegel, personal communication, November 2010). While overall present support in SLA for the synoptic styles described by SFL spans several target languages (e.g. English, German, Spanish), interesting questions of not only crosslinguistic validity but also crosslinguistic influence open up. Thus, Neff et al. (2004) argued that formal written styles in first-language Spanish favor a much greater use of subordination than English, and that this linguistic-rhetorical preference is transferred by L2 writers to the new language, English. If Neff, et al.’s interpretation of their data is correct, an improved crosslinguistic understanding of dynamic and synoptic styles is imperative if SLA researchers are to be able to generate precise appraisals of developmental patterns of complexification vis-à-vis crosslinguistic influences. In sum, it will be important to pay

148

Lourdes Ortega

closer attention to crosslinguistic comparisons and to understand grammatical metaphor from the combined perspective of the two languages of L2 users: How pervasive is grammatical metaphor in either language, in what styles does it occur, and through what lexico-grammatical resources is it instantiated? Second, it is worthwhile to investigate both subordination and grammatical metaphor not as unitary phenomena, but as comprising different dimensions and types of complexity. In her developmental longitudinal data, Derewianka (2003) distinguishes among several types of grammatical metaphor, from precursors and protometaphors all the way to mature adult use, such that we may profit from a notion of a developmental taxonomy of grammatical metaphor (see also Halliday and Mathiessen 1999). Likewise, it might be profitable to make finer complexity distinctions for different types of subordination or embedding. For example, in system functional linguistics grammar adverbial clauses are not considered embedded at the same level of integration as noun and relative clauses (Halliday 1985), and Diessel (2004), a child first language acquisition scholar working within the framework of usage-based linguistics, has grouped adverbials with coordinate clauses along a continuum of syntactic integration that, according to him, matters greatly for linguistic development. An even more nuanced proposal for gauging the relative structural complexity involved in formal registers, and which must be mastered by mature first-language and L2 users alike, is also being worked out by Biber and colleagues. Building on corpus-based evidence of first language use in formal academic registers, Biber and Gray (2010) argue that the main source of linguistic complexity in written academic language (i.e. in synoptic styles) is phrasal compression, which is achieved by means of nominal and phrasal devices, whereas clausal elaboration characterizes the complexity of spoken registers (i.e. dynamic styles) and is achieved through verb-centered resources such as subordination and clausal embedding controlled by high frequency verbs. The proposal greatly elaborates the view of subordination and grammatical metaphor offered by SFL theory. Biber, Gray, and Poonpon (2011) go on to propose a well worked-out developmental sequence in the learning of linguistic complexity along clausal elaboration and phrasal compression resources. Thus, for example, nominal subordination complexity would develop via the use of noun clauses first only in frames controlled by high-frequency verbs (e.g. think, know, say) and later by a wider range of verb types, even later in frames controlled by adjectives, and much later by nouns. Non-finite complementation, on the other hand, would be used first with high-frequency verbs such as want (want + to), then controlled by a wider range of verb types, later controlled by adjectives, and eventually deploy-

Interlanguage complexity

149

ing in extensive phrasal embedding in the noun phrase. This corpus-based framework may produce fruitful venues for future SLA work that seeks empirical support for the validity in L2 development of these L1-based proposed developmental path of complexification resources, particularly if interlanguage analyses of complexity are also imbued with useful developmental concepts from usage-based linguistics (Tomasello 2003), such as lexical-specific learning and frequency-related island effects, amalgams, precursors, and protoconstructions (e.g. Diessel 2009). Particularly given the predilection in much SLA work for oral language data, in the future it will also be important to ask: Might the present proposal be most useful for the study of written language performance, but least useful for the analysis of oral language samples, even at advanced levels of L2 competence? The dynamic-synoptic style is meant to denote a continuum rather than a dichotomy (much as modality itself has come to be seen as a continuum rather than an oral-written dichotomy; see Biber 1988). Therefore, the relative contributions to the make-up of interlanguage complexity of subordination and phrasal elaboration should be investigated across a variety of genres along the oral and written mode (and in hybrid genres associated with computer-mediated communication, since these types of interlanguage data are also of keen interest in SLA research). It will also be important to compare monologic and dialogic oral data. Dialogue may invite the deployment of more dynamic styles, and if so subordination density might be most informative of linguistic complexity when analyzing dialogic oral L2 data. Conversely, dialogue may actually invite co-constructed turns that exhibit great interactional but low syntactic integration. In this alternative scenario, subordination would turn out to be less informative, as suggested by Foster et al. (2000), and some other index of linguistic-interactional complexity may be needed. The worthwhile questions just sketched really represent the iceberg tip of a much larger research program that would take modality, register, and genre as important in elucidating the nature of interlanguage complexity and its relationship to proficiency, performance, and development. SLA researchers interested in centrally accounting for genre in investigations of syntactic complexity will find many useful directions in the work by Berman and colleagues (e.g. Berman 2008; Berman and Nir-Sagiv 2004). Berman’s usagebased model of full language development offers analytical tools and theoretical explanations for the deployment of mature linguistic and discourserhetorical capacities during late language development in adolescence (up to ages 16–17). These can open up new terrain for the study of emerging mature abilities in adult interlanguage.

150

Lourdes Ortega

Finally, I have eschewed a discussion of what counts as linguistically “complex” at very early stages of development and/or proficiency because it would greatly exceed the limits of this chapter. However, I would be remiss if I did not at least point out the extreme importance of finding ways to measure “the complex” in novice performance. In Ortega and Sinicrope (2008), we investigated Spanish and German as a foreign language samples produced at a narrow range of beginning proficiencies, rated as Novice-Low, Novice-Mid, and Novice-High by the ACTFL Proficiency Guidelines (1999) and loosely equivalent to the A1 Level in the CEFR (2001). The challenges we encountered were eye opening. For example, in our corpus we recorded mean lengths of total production per learner ranging from a low end of 28 words (for the most novice level) to a high end of 68 words (for the learners at the Novice-High level). These beginning performances were at the cutting edge of foreign language capacities for the novice stage, yet no reliable quantification of complexity in the traditional statistical senses described in this chapter can be pursued with such short samples. We also noted that successful task completion relied on verbless lists and that nouns were central to the language of these novices, with nouns accounting for about at least 50 % of content words (and as much as 80 % in the Novice-Low samples). Verbs and function words became gradually more visible at the Novice-Mid and Novice-High levels, whereas adjectives (and even more so adverbs) remained infrequent to nonexistent. We saw that, as novices begin to develop their interlanguage, their vocabulary also becomes more predictable and shared across individuals, that is, less idiosyncratic. In other words, the kinds of linguistic complexification that appear to be detectable at novice levels of L2 performance may have to be sought mostly in the realm of lexical resources. While SLA as a field has produced studies that traced emergent levels of proficiency and increasing levels of linguistic complexity and grammaticalization, these are scarce and have exclusively focused on naturalistic learners without instruction (e.g. Klein and Perdue 1997). Little is known about the incipient capacities of instructed foreign language learners before they reach the infamous and ubiquitous “intermediate” level investigated most widely in the SLA literature.

9.

Conclusion

I hope to have shown why it is important to theorize the construct of linguistic complexity in interlanguage production. It brings to the fore a number of new questions SLA researchers should ask about the relationship between interlanguage complexity, on the one hand, and very novice and very advanced language capacities, full developmental trajectory, and genre

Interlanguage complexity

151

on the other. Under the thrust of functional theories, including SFL theory, usage-based linguistics, and corpus linguistics, it may be possible in the future to renew and expand researchers’ understandings of what is linguistically less or more “complex” in interlanguage and what different kinds of complexification resources are most available to learners at different points in their developmental trajectories. Two worthwhile goals guiding future efforts will be analytically accounting for the most novice and the most mature capacities of second language use, and devising theoretically motivated tools for the study of interlanguage complexity in second language production across a theorized continuum of modalities and genres.

References Achugar, Mariana and M. Cecilia Colombi 2008 Systemic Functional Linguistic explorations into the longitudinal study of the advanced capacities: The case of Spanish heritage language learners. In: Lourdes Ortega and Heidi Byrnes (eds.), The Longitudinal Study of Advanced L2 Capacities, 36–57. New York: Routledge. American Council of the Teaching of Foreign Languages [ACTFL] 1999 Proficiency Guidelines Revised. Yonkers, New York: Author. Atkinson, Dwight (ed.) 2011 Alternative Approaches in Second Language Acquisition. New York: Routledge. Baratta, Alexander M. 2010 Nominalization development across an undergraduate academic degree program. Journal of Pragmatics 42: 1017–1036. Berman, Ruth A. 2008 The psycholinguistics of developing text construction. Journal of Child Language 35: 735–771. Berman, Ruth A. and Bracha Nir-Sagiv 2004 Linguistic indicators of inter-genre differentiation in later language development. Journal of Child Language 31: 339–380. Biber, Douglas 1988 Variation across Speech and Writing. Cambridge: Cambridge University Press. Biber, Douglas and Bethany Gray 2010 Challenging stereotypes about academic writing: Complexity, elaboration, explicitness. Journal of English for Academic Purposes 9: 2–20. Biber, Douglas, Bethany Gray and Kornwipa Poonpon 2011 Should we use characteristics of conversation to measure grammatical complexity in L2 writing development? TESOL Quarterly 45: 5–35. Brown, Roger 1973 A First Language: The Early Stages. Cambridge, MA: Harvard University Press. Byrnes, Heidi 2009 Emergent L2 German writing ability in a curricular context: A longitudinal study of grammatical metaphor. Linguistics and Education 20: 50–66. Byrnes, Heidi, Hiram H. Maxim and John M. Norris 2011 Realizing Advanced Foreign Language Writing Development in Collegiate Education: Curricular Design, Pedagogy, Assessment. Special Issue Monograph of Modern Language Journal 94 (Supplement 1). Council of Europe Modern Languages Division Strasbourg 2001 Common European Framework of Reference for Languages: Learning, teaching, assessment [CEFR]. Cambridge, UK: Cambridge University Press.

152

Lourdes Ortega

de Bot, Kees, Wander Lowie and Marjolijn Verspoor 2007 A dynamic systems theory approach to second language acquisition. Bilingualism: Language and Cognition 10: 7–21. Derewianka, Beverly 2003 Grammatical metaphor in the transition to adolescence. In: Anne-Marie Simon-Vandenbergen, Miriam Taverniers and Louise J. Ravelli (eds.), Grammatical Metaphor: Views from Systemic Functional Linguistics, 185–219. Amsterdam: John Benjamins. Diessel, Holger 2004 The Acquisition of Complex Sentences. New York: Cambridge University Press. Diessel, Holger 2009 On the role of frequency and similarity in the acquisition of subject and non-subject relative clauses. In: Talmy Givón and Masayoshi Shibatani (eds.), Syntactic Complexity, 251–276. Amsterdam: John Benjamins. Ellis, Nick C. 2008 The dynamics of second language emergence: Cycles of language use, language change, and language acquisition. Modern Language Journal 92: 232–249. Ellis, Nick C. and Diane Larsen-Freeman 2006 Language emergence: Implications for applied linguistics – Introduction to the special issue. Applied Linguistics 27: 558–589. Ellis, Rod and Gary Barkhuizen 2005 Analyzing Learner Language. New York: Oxford University Press. Ellis, Rod and Fangyuan Yuan 2005 The effects of careful within-task planning on oral and written task performance. In: Rod Ellis (ed.), Planning and Task Performance in a Second Language, 167–192. Amsterdam: John Benjamins. Foster, Pauline, Alan Tonkyn and Gillian Wigglesworth 2000 Measuring spoken language: A unit for all reasons. Applied Linguistics 21: 354–375. Granfeldt, Jonas and Pierre Nugues 2007 Evaluating stages of development in second language French: A machine-learning approach. In: Joakim Nivre, Heiki-Jaan Kaalep, Kadri Muischnek and Mare Koit (eds.), NODALIDA 2007 Conference Proceedings: Tartu, Estonia, May 25–26. Retrieved August 2010 from: http://person.sol.lu.se/JonasGranfeldt/publikationer.html Halleck, Gene B. 1995 Assessing oral proficiency: A comparison of holistic and objective measures. The Modern Language Journal 79: 223–234. Halliday, Michael A. K. 1985 An Introduction to Functional Grammar. Baltimore, MD University Park Press. Halliday, Michael A. K. 1998 Things and relations: Regrammaticising experience as technical knowledge. In: James R. Martin and Robert Veel (eds.), Reading Science: Critical and Functional Perspectives on Discourses of Science, 185–235. New York: Routledge. Halliday, Michael A. K. and Christian M. I. M. Matthiessen 1999 Construing Experience through Meaning: A Language-Based Approach to Cognition. London: Cassell. Halliday, Michael A. K. and Christian M. I. M. Matthiessen 2006 Construing Experience through Meaning: A Language-Based Approach to Cognition. London: Continuum. [Study Edition of 1999 book] Hickey, Tina 1991 Mean length of utterance and the acquisition of Irish. Journal of Child Language 18: 553–569. Hudson, Thom 1993 Nothing does not equal zero: Problems with applying developmental sequences findings to assessment and pedagogy. Studies in Second Language Acquisition 15: 461–593.

Interlanguage complexity

153

Hunt, Kellogg W. 1965 Grammatical Structures Written at Three Grade Levels. Urbana, IL: The National Council of Teachers of English. Jarvis, Scott, Leslie Grant, Dawn Bikowski and Dana Ferris 2003 Exploring multiple profiles of highly rated learner compositions. Journal of Second Language Writing 12: 377–403. Klein, Wolfgang and Clive Perdue 1997 The basic variety (or: Couldn’t natural languages be much simpler?). Second Language Research 14: 301–347. Larsen-Freeman, Diane 1978 An ESL index of development. TESOL Quarterly 12: 439–448. MacWhinney, Brian 2000 The CHILDES Project: Tools for Analyzing Talk. Volume I: Transcription Format and Programs (3rd ed.). Mahwah, NJ: Lawrence Erlbaum. Mohan, Bernard and Gulbahar H. Beckett 2001 A functional approach to research on content-based language learning: Recasts in causal explanations. The Canadian Modern Language Review 58: 133–155. Myhill, Deborah 2008 Towards a linguistic model of sentence development in writing. Language and Education 22: 271–288. Myles, Florence and Rosamond Mitchell no date French Learner Language Oral Corpora (FLLOC) project. http://www.flloc.soton.ac.uk/ Neff, Joanne, Emma Dafouz, Mercedes Díez, Rosa Prieto and Craig Chaudron 2004 Contrastive discourse analysis: Argumentative text in English and Spanish. In: Carol L. Moder and Aida Martinovic-Zic (eds.), Discourse across Languages and Cultures, 267–283. Philadelphia, PA: John Benjamins. Norris, John M. 1996 A validation study of the ACTFL guidelines and the German speaking test. Unpublished Master’s thesis, University of Hawai’i. Norris, John M. and Lourdes Ortega 2009 Towards an organic approach to investigating CAF in instructed SLA: The case of complexity. Applied Linguistics 30: 555–578. Norris, John M. and Peter Pfeiffer 2003 Exploring the use and usefulness of ACTFL Guidelines oral proficiency ratings in college foreign language departments. Foreign Language Annals 36: 572–581. Oh, Saerhim 2006 Investigating the relationship between fluency measures and second language writing placement test decisions. Unpublished Master’s thesis, University of Hawai’i at Manoa, Honolulu, HI. Ortega, Lourdes 2000 Understanding syntactic complexity: The measurement of change in the syntax of instructed L2 Spanish learners. Doctoral dissertation, University of Hawai’i. Ortega, Lourdes 2003 Syntactic complexity measures and their relationship to L2 proficiency: A research synthesis of college-level L2 writing. Applied Linguistics 24: 492–518. Ortega, Lourdes 2009 Sequences and processes in language learning. In: Michael H. Long and Catherine J. Doughty (eds.), Handbook of Second and Foreign Language Teaching, 81–105. Malden, MA: Wiley/Blackwell. Ortega, Lourdes and Heidi Byrnes 2008 Theorizing advancedness, setting up the longitudinal research agenda. In: Lourdes Ortega and Heidi Byrnes (eds.), The Longitudinal Study of Advanced L2 Capacities, 281–300. New York: Routledge. Ortega, Lourdes and Castle Sinicrope 2008 Novice Proficiency in a Foreign Language: A Study of Task-Based Performance Profiling on the STAMP Test. Technical Report prepared for the Center for Applied Second Language Studies at the University of Oregon.

154

Lourdes Ortega

Pan, Barbara A. 1994 Basic measures of child language. In: Jeffrey L. Sokolov and Catherine E. Snow (eds.), Handbook of Research in Language Development Using Childes, 26–49. Hillsdale, NJ: Erlbaum. Parker, Matthew D. and Kent Brorson 2005 A comparative study between mean length of utterance in morphemes (MLUm) and mean length of utterance in words (MLUw). First Language 25: 365–376. Pienemann, Manfred 1992 Assessing Second Language Acquisition through Rapid Profile: University of Sydney: Language Acquisition Research Centre. Pienemann, Manfred, Malcom Johnston and Jürgen Meisel 1993 The multidimensional model, linguistic profiling, and related issues. Studies in Second Language Acquisition 15: 495–503. Rice, Mabel L., Sean M. Redmond and Lesa Hoffman 2006 Mean length of utterance in children with specific language impairment and in younger control children shows concurrent validity and stable and parallel growth trajectories. Journal of Speech, Language, and Hearing Research 49: 793–808. Robinson, Peter 2001 Task complexity, cognition and second language syllabus design: A triadic framework for examining task influences on SLA. In: Peter Robinson (ed.), Cognition and second language instruction, 287–318. New York: Cambridge University Press. Rollins, Pamela Rosenthal, Catherine E. Snow and John B. Willett 1996 Predictors of MLU: Semantic and morphological developments. First Language 16: 243–259. Ryshina-Pankova, Marianna 2010 Toward mastering the discourses of reasoning: Use of grammatical metaphor at advanced levels of foreign language acquisition. Modern Language Journal 94: 181–197. Samuda, Virginia and Martin Bygate 2008 Tasks in Second Language Learning. New York: Palgrave/Macmillan. Scarborough, Hollis S. 1990 Index of productive syntax. Applied Psycholinguistics 11: 1–22. Simon-Vandenbergen, Anne-Marie, Miriam Taverniers and Louise J. Ravelli (eds.) 2003 Grammatical Metaphor: Views from Systemic Functional Linguistics. Amsterdam: John Benjamins. Skehan, Peter 1998 A Cognitive Approach to Language Learning. Oxford, UK: Oxford University Press. Skidmore, Susana Troncoso, and Bruce Thompson 2010 Statistical techniques used in published articles: A historical review of reviews. Educational and Psychological Measurement 70: 777–795. Tabachnick, Barbara G. and Linda S. Fidell 1996 Using Multivariate Statistics. New York: HarperCollins. Tarone, Elaine and Bonnie Swierzbin 2009 Exploring Learner Language. New York: Oxford University Press. Thomas, Margaret 2006 Research synthesis and historiography: The case of assessment of second language proficiency. In: John M. Norris and Lourdes Ortega (eds.), Synthesizing Research on Language Learning and Teaching, 279–298. Amsterdam: John Benjamins. Tomasello, Michael 2003 Constructing a language: A Usage-Based Theory of Language Acquisition. Harvard, MA: Harvard University Press. Vandergrift, Larry (no date) Proposal for a Common Framework of Reference for Languages for Canada. Report prepared for the Department of Canadian Heritage, re-

Interlanguage complexity

155

trieved from: http://elp-implementation.ecml.at/IMPEL/Documents/Canada/ ProposalofaCFRforCanada/tabid/122/language/fr-FR/language/en-GB/ Default.aspx Wolfe-Quintero, Kate, Shunji Inagaki and Hae-Young Kim 1998 Second Language Development in Writing: Measures of Fluency, Accuracy, and Complexity. Honolulu, HI: University of Hawai’i, Second Language Teaching and Curriculum Center. Yang, Yanning 2008 Typological interpretation of differences between Chinese and English in grammatical metaphor. Language Sciences 30: 450–478.

156

Maria Steger and Edgar W. Schneider

Maria Steger and Edgar W. Schneider

Complexity as a function of iconicity The case of complement clause constructions in New Englishes

1.

New Englishes are simple, isn’t it?

As yet, there is no universally-acknowledged and theory-independent definition of language complexity. Jeff Siegel (2004, this volume) accurately points out that researchers disagree on whether language simplicity can be judged absolutely, i.e. by some independent measure, or only comparatively, i.e. by comparison to another variety. Moreover, it is still unclear if a variety can be classified to be simple overall (holistic simplicity) or if it is only possible to rate some part as in some sense simpler than a comparable part in the same or another variety (modular simplicity). And, above all, it is controversial whether language complexity should be understood quantitatively, i.e. in structural ‘more is more’ terms, or qualitatively, i.e. in psycholinguistic terms. Applied to indigenised second language varieties in postcolonial contexts, the issue becomes particularly delicate since simplicity is easily associated with reduction of complexity instead of, perhaps, lack of expansion and, furthermore, “[a]lleged deficiencies in these varieties are potentially interpretable as deficiencies in their speakers” (Williams 1987: 164). Despite these reservations, there appears to be an implicit understanding among experts that so-called New Englishes, i.e. institutionalised (originally scholastic) outer-circle types of English, are in some respect simple or at least simpler than Standard British English (henceforth: StBrE) (e.g. Görlach [1998, 1999] 2002: 115–116). Wong (1983), Williams (1987), and De Klerk (2003) focused on processes of simplification in a number of such varieties and their studies seem to imply that the psycholinguistic principle underlying many of the typical New English features is in fact a propensity for transparency (Steger fc.). In second language acquisition research, the one-to-one principle – i.e. a transparent one-to-one-mapping of conceptual structure and surface form – has long been accepted as an operating principle or learning strategy (e.g. Andersen 1984). However, “the difficulty consists in making the notion of ST [semantic transparency] workable and giving it a func-

Complexity as a function of iconicity

157

tional place in existing linguistic theory” (Seuren and Wekker 1986: 64). Maximisation of transparency is often understood as resulting from some kind of naturalness or universality of the bond between meaning and form. If this bond is motivated, i.e. if the sign is iconic, minimal reliance on, and learning of, language-specific arbitrary symbolic conventions are required and language processing and language acquisition become easier. Haiman (1980, 1983, 1985b) has linked the one-to-one (or isomorphism) principle most firmly to the concept of (diagrammatic) iconicity and has integrated it with a cognitive-functionalist framework. Using a slightly different terminology, Givón (1985: 188) equates isomorphism with iconicity so that, to him, “an iconic code is ‘an isomorphically constructed code’”. Assuming that iconicity minimises the acquisition effort and should therefore be reflected in transitional interlanguage constructions, we expect traces of such conceptually motivated structural encoding to be found relatively more frequently in grammatical patterns of New Englishes, which are products of both individual and community second language acquisition, than in those of StBrE. This hypothesis, which seems to be supported by example-based descriptions of New Englishes (e.g. Platt, Weber, and Ho 1984; Mesthrie 2008c; Burridge and Kortmann 2008), will be empirically tested in the present study by investigating different levels of iconicity in complex complement clause constructions of verbs which, in StBrE, typically take non-finite NP to V clausal complements, as in John wanted Mary to come, John expected Mary to come, or John persuaded Mary to come. These verbs display complex – and sometimes very un-iconic – behaviour in their semantic and syntactic properties, as to e.g. whether or not the noun phrase following the matrix verb is in itself a complement of this verb, whether the non-finite complement can also appear as a finite object clause, whether there is an overt complementiser, or whether the subject noun phrase of the embedded clause can undergo so-called raising to matrix clause subject position through passivisation. The syntactic constructions of a number of such verbs will be analysed in five International Corpus of English (ICE) corpora representing ESL-type New Englishes at different developmental stages, namely East African English (henceforth: EAfE), Hong Kong English (henceforth: HKE), Indian English (henceforth: IndE), and Singapore English (henceforth: SgE), as well as StBrE (as a benchmark for quantitative comparisons), and both qualitative observations and quantitative tendencies will be reported. Adopting a cognitive-functionalist perspective (Haiman 1980, 1983, 1985a, 1985b; Givón 1985; DuBois 1985; Croft 2003), this paper opts for a comparative and modular interpretation of complexity and defines it in

158

Maria Steger and Edgar W. Schneider

terms of qualitative speaker-oriented processing optimisation1 as a function of iconicity, along the lines of Givón’s (1985: 189, emphasis original) iconicity meta-principle: “All other things being equal, a coded experience is easier to store, retrieve and communicate if the code is maximally isomorphic [in Haiman’s terms: iconic] to the experience”.

2.

Complexity as a function of iconicity

2.1. Iconicity and second language acquisition In this paper, language complexity is regarded as being inversely correlated with iconicity via learning effort. A maximally simple mode of communication relies on signs whose meanings are immediately obvious to all communicators. Solely icons, in which the form resembles the meaning by sharing, or being similar in, some of its qualities (shape, sound, taste, smell, behaviour, etc.), truly meet this condition (Peirce [1931] 1965: 158). The large majority of signs in language are not icons but symbols, in which the relation between surface form and lexical meaning or grammatical function is (at least synchronically) arbitrary and purely conventional so that it must be learnt by the language user.2 While children master this task effortlessly in first language acquisition, it is common knowledge that adolescents and adults struggle with it in second language acquisition. As regards simplex linguistic signs (with the rare exception of onomatopoeia), the burden is entirely on the learner’s mnemonic skills. There is no way of figuring out form and function of a morpheme other than to memorise it. However, when it comes to complex linguistic signs, i.e. syntagmatic combinations of simplex signs, the acquisition process may sometimes be facilitated by a joint venture of several cognitive abilities. Simplex signs are intrinsically unmotivated, whereas complex signs may be motivated to some extent, and the learner may resort to general cognitive processes such as his “human pattern-finding skills” (Tomasello 2003: 4) to exploit their “relative motivation” and let “the mind succeed […] in introducing a principle of 1

2

It should be noted that what makes encoding easier for the sender may not necessarily make decoding easier for the receiver. Meisel (1980: 14), for instance, differentiates between psychological simplification and perceptual simplification accordingly. For the purpose of this paper, we will broadly follow the Peircean model of meaning, which includes both reference and conceptual sense. Similarly, we will not distinguish between meaning and function and will, for the sake of clarity, equate semantic with conceptual structure.

Complexity as a function of iconicity

159

order and regularity into certain areas of the mass of signs” (de Saussure [1916] 1983: 131). Second language acquisition therefore “consists of sorting out formfunction correlations” (Ellis 1985: 95; cf. also Ellis 2008: Chapter 4); the more clear-cut these get, the more efficient the interlanguage system becomes. If a language learner has not yet learnt the meaning associated with a certain complex form, she – being, after all, a homo significans, a “meaningmaker” (Chandler 2007: 13) – may draw on her mental capacities of perception, identification, and categorisation to try to make sense of it. Knowing the (simplex) constituent elements is a prerequisite, but not enough, as the overall meaning of a complex construction is also determined by its syntactic organisation (John killed Mary vs. Mary killed John). Without external help, the learner will succeed only if the arrangement of the components of the complex form is not arbitrary but motivated and serves as a clue to the meaning by mirroring it. In other words, he will be able to infer the formfunction correlation correctly if, and only if, the syntactic structure qualifies as the subtype of the icon which Peirce (1931: 159) called diagram. If the complex form-function correlation is based on an arbitrary (morpho-)syntactic convention, the learner will need external clues, e.g. explanations or rules, which he has to memorise, and the learning process will require more effort. Following Ellis (2008: 136), it is our hope that the form-function analyses presented in this study will shed some light on cognitive processes in second language acquisition.

2.2. Diagrammatic iconicity: isomorphism and motivation We believe that, if second language acquisition consists of sorting out formfunction correlations, it will be easier if these correlations are motivated by (diagrammatic) iconicity. In his theory of signs, Peirce (1931: 157) distinguishes between icons, indices, and symbols. He further differentiates three subtypes of the icon, namely images (“[t]hose which partake of simple qualities”), diagrams (“those which represent the relations, mainly dyadic, or so regarded, of the parts of one thing by analogous relations in their own parts”), and metaphors (“those which represent the representative character of a representamen [sign] by representing a parallelism in something else”). According to Haiman (1985b), languages bear particular similarity to diagrams. As will be discussed in more detail below, iconicity, which idealises overt coding of each and every concept, is only one of two competing cognitive

160

Maria Steger and Edgar W. Schneider

phenomena that underlie structural encoding in language. Its antagonist, economy, favours zero coding in order to minimise expression where possible. 2.2.1. Syntagmatic isomorphism Isomorphism is concerned with the way individual conceptual categories are or are not rendered in language (Bolinger 1977). While paradigmatic isomorphism relates to a one-to-one correspondence between form and meaning in the mental lexicon and favours monosemy (over synonymy, homonymy, and polysemy), this paper is interested mainly in syntagmatic isomorphism, which favours a one-to-one correspondence of meanings and forms in syntactic constructions. Double markings of e.g. tense or person are regarded as un-iconic, as they violate the principle of syntagmatic isomorphism by redundantly mapping a single concept onto two surface forms. In passive or raising constructions, the deletion of the agent, which certainly figures in the cognitive conceptualisation of the situation described by the verb, also destroys iconicity, which increases processing complexity (see also Rohdenburg 1996: 162). Table 1 (adapted from Croft 2003: 104) illustrates possible form-meaning correspondences and their underlying cognitive motivation(s). Table 1. Possible form-meaning correspondences in syntagmatic isomorphism concept(s)

form(s)

iconic

economic

1 1 0 >1 0 1

1 0 0 1 1 >1

+ – – – – –

– + + + – –

classic iconic structure zero expression of category absence of category cumulation empty forms redundancy

2.2.2. Iconic motivation: structural transparency The iconicity claim is twofold: the parts of the syntactic structure correspond to the parts of the conceptual structure (syntagmatic isomorphism) and relations between the parts of the syntactic structure correspond to relations between the parts of the conceptual structure (motivation).3 It is im3

Givón (1985: 188) does not follow Haiman’s distinction between isomorphism and motivation. He argues that isomorphism means ‘similarity in form’, and that “the no-

Complexity as a function of iconicity

161

portant to repeat here that the individual signs, i.e. the points in the diagram, are symbolic in nature. It is only their combination into a motivated syntactic construction which may create a relationship of similarity. Several different – and sometimes competing – types of parallelism between formal structure and conceptualisation have been put forward (e.g. Haiman 1985b: Chapter 6). Three of them are particularly well-studied and of special relevance for our acquisition-related study: iconicity of sequence, iconicity of distance, and iconicity of independence. Iconicity of sequence exploits the linearity of human speech to reflect extralinguistic sequentialities of time, cause, or purpose. Conceptual relations are encoded simply by the order of the elements. Conversely, such conceptual relations can be inferred intuitively from the structure: the (meaningful) sequence of events is immediately obvious in a sentence such as I sliced the apple and ate it. However, though grammatically identical, the reverse order (I ate the apple and sliced it) does not work without a formal “diacritic” (Haiman 1985b: 12), such as before (as in before I ate the apple, I sliced it), which warns the addressee that the linguistic structure does not parallel the conceptual structure and “rectif[ies] or override[s] the misleading implications of the structure of the diagram itself ” (Haiman 1985b: 12). Unlike motivated constructions, such diacritic or auxiliary signs and associated structures obviously have to be learnt. The principle of post hoc, ergo propter hoc can lead to a causal interpretation of sequentially iconic structures: I was hungry and ate an apple could plausibly be understood as ‘because I was hungry, I ate an apple’. Similarly, it is logically possible to read the sender’s purpose into the example sentence above: ‘I sliced the apple in order to eat it’. There are many examples in the literature on New Englishes which show this basic motivation at work: (1) a. You are going where? ‘Where are you going?’ (movement logically leads to goal) (CamE; Mbangwana 2008: 422) b. You eat finish, go out and play. ‘After you have finished eating, you go out and play.’ (SgE & MalE; Platt, Weber, and Ho 1984: 71)

tion of form, in turn, does not imply only agreement in a number of matching parts [Haiman’s isomorphism], but also in their relationships [Haiman’s motivation]”. To him, therefore, “an iconic code is ‘an isomorphically constructed code’”.

162

Maria Steger and Edgar W. Schneider

Raising constructions of the type Mary is expected to come violate not only isomorphism (by deleting the agent) but also iconicity of sequence (see Section 3.2). Iconicity of distance holds that a perceived close relationship of conceptual categories is mirrored by the adjacency of their formal counterparts (Haiman 1985b: 106–107; 111). The complement clauses under investigation have notional subjects of their own; if these are raised to matrix clause subject or object function, their affiliation to the complement verb gets stretched (see Section 3.2). Iconicity of independence may either be regarded as a third motivation in its own terms or as a subtype of iconicity of distance; the two are closely related. The conceptual independence of two propositions is iconically mirrored by the grammatical separateness and independence of the corresponding two clauses, which – if the clauses are not realised as two distinct sentences altogether – is defined by the finiteness of their verbs and may be reinforced by the insertion of a so-called ‘dividing’ complementiser. According to Haiman (1983: 799), “[a] crude approximation to the notion of conceptual independence is provided by the notion of entailment: Given two propositions S1 and S2, where S1 entails S2, S2 is dependent on S1”. In the patterns investigated in this study, the proposition expressed in the complement clause is not logically entailed by the matrix clause proposition and should therefore ideally be rendered as a grammatically independent (finite) clause. However, building on the same general idea, Givón (1980) established systematic intra-language and cross-language correlations between various conceptual parameters of complement-taking verbs and the syntactic structure of their complements. His findings are represented in the so-called “binding hierarchy”, which predicts the syntactic coding of the complement clauses of such semantically defined verb types.4 The verbs investigated in this study occupy the middle range of the scale, which – a bit simplified – implies that they typically take infinitival complements. Are there any reasons to expect these verbs to behave differently in New English varieties such that they prefer (more independent) finite object clause constructions? Prompted by Hawkins’ (1990, 1992) research, Rohdenburg (1996: 173) has demonstrated that iconicity of independence can be in con4

In StBrE, for instance, the binding hierarchy accounts for different formal realisations and (subtle) semantic differences such as the following: because make is more manipulative than cause, make takes the bare infinitive without to; she told him to eat implies a stronger order than she told him that he should eat; demand is less manipulative and therefore allows only that-complements; etc.

Complexity as a function of iconicity

163

flict with a complexity (or transparency) principle, according to which more explicit or isomorphic-iconic linguistic options are preferred in more complex environments. Contexts which are said to regularly trigger more iconic grammatical alternatives include the following: “discontinuous constructions of various kinds”, “(the presence of) more or less complex surface objects preceding finite and non-finite clauses”, “heavy subject expressions (including subordinate clauses)”, “complex subordinate clauses”, and “passive constructions”. The prototypical (NP) to V complementation pattern (John expected Mary to come) and its raised counterpart (Mary was expected to come) obviously meet a number of the above criteria and should therefore qualify as complex environments. Moreover, Rohdenburg (1996: 160) shows that less common and therefore less familiar verbs tend to involve a greater processing burden and are also likely to activate the complexity principle. Consequently, even though Rohdenburg himself (1996: 166) emphasises that (in first language production) “the rivalry between finite and nonfinite complement clauses with promise or other verbs is largely determined by the distance [here: independence] principle” in terms of Givón’s (1980) binding hierarchy, the above factors may have some impact on the final choice, and perhaps particularly so in (early) second language production, assuming that foreign language verbs are less well-known to the learner and (basic) second language varieties are remarkably transparent or isomorphic (Klein and Perdue 1997: 304). On the whole, Rohdenburg (1996) clearly lends some support to the approach adopted in this paper and the hypothesis that iconicity simplifies complex syntactic structures such as the ones outlined above, which arise with the kinds of verbs investigated here.

2.3. Iconicity versus economy It is a well-known fact that iconicity is counterbalanced by another cognitive phenomenon, namely economy, which is why not all natural languages are maximally iconic (see, for example, Kusters 2000: 227). Iconicity, however, can be conjectured to have a particular impact on New English systems for two main reasons, which will be touched upon only briefly here (see Steger fc. for more details). New Englishes are products of individual and group second language acquisition and language shift. As indigenised varieties, they are comparably young and thus likely not yet strongly affected by iconicity-destroying effects of lexicalisation and grammaticalisation (see, for instance, Trudgill 2001). Section 2.1 has shown that iconic structures are easier to learn and employ because they

164

Maria Steger and Edgar W. Schneider

are decodable with the help of general cognitive skills and, hence, are immediately accessible to the language user. In post-colonial contexts, reliance on such general intellectual capacities in language processing and language learning has probably been reinforced by restricted access to the StBrE target (Mesthrie 2003: 453; Simo Bobda 2000: 61) and therefore limited exposure to, and entrenchment of, StBrE (symbolic) conventions. Thanks to the “lousy language learning abilities of the human adult” (Trudgill 2001: 372) and all they entail for linguistic complexity, iconicity effects may be expected to be most prominent in varieties which are closer to the acquisition stages and less advanced in the evolutionary cycle outlined in the Dynamic Model (Schneider 2007). New Englishes are spoken in multilingual and multidialectal societies. In all likelihood, intra-national speech accommodation will assure mutual intelligibility with basi- and mesolectal varieties to some extent. This again perhaps promotes iconic structures, as basic second language varieties seem to prioritise natural and transparent form-function relationships (Klein and Perdue 1997: 304). Indeed, many of the features described in the literature as typical of New Englishes (e.g. canonical subject-verb-object order, invariant tag question, no distinction between mass and count nouns, analytic and non-redundant tense marking, overgeneralisation of grammatical markers, etc.) can be regarded as being motivated by iconicity rather than economy. Williams (1987) has tried to explain similarities in the grammatical structures of diverse New Englishes in terms of psycholinguistic processes. And she relates all of the four strategies she distinguishes to the one-to-one principle at some point, although she attributes only two of them to a demand for explicitness or ‘hyperclarity’ (the other two being seen as associated with economy instead). Szmrecsanyi and Kortmann (2009: 77), in their study on simplification and complexification in non-standard varieties of English, also conclude that “L2-varieties trade off grammaticity for transparency”.5 This paper sets out to investigate a complex conceptual configuration, which in its default StBrE (NP) to V encoding violates most of the iconic motivations introduced above. 5

We do not claim, of course, that iconicity is the only factor determining grammatical structures in New English varieties – transfer effects, for example, certainly play a major role as well (see, for example, Odlin, this volume). We sought to control for these to some extent by choosing New Englishes with as diverse first language backgrounds as possible.

Complexity as a function of iconicity

3.

165

Hypothesis formation and testing procedure

3.1. Complement clause constructions in StBrE Hawkins (1986) has labelled English a loose-fit language, in which there is, in comparison with German, a tendency towards weakened one-to-one correlations between the underlying conceptual structure and explicit surface realisations. Non-finite complement clause constructions of the type (NP) to V are manifestly less isomorphic-iconic than finite alternatives: (2) a. I want [ him to do that] (non-finite) b. I want [that he should do that] (finite) Categories which are explicitly marked in the finite construction but which are suppressed in the less iconically independent non-finite version include tense and agreement, modality, and a complementiser. The formal mapping of the notional subject of the complement verb onto the grammatical direct object function of the matrix verb further violates iconicity of distance and independence.6 There are a number of verbs in English that take non-finite complements of the type described above. However, with regard to alternative constructions, not all of them behave in the same way, which explains fuzzy and sometimes conflicting categorisations in the literature (Quirk et al. 1985; Langacker 1996; Aarts 1997; Huddleston and Pullum 2002). Still, there seems to be some agreement on three broad verb classes: – want-class Quirk et al. 1985: S-V-DO(NP to V) With verbs of the want-type (hate[2], love1/2, prefer1/2, want[1], wish1/2),7 the noun phrase following the matrix verb cannot become the subject of a passivised matrix clause. Especially in American English, the noun phrase is sometimes preceded by for. These verbs usually do not allow an alternative finite thatclause construction. – believe-class Quirk et al. 1985: S-V-DO(NP)-OC(to V) With verbs of the believe-type (allow[1], authorise[1], believe1/2, cause[1], enable[1], expect1/2, require1/2), the direct object noun phrase can become the subject of 6

7

In the following, in order to achieve a terminological parallelism between the formal-grammatical and semantic-conceptual levels, the terms ‘notional subject’ and ‘notional object’ will be used to refer to the theta-roles of agent on the one hand and patient or goal on the other. The subscript numbers identify a verb with the type(s) according to Tables 2 to 4. If a subscript number is in square brackets, the verb occurred only in constructions of this one type in the corpora.

166

Maria Steger and Edgar W. Schneider

the passivised matrix clause. Most of them (the factual verbs to be precise) allow a finite that-clause alternative with an indicative verb. The finite version is preferred with complement verbs other than be, “except that the infinitive construction provides a convenient passive form” (Quirk et al. 1985: 1204). – persuade-class Quirk et al. 1985: S-V-IO(NP)-DO(to V) In the persuade-class (advise[1], persuade[1], promise[3]), the indirect object noun phrase can become the subject of the passivised matrix clause. The iconic alternative finite that-clause construction is rare and considered more formal, with the complement verb often being in the subjunctive mood in American English or preceded by putative should in StBrE. Starting out from these classes, we gradually developed a scheme of syntactic structures which reflects a cline of increasing iconicity. This cline is schematically represented and illustrated in Tables 2 to 4. We distinguish between three possible basic types, depending on the conceptualisation of the noun phrase in the default structure (shaded in dark grey). For each type, the number and form-function relationships of the constituents are different. Drawing on the theoretical framework outlined in Section 2.2, within each table, we postulate a relationship that the earliest patterns are the least iconic ones, that the level of iconicity from one pattern type to another is gradually increasing, and that the relatively last pattern types are comparatively most iconic, as the examples will show. Note that not all of these patterns are grammatical in StBrE, but all are attested in some varieties. It should be added that variant meanings of the same verb (polysemes) may behave and be categorised independently of each other. With type-1 verbs, the notional subject of the matrix verb is a different entity from the notional subject of the complement verb; the latter is represented as ‘NP’ in the tables and in the following. Table 2. Cline of iconicity: Type 1 Type 1:

NP = Scomp ≠ Smatrix ≠ Omatrix (= Omatrix with persuade-type)

type

finiteness

surface structure

1-nf (a) 1-nf (b) 1-nf (c)

non-fin non-fin non-fin

[ØNP] to V [ØNP] V-ing NP to V (reduced passive relative)

1-nf (d)

non-fin

1-nf (e)

non-fin

1-nf (f)

non-fin

example

John advises to come. John advises coming. [This is] a person expected to come, … NP V-ing (reduced passive relative) [This is] a person expected coming, … NP to V (contact clause) [This is] a person John expects to come, … NP V-ing (contact clause) [This is] a person John expects coming, …

167

Complexity as a function of iconicity Type 1:

NP = Scomp ≠ Smatrix ≠ Omatrix (= Omatrix with persuade-type)

type

finiteness

surface structure

example

1-nf (g)

non-fin

NP to V (relative)

1-nf (h)

non-fin

NP V-ing (relative)

1-nf (k)

non-fin NP to V (passive) SSR-1 non-fin NP V-ing (passive) SSR-1 non-fin NP to V SOR-1a/b non-fin NP V-ing SOR-1a/b non-fin NP compl to V non-fin NP compl V-ing non-fin NP to V (it-construction) non-fin/fin NP [Øto] V fin [Øcompl] NP V fin compl NP V fin [Øcompl] NP V (it-construction) fin compl NP V (it-construction) fin NP(Om)® [Øcompl] NP® V fin NP(Om)® compl NP® V fin [ØNP(Om)] (compl) NPnew V fin NP(Om) (compl) NPnew V

[This is] a person who(m) John expects to come, … [This is] a person who(m) John expects coming, … Mary is expected to come.

1-nf (l) 1-nf (m) 1-nf (n) 1-nf (o) 1-nf (p) 1-nf (q) 1-int 1-f (a) 1-f (b) 1-f (c) 1-f (d) 1-f (e) 1-f (f) 1-f (g) 1-f (h)

Mary is expected coming. John expects / persuades Mary to come. John expects/persuades Mary coming. John prefers for Mary to come. John prefers for Mary coming. It is expected (of Mary) to come. John expects Mary come. John expects Mary comes. John expects that Mary comes. It is expected Mary comes. It is expected that Mary comes. John persuades Mary she comes. John persuades Mary that she comes. John advises that something happens. John advises Mary that something happens.

The conceptualisation is slightly different with persuade-type verbs; here, the noun phrase preceding the complement verb not only refers to the notional subject of the complement verb but, simultaneously, to the notional object of the matrix verb (see Section 3.2). In 1-f (e) and 1-f (f), in which this double function is made explicit on the surface level, a superscript ‘®’ symbolises the referential identity between the notional object of the matrix verb ‘NP(Om)’ and the notional subject of the complement verb ‘NP’. For the sake of clarity, we decided against separating persuade[1] and advise[1] into a new type 4 although, strictly speaking, the conceptualisation underlying the default pattern 1-nf (m) is different from that of verbs such as expect1. In contrast to type 2 and type 3, however, the notional subject of all type-1 complement verbs is never referentially identical with the notional subject of type-1 matrix verbs. 1-f (g) and 1-f (h) distinguish yet another conceptualisation. Here, the notional subject of the complement verb is not coreferential with the notional object of the matrix verb but refers to a new entity, as symbolised by the superscript ‘new’ in Table 2.

168

Maria Steger and Edgar W. Schneider

Table 3. Cline of iconicity: Type 2 Type 2:

NP = Scomp = Smatrix

type 2-nf (a)

finiteness non-fin SSR-2 non-fin SSR-2 non-fin/fin fin fin

2-nf (b) 2-int 2-f (a) 2-f (b)

surface structure [ØNP®] to V

example John expects to come.

[ØNP®] V-ing

John expects coming.

[ØNP®] [Øto] V [Øcompl] NP® V compl NP® V

John expects come. John expects he comes. John expects that he comes.

Type-2 verbs (see Table 3) are different from type-1 verbs in that the notional subject of the complement verb is coreferential with the notional subject of the matrix verb and superficially absent in the default non-finite construction 2-nf (a). Table 4. Cline of iconicity: Type 3 Type 3:

NP(Om) = IOmatrix ≠ Scomp NP = Scomp = Smatrix

type 3-nf (a)

finiteness non-fin SSR-3a non-fin non-fin SSR-3b fin fin fin fin

3-nf (b) 3-nf (c) 3-f (a) 3-f (b) 3-f (c) 3-f (d)

surface structure [ØNP(Om)] [ØNP®] to V

example John promises to come.

[ØNP(Om)] [ØNP®] V-ing NP(Om) [ØNP®] to V

John promises coming. John promises Mary to come.

[ØNP(Om)] [Øcompl] NP® V [ØNP(Om)] compl NP® V NP(Om) [Øcompl] NP® V NP(Om) compl NP® V

John promises he comes. John promises that he comes. John promises Mary he comes. John promises Mary that he comes.

Type 3 comprises only promise[3]. In the default non-finite construction 3-nf (c) – which is only superficially identical with 1-nf (m) – the noun phrase preceding the complement verb is not its notional subject but, instead, the notional object of the matrix verb and therefore symbolised as ‘NP(Om)’ in Table 4. The notional subject of the complement verb is coreferential with the notional subject of the matrix verb as becomes obvious in the finite constructions. In contrast to type 2, promise[3] is conceptualised with a notional object even though it may be absent from some surface structures (3-nf [a/b], 3-f [a/b]). Consequently, 3-nf (a) is only superficially identical with 2-nf (a).

169

Complexity as a function of iconicity

3.2. The un-iconicity of non-finite complement clause and raising constructions in English All three verb types allow various so-called raising constructions, which cause disparities between the semantic-conceptual and the formal-grammatical configurations. The following figures develop more specific types of raising constructions by contrasting the constituents’ representation on the formal (surface) level with their representation on the conceptual level, and by relating discrepancies to specific types of iconicity violations. John

expects

Smatrix Smatrix John

f Omatrix = Scomp:

Omatrix to Scomp [] expects [Mary (will) come.] violation of iconicity of independence

f to =

violation of isomorphism / iconicity of distance

SOR-1a: formal level: conceptual level:

Mary / her

to come.

Figure 1. Subject-to-object raising with type-1 verbs (except persuade[1] and advise[1]) SOR-1b:

John

persuades

Mary / her

to come.

formal level:

Smatrix

Omatrix

to

conceptual level:

Smatrix

Omatrix + Scomp

[]

[John persuades Mary.] + [Mary (will) come.] violation of syntagmatic isomorphism f Omatrix = Omatrix + Scomp: violation of iconicity of independence f Omatrix = Scomp: violation of isomorphism / iconicity of distance f to =

Figure 2. Subject-to-object raising with type-1 verbs persuade[1] and advise[1]

Figures 1 and 2 elaborate on the type-1 default structure 1-nf (m). While, with most of the type-1 verbs, only iconicity of independence is violated by formally integrating the proposition of the complement clause into the matrix clause (by raising the independent notional subject of the complement verb to the dependent grammatical object function of the matrix verb), SOR-1b constructions with persuade[1] and advise[1] in addition impair isomorphism: the surface level does not reflect the double conceptual functions of the noun phrase in the complement clause. Finally, the formal-grammatical infinitive marker to (a diacritic which lacks a conceptual counterpart and has to be learnt) un-iconically increases the distance between the complement verb and its subject in all constructions presented in this section.

170 SSR-1: formal level: conceptual level:

Maria Steger and Edgar W. Schneider Mary

is expected

Smatrix Scomp

to [] [] [Smatrix] expect(s) [Mary (will) come.] violation of syntagmatic isomorphism violation of iconicity of sequence / distance / independence violation of isomorphism / iconicity of distance

f = Smatrix: f separation of Scomp – Vcomp: f to =

to come.

Figure 3. Subject-to-subject raising with type-1 verbs SSR-2: formal level: conceptual level:

John

Smatrix Smatrix + Scomp John f Smatrix = Smatrix + Scomp: f separation of Scomp – Vcomp: f to =

expects

to come.

to [] expects [John (will) come.] violation of syntagmatic isomorphism violation of iconicity of distance / (independence) violation of isomorphism / iconicity of distance

Figure 4. Subject-to-subject raising with type-2 verbs

Figures 3 and 4 exemplify the iconicity violations caused by so-called subject-to-subject raising constructions with type-1 and type-2 verbs (1-nf [k] and 2-nf [a], respectively). While in SSR-2, the formal separation of the complement verb from its notional subject obviously causes some distance – which is further increased by the un-isomorphic to – between these conceptually closely related constituents, the breach of iconicity of independence is only minor because the two propositions expressed by the two clauses share the same notional subject and are conceptually closely related. This is very different in SSR-1, where the splitting of the notional complement clause subject from its verb not only impedes processing due to a considerable blurring of iconicity of distance and independence, but particularly due to the violation of iconicity of sequence, the most basic type of iconicity: first someone expects, then Mary comes. Moreover, in both cases, there is no one-to-one relationship between the conceptual and the formal levels. In SSR-1, the notional subject of the matrix verb (i.e. the person who expects) is absent from the surface structure altogether. In SSR-2, the referentially identical notional matrix clause subject and notional complement clause subject are both mapped onto the grammatical subject function of the matrix clause. It is likely that learners have particular difficulties in processing the formal SSR-2 structure, for it is ambiguous in that it may represent several different conceptualisations: 2-nf (a) as outlined above or 1-nf (a), in which the formally absent notional subject of

171

Complexity as a function of iconicity

the complement verb is not referentially identical with the notional subject of the matrix verb, in which case it is impossible to know from the surface structure whom John expects or advises to come. Although 1-nf (a) is uncommon in StBrE, it recurs in the New English corpus data, which may be taken as an indication of processing insecurities caused by the multiple conceptualisations behind this non-finite pattern. Figure 5 demonstrates yet another conceptualisation of this surface structure, namely type 3-nf (a). SSR-3a:

John

formal level: conceptual level:

Smatrix Smatrix + Scomp John f Smatrix = Smatrix + Scomp: f = Omatrix: f separation of Scomp – Vcomp: f to = f only superficially like SSR-2:

promises

to come. [] Omatrix

to [] promises [John (will) come.] violation of syntagmatic isomorphism violation of syntagmatic isomorphism violation of iconicity of distance / (independence) violation of isomorphism / iconicity of distance John expects to come.

Figure 5. Subject-to-subject raising with promise[3] (notional matrix verb object absent from surface structure) SSR-3b:

John

promises

Mary / her

to come.

formal level: conceptual level:

Smatrix Omatrix to Smatrix + Scomp Omatrix [] [John promises Mary.] + [John (will) come.] violation of syntagmatic isomorphism f Smatrix = Smatrix + Scomp: violation of iconicity of distance / (independence) f separation of Scomp – Vcomp: violation of isomorphism / iconicity of distance f to = f only superficially like SOR-1a/b:John expects / persuades Mary / her to come.

Figure 6. Subject-to-subject raising with promise[3] (notional matrix verb object present in surface structure)

While SSR-3a is superficially identical to SSR-2, the verb promise[3] cannot be conceptualised without a person who the promise is made to. This notional matrix object is suppressed in SSR-3a, causing a first discrepancy between conceptual and formal level. All other violations are similar to those in SSR-2. As Figure 6 shows, SSR-3b (3-nf [c]) formally renders the notional object of the matrix verb and is therefore more iconic than SSR-3a. We note, by way of an interim conclusion, that un-iconic structures like the above pose problems in language learning (Chomsky 1969) and thus certainly count as complex environments (Rohdenburg 1996: 173).

172

Maria Steger and Edgar W. Schneider

3.3. New Englishes as ESL varieties: (more) iconic Englishes? Iconicity can therefore be expected to have been restored to some extent in New Englishes, particularly in (evolutionarily) less advanced varieties (Callies 2008: 208, referring to Kellermann 1979 and Kortmann 1998). 3.3.1. Structural hypotheses If our predictions are borne out, the constructions complementing the verbs investigated should reveal a tendency in New Englishes towards a. explicit marking of their independent conceptual status by appearing as finite rather than non-finite clauses (motivated by iconicity of independence) and b. being separated from the matrix clause proposition by a complementiser (motivated by iconicity of distance and iconicity of independence); c. explicit expressions of modality (motivated by syntagmatic isomorphism and iconicity of independence); d. explicit structural marking of double conceptual functions (matrix verb object and subject of the complement verb) of noun phrase constituents with persuade-type verbs (motivated by syntagmatic isomorphism and iconicity of independence); and e. less frequent selection of raising phenomena (which violate syntagmatic isomorphism, iconicity of independence, iconicity of distance and, in some cases, iconicity of sequence). Incidentally, these assumptions are in line with Schneider’s (2007: 86–88) claim that structural nativisation in New Englishes typically occurs at the interface of lexis and grammar. 3.3.2. Support for the plausibility of our hypotheses The above hypotheses receive some initial support by a few idiosyncratic observations made in descriptions of some New Englishes. Thus, in CamE (Mbangwana 2008: 422), BlSAfE (de Klerk and Gough, quoted in Mesthrie 2008a: 490), and PakE (Baumgardner, quoted in Mahboob 2008: 582), an explicit complementiser that, which would not be found in British or American English, can iconically mark the conceptual distance between, and independence of, two propositions, in addition to clauses being finite rather than non-finite. We also know that a number of New Englishes – such as MltE (Manfred Krug, p.c.), IndSAfE (Mesthrie 2008b: 511), FijiE (Mugler and Tent 2008: 562), or EAfE (Schmied 2008: 453) – may display a finite or

Complexity as a function of iconicity

173

‘intermediate’ complementation pattern where a less iconic non-finite one would be expected. Furthermore, as early as in 1972, Selinker’s classic paper on second language acquisition research and interlanguage development addressed the observation that Indian learners of English face difficulties with non-finite complement clause constructions, and prefer finite constructions with that instead thanks to fossilisation tendencies (Selinker 1972: 216). 3.3.3. Methodology We chose the International Corpus of English (ICE) corpora as appropriate sources to test our hypotheses, for obvious reasons. The sub-corpora of the ICE project all adopt the same corpus design (roughly one million words, representing a predetermined grid of genres, with 60 % spoken material), so that the comparability of results across a wide range of countries and varieties is guaranteed (Greenbaum 1996; http://ice-corpora.net/ice/). This procedure also secures some degree of control over the transfer effects resulting from different first language inputs. The varieties selected for investigation represent different evolutionary stages in Schneider’s Dynamic Model (Schneider 2007). Thus, as a sideline of the main hypothesis, as it were, we can test the assumption that second language acquisition effects will be strongest in early developmental stages (which are closer to the acquisition process itself), while, in the course of time, the impact of this process will become weaker and will be overridden by economically motivated grammaticalisation effects and cultural or learnt conventions. Obviously, StBrE, the source of all post-colonial varieties considered here, serves as the benchmark against which possible second language acquisition effects will have to be measured. SgE should come closest, given that, in Schneider’s (2007) classification, it has reached phase 4, endonormative stabilisation, and is characterised by a substantial proportion of the young population growing up as native speakers of English. SgE may thus be viewed as being on the move from an ESL to an ENL status. In India, English is also deeply entrenched; phase 4 seems in sight there. In contrast, both in Hong Kong and in East Africa, local Englishes are still undergoing nativisation; in practice, they are not equally deeply rooted and transformed but are typically associated with formal domains of use, and, as far as speakers are concerned, higher status and advanced levels of education. In the following section, we provide both qualitative evidence (subject to the limits of the data source), and report a few individual observations which seem relevant to our hypothesis. In addition, we will test whether our pre-

174

Maria Steger and Edgar W. Schneider

dictions are borne out quantitatively. The obvious question is whether iconic structures occur more frequently in certain contexts. Frequency data will of course be based upon systematically screened corpus evidence. To test our hypothesis, we selected 15 different verbs which take non-finite complement clauses, occur with a sufficiently large frequency, and allow the assumption that varying complementation can be expected. Table 5 identifies these verbs, along with the number of attestations analysed. Alternative meanings or cases of coincidental homonymy were excluded, so these figures present the number of relevant hits filtered from all hits. Three of these verbs (believe1/2, allow[1], and expect1/2) display a frequency way above that of the others, yielding more than 1,000 relevant examples. Table 5. Verbs analysed and number of pertinent attestations verbs analysed

pertinent attestations

believe1/2 allow[1]

1,273

expect1/2

1,015

wish1/2 require1/2

1,651

enable[1]

1,449

advise[1] prefer1/2

1,234

want[1]

1,174

promise[3] cause[1]

1,166

love1/2 persuade[1]

1,127

hate[2]

1,480

authorise[1]

1,444

1,056

1,542

1,181 (every 5th)

1,146 1,114

The clauses containing pertinent uses of these verbs were pasted into a spreadsheet and annotated for the following factors which we expected to be relevant for the future analysis: – variety; – genre (written or spoken);

Complexity as a function of iconicity

175

– complementation type (as defined in Tables 2 to 4 above, i.e.1-nf [a-q], 1-int, 1-f [a-h]; 2nf [a-b], etc., and a forth type which was used to classify debatable examples as, for instance, cases of self-monitoring, self-correction, avoidance, etc.); – surface structure of the complement clause; – finiteness of the complement clause; – presence of a complementiser; – for the matrix verb lexeme: modality, tense, voice, aspect, and negation; – and the same categories for the verb of the complement clause. Not all of these factors turned out to be influential, however; only the ones which we found of interest will be reported below. In the analysis itself, we counted and compared the frequency of occurrence per variety of alternative structures representing different degrees of iconicity, and calculated the relative strength of more and less iconic choices. In line with conventional variationist practice, the envelope of variation was considered, i.e. the proportion of target choices was calculated in relation to all possible contexts. When the number of hits was too small to permit statistical validation (usually zero > who ICE-GH: that > zero > who = which While which and that are equally frequent (ca. 27.5 % each) and at the top of the list in BrE, followed by the zero relativizer (24.8 %) and who (18.1 %),3 GhE has a rather strong preference for that (32.9 %) over the zero relativizer (24.4 %) and which/who (about 20 % each). This could be taken as a first indication that GhE, in its strong preference for invariant that, has developed away from the historical input variety BrE. However, the bare figures in Table 1 present a rather crude picture, which is only useful as a starting point. The choice of the relativizer, at least in BrE, depends on a number of linguistic and extra-linguistic factors, which these figures do not express. It could be, for example, that the texts included in ICE-GB are more about inanimate referents and BrE thus only appears to favour which. To be able to account for such factors determining the selection of relativizers in the two varieties, the 3,000+ restrictive relative clauses were manually coded for the variables discussed in the next section.

3

significance is made as follows: marginally significant (0.075 > p ‰ 0.05), significant (p < 0.05), highly significant (p < 0.01), extremely significant (p < 0.001). If any expected frequency was below 1 or if the expected frequency was less than 5 in more than 20 % of the cells, Yates’ correction was employed. Note that these figures differ somewhat from those given in Biber et al. (1999: 610–611), where for restrictive relative clauses in the written registers of fiction, news and academic prose we find: that 32.2 %, which 30.0 %, who 18.6 %, zero 18.0 % and whom/whose 2.3 % (all percentages only approximate). This may be due to the different textual composition of the ICE-corpora and the Longman Spoken and Written English Corpus, the fact that the figures in Biber et al. (1999) include American English as well as other standard varieties and/or to a possibly lower representativeness of the figures in Biber et al. (1999), where findings are normalized to tokens per million words but where the absolute numbers of relative clauses are not given (it is highly unlikely that all relative clauses in the LSWE Corpus – at least 300,000 by my calculations – were analyzed).

Syntactic and variational complexity in British and Ghanaian English

3.

225

Coding the tokens

The relative clauses extracted from the nine text types in ICE-GB and ICE-GH were coded for the dependent variable Relativizer and the independent variables Syntactic role of the relativizer in the relative clause, Voice of relative clause, and Animacy.

3.1. Relativizer The dependent variable investigated in this contribution is the form of the relativizer. The variants are who, whom, which, that, and zero. Only these “classic” relativizers were considered here. Relative clauses containing other relativizers such as the wh-forms when and where, as well as relative clauses lacking a nominal head, were disregarded. For this reason, constructions such as (1) anytime I receive a letter from you (ICE-GH:W1B-010#3:1) which could be analyzed as relative clauses containing a zero relativizer, were excluded because they lack a nominal head.

3.2. Syntactic role of the relativizer in the relative clause Following the classification of Quirk et al. (1985: 1248–1250), the syntactic positions that a relativizer can occur in were coded as follows: (2) subject: chiefs [who abdicated their stools to ensure harmony among their people] (ICEGH:W2C-009#9:1) (3) object: the tour [which the president undertook] (ICE-GH:W2B-002#83:3) (4) complement: artificial boundaries [that the single subjects present] (ICE-GH: W2A-020#39:1) (5) adverbial: the way [I have treated you] (ICE-GB:W1B-008#123) In addition, if the relativizer expresses the possessor in the relative clause, it was coded as genitive:

226

Magnus Huber

(6) genitive a candidate [whose background warrants such approval] (ICE-GH:W2D-010# 93:1) A sizeable number of relative clauses in the data contain prepositional verbs such as approve of. There are two ways of analyzing such verbs, either as transitive followed by an object or as intransitive followed by an adjunct. In the former case the syntactic role would have to be coded as object, in the latter as an adverbial (cf. Quirk et al 1985: 1156): (7) Analysis 1: Analysis 2:

S She S

V looked V

O after

her son A

For the present purposes, such constructions were interpreted following Analysis 1. Thus, the role of the (zero) relativizer in a sentence such as (8) was coded as object. (8) This is congruous with the definitions [Ø we have looked at] (ICE-GH:W1A-002 #80:1)

3.3. Voice of relative clause Although the binary values of active and passive suggest a clear-cut distinction, sentences can actually be arranged on a cline between these two poles. This paper treats as passives only those clauses which Quirk et al. (1985: 167–168) describe as “central passives”, i.e. clauses that have a clear correspondence with an active verb phrase or active clause, as in (9). (9) the rate [at which they are degraded by the methanogens] (ICE-GB:W2A-021# 054) Relative clauses containing be V-en structures were coded as active whenever the past participle had more of an adjectival meaning, as in (10). (10) certain fossil zones [which are known to have existed] (ICE-GB:W1A-020# 093)

Syntactic and variational complexity in British and Ghanaian English

227

3.4. Animacy Animacy is a composite of various semantic parameters which essentially have to do with empathy and sociocentric orientation (cf. Whaley 1997: 172–173), and is said to be a universal hierarchical ordering of nominals: (11) The Universal Animacy Hierarchy (Whaley 1997: 173) 1 & 2 person > 3 person pronoun > proper name/kin term > human NP > animate NP > inanimate NP The data analysed here did not contain pronominal heads and few proper nouns or kinship terms, so the latter were subsumed under “human NP” and a simplified hierarchy was used in this study: (12) Simplified Animacy Hierarchy human NP > animate NP > inanimate NP The following examples illustrate the three categories in this simplified hierarchy: (13) human (including human proper nouns and kin terms): the lady [I told you about] (ICE-GH:W1B-008#144:8) (14) animate (animals and human groups/organisations): a. dogs and other animals [which carry fleas] (ICE-GH:W2B-027#65:1) b. the Pentecostal and charismatic churches [that are operating as independent organizations] (ICE-GH:W2A-013#81:1) (15) inanimate (all others): an environment [that can promote high inflation] (ICE-GH:W2B-011#133:2) If the head was used metaphorically, animacy was coded not on the basis of the literal but the metaphorical meaning. Thus, hawks in the following example was coded as human, not animate: (16) I further enjoin him [the President of Ghana] to be aware of the few hawks [that he would inadvertently work with] (ICE-GH:W1A-005#52:1) When the head was a demonstrative, it was coded according to the animacy of the noun it referred to, so in the following example all those was therefore coded as a human head: (17) all those [who are accused of criminal offences] (ICE-GB:W2B-020#005)

228

4.

Magnus Huber

Linguistic variables and relativizer selection in British and Ghanaian English

With regard to the linguistic variables that have an influence on the choice of the relativizer, Quirk et al. (1985: 1250–1252) remark: When the antecedent is personal [= human] and the pronoun is the subject of the relative clause, who is favoured, irrespective of the style and the occasion […]. With the antecedent still personal but with the pronoun now object of the verb or prepositional complement, there is a much stronger preference for that or zero, perhaps to avoid the choice between who and whom. […] Avoidance of whom may not be the only factor influencing that as object with personal antecedents. Grammatical objects are more likely to be non-personal, or to carry non-personal implications, than subjects.

Starting from these observations, the following sections will investigate the role of animacy of the head, the grammatical relation of the relativizer in the clause and the voice of the relative clause in determining the choice of the relativizer in BrE and GhE. Table 2 shows the distribution of the relative pronouns who/m and which as well as the particle that in subject and object position across the three animacy categories. Table 2. Relativizers in subject and object position who/m human animate inanimate

which

that

ICE-GB

253

0

8

ICE-GH

317

0

20

ICE-GB

22

48

53

ICE-GH

9

23

43

ICE-GB

0

233

347

ICE-GH

0

190

421

p n.s. n.s. 0.001

In both BrE and GhE, who/m is used with reference to humans to the exclusion of which, and which refers to inanimate heads to the exclusion of who/m. The latter also occur with animate heads because groups and organisations of humans were coded as animate, and these can be relativized by who, just like pets (cf. (18) and (19)). (18) those varieties of Protestant [who had formed themselves into pressure-groups] (ICE-GB:W2A-006#64) (19) an exuberant puppy [who was frisking around Tanya’s feet] (ICEGB:W2F-006#55).

Syntactic and variational complexity in British and Ghanaian English

229

That is used in all three animacy categories, and BrE and GhE show similarities and differences in the selection of that as opposed to the pronouns.

Figure 1. Percent of that vs. who/m + which in subject and object position by animacy4

4

As we descend the animacy hierarchy, invariant that becomes more and more frequent in both varieties. From the shape of the curves it appears that the major split is between humans on one side and non-humans (animate + inanimate) on the other. This is more pronounced in ICE-GH, where there is a steeper increase of that from human to animate heads (more than 50 %) but only a comparatively small difference between animate and inanimate heads (some 11 %). In both varieties, the split would have been even more prominent if human groups relativized with who/m – such as university, community or society – had been coded as human rather than animate: reclassifying the 22 animate heads relativized with who/m in ICE-GB and the 9 corresponding ones in ICE-GH as human increases the percentage of that in animates almost up to that of inanimates: ICE-GB 52.5 %, ICE-GH 65.2 %. Thus, the dataset exhibits the well-known split between human and nonhuman heads in English relative clauses. Note, however, that GhE uses a significantly higher rate of the particle that for animate and inanimate heads. 4

The differences between the two corpora range from not significant over marginally significant (animate, p = 0.052) to highly significant (inanimate, p = 0.001). Within each variety, the differences between the animacy categories are extremely significant, except for ICE-GH where the animate-inanimate distinction is merely significant at p=0.043.

230

Magnus Huber

It is possible that the apparent preference of BrE for relative pronouns rather than that could just be an artefact created by the exclusion of the zero relativizer from the above figures since it is imaginable that BrE avoids who/m and which just as much as GhE, but does so by resorting to the zero relativizer rather than that (at least in object position; there are no zero relativizers in subject position in the corpora). To verify this, Table 3 indicates the frequency of that and zero in object relative clauses. Table 3. That and zero in object position

human in/animate

that

zero

% zero

p

ICE-GB

1

28

96.6

n.s.

ICE-GH

4

27

87.1

ICE-GB

63

246

79.6

ICE-GH

81

259

67.2

n.s.

The numbers of animate heads in object position relativized by that or zero are very low (3 in ICE-GB and 6 in ICE-GH),5 so the figures for animate heads are given as composites along with inanimates, following the basic animacy split mentioned above. Although the percentage of the zero relativizer is somewhat higher in ICE-GB, this difference is not significant, even if the categories human and in/animate are added up. Figure 2 adds the zero relativizers to the figures displayed in Figure 1 above. The picture remains the same: in both varieties, there is a tendency to use an increasing rate of non-pronominal relativizers (that or zero) as we descend the animacy hierarchy. In other words, while close to 90 % of the relativizers in clauses following human heads are pronouns, the rate decreases to about half following animate heads and to about a quarter following inanimate heads. There is no significant difference between ICE-GB and ICE-GH as far as human heads are concerned, but in the non-human categories ICE-GH shows a significantly higher rate of pronoun avoidance than ICE-GB (15 percentage points for animates, p = 0.030; 7 percentage points for inanimates, p = 0.003). It seems therefore that in the process of nativization, GhE is moving towards a typologically unmarked system in which relative pronouns are gen5

In fact, there are also few object relative clauses with human heads (ICE-GB 30, ICE-GH 33). This is because humans and animates tend to be agents rather than patients.

Syntactic and variational complexity in British and Ghanaian English

231

Figure 2. Percent of that + zero vs. who/m + which in subject and object position by animacy6 6

erally dispreferred compared to BrE: relative pronouns are rather marginal in the world’s languages, being attested in only 4.7 % of Velupillai’s (in press) genetically and areally balanced sample of 277 languages employing this strategy.7 The fact that this move towards a typologically more common strategy takes place in the non-human categories but not (yet?) with relative clauses modifying human head nouns may be due to the latter being the most salient animacy category, who eroding more slowly here. It might be argued that the higher rate of that in GhE is a product of language change rather than a real difference between the two varieties, since the texts in ICE-GB were produced or published between 1990 and 1993 (Nelson, Wallis, and Aarts 2002: 4) and those in ICE-GH about a decade or more later, mainly between 2000 and 2009, thus showing an increased use of that. However, there are several counter-arguments. First, ten years seem too short to account for the observed differences; second, historically that has been receding rather than increasing in written English; and third, the fact that there is no significant difference between ICE-GB and ICE-GH with regard to human heads (in spite of the sampling time mismatch) can be taken 6

7

The differences between the two corpora range from not significant (human, p = 0.099) over significant (animate, p = 0.030) to highly significant (inanimate, p = 0.001). Within each variety, the differences between the animacy categories are extremely significant, except for ICE-GH where the animate-inanimate distinction is merely significant at p = 0.043. Even if only the 109 languages are considered that have an overt relative clause marker, the percentage is rather low (11.9 %).

232

Magnus Huber

as an indication that the difference between BrE and GhE is not due to language change but results from actual divergent preferences for relativizers in the lower animacy categories. On the other hand, it is also imaginable that GhE has conserved a system that was once current in the historical input variety BrE: that is the older relativizer, which started to be replaced by the pronouns in the more formal registers of late Middle English and Early Modern English (see e.g. the summary in Levey 2006: 47–48). Further research will have to explore whether earlier input varieties of BrE showed a higher frequency of that. Such a study will be complicated by the fact that, even if we can delimitate the period when such an original input took place (probably during the second half of the 19th century) and even if we can compile comparable corpora of BrE for this period, GhE has been under uninterrupted influence from BrE from its earliest days to the present. Most Outer Circle Englishes did not just branch off from their historical mother varieties and develop in complete isolation. Rather, because BrE was and still is the professed target for many Ghanaians, ongoing change in BrE would have exerted a constant influence on GhE. Another possibility is that in its preference for that, GhE is influenced by American English, which is known to favour that over which: Biber et al. (1999: 616) found ca. 80 % relative that in American English newspaper articles, as opposed to only about 45 % in British papers, and Tottie (1997: 467–477) has similar findings. Many middle-class Ghanaians, including academics, spend time abroad to work or for education, and North America is one of the most popular destinations. Americanized accents, acquired abroad or even locally, can be heard in Ghana and American English may also have had an impact on morphology and syntax. Further, the more pronounced split between human and non-human heads in GhE as well as the higher rate of non-pronominal relativizers in the lower animacy categories could possibly be due to the fact that with very few exceptions, English is a second language for Ghanaians, learnt exclusively in the school context. Thus, prescriptive rules may be ingrained more strongly in Ghanaians, who learn English in a formal school context, than in British users, who acquire spoken and colloquial forms before the more formal registers. However, if the strong prescriptive rule that favours relative pronouns over that with its colloquial associations (cf. e.g. van Gelderen 2006: 218) did indeed play a role in GhE relative clause formation, we would expect a higher rate of which rather than that in the lower animacy categories, but the opposite is true. Since none of the above wholly or satisfactorily accounts for GhE’s pref-

233

Syntactic and variational complexity in British and Ghanaian English

erence for that, it is worthwhile to have a closer look at the competition between which and that in relative clauses following non-human head nouns. Table 4. Which vs. that in subject and object positions following non-human heads which

animate

inanimate

that

% that

SUB

OBJ

SUB

OBJ

SUB

OBJ

ICE-GB

44

4

52

1

54.2

–

ICE-GH

23

0

40

3

63.5

–

ICE-GB

174

59

285

62

62.1

51.2

ICE-GH

158

32

343

78

68.5

70.9

For animates in subject position, the difference between ICE-GB and ICE-GH is not significant, while for inanimates the difference in subject position is significant (p = 0.038) and the almost 20 percentage point difference in object position is highly significant (p = 0.002). Within ICE-GB, the difference between inanimate heads in subject and in object position is significant (p = 0.030), while within ICE-GH there is no significant difference. Thus, the difference in the that ~ which competition lies in relative clauses with inanimate heads: BrE has a preference for that in subject position (62.1 %), but which and that occur with roughly equal frequency in object position (51.2 % that). GhE on the other hand has the same high percentage of that in both subject and object positions (ca. 70 %). In sum, although English is cross-linguistically untypical in having relative pronouns, BrE conforms to the universal tendency to use more explicit relativizers for the lower positions of the Noun Phrase Accessibility Hierarchy (subject > direct object > indirect object > oblique > genitive > object of comparison; Keenan and Comrie 1977: 66). In Payne’s (1997: 335) words: “Chances are the more explicit strategies (relative pronoun, pronoun retention, internal head) will be used to relativize arguments farther down (to the right) the hierarchy than the less explicit strategies”. On the other hand, GhE shows a tendency towards a typologically more unmarked relativization strategy in its general preference for that but is untypical in not showing a significantly higher rate of which in object position. Table 1 already indicated that zero relativizers are equally frequent in GhE and BrE, accounting for about a quarter of all tokens. Comrie and Kuteva’s (2005) sample of the world’s languages shows that gapping is significantly (p

234

Magnus Huber

< 0.001) more frequent in subject relative clauses (125/166 = 75.3 %) than it is in oblique relative clauses (55/102 = 53.9 %),8 which again demonstrates that the more explicit strategies are found on the lower end of the Noun Phrase Accessibility Hierarchy. BrE and GhE are typologically uncommon in that their rate of gapping is lower in subject relativization than on lower positions of the Noun Phrase Accessibility Hierarchy (Table 5). Table 5. Gapping (that + zero) in subject and object positions

subject object

overt

gap

% gap

p

ICE-GB

492

344

41.1

n.s.

ICE-GH

505

399

44.1

ICE-GB

64

338

84.1

ICE-GH

34

371

91.6

0.001

Since the choice of the relativizer depends on its syntactic role in the relative clause (see Quirk et al. 1985: 1250–1251), the following tables bring together the figures for who/m, which, that and zero in subject (Table 6) and object position (Table 7). Table 6. Relativizers in subject position, by animacy

human animate inanimate

8

zero

who

which

that

ICE-GB

252

0

7

0

ICE-GH

315

0

16

0

ICE-GB

22

44

52

0

ICE-GH

9

23

40

0

ICE-GB

0

174

285

0

ICE-GH

0

158

343

0

Note that Comrie and Kuteva (2005: 495) define the gap strategy as “cases where there is no overt case-marked reference to the head noun within the relative clause”. In English, this includes the zero relativizer but notably also the particle that, since that does not co-vary with the head.

235

Syntactic and variational complexity in British and Ghanaian English Table 7. Relativizers in object position, by animacy

human animate inanimate

who(m)

which

that

zero

ICE-GB

1

0

1

28

ICE-GH

2

0

4

27

ICE-GB

0

4

1

2

ICE-GH

0

0

3

3

ICE-GB

0

59

62

244

ICE-GH

0

32

78

256

A comparison of the two tables leads to the following generalisations: neither BrE nor GhE allow a zero relativizer in subject position. In both varieties, human head nouns tend to be relativized by who in subject position (ICE-GB 97.3 %, ICE-GH 95.1 %; n.s.) but by zero and, less so, that in object position (ICE-GB 96.7 %, ICE-GH 93.4 %). This corroborates Quirk et al.’s (1985: 1250–1251) remark that relative clauses modifying human heads prefer who when the relativizer is in subject position but that or zero when it is in object position. The preferred relativizers for animate heads in subject position are that and which (ICE-GB 81.4 %, ICE-GB 87.5 %, n.s.; note that these figures would go up to 100 % if the human groups/organizations, now subsumed under animate, were reclassified as human). The number of relative clauses with animate heads in object position is too low to make any meaningful statement, but if we look at the aggregates of non-human heads (i.e. animates plus inanimates), we see that the percentages of non-pronominal relativizers (that and zero = gapping) increase as we descend the Noun Phrase Accessibility Hierarchy: in ICE-GB from 58.4 % in subject relativization to 83.1 % in object relativization, and in ICE-GH from 66.8 % to 91.4 %.9 Note also that overall there are very few object relative clauses with human or animate head nouns: 37/402 in ICE-GB and 39/405 in ICE-GH, and this holds true although clauses like that in (19) are perfectly grammatical. (19) She was constantly fighting with women [whom she suspected either genuinely or falsely to be her husband’s mistresses.] (ICE-GH:W2F-016#73:1)

9

This is due to the high number of zero relativizers in object position, but as mentioned above both zero and that are instances of the gapping strategy and in so far present the same alternative to relative pronouns.

236

Magnus Huber

It seems that, rather than resorting to the animacy-neutral relativizers that and zero, the prevalent pattern in both GhE and BrE is to avoid “objectifying” human and animate head nouns in relativization. Instead both varieties employ the strategy of passivizing the relative clause, thus promoting the relativizer to subject position. This can be seen when comparing the number of passive subject relative clauses with that of active object relative clauses with human/animate and inanimate head nouns (Tables 8 and 9). Table 8. ICE-GB – animacy by voice and grammatical relation subject, passive

%

object, active

%

p

human/animate

20

22.7

36

9.0

0.000

inanimate

68

77.3

362

91.0

Table 9. ICE-GH – animacy by voice and grammatical relation subject, passive

%

object, active

%

p

human/animate

23

26.1

39

9.7

0.000

inanimate

65

73.9

364

90.3

The figures show that in both BrE and GhE relativizers referring to human or animate heads are more likely to occur in subject position in passivized relative clauses than in object position in active relative clauses: this is 13.7 percentage points more likely in ICE-GB and 16.4 percentage points more likely in ICE-GH. The difference between the varieties is not significant, but the slightly higher likelihood in GhE may be an indicator that another factor is at work in this variety: agency. If agency plays any role in the choice of the relativizer, we can – for reasons that will become evident in due course – expect a higher likelihood for human/animate heads to be coded by who/which in active relative clauses, where they are agents, than in passive relative clauses, where they are patients. Tables 10 and 11 compare the rate of that in subject relative clauses across the animacy categories in ICE-GB and ICE-GH.10

10

In both ICE-GB and ICE-GH, the differences within subject (active) and subject (passive) are extremely significant (p < 0.001).

237

Syntactic and variational complexity in British and Ghanaian English Table 10. ICE-GB – animacy by voice in subject relative clauses subject, active that

subject, passive

who/which %

that

who/which %

human/animate

56

301

84.3

3

17

85.0

inanimate

244

147

37.6

41

27

39.7

In ICE-GB, the percentages of who/which are very much the same within the two animacy categories, with no significant difference between active and passive. That is, in BrE a subject relativizer referring to a human/animate head is not more likely to be a pronoun when it is an agent than when it is a patient. Table 11. ICE-GH – animacy by voice in subject relative clauses subject, active that

subject, passive

who/which %

that

who/which %

human/animate

49

331

87.1

7

16

69.6

inanimate

290

146

33.5

53

12

18.5

In ICE-GH, on the other hand, there are considerable and significant differences in relativizer selection between active and passive voice (human/animate p = 0.018, inanimate p = 0.015): for both human/animate and for inanimate heads, there is a 16 to 17 percentage points higher likelihood for relativizers to be pronouns when they are agents, i.e. when they are in subject position in an active clause. Conversely, patients are more likely to be relativized by that. Quirk et al. (1985: 1252) suspect that “grammatical objects are more likely to be nonpersonal [i.e. animate or inanimate], or to carry nonpersonal implication, than subjects”. The tendency in both BrE and GhE to promote human/animate relativizers to subject relativizers through passivization but to show a greater tolerance towards inanimate object relativizers (see Tables 10 and 11) corroborates this. The figures further indicate that the split is actually not between human and animate/inanimate heads (as surmised by Quirk et al. 1985) but rather between human/animates on the one hand, and inanimates on the other. The semantic reason for the low frequency of object relative clauses in the higher animacy categories may be that humans and animates are prototypical agents, and that agents typically occur in subject position (although of course passivization turns human/animate heads into

238

Magnus Huber

patients, but this seems to be a lesser evil than leaving them in object position). The passivization strategy just described causes virtually all tokens of human/animate who/m (ICE-GB: 274/275 = 99.6 %, ICE-GH: 324/326 = 99.4 %) and the vast majority of tokens of human/animate which (ICE-GB: 44/48 = 91.7 %, ICE-GH: 23/23 = 100 %) to end up as subjects, leaving the object position to that and the zero relativizer. For who/m in particular (and in both BrE and GhE), this distribution yields a strong association of the relative pronouns (as opposed to that and zero) with agents.11

5.

Relativization in Ghanaian languages

Since the patterns of GhE relativization described in the previous section may result from transfer from the mother tongues of the users of GhE, this section will have a look at the structure of relative clauses in the major Ghanaian languages. Ghana’s ca. 60 indigenous languages belong to two branches of NigerCongo: Gur languages are spoken in northern Ghana by about a quarter of the 24+ million Ghanaians, and Kwa languages in the south by about three quarters. There are also two Mande languages with a very small number of speakers (< 1 %). The following table lists the major languages of Ghana and gives a rough estimate of the proportion of their speakers. Table 12. The major Ghanaian languages Languages Kwa (south) Akan Ewe Ga-Dangme Gur (north) Dagaare Dagbani 11

speakers (in % of total) 43 % 10 % 7% 6% 3%

For which, this association is not quite so strong because it can also refer to inanimate heads, which occur more often in object positions. The figures for inanimate which in subject position are ICE-GB: 174/233 = 74.7 %, ICE-GH: 158/190 = 83.2 %. That is, even inanimate which is more frequent as the subject of the relative clause than as the object, but in both varieties, subject which is significantly less frequent with inanimate head nouns than it is with human/animate heads (some 17 percentage points, p < 0.001).

Syntactic and variational complexity in British and Ghanaian English

239

Akan (with the main dialects of Twi and Fante) is the biggest Ghanaian language cluster, being the L1 of about 43 % of the population. In recent years, there has been a strong increase in L2 speakers of Akan (particularly the Twi dialect), which is now spoken by over 70 % of the population and is thus the most important lingua franca in the country. This, together with the fact that in colonial times, but also afterwards, the vast majority of teachers came from the south, as well as the fact that English is much more current in Ghana’s south, may have resulted in a strong Kwa influence on GhE. The major Ghanaian languages employ a variety of strategies in relative clause formation, but the following generalizations, based on Saah (2010; Akan), Dzameshie (1995; Ewe), Kotey (1969: 113–122; Ga), Bodomo and Hiraiwa (2004; Dagaare), Wilson (1963; Dagbani), are important in the context of the present contribution: Position of the relative clause and the relativizer. As in English, relative clauses are postnominal in the major Ghanaian languages, with the exception of Dagbani, where object relative clauses can also be prenominal. In the three Kwa languages, the position of the relativizer is at the beginning of the clause, but in Dagaare and Dagbani, its position is preverbal and thus not necessarily at the beginning. None of these exceptions has found its way into written GhE. Resumptive pronouns occur in Akan, Ewe, Dagaare and Dagbani, though the conditions (obligatoriness, syntactic role) vary. Nevertheless, constructions as in (20), which can be heard in colloquial spoken GhE, are absent from the ICE-GH relative clauses analyzed here. (20)the old woman [who I gave her the money] (Dako and Huber 2004: 858) Relative clause delimitation. The three main Kwa languages overtly mark the end of relative clauses: Akan with the particle nó, Ewe with the determiner la and Ga with the determiner lε`. Again, this strategy has left no trace in written GhE. Variation of the relativizer. Ewe’s relativizers covary with the number of the head noun (si = singular, siwo = plural), while those of Dagbani depend on the syntactic role of the relativizer in the relative clause (ŋun = subject, nə = object). However, Akan, Ga and Dagaare, together spoken natively by about 50 % to 55 % of Ghanaians, have invariant relativizers. Interestingly, the absence of zero relativizers in all the major languages of Ghana does

240

Magnus Huber

not seem to have resulted in a lower rate of zero relative clauses in GhE, which marks about a quarter of all relative clauses by zero, just like BrE (see Table 1 above). However, the predominance of invariant relativizers in Ghanaian languages may be one of reasons for the overall stronger preference for invariant that rather than who~which in GhE. Note also that none of the major Ghanaian languages has a relativizer that varies according to the animacy of the head noun. This may be yet another reason why GhE has a stronger preference than BrE for that.

6.

Summary and Outlook

Nativization is not only characterized by the addition of completely new features to an emerging indigenized L2 variety and the deletion of others from the historical input variety but also by a more subtle reinterpretation of existing systems that can only be grasped statistically. The main claim of this contribution is that in transforming English into a new variety in the Nativization and Endonormative Stabilization Phase in Schneider’s (2003) model, users of English in Outer Circle countries reinterpret and restructure complex subsystems of the input variety in subtle ways. This reorganization proceeds without producing structures that in themselves are unacceptable in the historical input variety. The English relative clause system is especially amenable to such a kind of restructuring: its complexity (several pronouns, a particle and a zero relativizer that overlap in their functions and that are influenced by a variety of linguistic and extra-linguistic parameters) and typological exceptionalism in many aspects provides ample room and pressure for reinterpretation. By and large, GhE relative clauses, taken individually, are not ungrammatical in BrE, which is still the professed target as far as the morphology and syntax of English in Ghana are concerned. It is only when looked at in a wider variationist perspective that the GhE relative clause system emerges as organised according to a different weighting of linguistic and presumably also extralinguistic factors. One reason for such a reorganisation may be that the resulting constructions conform more closely to the structures and constraints of the adstrate languages, but this need not always be the case. A case in point is the retention of the zero relativizer in GhE, with the same structural distribution as in BrE, in spite of the fact that 1. there is no zero relativizer in the major Ghanaian languages, 2. that BrE is typologically untypical in not allowing zero in subject relativization but on lower positions of the Noun Phrase Accessibility Hierarchy and 3. that zero is optional in BrE and could thus easily have been discarded.

Syntactic and variational complexity in British and Ghanaian English

241

This article has shown that during the nativization of English in Ghana, the structural factors affecting the choice of relativizers have been assigned new weights. Future research will have to explore the reorganisation of other complex systems of the historical input variety as well as further dimensions of the reorganisation of the relative clause system. In Huber (in preparation) I intend to show that this process also affects the sociolinguistic level and that, although the ICE corpora were not primarily compiled for sociolinguistic research, a careful selection and classification of their text types can yield valuable insights into stylistic variation in both Inner and Outer Circle varieties of English.

References Biber, Douglas, Stig Johansson, Geoffrey Leech, Susan Conrad and Edward Finegan 1999 Longman Grammar of Spoken and Written English. Harlow: Pearson Education Limited. Bodomo, Adams and Ken Hiraiwa 2004 Relativization in Dagaare. Journal of Dagaare Studies 4: 53–75. Comrie, Bernard and Tania Kuteva 2005 Relativization strategies. In: Martin Haspelmath, Matthew S. Dryer, David Gil and Bernard Comrie (eds.), The World Atlas of Language Structures, 449–501. Oxford: Oxford University Press. Dako, Kari and Magnus Huber 2004 Ghanaian English: morphology and syntax. In: Bernd Kortmann and Edgar W. Schneider (eds.), A Handbook of Varieties of English. A Multimedia Reference Tool. Volume 2: Morphology and Syntax, 854–865. Berlin: Mouton de Gruyter. Dzameshie, Alex K. 1995 Syntactic characteristics of Ewe relative clause constructions. Research Review (NS) 11: 27–42. Greenbaum, Sidney 1996 Introducing ICE. In: Sidney Greenbaum (ed.), Comparing English World-Wide. The International Corpus of English, 3–12. Oxford: Clarendon Press. Huber, Magnus In preparation Sociolinguistic variation in British and Ghanaian English. Keenan, Edward L. and Bernard Comrie 1977 Noun phrase accessibility and Universal Grammar. Linguistic Inquiry 8: 63–99. Kortmann, Bernd, Edgar Schneider, Kate Burridge, Raj Mesthrie and Clive Upton (eds.) 2004 A Handbook of Varieties of English. Berlin/New York: Mouton de Gruyter. Kotey, Paul F.A. 1969 Syntactic Aspects of the ‘Ga’ Nominal Phrase. Ph.D dissertation, The University of Wisconsin: Madison. Levey, Stephen 2006 Visiting London relatives. English World-Wide 27: 45–70 McWhorter, John 2001 The world’s simplest grammars are creole grammars. Linguistic Typology 5: 125–166. Nelson, Gerald, Sean Wallis and Bas Aarts 2002 Exploring Natural Language. Working with the British Component of the International Corpus of English. Amsterdam: Benjamins.

242

Magnus Huber

Payne, Thomas E. 1997 Describing Morphosyntax. A Guide for Field Linguistics. Cambridge: Cambridge University Press. Preacher, Kristopher J. 2001 Calculation for the chi-square test: An interactive calculation tool for chi-square tests of goodness of fit and independence [Computer software]. Available from http://quantpsy.org. Quirk, Randolph, Sidney Greenbaum, Geoffrey Leech and Jan Svartvik 1985 A Comprehensive Grammar of the English Language. Longman: Harlow. Saah, Kofi K. 2010 Relative clauses in Akan. In: Enoch O. Aboh and James Essegbey (eds.), Topics in Kwa Syntax. Dortrecht: Springer, 91–108. Sampson, Geoffrey 2009 A linguistic axiom challenged. In: Geoffrey Sampson, David Gil and Peter Trudgill (eds.), Language Complexity as an Evolving Variable, 1–28. Oxford: Oxford University Press. Schneider, Edgar W. 2003 The dynamics of New Englishes: From identity construction to dialect birth. Language 79: 233–281. Schneider, Edgar W. 2007 Postcolonial English. Varieties around the World. Cambridge: Cambridge University Press. Szmrecsanyi, Benedikt and Bernd Kortmann 2009 Between simplification and complexification: Non-standard varieties of English around the world. In: Geoffrey Sampson, David Gil and Peter Trudgill (eds.), Language Complexity as an Evolving Variable, 64–81. Oxford: Oxford University Press. Tottie, Gunnel 1997 Relatively speaking: relative marker usage in the British National Corpus. In: Terttu Nevalainen and Leena Kahlas-Tarkka (eds.), To Explain the Present. Studies in the Changing English Language in Honour of Matti Rissanen, 465–481. Helsinki: Société Néophilologique. van Gelderen, Elly 2006 A History of the English Language. Amsterdam: Benjamins. Velupillai, Viveka In press. An Introduction to Linguistic Typology. Amsterdam: Benjamins. Whaley, Lindsay J. 1997 Introduction to Typology. The Unity and Diversity of Language. Thousand Oaks: Sage Publications. Wilson, W.A.A. 1963 Relative constructions in Dagbani. Journal of African Languages 2(2): 139–144.

Complexity hotspot

243

John McWhorter

Complexity hotspot The copula in Saramaccan and its implications

1.

Introduction

I have argued that creole languages were born as structurally reduced pidginized varieties, and eventually expanded into new natural languages. According to this genesis scenario, because grammatical complexity emerges in languages as the result of long-term processes of drift, we would expect creole languages to be less grammatically complex than older languages (McWhorter 2001, 2006). Furthermore, as we would expect, much of the grammatical complexity that creole grammars do display can be identified as having emerged after genesis, over the past few hundred years. In this light, this paper will examine an area of the grammar of Saramaccan creole that exhibits a degree of complexity unusual in the grammar as a whole, copula constructions. The question is why complexity has emerged to such a degree in the particular area of nonverbal predication, and whether the reasons for this are useful in predicting where complexity is most likely to emerge most rapidly in new grammars. Saramaccan is spoken in the Surinamese rain forest by approximately 20,000, and developed among slaves escaped from coastal plantations in the late seventeenth century. In terms of its grammar, it is an offshoot of a “father” creole, Sranan, today the vernacular lingua franca of Surinam. Saramaccan’s vocabulary is derived mostly from English and Portuguese. However, its phonology and grammar are so deeply influenced by the African language spoken by most of its creators, Fongbe, that it is in no way a dialect of any European language in the way that creoles like Jamaican Creole or Gullah can be analyzed as “types of English.” The Saramaccan data, unless otherwise indicated, were collected by the author and graduate student assistants.

2.

Metric of complexity

My demonstration of emergent complexity in Saramaccan will be based on a particular metric of complexity, consisting of three aspects of grammar which contrast in degree between languages.

244

John McWhorter

2.1. Overspecification Languages differ in the degree to which they overtly and obligatorily mark semantic distinctions. For example, some grammars mark a finer grain of distinctions of person and number than others (as Mühlhäusler demonstrates in this volume in regard to Norf ’k). In contrast to Indo-European languages, the Oceanic language Kwaio marks the dual and paucal (three or more), as well as an inclusivity distinction (me and you vs. me and them) in the first person plural (Keesing 1985: 28): Table 1. Pronouns in Kwaio first person (inclusive)

singular

dual

paucal

plural

(i)nau

(’i)da’a

(’i)dauru

gia

(’e)me’e

(’e)meeru

(’i)mani

first person (exclusive) second person

(i)’oo

(’o)mo’o

(’o)mooru

(’a)miu

third person

ngai(a)

(’i)ga’a

(’i)gauru

gila

(Obviously, by this metric English’s pronouns are overspecified, with their gender distinction in the third person, compared to those of Finnish or Chinese.) There are languages that mark a “fourth person” in clauses with two third-person referents, obligatorily indicating which of the two is more prominent in the discourse. The Indo-European speaker might well marvel that a grammar would choose to mark such a distinction overtly, and indeed obviative marking is an interesting but unnecessary feature: an overspecification. Here is an example from Eastern Ojibwa where in the final verb, the young man is indexed as inverse in contrast to the new obviative focus on the foreigners (Rhodes 1990: 107): (1) Maaba dash oshkinawe o-gii-bawaad-am-n wii-bi-ayaa-ini-d this

EMPH

young.man

3-PAST-dream3INAN-OBV

FUT-coming-be.atOBV-3

myagi-nishnaabe-an x-wii-bi-nis-igo-waa-d-in. foreign-people-OBV REL-FUT-coming-kill-INV-3-OBV ‘Then this young man dreamed that foreigners would come to kill them.’

Or, in many grammars, possessive marking differs for that which belongs to one in an inherent, eternal sense, as opposed to that which is formally and less permanently owned, as in Mandinka i faamaa ‘your father’ versus i la

Complexity hotspot

245

koloŋo ‘your well’ (Lück and Henderson 1993:23). This distinction is hardly necessary to even nuanced communication, as is readily apparent to English speakers. It is an overspecification. The aforegoing is intended not as an exhaustive, but as a demonstrational, list of examples of overspecification.

2.2. Structural elaboration An aspect of one grammar may differ from that aspect in another’s in terms of the number of rules (in phonology and syntax) required to generate grammatical forms. For example, all natural languages have morphophonemic processes. But in Celtic languages, these entail an array of consonant mutations, triggered by a range of grammatically central interfaces, in which synchronically, they are based not on assimilatory influence from preceding segments such as voicing (leaf, leaves) but are phonetically quite unpredictable. In Welsh, cath ‘cat’ occurs in citation form with eu ‘their’, but with other possessive pronouns, undergoes particular mutations: Table 2. Welsh consonant mutations (Ball and Müller 1992:195–6) eu cath fy nghath ei gath ei chath

‘their cat’ ‘my cat’ ‘his cat’ ‘her cat’

Notice that the mutation alone carries the functional load of distinguishing gender in third-person singular gath versus chath. Thus the mutations often straddle a line between phonology and morphology. A grammar is also structurally elaborified to the extent that nominals vary in their concordial requirements and/or case markers according to phonological traits of the root. Latin is a typical example, in which concord was determined by a three-way gender distinction between masculine, feminine and neuter, while case marking varied according to five declension classes distinguished by phonetic traits. Other examples of structural elaboration include: a. the tendency of clitics in some languages to seek a particular position in a clause regardless of their grammatical role. Serbo-Croatian has generally free word order, but clitics occur after the first accented constituent, and can even occur after the first accented word, intervening within a constituent:

246

John McWhorter

(2) Taj that

mi

je

pesnik napisao knjigu.

to.me AUX poet

wrote

book

‘That poet wrote me a book.’ (Spencer 1991: 355) b. subject-verb inversion in interrogative sentences in many European languages, a trait rare outside of Europe (Ultan 1978). Typically, languages indicate the interrogative with intonation and/or the appendage of an interrogative particle (Indonesian apa dia sudah makan [QU she PERF eat] ‘Has she eaten?’ [Sneddon 1996: 311]).

2.3. Irregularity Grammars differ in the degree to which they exhibit irregularity and suppletion. English’s small set of irregular plurals like children and people is exceeded by the vast amount of irregularity in German plural marking: a masculine noun may take -e for the plural, but almost as equally will also take an umlaut: der Arm, die Arme, der Besuch ‘visit’, die Besuche but der Arzt ‘doctor’, die Ärzte, der Gast ‘guest’, die Gäste. Similarly, a “regular” neuter is das Jahr ‘year’, die Jahre, but then das Buch, die Bücher, das Volk, die Völker and even das Floß ‘raft’, die Flöße. Another example of irregularity in plural marking is the broken plural patterns in Arabic: Table 3. Broken plural patterns in Arabic singular qalam bayt kalb kit¯ab dawla sˇahr wazı¯r s.adı¯q

plural ’aql¯am buy¯ut kil¯ab kutub duwal ’aˇshur wuzar¯a’ ‘as.diq¯a’

‘pen’ ‘house’ ‘dog’ ‘book’ ‘country’ ‘month’ ‘minister’ ‘friend’

Then grammars differ in the degree to which they exhibit suppletion. Suppletion is moderate in English, especially evident in the verb “to be” which distributes various Old English roots across person, number, and tense: am, are, is, was, were, been, be. But the Caucasian language Lezgian has no fewer than sixteen verbs that occur in suppletive forms (Veselinona 2003).

247

Complexity hotspot

Elsewhere, suppletion is present in various corners of grammars, as in Spanish, where the third person indirect pronominal clitic le, when preceding an object clitic, transforms to se: Le dí un libro ‘I gave a book to him’ but Se lo dí ‘I gave it to him’ (*Le lo dí).

3.

The complexity of the Saramaccan copula

According to the three manifestations of complexity specified above, the way that Saramaccan expresses the concept of being is one of the most complex constructions in the grammar. English-based New World creoles, a group within which Saramaccan is one of many members, have often been described as having a simple division of labor between an “equative” copula (generally of the shape /(d~n)a/) and a “locative” copula (generally /de/). The situation in Saramaccan is much more elaborated than this.

3.1. Overspecification The default be verb in Saramaccan is dε´ . It is indeed used with locative predicates: (3) Mε´ íki ku milk with

w˜ı´ wine

bi dε´ a táfa líba. PAST be LOC table top

‘Milk and wine were on the table.’ but there is nothing specifically locative in its semantics. Dε´ is used, for example, in the existential: (4) A bi

dε´ hángi

t˜e´.

3S PAST be hungry time ‘There was a famine.’ and especially indicatively, even in equative constructions: (5) Mε´ íki dε´ wã´ milk be a

soní dí míi thing REL child

‘Milk is something that children drink.’

tá bebé ◎ IMF drink

248

John McWhorter

It is also used with adverbial predicates: (6) Dí

kínɔ bi

DEF film

dε´ f ɔ´

PAST be four

jú

lóngi.

hour

long

‘The film was four hours long.’ Dε´ is also used with deverbal resultatives, which are expressed with a reduplicated verb: (7) Dí sutúu DEF chair

dε´ boókoboóko. be broken.RD

‘The chair is broken.’ However, in two specific semantic contexts, the copula is da instead. This is so, first, in the identificational subclass of equative predicates (as opposed to the class subclass of identification in [5]): (8) Mi da Gádu. 1S be God ‘I am God.’ Da is optionally used rather than dε´ with the other subclass of equative predicates, indicating class rather than identity: (9) Mε´ íki dε´ / da wã´ soní

dí

míi

milk be be a thing REL child ‘Milk is something that children drink.’

tá

bebé ◎

IMF

drink

Milk is one of many things a child might drink, and thus a class predicate rather than an identificational one, which it would be if milk were the only thing children drink. Thus: (10) (a) Mi da (*dε´ ) 1S be

dí kabiténi. DEF captain

‘I am the captain.’ (b) Mi dε´ / da wã´ kabiténi. 1S be a captain ‘I am a captain.’

249

Complexity hotspot

Da is also used optionally with possessive predicates, although it can also be omitted: (11) Dí

búku akí

(da) u

mí.

DEF book here be for 1S ‘This book is mine.’ There is, therefore, an overspecification in the copular domain in Saramaccan compared to, for example, English, in which copular morphemes do not vary according to the semantics of the predicate.

3.2. Structural elaboration The Saramaccan copulas also occasion structural elaboration of various kinds. Da cannot appear sentence-finally. Therefore, when the predicate is fronted, there is no copula: (12) Mí

tatá,

dísi.

My father this ‘This is my father.’ (i.e. My father is who this is.) (13) Mí, dísi. 1S

this

‘It’s me’ (i.e. This is me talking to you.) This includes interrogative sentences: (14) U˜´

búku

which book

dí-dε´ ? that

‘Which book is that?’ (15) Andí what

dísi? this

‘What is this?’ although in this context da can optionally occur between the wh-word and the subject:

250

John McWhorter

(16) Andí da d¯ı búku naándε´ ? what be DEF book there ‘What is that book there (about)?’ Also, in headless relatives, there is subject-verb inversion such that da does not occur sentence-finally: (17) Mi sá’

ambε´ da i. (* . . ambε´ i da.)

1S know who be 2S ‘I know who you are.’ in contrast to dε´ , which can occur sentence-finally in headless relatives: naásε´

(18) Awáa mi féni

a

bì

dε´ .

now 1S find where 3S PAST be ‘I finally found out where he was.’ as other verbs do: (19) Lúku andí

i

dú!

look what 2S do ‘Look at what you did!’ Da also manifests another aspect of structural elaboration, namely allomorphy. With possessive predicates that have been fronted, da occurs as an allomorph a: (20) U mí a dí búku for 1S be DEF book

akí. (*U mí da dí búku akí.) here

‘This book is mine.’ Thus the appearance of da in Saramaccan is conditioned by a list of rules determining its absence or heterogenous position in various contexts. Dε´ , meanwhile, also exhibits structural elaboration, in conjunction with property items. Most property items in Saramaccan are verbs; i.e. they take tense and aspect particles like verbs:

251

Complexity hotspot

(21) Ée i njã´ dí soní akí, i ó síki. if 2S eat DEF thing here 2S FUT sick ‘If you eat this thing, you will get sick.’ However, when a property item is fronted, dε´ occurs sentence-finally although it would be ungrammatical with a property item otherwise: (22) U˜´

bígi dí

wósu

dε´ ? (*Dí wósu dε´ bígi.)

how big DEF house be ‘How big is the house?’ (Kramer 2001: 36)

3.3. Irregularity The Saramaccan copulas condition a degree of irregularity which stands out in a grammar in which irregularity is minimal in comparison to, for example, an Indo-European language. Da, for example, is defective if analyzed as a verbal form. Da cannot take tense marking, in which case dε´ occurs: líbisε` mbε` human.being

(23) Dí fósu DEF first

bi dε´ (*bi da) Adám. PAST be Adam

‘The first person was Adam.’ (24) Mí tatá my father

dε´ dí

ó

kabiténi. FUT be DEF captain

‘My father will be the captain.’ Da also does not occur when the predicate is marked for focus with wε: (25) (*Da) Páu tree

wε FOC

‘It’s a tree.’ (26) Dí

wómi

DEF man

dε´ ,

dáta

wε

o!

there doctor FOC INJ

‘(Well, look here) that man is (now) a doctor!’1 1

The usage of dε´ here is not copular; dε´ is homonymous with the word for ‘there’, which is used in the distal demonstrative construction; e.g. dí wómi dε´ ‘that man’, dí wómi akí ‘this man’.

252

John McWhorter

Da also occasions suppletion. While dε´ is negated in regular fashion with predicate negator á, (27) Mi á dε´ a wósu. I NEG be LOC house ‘I am not at home.’ when negated, da occurs in suppletive form as ná: (28) Nɔ´ nɔ´ , ná

mi. (*Nɔ´ nɔ´ , á da mi.)

no NEG 1S ‘No, it wasn’t me.’ Da also requires that the third person subject pronoun, normally a, occur in its tonic form hε ´ even when no emphasis is intended: (29) Hε ´ da dí mɔ ´ɔ lánga wã´ a u déndu. (*A da dí mɔ ´ɔ lánga …) 3S be DEF more tall one LOC 1P inside ‘He is the tallest one among us.’ Da also fails to occur in a few contexts independently of semantic or syntactic regularity; namely, when the predicate is a day of the week: (30) Tidé díi-dé today three-day

woóko. work

‘Today is Wednesday.’ and optionally when the predicate is possessive: (31) Dí

búku

akí

(da) u

mí.

DEF book here be for 1S ‘This book is mine.’

4.

Evidence of emergence rather than substrate transfer

The behavior of the copula in Saramaccan obviously evidences complexity. However, all evidence suggests that this complexity emerged gradually over time after the genesis of the creole in the late seventeenth century.

253

Complexity hotspot

4.1. Diachronic evidence The evidence that Saramaccan originated without a da copula is first that da is derived from no English or Portuguese copular form, but from a deictic item, dati from that (preserved in Saramaccan’s sister offshoot creole from Sranan, Ndjuka, although replaced in Saramaccan by d´i-dε´ > ‘the-there’). The development of copulas from deictic items is common cross-linguistically, but requires a period during which the lexical meaning of the deictic item is bleached away and its function changes to a grammatical one. Therefore, the development of lexical dati to grammatical da is not a process that we could expect to happen in one blow at the time when Saramaccan was emerging; it can only have occurred over a long stretch of time afterward. For example, in Archaic Chinese there was no copula: (32) Wáng-Tái wù Wang-Tai outstanding

zhˇe person

yˇe. DEC

‘Wang-Tai is an outstanding person.’ (Li and Thompson 1975: 421) However, shì ‘this’ was used as a subject in topic-comment constructions: (33) Qíong

yù

jiàn,

shì rén

zhˇı

sˇuo

wù

yˇe.

poverty and debasement this people GEN NOM dislike DEC ‘Poverty and debasement, this is what people dislike.’ (Li and Thompson 1975: 421) By the first century, A.D., it had been reanalyzed as a copula, as it is used today: (34) Yú shì sˇuo

jià

f¯u-rén

zhˇı

fù

yˇe.

I be NOM marry woman GEN father DEC ‘I am the married woman’s father.’ (Li and Thompson 1975: 426) This process is common in languages, having also yielded, for example, the copular use of subject pronominal forms in Modern Hebrew (David hu ganav ‘David is a thief ’). Saramaccan’s da would have begun as the subject deictic in the comment, which itself had no copula: (35) [Kobí] [da] Kobi

that

topic subject

Ø-copula [mí tatá] my father predicate

254

John McWhorter

Over time, as the topic-comment structure was reinterpreted as a subjectpredicate one, the topic became the subject. Da was retained, but edged out of its subject function and left as a mere semantically empty associative morpheme, i.e. a copula: (36) [Kobí]

[da]

Kobi is subject copula

[mí tatá] my father predicate

This also explains why da requires the tonic pronoun in subject position. When sentences with da were topic-comment constructions, the tonic would have been the form expected of a topic, (37) [Hε ´] him

[da] that

Ø-copula

topic subject

[mí tatá] my father predicate

and persists today in that form despite being a subject, the result being a kind of Exceptional Case Marking: (38) [Hε ´]

[da]

he is subject copula

[mí tatá] my father predicate

However, it is inherent to this reconstruction that there was an initial topiccomment stage, in which the copula form had yet to emerge. Also, in terms of historical syntax, if da is the result of the reanalysis of a deictic subject into a particle amidst the reinterpretation of a topic-comment structure into a subject-predicate one, then it would follow that in sentences in which the predicate occurs before the subject, such as wh-questions, there would remain no copula today. This is because we can assume that such sentences occurred originally, when there was no copula yet – and now such sentences have no copula because in them, there was never a deictic item after the subject to be reinterpreted in such cases:

255

Complexity hotspot Table 4. Historical development of da in wh-sentences Stage One

Stage Two

[Kobí]

[da]

ø-copula [mí tatá]

Kobi

that

my father

topic

subject

predicate subject copula predicate

[Andí]

[dísi?] ø-copula

[Andí]

[dísi?] ø-copula

what

this

what

this

predicate

subject

predicate subject

[Kobí]

[da]

[mí tatá]

Kobi

is

my father

(Presumably, examples such as [16] in which a copula does occur in wh-question sentences would become possible, albeit optional, on the model of da’s presence in affirmative ones.) In addition, there is some evidence in eighteenth-century documents of Sranan, father language to Saramaccan, of equative sentences with zero-copula, such as: (39) Mi blibi I believe

joe ø you

wan bon mattie a good friend

fo dem. for them

‘I believe you’re a good friend of theirs.’ (Arends 1989: 160) Sentences like this were a remnant of a zero-copula stage in the evolution of the grammar that developed into that of Saramaccan.2 In an early document of Saramaccan itself, there is also evidence that at an earlier stage, dε´ was even used in the identificational context, suggesting a stage where dε´ was the only copula and da did not yet exist. A native speaker ends two letters he wrote with the now ungrammatical Mi de Christian Grego Aliedja ‘I am Christian Grego of Aliedja’ (Arends and Perl 1995: 385, 387). It finally bears mentioning that copulas have incontrovertible deictic sources in many other creoles; e.g. in French-based ones, se from c’est or sa from ça. 2

In Sranan, from which Saramaccan’s grammar developed, there is already the deictic-derived equative copula da (evidenced as such in early documents, but na in modern Sranan), meaning that this two-way equative/locative split properly developed at this stage rather than in Saramaccan proper (made clearer by fact that Sranan’s other offshoot creole Ndjuka has the same trait). More specific aspects of the copula, however (although not all of them), are local to Saramaccan and can be assumed to have developed after it split from Sranan.

256

John McWhorter

4.2. Synchronic evidence The reconstruction of an original zero-copula stage is also likely given the various contexts in which da does not occur in modern Saramaccan’s grammar where the semantics of the predicate would lead one to expect that it would; e.g. with days of the week and nonverbal predicates, after fronting, and with the focus marker wε. One might suppose that da was for some reason dropped at some point from these assorted contexts, but there is no principled reason why this would have happened. A more systematic approach is to assume that these instances are fossilizations. For example, possessive sentences and ones referring to days of the week are heavy-usage contexts of the kind most likely to yield fossilizations, just as it is heavy-usage nouns like man, woman and child that retain irregular plurals in English.

4.3. Substrate inheritance? It has commonly been argued (e.g. Migge 2003) that copulas like Saramaccan’s were indeed present at the genesis of the creole, modeled on copulas in the African languages spoken by the creoles’ creators. This argument is superficially attractive, given that languages of the Upper Guinea coast of Africa tend to subdivide the copular domain between equative and locative morphemes. For example, Saramaccan’s grammar is modelled upon that of Fongbe, a Niger-Congo language of the Kwa group. The Fongbe equative and locative copulas are: (40) Ùn nyí Àfíáví. 1S be Afiavi ‘I am Afiavi.’ (Lefebvre and Brousseau 2001: 144) (41) Wémâ book

ɔ´

ɔ´

DEF be.at

távò table

jí. on

‘The book is on the table.’ (Lefebvre and Brousseau 2001: 147) However, the resemblance here is, in the typological sense, insignificant. It is very common worldwide for languages to have equative and locative copulas. The following table shows examples from a few languages chosen for familial diversity; typically all or most of each language’s group also have an equative/locative subdivision between copulas, as do a great many languages of families not represented here:

257

Complexity hotspot

Table 5. Languages other than West African with separate equative and locative copulas (Sources: Stenson 1981, Thompson 1965, Hagman 1977, Hawkins 1982, Hashimoto 1969, Sadler 1964 respectively) Irish Vietnamese Nama Hawaiian Chinese CiBemba

Equative

Locative

is là ’a he shì ni

tá o hàa aia zai lì

Thus the fact that both Saramaccan and Fongbe have an equative/locative split between copulas is not evidence of a causal relationship, because this trait appears idiosyncratic only from the perspective of European languages. Importantly in this light, there are quite a few creoles with African substrates in which the equative/locative copula split of the African substrate languages is absent. In French-based creoles also created by speakers of Fongbe (and related languages), despite various transfers from Fongbe grammar, even today there is no locative copula, such as in Haitian (Bouki anba tab-la ‘Bouki is under the table’). In the Portuguese creoles of the Gulf of Guinea, while their main grammatical model was the Niger-Congo language Edo (Hagemeijer 2005) which has several distinct copulas for the equative and elsewhere (cf. Baker 2008: 28), the creoles have a single copula for the equative, locative and beyond (e.g. Principense [Maurer 2009: 95–102] and Angolar [Maurer 1995: 92–5]). This makes it especially crucial that Fongbe’s equative nyí evidences none of the particular behavioral traits of Saramaccan’s da. Nyí exhibits typical verbal behavior, taking negation and tense marking (Migge 2003: 72). It is not omitted in assorted contexts as da is, and does not share semantic space in any context with alternate copula ɔ´ in the way that da does with dε´ with class equative predicates (cf. Lefebvre and Brousseau 2001: 143–147). The sole specific behavioral trait nyí shares with da is that it can be omitted when the focus marker wε` is present (Àtín wε` ‘It is a tree’ [Lefebvre and Brousseau 2001: 134]), unsurprising given that Saramaccan borrowed this very focus marker. Creolists tend to resist the argument that creole copulas are entirely disconnected from substrate patterns. However, my argument is that the three observations adduced here, and not each taken alone but the combination of the three, constitute conclusive evidence that in this case, the counterintuitive is true. The superficial resemblance between the creole and African pat-

258

John McWhorter

terns is neither evidence of a relationship nor, scientifically speaking, evidence that there “must” be “at least some” relationship. With the resemblance between Saramaccan and Fongbe copulas revealed as a false lead, the evidence that Saramaccan’s da emerged as a deictic element and was reanalyzed as a copula is even more significant. Da not only does not exhibit the behavior of nyí, but also has an etymology that suggests an origin in reanalysis rather than transfer.

4.4. Summary Thus, while a great deal of Saramaccan’s grammar is clearly modelled on Fongbe (Migge 2003, McWhorter 2008), the evidence does not suggest that the complexity of the Saramaccan copula was one of those traits. Predictably, given the role of pidginization and/or second-language acquisition in creole genesis, copulas were not prioritized for transfer into the new language, as morphemes of little or no semantic content, unnecessary to what began as utilitarian, untutored, makeshift communication. In terms of the theme of this volume, omission of copulas from either the lexifier or the substrate languages has been typical of creole genesis, in line with work demonstrating that the process is one driven by principles familiar in second-language acquisition (e.g. Plag 2008). The development of a division of labor between copular morphemes is something we would expect to occur later, when a reduced, non-native variety became a full language. Thus the modern Saramaccan copular scenario emerged when a single copular morpheme dε´ – itself not present at emergence either, as demonstrated in McWhorter (1999) – was joined by a newly reanalyzed one. This new copula subdivided the semantic domain of copular constructions, while also complicating the grammar with rules and irregularities, some immediately and others developing thereafter.

5.

What creates complexity in Saramaccan

5.1. Heavy usage What appears to unite the aspects of Saramaccan grammar in which complexity has emerged the fastest is heavy usage. Grammaticalization and reanalysis are inherently connected to conventionalization, which implies that the constructions they affect are used with special frequency, i.e. are particularly conventional (Hopper and Traugott 1993: 103, 146). Thus we would expect that heavily used areas of grammar would be more likely to:

Complexity hotspot

259

a. undergo processes of grammaticalization and reanalysis that create new morphemes and constructions, b. bear irregularities as a token of the fact that especially heavy usage also creates loci of resistance to change. As such, a. first in Saramaccan a new copula encroached on the domain of the original one when heavy usage encouraged a typical process of reanalysis, in which a deictic subject between a topic and a predicate becomes a copula. b. Later, phonetic erosion created allomorphy – a kind of structural elaboration – when da became a after fronted possessive phrases (U mí a dí búku ‘The book is mine’). c. Also, irregularity was a natural outcome of the process. In its initial stage as a deictic lexical item serving as a subject rather than as a verb or copula, naturally da did not take negation or tense. Da, although now a copular item, continues not to take negation or tense although dε´ does. This aspect of da is a conservative feature: heavy usage discourages its complete reanalysis as a copula or verb just as the same factor preserves the irregularity of heavily used English plurals like children (while eliminating earlier ones such as kine for ‘cow’. The conservative aspect of da can be analogized to that of the be verb in English. It conserves reflexes of various Old English root verbs instead of just one (beon, sindon, wesan), and is the only verb in English with forms that vary according to number and (in the present tense) person. Be is an unusually complex verb – according to my metric in this paper – in a language whose verbal inflection paradigms are notoriously streamlined in comparison to other languages in its Germanic family. This is because of especially heavy usage: it would be vanishingly unlikely that in a low-inflection language like English, such features would persist for verbs like pick or turn. Another Saramaccan feature demonstrating the centrality of heavy usage to emergent complexity is its one other irregular verb. The imperfective marker in Saramaccan is tá, as in Mi tá wáka ‘I am walking’. With the verb gó ‘to go’, tá must occur as a prefix nán-, such that ‘I am going’ is Mi nángó. This is the only such case with any marker on any verb, and it is not accidental that it occurs with the go verb, one of the most susceptible cross-linguistically to grammaticalization (Heine and Kuteva 2002). Moreover, the irregularity of nángó is advanced (in its boundedness) but conservative, as irregularity often is, in that the second /n/ preserves the final consonant of the source of the modern marker tá, tan (> stand), long lost otherwise. Pointedly, there are languages in the world with only three true verbs, such as Jingulu in Australia (Pensalfini 2003), and go and be are two of them. Given

260

John McWhorter

how central these verbs are to human expression in languages that have them, and how heavy their usage therefore is, it is unsurprising that they are the only ones in Saramaccan analyzable as irregular, i.e. to have drifted into new forms of complexity since the language’s birth.

5.2. Topic-comment syntax The reanalysis of topic-comment constructions into subject-predicate ones as described by Li and Thompson (1975) has been a source of emergent complexity in Saramaccan beyond the development of the da copula. In general, when the subject position in a comment phrase becomes the subject in a subject-predicate construction, the morpheme occupying this slot can undergo phonetic, semantic and functional transformation that adds complexity to what was at first a relatively straightforward situation. For example, in modern Saramaccan, the predicate negator is á: Kobí á wáka ‘Kobi does not walk’. The imperative is negated, however, with ná: Ná wáka! ‘Don’t walk!’ Ná is also, as shown above, the suppletive negative form of copula da: Mi ná i tatá ‘I am not your father.’ However, in historical documents of eighteenth-century Saramaccan, the form of the negator with both predicates and imperatives as well as elsewhere is no. This, in addition to other traits in the modern grammar, shows that today’s predicate negator á is the product of the development of topiccomment into subject-predicate structures. Specifically, the third person subject pronominal a (with low tone, in contrast to the high tone of the modern predicate negator) in the comment’s subject position fused with the following negator ná (which developed from original no) yielding a portmanteau morpheme ˜a retaining the nasality of the lost /n/. (This reconstruction is supported by the fact that this remains the predicate negator form in the Upper River dialect of Saramaccan.) When the topic was reinterpreted as a subject, ˜a was reinterpreted as a negator alone, then eventually losing its nasality, a process common in phonetically light, heavily used grammatical morphemes in the language): Table 6. The development of the predicate negator in Saramaccan

Stage 1

Stage 3

Kofí, a ná wáka

Stage 2 Kofí, ã´ waka

‘Kobi, he doesn’t walk’

‘Kobi, he doesn’t walk’

‘Kobi doesn’t walk.’

Kofí á wáka.

261

Complexity hotspot

This pathway explains why Saramaccan retains the ná form as the negative copula in contexts where pronominal a, for various reasons, does not occur before it and can be reconstructed to never have previously; (cf. McWhorter 2005: 182–198). The development from topic-comment to subject-predicate, then, has created an allomorphy in Saramaccan negation where once there was none – as well one of the rare cases in the language in which tone alone marks a grammatical distinction, given that a with low tone persists as the third person singular pronoun.

6.

Implications of Saramaccan for theories of complexity

6.1. Syntheticity is but a subset of complexity The complexity of the copular domain in Saramaccan is a demonstration that grammatical complexity consists of a great deal more than inflectional affixation and its consequences. Siegel’s investigation in this volume of the reason for analyticity in pidgins and creoles is well taken, for example, but carries an implication that low levels of inflectional affixation are in some sense a primary difference between creoles and their source languages in terms of complexity. Worldwide, however, grammatical complexity consists of a great deal more than inflectional affixation and its consequences. Morphologically isolating languages that were not born a few centuries ago evidence considerable complexities even without inflections (cf. Riddle 2008), such as in the application of numeral classifiers, tonal sandhi, and in marking categories typically associated by linguists with affixation but in fact also encodable with free morphemes. While surely this is not intended by analysts, the focus in much work on complexity carries an implication that analytic languages like Vietnamese and Yoruba are not grammatically complex, and/or that their only significant complexity is in their tonal systems. Specialists in analytic languages would disagree. Here is a sentence of Akha, a Sino-Tibetan language: (42) ŋá nε I

àjɔ´ q

ERG he

áŋ

áshì

thì

shì

OBJ

fruit

one

CL give DEC

bìq

ma.

‘I was the one who gave him one fruit.’ (Hansson 2003: 243)

262

John McWhorter

In this sentence from an analytic and tonal language, there is an ergative marker, a marker for animate objects, one of many numeral classifiers, and a declarative marker that differs according to conversational participant. Thus our interest in the development of complexity in Saramaccan is predicated upon the fact that Saramaccan differs in degree of complexity with analytic languages like Akha, not just with Western European languages. In general, investigations of grammatical complexity in all areas, including second language acquisition and typology, would benefit from a more general acknowledgment of what complexity consists of in the larger sense, beyond well-investigated aspects such as syntheticity and clausal embedding.

6.2. Complexity and teleology Many of the articles in this volume address complexity in grammars as indexed to functional imperatives of various kinds, whether this be the increase in complexity or a movement away from it. Mühlhäusler traces the proliferation of pronouns in Norf ’k in part to social factors. Steger and Schneider, as well as Huber, examine the degree to which speakers of new varieties of English incorporate synthetic structures from the target English variety as opposed to opting for analytic equivalents. Siegel argues that creoles change as the result of a “compensatory” strategy involving, again, the recruitment of structures from the lexifier language. While I take issue with none of these presentations in themselves, I intend this one as a demonstration that to a considerable extent, the development of complexity in a language (or not) can be unrelated to contact factors. That is, much of the process takes place as the result of the same grammar-internal processes of change familiar from historical linguistics textbooks. The development of the two-copula system in Saramaccan, for example, was not modeled on the speakers’ native language Fongbe. More obviously, European languages were not a target. Rather, the bipartite Saramaccan copula scenario happened on its own, as the result of explainable but unpredictable processes of change. Saramaccan speakers do not process the language as a variety of any European language, and thus its development of copulas – as well as negators, irregular verbs, etc. – is not couched in a dynamic relationship with a “high” superstratal language. Nor were there sociological factors affecting these processes: for example, the omission of da in Saramaccan, where licensed, is not indexed to sociological factors as the omission of the copula is in Black English in America. That is, while certainly the development of grammatical complexity is

Complexity hotspot

263

linked, in many situations, to language contact factors and acquisitional factors, there is a proportion of the phenomenon linked simply to the drift of a new language’s grammar over time, according to the principle of “the invisible hand” (cf. Keller 1995). I would in fact argue that in Saramaccan, essentially all of the emergence of complexity since its genesis has been grammarinternal in this fashion.

References Arends, Jacques 1989 Syntactic developments in Sranan. PhD dissertation, University of Nijmegen. Arends, Jacques and Matthias Perl 1995 Early Surinamese Creole texts. Frankfurt/ Madrid: Biblioteca Ibero-Americana. Baker, Mark C. 2008 The syntax of agreement and concord. Cambridge: Cambridge University Press. Ball, Martin J. and Nicole Müller 1992 Mutation in Welsh. London: Routledge. Hagemeijer, Tjerk 2005 The origins of serialization in the Gulf of Guinea creoles. Paper presented at Creole Language Structure: Between Substrates and Superstrates conference, Max Planck Institute of Evolutionary Anthropology, Leipzig, Germany. Hagman, Roy S. 1977 Nama Hottentot grammar. Bloomington: Indiana University Publications. Hansson, Inga-Lill 2003 Akha. In: Graham Thurgood and Randy J. LaPolla (eds), The Sino-Tibetan languages, 236–251. London: Routledge. Hashimoto, Anne Yue 1969 The verb “to be” in Modern Chinese. In: John W.M. Verhaar (ed), The verb “be” and its synonyms: philosophical and grammatical studies, 72–111. New York: Humanities Press. Hawkins, Emily 1982 A pedagogical grammar of Hawaiian: Recurrent problems. Honolulu: University Press of Hawaii. Heine, Bernd and Tania Kuteva 2002 World lexicon of grammaticalization. Cambridge: Cambridge University Press. Hopper, Paul and Elizabeth Closs Traugott 1993 Grammticalization. Cambridge: Cambridge University Press. Keesing, Roger M. 1985 Kwaio grammar. Canberra: Pacific Linguistics. Keller, Rudi 1995 On language change: the invisible hand in language. London: Routledge. Kramer, Marvin 2001 Substrate transfer in Saramaccan Creole. PhD dissertation, University of California, Berkeley. Lefebvre, Claire and Anne-Marie Brousseau 2001 A grammar of Fongbe. Berlin: Mouton de Gruyter. Li, Charles N. and Sandra A. Thompson 1975 A mechanism for the development of copula morphemes. In: Charles N. Li (ed), Word order and word order change, 419–444. Austin: University of Texas Press. Lück, M. and L. Henderson 1993 Gambian Mandinka. Banjul, Gambia: WEC International. Maurer, Philippe 1995 L’angolar: un créole afro-portugais parlé à São Tomé. Hamburg: Helmut Buske. Maurer, Philippe 2009 Principense. London: Battlebridge.

264

John McWhorter

McWhorter, John H. 1999 Skeletons in the closet: anomalies in the behavior of the Saramaccan copula. In: John R. Rickford and Suzanne Romaine (eds), Creole genesis, attitudes and discourse, 121–142. Amsterdam: John Benjamins. McWhorter, John H. 2001 The world’s simplest grammars are creole grammars. Linguistic Typology 5 (3/4):125–156. McWhorter, John H. 2005 Defining creole. New York: Oxford University Press. McWhorter, John H. 2006 What the creolist learns from Cantonese and Kabardian. (Review article of Phonology and morphology in creole languages. Edited by Ingo Plag). Diachronica 23: 143–184. McWhorter, John H. 2008 Hither and thither in Saramaccan Creole. Studies in Language 32: 163–195. Migge, Bettina 2003 Creole formation as language contact. Amsterdam: John Benjamins. Pensalfini, Robert 2003 A grammar of Jingulu. Canberra: Pacific Linguistics. Plag, Ingo 2008 Creoles as interlanguages: inflectional morphology. Journal of Pidgin and Creole Languages 23: 114–135. Rhodes, Richard 1990 Obviation, inversion, and topic rank in Ojibwa. In: David J. Costa (ed), Proceedings of the Berkeley Linguistics Society 16: 101–115. Berkeley: University of California, Berkeley. Riddle, Elizabeth M. 2008 Complexity in isolating languages: elaboration versus grammatical economy. In: Matti Miestamo, Kaius Sinnemäki and Fred Karlsson (eds), Language complexity: typology, contact, change, 133–151. Amsterdam: John Benjamins. Sadler, Wesley 1964 Untangled CiBemba. Kitwe, N. Rhodesia: The united Church of Central Africa in Rhodesia. Sneddon, James Neil 1996 Indonesian: a comprehensive grammar. London: Routledge. Spencer, Andrew 1991 Morphological theory. Oxford: Blackwell. Stenson, Nancy 1981 Studies in Irish syntax. Tübingen: Günter Narr. Thompson, Laurence C. 1965 A Vietnamese grammar. Seattle: University of Washington Press. Ultan, Russell 1978 Some general characteristics of interrogative systems. In: Joseph Greenberg (ed), Universals of human language Volume IV: 211–248. Palo Alto: Stanford University Press. Veselinona, Ljuba 2003 Suppletion in verb paradigms. PhD Dissertation,University of Stockholm.