Discontinuity in Second Language Acquisition: The Switch between Statistical and Grammatical Learning 9781783092475

With a focus on the morphosyntactic features of second language, this book discusses the idea that language acquisition

178 112 2MB

English Pages 272 [267] Year 2014

Report DMCA / Copyright

DOWNLOAD PDF FILE

Table of contents :
Contents
1. Second Language Acquisition Facit Saltus (‘Takes a Leap’)
2. Discontinuity as Chunks Feed into Grammar
3. Discontinuity in the Maturing and in the Adapting Brain
4. Discontinuity and the Neurocognition of Second Language
5. Statistical Learning of a Second Language
6. Parts of L2 Grammar That Resist Statistical Learning
Conclusions
References
Index
Recommend Papers

Discontinuity in Second Language Acquisition: The Switch between Statistical and Grammatical Learning
 9781783092475

  • 0 0 0
  • Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up
File loading please wait...
Citation preview

Discontinuity in Second Language Acquisition

SECOND LANGUAGE ACQUISITION Series Editor: Professor David Singleton, University of Pannonia, Hungary and Fellow Emeritus, Trinity College, Dublin, Ireland This series brings together titles dealing with a variety of aspects of language acquisition and processing in situations where a language or languages other than the native language is involved. Second language is thus interpreted in its broadest possible sense. The volumes included in the series all offer in their different ways, on the one hand, exposition and discussion of empirical findings and, on the other, some degree of theoretical reflection. In this latter connection, no particular theoretical stance is privileged in the series; nor is any relevant perspective – sociolinguistic, psycholinguistic, neurolinguistic, etc. – deemed out of place. The intended readership of the series includes final-year undergraduates working on second language acquisition projects, postgraduate students involved in second language acquisition research, and researchers and teachers in general whose interests include a second language acquisition component. Full details of all the books in this series and of all our other publications can be found on http://www.multilingual-matters.com, or by writing to Multilingual Matters, St Nicholas House, 31–34 High Street, Bristol BS1 2AW, UK.

SECOND LANGUAGE ACQUISITION: 80

Discontinuity in Second Language Acquisition The Switch between Statistical and Grammatical Learning

Stefano Rastelli

MULTILINGUAL MATTERS Bristol • Buffalo • Toronto

Library of Congress Cataloging in Publication Data Rastelli, Stefano. Discontinuity in Second Language Acquisition: The Switch between Statistical and Grammatical Learning/Stefano Rastelli. Second language acquisition: 80 Includes bibliographical references and index. 1. Second language acquisition—Study and teaching. 2. Language and languages—Study and teaching. 3. Grammar, Comparative and general. 4. Linguistics—Statistical methods. 5. Applied linguistics. I. Title. P118.2.R38 2014 418.0071–dc23 2014016210 British Library Cataloguing in Publication Data A catalogue entry for this book is available from the British Library. ISBN-13: 978-1-78309-246-8 (hbk) Multilingual Matters UK: St Nicholas House, 31–34 High Street, Bristol BS1 2AW, UK. USA: UTP, 2250 Military Road, Tonawanda, NY 14150, USA. Canada: UTP, 5201 Dufferin Street, North York, Ontario M3H 5T8, Canada. Website: www.multilingual-matters.com Twitter: Multi_Ling_Mat Facebook: https://www.facebook.com/multilingualmatters Blog: www.channelviewpublications.wordpress.com Copyright © 2014 Stefano Rastelli. All rights reserved. No part of this work may be reproduced in any form or by any means without permission in writing from the publisher. The policy of Multilingual Matters/Channel View Publications is to use papers that are natural, renewable and recyclable products, made from wood grown in sustainable forests. In the manufacturing process of our books, and to further support our policy, preference is given to printers that have FSC and PEFC Chain of Custody certification. The FSC and/or PEFC logos will appear on those books where full certification has been granted to the printer concerned. Typeset by Techset Composition India(P) Ltd., Bangalore and Chennai, India. Printed and bound in Great Britain by the CPI Group (UK Ltd), Croydon, CR0 4YY.

Contents

1

2

Second Language Acquisition Facit Saltus (‘Takes a Leap’) 1.1 Preview of the Volume: Second Language Acquisition in Adulthood is a Discontinuous Process 1.2 The Term ‘Discontinuity’ and its Meaning for SLA 1.3 The Core Idea of the Discontinuity Hypothesis 1.4 ‘Gemination’ of the Same Twice-learned Items in a Learner’s Competence 1.5 Second Language Acquisition is ‘Quantized’ 1.6 Falsifiability Criteria for the Discontinuity Hypothesis 1.7 Increasing Amount of Exposure to Input Cannot Explain All Developmental Transition States 1.8 Discontinuity is Neither Automatization Nor Restructuring 1.9 Discontinuity Does Not Mirror the Lexicon/Grammar Distinction 1.10 Discontinuity Operationalizes Two Different Kinds of L2 Grammar 1.11 Discontinuity Differs From Developmental Theories of ‘Incrementalism’ 1.12 Diagnostics of Discontinuity 1.13 Discontinuity and Individual Differences 1.14 Credits 1.15 Breakdown of the Volume Discontinuity as Chunks Feed into Grammar 2.1 Chapter Preview: Frequency Takes the Floor 2.2 Three Aspects of the ‘Frequency Factor’ in Language Processing and Language Acquisition 2.3 Chunks, Not Formulas, Are the Building Blocks of SL v

1 1 3 5 8 9 11 15 17 19 20 22 24 28 31 39 43 43 44 45

vi

3

4

Discont inuit y in Second L anguage Acquisit ion

2.4 Chunks Feed into L2 Grammar 2.5 Chunks Feed into L2 Constructions 2.6 One Example of Gemination of TL Representation in L2 Italian 2.7 How Much Grammar Can Be Found in Chunks and Constructions? 2.8 Chunking (and SL) Operates on Sociolinguistic Variants as Well 2.9 To Sum Up: Some Language Properties Are Not a Property of Input

48 51

Discontinuity in the Maturing and in the Adapting Brain 3.1 Chapter Preview: Discontinuity Across a Learner’s Age 3.2 Beyond the ‘Fundamental Difference’ Versus ‘Full Access’ Debate 3.3 Learning by Patches is Typical of Adult SLA 3.4 Discontinuity in Brain Maturation and Brain Adaptation 3.5 The Difference Between a ‘Sensitive’ and a ‘Critical’ Period for Language Acquisition 3.6 Discontinuity in the Maturing Brain 3.7 Lifelong Effects of the Early Acquisition of Additional Languages 3.8 Discontinuity and an Adult’s Brain Adaptation 3.9 The Critical Period Hypothesis Revised 3.10 Specific Features of SLA in Adulthood 3.11 To Sum Up: The Balance Between Loss and Compensation in Late SLA

64 64

Discontinuity and the Neurocognition of Second Language 4.1 Chapter Preview: Discontinuity Fits a Divergence Model of L1–L2 Acquisition 4.2 The Notion of Convergence and the Single-network Hypothesis 4.3 The Notion of Divergence and the Declarative/ Procedural Model 4.4 The Declarative/Procedural Model 4.5 Problems with the DPM (and Possible Integrations) 4.6 Discontinuity and L2 Neurocognition: Experimental Studies 4.7 Studies Questioning the Developmental Relevance of the N400/P600 Dichotomy

96

52 58 60 62

66 68 69 70 72 78 80 81 84 93

96 97 98 99 108 114 123

Content s

4.8 What is Learned at Discrete Stages of Learning is Just Combinatorial Grammar 4.9 The N400–P600 Dichotomy Could Reveal L1–L2 Common Processing Strategies 4.10 A Different View of the N400–P600 ‘Biphasic Pattern’: Neural Cues of Gemination in L2 Processing and Representations 4.11 To Sum Up: SL and GL Divide the Labor of SLA in Adulthood 5

Statistical Learning of a Second Language 5.1 Chapter Preview: What Parts of the L2 Are Affected by SL? 5.2 The Distinction Between Combinatorial and Non-combinatorial Grammar 5.3 Second Language Acquisition as a Form of Supervised SL 5.4 Did the SL Approach Change the Problem of Language Acquisition? 5.5 Is Probabilistic Information also Relevant to the Structural Properties of the Language? 5.6 Syntax that Can Be Learned Statistically (in a Miniature Artificial Grammar) 5.7 The Potential Contribution of SL for the Acquisition of L2 Syntax 5.8 On Syntax as ‘Structural Integration’: P600 Effects also Signal Identical Violations in Non-linguistic Domains 5.9 Are the Patterns that Statistical Learners Extract from the Input Actually ‘Syntactic Structures’? 5.10 Adjacency, Non-adjacency and SL 5.11 Non-adjacent Dependencies in SL Correspond to What We Have Previously Labeled as Combinatorial Grammar 5.12 Neural Correlates of the Processing of Supra-regular (Phrase-Structures) Versus Regular (Finite State) Grammars 5.13 What ‘L2 Chunks Feed Into Grammar’ Means in the Perspective of SL 5.14 To Sum Up: The Distinctive Features of SL and Discontinuity in SLA

vii

124 128 129 131 135 135 136 138 139 140 143 146 151 154 158 161 166 172 175

viii

Discont inuit y in Second L anguage Acquisit ion

6

Parts of L2 Grammar That Resist Statistical Learning 6.1 Chapter Preview: Two Kinds of Grammar, Two Kinds of Learning 6.2 The Discontinuity from Statistical ‘Counting’ to Grammatical ‘Computation’ 6.3 The Switch Between Concatenation and ‘External Merge’ 6.4 Discontinuity and Downstream, Top-down L2 Processing 6.5 The Benefits of Frequency Do Not Extend to Displaced Items (‘Internal Merge’) and to Empty Categories 6.6 There are Parts of the Second Language that Cannot Be Learned like a Song 6.7 Non-combinatorial Grammar and Non-adjacency of Items in a Sentence 6.8 Non-combinatorial Grammar and Long-distance Dependencies: The Shallow Structure Hypothesis 6.9 Non-combinatorial Grammar at the Interfaces 6.10 Non-combinatorial Grammar and Uninterpretable Features 6.11 Non-combinatorial Grammar and Functional Morphology: The ‘Bottleneck’ Hypothesis 6.12 Is Non-combinatorial Grammar Important for SLA? 6.13 To Sum Up: Whether Something Can Be Learned or Not Depends on How it Can Be Learned

178 178 179 180 188 190 196 197 198 204 213 215 216 218

Conclusions

220

References

225

Index

251

1

Second Language Acquisition Facit Saltus (‘Takes a Leap’)

1.1 Preview of the Volume: Second Language Acquisition in Adulthood is a Discontinuous Process In this book, the idea is discussed that the acquisition of some morphosyntactic features of the second language (L2) by adult learners is a discontinuous process. The process is discontinuous because the learning principles and the brain structures that have supported the acquisition of those features up to a certain point can be juxtaposed with different learning principles and different brain structures. The latter principles and structures do not replace the former; rather, they cohabit with them. The consequence is that L2 representations, after a point of rupture, may duplicate in an L2 learner’s competence. This fact will henceforth be referred to as ‘discontinuity with gemination’ or, more simply, as ‘discontinuity’ (Section 1.4). A representation of a discontinuous learning process with gemination is given in Figure 1.1 (Section 1.2). The graph in the figure illustrates that a discontinuous process is characterized by the presence of a cut-off point breaking a continuous line. The ascending continuous line can represent how an item/structure of the target language (TL) is increasingly mastered by a learner over time. Before the cut-off point, the item can be thought of as if it is represented and mastered in just one way in a learner’s competence. After the cut-off point the item is represented and mastered in two different ways. This holds because the acquisition of that item splits along two qualitatively different developmental routes. The difference between these routes is qualitative because it is grounded in L2 neurocognition. Discontinuity in second language acquisition (SLA) precisely arises when the same item starts being represented and processed differently in a learner’s brain over time. Up to a certain point, 1

2

Discont inuit y in Second L anguage Acquisit ion

some morphosyntactic items of the TL may be represented and processed by learners only statistically as psychological units or chunks (words frequently occurring together to the extent that they are perceived as a whole; see Section 2.3). After that point, the same items, if certain conditions occur, may also be represented and processed by learners grammatically, as being projections of abstract features which are implicitly noted to be relevant for that and also for other, not necessarily similar, items. The consequence of discontinuity is that some morphosyntactic items might be learned (represented and processed) twice, each time in a different way (Section 1.3). Adult L2 learners – but also native speakers (albeit to a different extent) – may become capable of switching from one way of learning to the other depending on different factors. In this volume, a description is provided of how the switch between one way of learning and another could take place in adult SLA. The former way of learning is called ‘statistical learning’ (SL) and the latter ‘grammatical learning’ (GL). In Chapters 2 and 5 we will see that SL occurs via learners’ computation over ‘transition probabilities’ (henceforth, TP) and bottom-up category formation, while GL occurs via learners’ computation over symbolic, abstract rules and top-down category formation. SL and GL are developmentally independent from one another, but interact in a learner’s competence. To put it simply, SL provides the learners’ developing L2 grammar with the cognitive environment to grow and develop. In this book it is proposed that the switch between SL and GL takes place as follows. The innate capacity of implicitly tracking frequently co-occurring items (which sustains SL) would provide adult learners with the necessary cognitive ground for successive grammatical generalizations. This capacity, despite being non-language specific, also provides adult learners with the necessary means to acquire parts of the grammar of a second language. Conversely, the abstract rules of ‘combinatorial grammar’ (Sections 1.9 and 4.8), which sustain GL, would provide adult learners with the appropriate labels for categorizing items which are also novel (never encountered before) or not frequent enough in the input and/or that appear to be very dissimilar from the majority of other category members. In the end, SL and GL are only two different ways of learning the same kinds of grammatical structures. From a certain point onwards, these two ways both may become available to L2 adult learners. We will see (in Chapters 2 and 5) how these ways of learning may cooperate as far as the acquisition of some aspects of L2 grammar is concerned. In this book predictions for SLA in adulthood are also discussed. It is claimed that there exist features of L2 grammar which are less likely to be learned because they cannot be learned discontinuously. Features such as null subject or wh- constituents extraction, for example, cannot be learned first

Second L anguage Acquisit ion Fac it Saltus

3

statistically and then grammatically, because they imply a learner’s capacity for categorizing ‘over absences’, that is, over empty categories or displaced items (Chapter 6). It is predicted that these non-discontinuous aspects of the grammar are more difficult to learn than discontinuous aspects of the L2 grammar (such as auxiliary selection in compound tenses in L2 Italian; see Sections 1.6 and 6.3), or even that they are unlikely to be acquired by adult L2 learners, independent of the fact that they are less frequent in the TL input (Section 6.12). It will also be suggested that discontinuous acquisition in adulthood mimics the process of early language acquisition in childhood, which is discontinuous as well. Discontinuities in childhood and in adulthood differ as to the extent to which they are successful. While the former, under normal conditions, is successful, the latter is only partially successful because the optimal neuroanatomical and neurofunctional conditions that accompany a child’s brain maturation cannot be replicated in adulthood (Chapter 3). Finally, some reasons will be also listed in this book as to why innatist (modular) and cognitive (general domain) theories are not necessarily contradictory in adult SLA research. In fact, different pieces of the TL are expected to be learned in different ways. This is because the linguistic nature of those items is different and because the neural resources at a learner’s disposal change with a learner’s age. If the pure grammatical computation of some aspects has become more difficult for the aging brain, the statistical pretreatment of grammatical features, which is a non-language-specific competence, may become necessary. Grammatical features of the TL that, due to their nature, cannot undergo this statistical pretreatment are less likely to be learned in adulthood.

1.2 The Term ‘Discontinuity’ and its Meaning for SLA The term discontinuous identifies a kind of mathematical function which is not continuous. A function f(x) is continuous if – when one plots data on the graph – the pencil never detaches from the sheet. Instead, if the line in the graph is interrupted and then restarts from another point, whether upper or lower, we have two limits instead of one in either axis; the function underlying such a graph is discontinuous. A rupture point signaling discontinuity with gemination (Section 1.4) can be depicted as in Figure 1.1. Discontinuity as a mathematical function means that some points on the line (such as at 3 in the x axis in the graph) have a twofold limit (a;b) upon the y axis because at that point the line interrupts and facit saltus (‘takes a leap’). In acquisitional terms, the twofold limit upon the y axis envisages the

4

Discont inuit y in Second L anguage Acquisit ion

Figure 1.1 A discontinuous function with gemination plotted on a graph

twofold idea that: (a) a grammatical item of the TL may be learned twice, at two different cognitive levels; and (b) the same grammatical item of the TL may have split representations in a learner’s competence. Discontinuous L2 development should not be confused with inconsistent, uneven or reversible developmental paths. For example, in the literature there is a well-known stage-developmental model called the ‘U-shaped developmental path’ or ‘U-shaped behavior’ (VanPatten & Benati, 2010: 28). According to this model, a learner initially progresses in the use of a form; then there is a sudden drop off which indicates loss of ability or knowledge which is eventually followed by an increase in performance. When plotted over time, the resulting performance/accuracy graph resembles a U shape (VanPatten & Benati, 2010: 164). A discontinuous developmental trend is a different concept from a U-shaped developmental trend, though. However reversible and non-uniform the U-shaped progression might be, it is a continuous progression-regression, because the line marking a learner’s mastering of the form is always one and the same. To put this a different way, in the U-shaped developmental path there is no gemination of forms. Rather, there is a replacement: eventually one form prevails and the less target like one subsides. On the contrary, the discontinuity hypothesis predicts that, as their proficiency increases, adult L2 learners happen to have two processing/

Second L anguage Acquisit ion Fac it Saltus

5

representational routes at their disposal. When this happens, the line (the developmental path) splits in two as in Figure 1.1. We have just seen that discontinuity has nothing to do with uniformity or reversibility. It instead concerns the idea of ‘quantization’ of the developmental trend. In Section 1.5 the idea will be discussed in detail that SLA is also quantized – a characteristic that has been attributed to other fundamental phenomena in nature. According to this idea, SL and GL feature in the form of two discrete, non-continuous packets or ‘levels of energy’, having different representational/processing costs and benefits (for the learning brain) in a hypothetical discontinuous scale of language development. For the sake of clarity, let us anticipate what the term ‘discontinuity’ may further suggest by looking at how it is used in the natural sciences. In the quantum physics of the 20th century the term ‘discontinuity’ was used to refer to the fact that electrons were found not to traverse all the continuous points between energy levels when they changed energy levels. Electrons orbiting around a nucleus in fact cannot change their orbit gradually along a continuous, gradient-like scale of energy. Instead, they must ‘jump’ from one energy level to another. The energy values of these discontinuous levels are predictable. The shift of electrons from one level to another does not follow a smooth curve: electrons disappear from one level and simultaneously (after a few nanoseconds) appear at the next. Ever since this 20th century physics, a system has been known as discontinuous when changes in its configuration did not cover every theoretically possible discrete transition state. Many changes in the physical world have been found to occur suddenly or in an unexpectedly fast manner (Buchanan, 2011). We will see that – if applied to adult SLA – the idea of discontinuity suggests that: (a) the learning curve, even though some points of its trajectory seem to be predictable, might not necessarily be smooth (Section 1.11); (b) some specific items of the TL could be learned twice, each time in a qualitatively different way; and (c) some neural processes relative to language acquisition in an adult’s brain may be carried on simultaneously and be redundant (Section 4.4).

1.3 The Core Idea of the Discontinuity Hypothesis The core idea of discontinuity is that the process of adult acquisition of L2 grammar is not uniform and incremental, but differentiated and redundant. To learn a second language, adults apply two different procedures to the same linguistic materials: redundancy means that the same language items may happen to be learned twice. The process is differentiated because the ways of learning are different. First learners implicitly record that some

6

Discont inuit y in Second L anguage Acquisit ion

words come together. They can do that by exploiting recurrent patterns of variable length in input distribution and by using some portions of the brain – especially in the left temporal cortex and in the left temporal lobe – which are specialized for this kind of learning from birth. In the literature (and in this book), this way of learning is referred to as SL. Afterwards, adult learners record implicitly not only that some words come together, but also why they do so. Here why means: by virtue of the identification of some formal properties of items. These abstract properties include both uninterpretable (purely formal, devoid of content) and interpretable (semantic) features (for a classification of these features in generative theory, see Adger, 2003: 22–60). Whether devoid of semantic content or not, all these abstract properties are conceived as being separate entities from the words where they are instantiated. Adult learners gradually realize that these features are abstract; that is, they live a separate life from that of the words with which they can combine. As an instance, being singular rather than plural or being nominative rather than accusative are eventually understood by learners of Italian as properties that overarch other meaning-related or distributional features, respectively, of nouns and pronouns to which these properties apply. Implicit recording which does not simply abstract over word co-occurrences but also over formal properties and which drives learners to property-based categorization is often (and also in this book) referred to as GL. Learners can categorize over abstract properties because part of an adult’s brain – especially in the leftfrontal cortex, in the pre-motor cortex and at the basal ganglia level – could be sensitive to property-based hierarchies. In Chapter 3 we will survey current research on the brain structures that promote a kind of learning which is not based on surface similarities and analogy. Neither measures based on accuracy percentages in mandatory contexts nor speed of processing nor fluency rates alone are reliable markers of SL or GL. The idea of discontinuity in SLA is not new: it has sometimes been acknowledged and alluded to or mentioned by some linguists during the last 10 years (Section 1.14). This book attempts to provide the reader with a less fragmentary picture by composing different insights and sources of evidence from different fields under a unified perspective. This perspective will sometimes in this book be referred to as the ‘discontinuity hypothesis’ and at other times, more vaguely, as ‘the idea of discontinuity’ or the ‘discontinuity approach’. Crucially, the word ‘theory’ is avoided purposely in this book for the reasons that will be explained below. Basically, there cannot be a theory of discontinuity yet because the evidence provided so far can be interpreted in different ways (see Chapter 4). An expression such as ‘discontinuity hypothesis’ better conveys the image of the embryonic stage of a prospective theory of discontinuity. At this stage, data from different sources are

Second L anguage Acquisit ion Fac it Saltus

7

collected and claimed to converge towards the intended interpretation, but this interpretation still awaits to be confirmed further by experimental data (in the sense specified in Section 1.6). Even though the idea of discontinuity might not be new, it is only in the last decade that the motivations for discontinuity have been looked at and accounted for in terms of human neurocognition (McLaughlin et al., 2004: 705; see Chapter 4). Osterhout et al. (2008) and Tanner et al. (2009) used event-related potentials (ERPs1) and found out that, during the acquisition of verb morphology, learners pass through ‘discontinuous stages’. The nature of discontinuity is made explicit in McLaughlin et al. (2010: 138–142): initially L2ers do not simply memorize whole-word sequences (chunks and formulas, see Chapter 2), but are sensitive to statistical rules (TP, see Chapter 5) which involve both adjacent and also non-adjacent morphemes. At a certain point, these learners go beyond statistically based patterns in the input to inducing productive rules. The central tenet of L2 neurocognition theories is that this passage is mirrored by the electrophysiological shift between N400–P600 ERP components. To give an example of this shift, in Osterhout et al. (2008), the same wrong sequence (e.g. *Tu adorez) after one month of instruction was found to elicit an N400-like effect in 14 L1 English, L2 French initial learners. In four months, the effect was replaced by a P600 component which was even larger at the third session (after 80 hours of instruction) and at that point comparable to native controls. Osterhout et al. (2008) claim that the N400–P600 ‘biphasic pattern’ is a cue of discontinuity, that is, a cue of a sudden change in the neural source of SLA of morphosyntax (see Chapter 4). McLaughlin et al. (2010: 142) also concluded that ‘there are qualitative changes in the neurocognitive mechanism underlying language processing during the first year of instruction’. The first step of such qualitative and physiological change is that ‘learners initially learned about words, but not rules’ (Osterhout et al., 2004: 290). Learners then step up from SL to GL not because they become able to ‘estimate from a sample’ (as in Ellis, 2009), but because they can see a grammar beyond frequently encountered examples (see Chapter 2). In the next chapter we will see that the hypothesis that adult SLA is discontinuous implies that: (a) acquisition could spread along both a linear dimension (the capacity of building categories out of chunks) and an in-depth, grammatical (eventually frequency-independent) dimension; (b) chunks and larger analogy-induced constructions represent a good cognitive environment for adult L2 grammar to grow as they can ‘feed into grammar’2; (c) the frequency factor in acquisition is not ‘in its own right’: in adult SLA, it may eventually promote and support a frequency-independent rule-computation system. In the next paragraph, it will be maintained that where type/token

8

Discont inuit y in Second L anguage Acquisit ion

frequency and TP cannot help, as in the case of the acquisition of null subjects, grammatical acquisition in adult SLA is less likely to take place. Instead, when frequency helps, as in the case of auxiliary selection in L2 Italian, acquisition is speeded up. Successively, if learners succeed in ‘unpacking’ the auxiliary + past participle chunk into its formal components and go beyond the pattern of form-function mapping of the construction, the learning process becomes discontinuous and makes a leap.

1.4 ‘Gemination’ of the Same Twice-learned Items in a Learner’s Competence The term ‘gemination’ (which has a long tradition in phonetics and phonology) is used in this book in a very different sense. It is used to convey the idea that an item of L2 morphosyntax, from a certain moment, may have split representations in an L2 learner’s competence. These ‘twin representations’ are very different in nature: one is statistical and the other is grammatical. The former is processed and stored on the basis of the frequency and of the distributional (transitional) properties of its components (see Chapter 5). The latter is processed and stored as the result of a computation which is based on the abstract properties of just one of its components, namely, the phrasal head (Section 6.3). To anticipate an example which will be further discussed in Chapter 2, an adult learner of L2 Italian can use the chunk non so ‘(I) do not know’ in two ways. She can even use it for a very long time as an unanalyzed whole, that is, without being aware of its lexical, functional and formal components. From a certain moment on, the same learner becomes capable of decomposing the chunk non so into its lexical and formal features. First of all, the learner realizes that the adverb (non) is separated from the verb (so); then she realizes that the verb is first person singular and can distinguish the meaning of the verb sapere ‘know’ from the meaning of similar verbs such as conoscere; eventually, she also comes to know under which conditions it is necessary to add the pronominal subject io ‘I’ to the verb. We will see that all these pieces of knowledge seem to become part of a learner’s competence rather suddenly and all together (Section 2.6). We will also see that chunks like non so seen before do not disappear when morphosyntactic (and lexical) features make their sudden appearance on the verb. On the contrary, correct (and also incorrect) TL formulaic uses survive also when they are juxtaposed with morphological (e.g. inflected) uses. Gemination is precisely the situation in which a given statistical form also comes to have a grammatical counterpart which possibly has a different distribution (in terms of language usage). The idea is that adult learners may

Second L anguage Acquisit ion Fac it Saltus

9

come to a point where they understand (and process) things twice, in two different ways. This is exactly what Townsend and Bever (2001) also claim to hold for native speakers. Gemination would therefore also characterize a ‘steady state’ condition of a mature first language speaker and not only a ‘transition state’, typical of L2 development. Tanner (2013) and Tanner et al. (2013a, 2013b) found that individual differences strongly affect the rate of the N400–P600 biphasic pattern. Advanced learners of a language may still be ‘N400 dominant’ as far as the acquisition of morphosyntax is concerned (Sections 1.13 and 6.10). Despite the fact that these learners are indistinguishable from native speakers in many other respects, their processing of L2 morphosyntax continues to be driven by frequency and lexical factors. On the other hand, we will see that native speakers may also sometimes be satisfied with running ‘good-enough’ (lexically and pragmatically driven) processing routines in situations when they cannot afford robust, detailed processing of syntactic structures (Section 1.14.2). Therefore gemination can be identified as the process that makes available the same morphosyntactic TL items on two different cognitive levels both to L2 learners and to native speakers. These cognitive levels have different processing and representational costs and benefits. Adult native speakers, in a situation of typical language development, can normally afford both costs, whereas adult non-native speakers cannot. Resorting to SL resources and to ‘shallow processing’ (Section 6.8) is not a matter of choice for them. In Section 4.10 we will see that there are also neural cues indicating that gemination in L2 representations and processing has a psychological reality in adult SLA.

1.5 Second Language Acquisition is ‘Quantized’ The term ‘quantized’ in this paragraph explicitly references the Quantum Theory formulated at the beginning of the 20th century (for an accessible overview on quantum theory, see Baggott, 2011; Kumar, 2008). Energy in nature (e.g. the energy possessed by an electron in an atom) is said to be ‘quantized’ when its changes (or dispersion) do not occur smoothly, but by jumps from one value to another. This amounts to saying that different energy levels in an atom can feature only given, pre-fixed values. This distribution of energy is discontinuous because it neglects other values that might theoretically be on a scale of possible values. According to Bohr’s atomic model, an atom can exist only in a limited, pre-fixed series of discrete states of energy which are multiple of constant values called quanta. A ‘quantum state’ identifies one of such multiples. Two attributes of the structures that

10

Discont inuit y in Second L anguage Acquisit ion

consist of quantum states are threshold-level and instantaneity. The former refers to the fact that for an electron it takes a pre-fixed amount of energy to get a quantum leap, that is, a leap from one quantum state to another. This quantity is conceivable in terms of ‘packets of energy’. When less energy than is due is given to an electron in a given state, nothing happens. To have an electron excited to the extent that it changes its orbit, that precise critical threshold of the necessary energy has to be reached and trespassed. In general, this entails that changes of state are not a direct, linear function of the quantity of adsorbed/emitted energy. One therefore cannot expect that x-energy = x-change where x is conceived as any value on a numeric scale. Instead, x can only be a multiple of a fixed value in an ordered set of discrete, discontinuous values such as, for instance, {2, 4, 6 . . . n}. In developmental terms, one can conceive the TL input to which learners are exposed as something comparable to energy levels of electrons orbiting around a nucleus. If SLA is quantized, then one cannot expect that the amount of exposure to the TL input and interaction determine SLA straightforwardly. By adding one unit in input one cannot get a corresponding [+1] progress in SLA. The discontinuity hypothesis claims that learners do not progress through a linear scale of processing and/or representational complexity depending on the learners’ amount of exposure to the TL input. In short, more input does not result in more acquisition of increasingly difficult items. This issue will be addressed in more detail in Section 1.7. The other attribute of changes occurring in quantum states structure is their instantaneity. This attribute means that transitions or jumps between one energy level and another have no observable, intermediate states. They are completed within a moment. In other words, there are no such things as approximations or gradient-like states of energy and there are no conditions where different states are mixed or belonging partly to one state and partly to another. In developmental terms, this amounts to saying that gemination from purely statistical to split (statistical and grammatical) representations of a given item is not gradual; rather, it is associated with a particular instant. When acquisition takes a ‘quantum leap’, a learner becomes immediately capable of using two different representations and processing exactly in the same manner as when a child – maybe at the end of a long series of tentatives – finally learns to ride a bicycle or to float in the water. Riding a bike, skiing and swimming are not gradient-states: you try hard and fail, then suddenly (you do not know how) you succeed. There is no such a state as being ‘halfcapable’ of swimming (of course you can swim well or not, but even in the worst case, you have become capable of floating). One of the central tenets in this book is that there are strong analogies between quantum theory and the discontinuity hypothesis, that is, the

Second L anguage Acquisit ion Fac it Saltus

11

hypothesis that adult SLA is inherently discontinuous. Adult SLA is quantized too because it proceeds by sudden, instantaneous leaps among two discrete, discontinuous values. These values are SL and GL. The leap occurs when an item that has been learned statistically also gets learned grammatically. Each one of these leaps is preceded by an accumulation of energy (TL input) up to a point where a threshold level is reached and trespassed. The ‘energy’ that is accumulated can be seen as a function of a learner’s experience with statistically relevant parts of the TL input. What gets accumulated by adult learners is a critical amount of concrete instances of a particular form via SL. When a statistically critical threshold of these concrete instances has been reached and trespassed, the statistical form itself geminates and the corresponding grammatical, abstract form takes shape in a learner’s competence. Finally, the switch from SL to GL is also assumed to be instantaneous, just like the leap of an electron from one orbit to an upper one. One of the most challenging implications of the discontinuity hypothesis described in this book is that adult SLA could also abide by the laws of nature and the laws of dispersion/conservation of energy.

1.6 Falsifiability Criteria for the Discontinuity Hypothesis In the previous paragraphs we have mentioned some experimental, electrophysiological evidence for discontinuity in SLA. We will review this evidence in more detail in Chapter 4. The explanations for this evidence are different, however. The behavioral and neurophysiological proofs that will be commented upon in this book in fact allow two different explanations. The first is based on the distinction between the acquisition of the L2 lexicon and the acquisition of the L2 grammar (Section 1.9). It says that SLA in adulthood is discontinuous because all L2 grammar is first learned ‘lexically’ (see Chapter 4). According to this perspective, grammar rules would emerge in the form of abstract generalizations over high-frequency items or chunks that have been preliminarily stored in declarative memory circuits of the brain. The neurophysiological evidence commented upon in Chapters 3 and 4 supports the hypothesis (which is proposed in the declarative/procedural model [DPM] framework) stating that at least some morphosyntactic features of the L2 are learned discontinuously in this precise sense. One might be tempted to conclude that it is time for a complete and consistent theory of discontinuity to be formulated. The problem is that the explanation proposed in the DPM for how the L2 grammar is learned does not account for all L2 grammar, but only for a part of it (see below). An alternative explanation for

12

Discont inuit y in Second L anguage Acquisit ion

the neurophysiological evidence is needed because the distinction between the lexicon and the grammar in a second language is too overspecified. This book provides such an alternative explanation. According to the explanation provided in this book, discontinuity does not result from the difference between the grammar and the lexicon but from the existence of two different ways of learning: SL and GL. SL is a non-language-specific mechanism with a broader cognitive scope (Chapter 5), which is more age insensitive than GL. An adult’s brain is much better and faster at learning the second language statistically than grammatically. Therefore it tends to use statistics3 to learn just those parts of the L2 grammar that can be approached and processed statistically. We will see that not all L2 grammar items and properties can be processed and learned statistically (and thus be learned discontinuously). The items/properties of L2 grammar that can be learned statistically are only a subset of all the L2 grammar. They consist of predictable, recurrent combinations of whole words and movable parts of words (morphemes) which occur in given conditions and sometimes at predictable (not necessarily adjacent) intervals in the sentence. This part of the grammar is called ‘combinatorial grammar’ (for more detail, see Section 4.8). Regularity is the main feature of combinatorial grammar, that is, of the grammar that can be pretreated statistically (for a definition of regularity, see Chapter 5). There are other grammatical features of the TL which cannot be learned statistically and which therefore are unlikely to be learned at all. This part of the grammar is called ‘non-combinatorial grammar’. In this book, some arguments will be presented that this might be the case, but no direct neurophysiological evidence is available to date to support such claim. In fact, no experimental study so far has factorized the difference between combinatorial and noncombinatorial L2 grammar, that is, between the grammars that can or cannot be treated statistically. This book advocates the need for such experiments to be carried out. Below, some falsifiability criteria are provided in order to test the explanation of the phenomenon of discontinuity which is put forward in this book. How should future experimental studies be designed in order to test the alternative explanation of discontinuity provided in this book? What can be considered reliable falsifiability criteria for this explanation? To answer these questions, let us give an example of what is expected in developmental terms for two very different features of L2 grammar. According to the alternative explanation of discontinuity proposed in this book, one can expect that adult learners of L2 Italian, all being equal, are more likely to acquire a rule for auxiliary selection in compound tenses than a rule for null subjects in a prodrop language like Italian. This is not because auxiliaries occur more or less frequently than null subjects in Italian, but rather because frequency and

Second L anguage Acquisit ion Fac it Saltus

13

statistics are effective for SLA in the former case and not in the latter. The reason is that frequency may support the procedure of concatenating co-occurring items, but cannot support a computation that involves absent item features (see Sections 6.2 and 6.3). Here, ‘absent’ means both item features that are not present or that are displaced in the sentence and items that are triggered indirectly by the memory of prior words, as is claimed to happen in connectionist models.4 In other words, ‘absent items’ are items that are computed and represented only mentally. In more detail, the auxiliary è ‘is’ and the past participle arrivata ‘arrived’ of Sentence (1) form a chunk (Sections 2.3 and 2.4) è + arrivata, which may consolidate in a learner’s memory over time and eventually ‘feed into grammar’ (turn into a productive rule for auxiliary selection): (1) Elena è arrivata E. isAUX arrived ‘Elena arrived’ On the other hand, the native speaker’s decision concerning the dropped pronominal subject (pro) before the VP non ha parlato ‘(she) did not talk’ in Sentence (2) is the result of a computation over absent items. The native speaker, in order to drop the pronoun, has to decide two things: (a) whether pro and Elena in the sentence are co-referent; and (b) whether or not there is a topic shift in the sentence. In order to make her decision, the native speaker could not rely on visible, surface cues (verb agreement alone is not a reliable cue). Rather, the computation is made over both present and absent values that make up a native speaker’s representation of that sentence. After this computation (for more detail on the computation, see Chapter 6), a native speaker of Italian knows that the pronoun must be dropped. (2) Elena è arrivata ma pro non ha parlato ‘Elena arrived but (she) did not talk’ In this book, it will be claimed that the result of any computation over absent/displaced items per se cannot provide learners of Italian with anything that can be remembered and re-used for similar situations. The results of any computations that are made over abstract items cannot be memorized by learners and overextended to similar sentences. Let us compare what learners might happen to retain after long exposure to instances of sentences like (1) and (2). L2 learners of Italian, after exposure to input, might happen to remember that arrivata ‘arrived’ is always preceded by various forms of the auxiliary essere ‘be’ (and not by the auxiliary avere ‘have’) as in Sentence (1).

14

Discont inuit y in Second L anguage Acquisit ion

After many identical encounters with the same chunk in Italian input, L2 learners might happen to establish an automatic procedure that entails the re-use of the same auxiliary whenever the verb arrivare ‘arrive’ occurs. Instead, learners cannot decide whether the pronoun in (2) must be dropped whenever the verb arrivare occurs in a sentence, nor can they use analogy to induce a rule that can be generalized effectively. They have to compute it. Computation differs from any analogy-based algorithms because it excludes ‘similarity’ and ‘repetitiveness’ as valid criteria. The dropping of the pronoun cannot be predicted; it must be computed by speakers. Every time, this kind of computation must run from scratch. In fact, it involves unpredictably different syntactic and pragmatic values of the items that compose the sentence. The final result (whether or not to drop the pronoun) is a function of these values. The result in itself cannot be generalized and its null predictive value cannot support acquisition. What the comparison between sentences (1) and (2) tells us is that, in the case of auxiliary selection, the rule might be learned first statistically and then grammatically (when values at the syntaxsemantics interface are acquired). In this case, the term ‘discontinuity’ would depict that gemination occurred in a learner’s brain. Instead, in the case of null subjects, the statistical processing (or ‘pretreatment’) cannot take place and gemination does not occur. In order to test our explanation of discontinuity, neurophysiological studies using blood oxygenation levels or differences in voltage potentials on the scalp should include experimental conditions which reflect a dichotomy in the nature of the grammar that is learned. The design of these experiment should allow researchers to test whether or not the acquisition of noncombinatorial grammar (e.g. null subjects, filler-gap dependencies, island constraints on wh- extraction, etc.) by adult L2 learners shows the same neurophysiological discontinuous trajectory (e.g. the shift between N400 to P600 components in ERPs, see footnote 1) over time as has already been shown in learners’ acquisition of combinatorial grammar (e.g. agreement within the noun phrase or the verb phrase). Since no experimental study so far has included these level variables in its design, the evidence for the discontinuity hypothesis, at least as it is conceived in this book, is missing. There is enough evidence about the existence of discontinuity, but we still lack evidence supporting the explanation of discontinuity proposed in this book. I will try to demonstrate that the explanation of discontinuity based on the lexicon-grammar divide put forward by the proponents of the DPM (Chapter 4) downgrades an important developmental factor. This factor is the difference between SL and GL. This difference overarches that between the L2 lexicon and the L2 grammar. An alternative explanation for the available evidence is worth exploring. Discontinuity in adult SLA in fact does not

Second L anguage Acquisit ion Fac it Saltus

15

hold between the L2 lexicon and the L2 grammar as it is envisaged in the DPM, but between SL and GL exclusively within the realm of combinatorial grammar. The remaining L2 grammar – non-combinatorial grammar – is excluded from discontinuity because it is out of reach for SL and, for this reason, is more unlikely to be learned in adulthood.

1.7 Increasing Amount of Exposure to Input Cannot Explain All Developmental Transition States The discontinuity hypothesis has two main implications: (a) not all grammatical items of the TL can be learned in adulthood; and (b) acquisition of learnable grammatical features in adulthood is not a direct function of a learner’s exposure to the input of the TL. This concept has been anticipated in Section 1.5, when it was observed that SLA is quantized and that the amount of exposure to the TL input and interaction do not determine SLA straightforwardly. This paragraph discusses some further implications of this point. When looking at the neurophysiological evidence presented in Chapters 3 and 4, one can claim that the acquisition of L2 items depends on a change in representations and in processing routes which is grounded in L2 neurocognition and which in turn probably correlates with a bunch of related environmental factors among which are exposure to input, proficiency, length of immersion, and possibly also kind of instruction. The role accorded to the TL input differentiates the hypothesis of discontinuity from other developmental models that are based on the idea of ‘continuity’. Developmental models of continuity (Section 1.11) assume that learners can progress through a scale of processing and/or representational complexity depending on the learners’ amount of exposure to the TL input. In short, more input results in more acquisition of increasingly difficult items. In these models, the relationship between exposure to input and acquisition is straightforward. Learners – just like electrons before quantum physics (Section 1.5) – with increasing exposure to the TL are expected to cover every theoretically possible discrete transition state. Therefore, according to the developmental models of continuity, the learning of the ultimate and most difficult features of a language is just a matter of time and exposure. Moreover, when one figures out a developmental path, one typically expects a linear progression through discrete stages of the L2. These stages are easier or more difficult not only for linguists describing that language, but also for a learner’s mind/brain. Usually simple elements are conceived as being easier to learn than complex elements: single words or chunks are easier to learn

16

Discont inuit y in Second L anguage Acquisit ion

than phrases or constructions, which in turn are easier to learn than complex sentences, etc. Complexity seems to be a matter of both combination and inter-exchangeability of items, and complexity at the level of linguistic description is seen as somehow reflecting a degree of complexity also at the level of a learner’s mind/brain. For instance, proponents of input/usage-based approaches to SLA (Section 2.5) seem to assume that linguistic complexity is represented in a learner’s mind in terms of the number and abstractness of items which are grouped together. The developmental passage from chunks to constructions reflects the idea that a learner’s increasing ability resides in the capacity for building more complex and more abstract constructions. Complexity concerns the increasing number of involved items. Abstractness concerns the passage from token-based to type-based combinations of items (see Chapter 2). In the next chapters, it will be claimed that all these assumptions may turn out to be inaccurate because: (a) a learner’s mind/brain may find difficulty or ease where a linguist’s descriptive grammar does not (Paradis, 2009); and (b) difficult computation may also involve ‘absences’ or displaced items, that is, features which are not there (in the sentence) but must be thought of as abstract entities and must be taken into account, too. These difficult computations are not made easier by frequency. We have already seen in Section 1.6 that null subjects are an example of computation for which very long exposure to TL input and to concrete exemplars could neither be sufficient nor decisive to adult learners and do not guarantee acquisition. Very advanced and also near-native speakers of L2 Italian still cannot drop the pronominal subject when required. Furthermore, in general, it is difficult to say what is really difficult for the mind/brain to acquire. Possibly, our descriptive scales of linguistic complexity and the internal laws of the mind/brain are not homogeneous. Since frequency is not the only factor at play, language development may appear to be discontinuous even if a learner’s amount of exposure to the input of the TL may have been increasingly linear. The idea of discontinuity can thus be summarized as follows: progress in adult SLA and degree of exposure to the input of the TL are not isomorphic because the ‘frequency-factor’ must interact with abstract principles (the grammar) and with the characteristics of the adult brain. Even though the biological program which allows a human to learn a second language has age-related limits and bugs, an adult’s brain is equipped to amend these limitations eventually, following its internal schedule and at some cost. A possible cost that late bilinguals have to pay is that they have to learn some parts of the L2 grammar twice. This price is fixed by L2 neurocognition, not by some inherent properties of the TL input. Which role may be acknowledged to input into the discontinuity hypothesis? It is an indirect one: the effect of exposure to

Second L anguage Acquisit ion Fac it Saltus

17

the TL input is in fact mediated by both the grammar principles and by the characteristics of an adult’s brain. Input contributes to triggering the neurocognitive change in an adult learner’s brain which eventually causes discontinuity in learning. Therefore input and its properties together represent a covariance factor, but they do not represent the main independent variable, as it is in the usage-based ‘mono-dimensional model’ of SLA. In fact, aspects of the TL that resist acquisition despite many years of learners’ exposure to the input exist. Such residual difficulties in late SLA are not amendable only by increasing the quantity of the input learners are exposed to. They in fact depend on two endogenous factors: (a) how languages are made (the fact that there are parts of L2 grammar which resist SL, see Chapter 6) and (b) the features of the aging brain (see Chapter 3).

1.8 Discontinuity is Neither Automatization Nor Restructuring The difference between SL and GL is qualitative (neurophysiological) in nature and cannot be revealed by any measures of automatization alone, such as processing speed and measurement of reaction times (RT). Theories of automaticity (Segalowitz, 2003; Segalowitz & Hulstijn, 2005) – at least those settled in the cognitive skill-acquisition theory paradigm (DeKeyser, 2007) – postulate that L2 learners first rely on explicit (or declarative 5 ) knowledge of grammar rules. Extensive practice of the L2 would lead to the speeding up and to the automatization of these grammar rules and to consequent error-free, effortless processing. According to the automatization theory, automatization reveals the passage between explicit and implicit knowledge which occurs when the latter outranks the former. This developmental trajectory of L2 skills would follow a non-linear learning curve. This curve fits a continuous, not a discontinuous function, however. This curve in fact shows that practice of the L2 inducing a radical change in the mechanism of knowledge retrieval would result in an abrupt increase in learning rate (expressed by measures of fluency and accuracy), which is followed by a gradual stabilization. Segalowitz and Hulstijn (2005: 385) hypothesize that learners’ improvement resides ‘in greater availability of ever-larger, preassembled linguistic units and the reduced need to compute information’. Logan’s (1992) theory of automaticity predicts that retrieving full-forms from memory storage is less costly then speeding up rule computation: ‘automaticity is said to have been achieved when it has essentially become faster and more efficient to pull the instance from memory than to continually apply the rule’ (Rodgers, 2011: 298).

18

Discont inuit y in Second L anguage Acquisit ion

According to the interpretation of discontinuity data that is outlined in this book, exactly the opposite claim could be made. Learners qualitatively improve their knowledge of the TL when they break up chunks and ‘melt’ syntactic freezes (Section 2.4), not when they learn to build and automatically retrieve larger ones. Moreover, it could be claimed that neither implicit SL nor implicit GL can stem from the automatization of explicitly learned grammar rules. As Bialystok (2011: 49–50) pointed out, while continual practice in executing articulated rules eventually improves performance where motor skills are concerned, this is not the case with language learning. Adult L2 learners, rather, are often found to be ‘stuck in a stage of paralysis in which no amount of learning could consolidate the rules and no amount of practice could improve fluency’ (Bialystok, 2011: 49). This is because explicit knowledge cannot be transformed or converted directly through practice into implicit linguistic competence (Paradis, 2002: 2) and implicit knowledge does not coincide with automatized knowledge. Faster processing in fact – whether the so-called coefficient of variance (Segalowitz & Segalowitz, 1993) is comprised in the calculation of RT or not – may represent only a surface cue of acquisition (Paradis, 2009). ‘Restructuring’ is the name for the phenomenon of the shift from controlled to automatic processing via practice (repeated activation) of frequent form-meaning pairings or of chunks and formulas (McLaughlin, 1990; McLaughlin & Heredia, 1996). According to this theory, in the controlled processing phase learners are just capable of assembling words in a piecemeal fashion to build larger structures. Instead, in the automatic processing phase, these sequences become automatic and are stored as units in the long-term memory system. Once they are stored, these larger units can integrate other clusters and eventually feed into productive grammar rules. Restructuring is then a mental phenomenon that occurs when an early established knowledge is ‘dismantled’ and re-used in a different way in order to build new knowledge. To an external observer, some changes in a learner’s performance due to internal restructuring may show up as sudden improvements or backslidings. The explanation of these sudden changes can neither be found in the amount of exposure to the TL nor to instruction, but to the fact that the previous state of a learner’s mental grammar has been destabilized by the incorporation of new knowledge or by the indiscriminate application of a rule. Restructuring does not differ from discontinuity in its general idea but in its biological and neurofunctional grounds. In the discontinuity hypothesis, the qualitative shift does not occur between the non-productivity and productivity of chunks and units which occur through practice, but between two ways of learning which target two different parts of the L2 grammar and which are differently wired in the brain (see Chapter 4).

Second L anguage Acquisit ion Fac it Saltus

19

1.9 Discontinuity Does Not Mirror the Lexicon/ Grammar Distinction In the literature, SL and GL are sometimes referred to respectively as ‘lexical learning’ and ‘rule learning’. These definitions are misleading and will not be used in this book because they draw a theorist’s attention to objects (the grammar and the lexicon) rather than to processes (the ways of learning). Instead, in the discontinuity hypothesis, what differentiates SL from GL is that either way of learning has its own peculiar (even though not exclusive) neural underpinnings. There are two reasons for preferring to draw our attention to the learning processes rather than to the learning products. One reason is general; another is more specific to SLA. Firstly, the distinction between what is stored in the lexicon and what is derived via rules is problematic even for first languages (Embick & Marantz, 2005). For instance, if it is true that some irregular verb forms are stored, regular forms can also be either stored in the declarative memory or composed according to rulegoverned operations in the procedural memory. Prado and Ullman (2009), in their RT and acceptability judgments study, found that English regular pasttense forms can be represented in a native speaker’s brain in more than one way. To give an example, the past-tense of the verb walk can be represented as an uninflected whole (e.g. walked) or compositionally, via an affixation rule (e.g. walk-ed). How could we decide whether an item of the TL has been learned in either way? Prado and Ullman (2009) looked for possible independent variables (e.g. the degree of imageability) which may help draw the line between storage and composition in a speaker’s mind, but eventually concluded that ‘the line . . . is not at all static’ (Prado & Ullman, 2009: 863). Babcock et al. (2012), in their replication of Prado and Ullman’s (2009) study, found that L2 learners of English are much more sensitive to the frequency effect than native speakers of English. Many other factors (sex, amount of exposure and age of arrival) seem to affect the storage/computation divide in a second language (Babcock et al., 2012: 16). The second reason for drawing attention to processes instead to products is that, while uncertainty about what is lexicon and what is grammar is undermining overall, it is especially so for ‘dual-system’ and neurolinguistic theories based on a double-dissociation mechanism such as the DPM. The reason is that these theories posit that the grammar and the lexicon in both L1 and L2 are differently rooted in the brain. But what is grammar and what is lexicon in a second language? In Section 2.7 this issue will be addressed in more detail. A tentative answer would be that there are at least two possible solutions, to avoid circularity in argumentation. One is to search for

20

Discont inuit y in Second L anguage Acquisit ion

preliminary, clear-cut definitions of what is lexicon and what is grammar in both an L1 and a learner’s interlanguage and then to differentiate SL and GL on these grounds. The existence of items, stored and rule generated even in first languages, strongly speaks against the possibility of clearly setting apart what is lexicon from what is grammar. So SL cannot be considered as the learning of the lexicon and GL cannot be considered the learning of the grammar. As far as SLA is concerned, there is a viable alternative. This alternative solution would be to consider that SL and GL might apply sequentially to the same items of the L2 grammar. According to this idea, there will be items of the TL that can be learned twice by adult learners, once statistically and then grammatically. This idea suggests that SL and GL differ in their processes, not in their objects or domains of application. What is lexicon and what is grammar in a learner’s interlanguage may in fact vary across proficiency levels and across other developmental factors. SL is not the learning of lexicon, but a peculiar way of learning that is based on the frequency patterns of visible and countable items which co-occur at fixed or regular positions in the sentence. Similarly, GL is not the learning of grammar, but a peculiar way of learning that is based on providing concatenations of words with an abstract label (Section 6.3). The discontinuity hypothesis claims that SL and GL divide the labor of acquisition over an underspecified linguistic domain in which the boundaries between the grammar and the lexicon are not recognizable and fixable on an a priori basis. SL will therefore operate by recognizing only those items that can be recognized at a certain developmental stage, while GL will operate by rulegenerating only items that can be generated given a current state of learner’s knowledge of the TL. Items which can be first recognized and then generated over time will possibly be learned twice, in two different ways. This is precisely when discontinuity in SLA occurs.

1.10 Discontinuity Operationalizes Two Different Kinds of L2 Grammar In the previous paragraph we have seen that the difference between SL and GL should be looked for in the way items are learned, not in the nature of items themselves. Some items of the TL can be learned in two different ways. Yet, in order to be effectively learned by adult learners, those items should be learned twice, once statistically and then grammatically. In this paragraph, we further develop the complementary idea that not all items of the L2 can be learned twice. There is in fact a subset of grammatical items of the second language which do not undergo the ‘statistical pretreatment’;

Second L anguage Acquisit ion Fac it Saltus

21

that is, there are grammatical items that cannot be learned statistically before being learned grammatically. The discontinuity hypothesis deals with the reasons why some features and not others belong to the subset of L2 items that can be learned twice. It is proposed that these reasons are both linguistic and age-related (Chapters 5 and 6). The language-related reasons have to do with the fact that, in many developmental theories, the term ‘grammar’ covers two different things: one is the combinatorial L2 grammar; another is non-combinatorial L2 grammar. Combinatorial grammar concerns a speaker’s mental operation of mapping regular changes of forms to regular changes of meaning. Moreover, combinatorial grammar relates only to ‘overt’ features, that is, to the features of the language that are visible and present (in sentences). Since these features are visible, they can also be counted up and their TP can be implicitly recorded and stored by the speaker in the declarative memory system (Chapter 4). SL is effective for the acquisition of combinatorial grammar because statistics need countable things. Instead, non-combinatorial grammar concerns the computation of both visible and invisible (covert) features of a language. As we have already seen, dropping a subject pronoun or not is an example of capacity that relies on such a computation. Extraction of wh- constituents from embedded sentences is another example of computation over absent features: a speaker (or a learner) must be aware of both the constituent’s boundaries and of the special status of some constituents which resist extraction. Neither constituent’s boundaries are signaled in any way in sentences, nor is their status overtly marked (TP are not reliable markers of constituents’ boundaries; see Chapter 6). A native speaker has an intuitive knowledge of whether or not a wh- constituent can cross certain positions (Section 6.8). There are reasons to believe not only that adult L2 learners do not possess such an intuitive knowledge, but also that they cannot count on their experience with the TL (and capitalize on SL) to develop it. Chapter 6 is dedicated to explaining some reasons why SL operates for combinatorial grammar, but is ineffective for non-combinatorial grammar. The features of combinatorial grammar, in both the L1 and L2, lie at the core of the ‘words and rules’ (WR) theory and are developed in neuroanatomical terms in the DPM (Chapter 4). Pinker and Ullman (2002a: 456) write that ‘the grammar is a system of productive, combinatorial operations that assemble morphemes and simple words into complex words, phrases and sentences’. Combinatorial operations are acquired gradually and can be applied probabilistically (Pinker & Ullman, 2002b: 472). We have seen that a typical example of a grammar rule is one that joins the English regular past tense morpheme -ed with the symbol V(erb) and can thus inflect any English word which has been categorized as a verb. In the WR theory, the mental

22

Discont inuit y in Second L anguage Acquisit ion

operation of applying a rule to form regular verbs is juxtaposed with another independent operation which consists of retrieving a whole, unanalyzed irregular verb form stored in lexical memory. These two mental operations coexist. Human memory is in fact partly combinatorial and partly superpositional and associative. Therefore L1 and L2 speakers have at their disposal a dual-route mechanism: if a whole form cannot be retrieved from the lexical store, than the grammar has the means to provide a regular form through combination. Full availability of the dual-route mechanism in native speakers entails integrity of declarative and procedural memory circuits. In Chapters 5 and 6 we will see that both L1 and L2 grammar is not just about combination or stem + morphemes concatenation. There are parts of the grammar where the computation engages invisible features (constituent boundaries, traces, islands) and entails a speaker’s capacity for ‘categorizing over absences’. In such cases, there is nothing for a combinatorial grammar to combine. The speaker is instead required to operate mentally with entities about which s/he is believed to have an intuitive knowledge. The discontinuity hypothesis predicts that, for non-combinatorial grammar, gemination and the switch between SL and GL is much less likely to occur or does not occur at all. By operationalizing two kinds of grammar, the discontinuity hypothesis does not at all espouse the connectionist view that ‘no actual rules operate in the processing of language’ (McClelland & Patterson, 2002: 465). Regularities that can be detected by a ‘pattern detector device’ can in fact never approximate a categorical symbolic rule, not even when an inputoutput relationship is fully regular. In the rest of the book, discontinuity will refer to learners’ capacity to switch between regularities and combinatorial rules. We will see that this passage occurs when learners apply an abstract ‘label’ to a concatenation of words. It will be also made clear that a substantial part of L1 and L2 grammar is out of reach for this capacity and is excluded by the switch between SL and GL.

1.11 Discontinuity Differs From Developmental Theories of ‘Incrementalism’ The idea of discontinuity differs from developmental theories of incrementalism, according to which more input brings more acquisition in an increasingly linear fashion until basically everything that can be learned is eventually learned. In developmental theories of incrementalism, a current state of interlanguage is assumed to predict what will be acquired next. The developmental sequence follows a scale of language difficulty which is sometimes claimed to mirror increasing cognitive (functional-communicative and

Second L anguage Acquisit ion Fac it Saltus

23

processing) difficulties that a learner’s brain can put up with over time. The discovering of sequences or ‘order of acquisition’ would represent evidence that such scales of difficulty actually shape the developmental path: ‘for a substantial number of language areas, learners are seen to traverse several stages, each consisting of predictable solutions, on their way to developing the various full-fledged subsystems of the target language’ (Ortega, 2011: 83). For instance, Jordens (1997) and Klein and Perdue (1997) characterize the adult SLA process as continuous. All learners initially develop a language which is simple and functional, that is, ‘highly efficient for most communicative purposes’ (Klein & Perdue, 1997: 303). This system is called the ‘basic variety’. Fully fledged natural languages are merely further elaborations of this basic variety, that is, the addition of specific devices (inflectional morphology) and of ‘decoration’ to the same organizational principles (Klein & Perdue, 1997: 304). The nature of the principles operating and interacting in successive learning varieties beyond the initial variety does not change over time. These principles just become more complicated in order to deal with the increasing complexity of the communicative needs of a person who is adjusting to the environment where the TL is spoken. The developmental process is regarded as being inherently incremental and continuous, where no fractures or ‘leaps’ are admitted or are theoretically justifiable. Likewise, the ‘processability theory’ is a continuity theory, even though in a different sense. According to the processability theory, learners’ readiness to acquire the TL is a function of learners’ developing processing skills and learners advance in the developmental stages of a ‘processability scale’ (Pienemann, 1998; Pienemann et al., 2005). This processing scale is made up of developmental steps which are inherently continuous. As far as morphosyntax is concerned, the degrees of this scale mark the increasing capacity of the human processor to store and distribute the grammatical information, in the procedure used for building phrases first within the same phrase and then also across the different phrases and/or non-adjacent items that form a sentence. Such a capacity is universal (independent of learners’ L1) and determines which constructs of grammar can be handled and when (Pienemann, 2007: 139). In the last elaboration of processability theory (Bettoni & Di Biase, forthcoming) the developmental stages of morphosyntax in L2 Italian are defined. If processability theory is right, the SLA of grammar depends on how much information a learner’s lexicon can store and a learner’s parser6 can handle at a certain developmental stage. In more detail, we would expect that, at the ‘lemma stage’, the parser can analyze no grammatical information at all and that L2ers can learn and use only uninflected routines and formulas. Morphological variation and consistent opposition of morphemes is what identifies the ‘category procedure stage’ of the parser. Learners at this

24

Discont inuit y in Second L anguage Acquisit ion

stage, for example, are capable of opposing different morphological markings on the same verb. At the ‘phrasal-procedure stage’ L2ers are finally capable of unifying formal features (or φ-features) within the VP. Finally, at the ‘sentence-procedure stage’, L2ers are capable of sharing features immediately above VP (at sentence node). For each stage, the processability theory distinguishes between ‘emergence’ (when a structure is used with more than a single lexical item and lexical items display at least two structures) and ‘mastery’ (when the structure is spread to the available lexicon and used accurately in all appropriate contexts) (Pienemann, 2007: 147). The processability theory claims that all values of diacritic (formal) features of lemmas are specified during functional processing (which also determines grammatical relations) (Bettoni & Di Biase, forthcoming). Even if the passage from the lemma stage to the category-procedure stage marks a learner’s capacity to apparently feed chunks and formulas into grammar, this passage is not discontinuous. In fact, the processability theory claims that the passage depends on a processor-developing capacity which, in turn, is a direct function of a learner’s exposure to the TL input. The passage does not depend on gemination and the shift between SL and GL, nor is it accompanied by a shift in the brain structures that support acquisition. The processability theory does not make the distinction between combinatorial and non-combinatorial grammar and disregards the fact that some items are much less likely to be represented and processed in adulthood. The processability theory claims that learners’ readability is only processing related and that maturation only concerns processing capacities. The processability theory does not take into account the fact that some parts of L2 grammar (namely, non-combinatorial grammar) could not be learned by adults for biological reasons and because of the linguistic nature of the items.

1.12 Diagnostics of Discontinuity Myles (2004: 140) acknowledges that ‘there is a major difficulty in knowing when a given sequence has been generated by the grammar and when it has been produced as an unanalyzed whole’. How can we recognize when, for instance, agreement within a NP is also rule generated or just learned off by heart and stored in the memory as a chunk? The vast majority of the studies commented upon in this book analyze electrophysiological (ERP) and neuroimaging (fMRI, PET) data. Actually, behavioral methods and RT have also been used to discriminate between what has been just stored in declarative memory and what has also been proceduralized (for a classical commentary on the use of RT in SLA, see Segalowitz & Segalowitz,

Second L anguage Acquisit ion Fac it Saltus

25

1993 and, more recently, Jiang, 2012). For instance, psycholinguistic tests of procedural memory (for a definition of ‘procedural memory’, see Chapter 4) use serial RT to ascertain the degree of automatization and interiorization of sequences (in the sense specified in Section 1.8) in associative learning (Kidd & Kirjavainen, 2011). In these experiments, a single visual stimulus moves between four spatial locations on a computer screen. Participants have to press one of four buttons on a response panel that matches the location of the visual stimuli. The stimulus moves across fixed sequences for four blocks of 60 trials. In the fifth block, the stimulus moves randomly. The variable of interest is the participants’ RT across different conditions. It is expected that RT decrease through the blocks with repeated sequences and increase in the randomly presented stimuli. This should signal the gradual interiorization of TP from one position to another following associative learning. Another example of how RT can be used is Bowden et al. (2010). These authors search for frequency effects in verbal morphology in L1 and L2 Spanish. The aim of the research is to see whether all forms (regular and irregular) are learned, represented and processed in an associative memory system. The procedure is as follows: the infinitive of a target verb is presented visually on a computer screen. Participants are asked to say out loud the present and the imperfect of that form within five seconds. Responses are digitally recorded and RT are collected. Shorter RT would indicate that a lemma is effectively associated with a corresponding inflected form in the declarative memory. The main criticism for the use of behavioral measures is that such measures can possibly distinguish between whether an item has been automatized and speeded up or not, but they are inconclusive as to whether an item has been also proceduralized in the procedural memory system (Chapter 4). RT cannot say anything about whether a learner has acquired an item by using one or another memory system in the brain. Instead, the recording of electrical cortical activity, electrical stimulation, functional imaging, voxelbased morphometry and tractography may say something about the brain activity and the shift between one system and another. Neuroimaging and ERP studies tell us whether the learner uses the same neural substrates and has learned the L2 in the same manner as an L1 (Sabourin, 2009: 7). Neuroimaging and ERP studies are not flawless or devoid of criticism, however. De Bot (2008) warns against an incautious use of neuroimaging, ERP and ‘introspective methods’ for assessing L2 neurocognition. Crucially, the discontinuity hypothesis is based on an interpretation of this kind of data and claims that these data are likely to be more indicative for L2 neurocognition than traditional behavioral data alone. Therefore a closer look at de Bot’s argument in the rest of the paragraph is advisable.

26

Discont inuit y in Second L anguage Acquisit ion

The first criticism advanced in de Bot’s review is that the relationship between parts of the brain and cognitive functions is at most correlational, but not necessarily causal. Electrical activity in parts of the brain, changes in magnetic fields and increased levels of blood oxygenation of tissues evidenced by ERP, MEG and neuroimaging studies do not necessarily imply that that area is needed or relevant for that activity or the other way round. Proponents of the advantages of introspective methods are already aware of this criticism (e.g. Cappa, 2012; Davidson, 2010: 111; Osterhout et al., 2006; Tanner et al., 2013a, 2013b). They also know well that, more generally, functional and architectural differences are not bi-univocally linked. Phillips and Sakai (2005: 166) put it simply: ‘it is important to bear in mind that knowing where language is supported in the human brain is just one step on the path of finding what are the special properties of those brain regions that make language possible’. Nevertheless, these and other authors recognize that this methodological weakness would not be amendable if there were not a massive convergence among very different neuroanatomical and neurophysiological studies. Fortunately, this is fairly often the case (for L2 neurocognition at least). We will see in Chapters 3 and 4 that some patterns of L2 neurocognition (the kind of ERP responses with proficiency when type of instruction, length of exposure and other relevant variables are kept constant) are often (of course, not always) very similar across completely different experimental designs. Perhaps the reasons for these similarities are still to be explained convincingly and their interpretation is far from being clear, but their very existence appears to many to suggest that a correlation might exist at least between certain patterns of brain activation (the N400–P600 biphasic pattern, see Chapter 4) and some developmental factors (such as proficiency and age of onset of acquisition) in SLA. A similar criticism – put forward by Paradis (2009: 137–140) – is that neuroimaging or ERP studies are not correlational in nature and are not replicable. This is simply not true. To give just two examples, Mueller (2006) replicates Mueller et al. (2005) with the same participants undergoing additional training sessions before ERPs were recorded again. Davidson and Indefrey (2011) replicates Davidson and Indefrey (2009a) with the only difference being that in the latter no explicit instruction was given to participants and that there were three rather than one training sessions. Despite these differences, the ERP responses to feedback across proficiency levels of learners were found to be similar in these two pairs of studies. These pairs of studies are replications in all effects, even if the value of a single variable has been changed from the matrix experiment to its replication. Controlling (after having possibly refined) the variables is the asset of any experimental method, also used in replication studies. Neuroimaging

Second L anguage Acquisit ion Fac it Saltus

27

and ERP studies, to our knowledge, are not an exception to this widely adopted procedure in experimental research. Also, the ERP studies presented in Section 4.6 report on strikingly similar results across different conditions where all relevant variables are controlled (and one or two of them are re-tuned). In sum, de Bot’s criticism does not reflect the current situation where authors of neuroimaging and ERP studies are well aware of the non-correlational nature of their findings. The second criticism put forward by de Bot (2008) is that introspective research would be plagued by the ‘localizationist issue’ which consists of looking at exactly where the first and the second language would be neurally represented in the brain. The risk of this kind of research is to confound greater or lesser cortical recruitment for the L2 with the increased or diminished cognitive load due to variation in attention, memory and other cognitive (not necessarily language-related) factors. By doing so, one would exchange task-related enhanced brain activity for evidence of the existence of typical, task-independent L2 areas in the brain. Instead, less activation could mean that the learner is not doing anything but putting less effort in the task or is choosing among competing resources. This possibility is not ignored by the proponents of introspective methods, as we will see in the debate between the proponents of single and of dual-route mechanisms (Sections 4.2 and 4.3). Proponents of the ‘convergence model’ just posit that early and late learned languages are both languages in all effects; that is, they are made by ‘signals’ of the same nature. As such, they are likely to be represented and processed by the same brain network, whether they have been learned before or after puberty. According to the convergence model, the differences in activation patterns that were especially found in left prefrontal areas (namely, in the inferior frontal gyrus) – in the studies that will be described in Chapter 3 – cannot be explained in terms of differences in learners’ proficiency or exposure to the TL input. Rather, these differences should be ascribed to executive control processes which mediate between potentially interfering languages. Therefore, the issue raised by de Bot – far from being ignored in neuroimaging and ERP studies – is at the core of the current debate on memory systems in neurolinguistics. A third criticism is that a learner’s background, and especially proficiency, in neuroimaging and ERP studies is often not assessed adequately. One must agree with this criticism, which has already been seriously undertaken by many authors, some of whom use and describe neuroimaging techniques in the same book that de Bot comments on in his review article (Davidson, 2006: 233; Lamers, 2006: 276; Reiterer et al., 2009: 78; Steinhauer et al., 2009: 34; Stowe, 2006: 306; Van den Noort et al., 2006: 2293, 2010: 6). Moreover, one might observe that the same criticism may

28

Discont inuit y in Second L anguage Acquisit ion

apply to many behavioral studies (using RT and proficiency measures) where proficiency is not adequately (or just naively) factorized. Thus de Bot’s concerns are perhaps mistargeted or at least rather unfairly overrestricted. This does not mean that this criticism can be underrated and should not be taken seriously. A fourth criticism is that individual variation is not taken properly into account given that neuroimaging and ERP studies use grand averages in order to minimize the signal-to-noise ratio. In fact, bilinguals often show a larger variation when compared to monolingual controls. Actually, it is not true that individual patterns are destined to escape the attention of neuroimaging and ERP researchers throughout. Instead, they are likely to be commented upon separately – as also happens, to take an example from a very distant field of research, in many sociolinguistic studies based on regression models of very large sample of subjects taken from a population. Individual variation is in fact handled with proper statistical means whenever qualitative and quantitative research are carried out together (Dörnyei, 2007). On the one hand, averaged data from groups of participants to a study mean one ‘can’t see the forest for the trees’. On the other hand, individual variation data can explain more subtle nuances or even account for the unexpected appearance of novel ERP components. The next paragraph is aimed at describing how individual differences in ERP responses are taken into account in ERP studies on L2 neurocognition

1.13 Discontinuity and Individual Differences In the last paragraph, we mentioned the fact that studying ERPs may allow researchers to detect the neural mechanisms of real-time language comprehension with a very high (millisecond-level) temporal resolution. Following a well-established tradition in psychological and neurological studies on language acquisition, in Chapters 3 and 4 two well-known ERP components (N400 and P600) will be taken as possible indices of semantic and morphosyntactic online processing both in first and second languages. We will see why and to what extent these electrophysiological effects are usually held to reflect, respectively, the contrast between lexical access and combinatorial processes that might characterize different stages of SLA. We will also see that there are of course important exceptions to this generalization. There are also studies that question the existence, the relevance and the meaning of the N400/P600 dichotomy for SLA (Section 4.7). Even though in theory ERPs could be used to observe how individual differences in processing capacity are reflected in different electrophysiological

Second L anguage Acquisit ion Fac it Saltus

29

responses, most approaches treat inter-subject variability as a source of noise (in statistical analysis) rather than as an opportunity for SLA research (Tanner et al., 2013a: 367). Most ERP waveforms that are taken as evidence for SLA represent the central tendencies after averaging the raw signal across both trials and subjects. ERP measures consist in fact of averages made on other averages. This averaging procedure is necessary in order to achieve an adequate signal-to-noise ratio (Luck, 2005). In this way, responses of individual subjects to single stimuli on a particular trial are masked and individual differences remain hidden to further investigations. As Tanner et al. (2013a: 369) put it: ‘a given electrophysiological effect may be present on most trials in a few individuals, or on a few trials in most individuals, but be obscured in the averaging process’. In this section we will describe other studies which try to investigate the impact of individual differences by using grouped design. In this experimental procedure, groups of L2 learners are defined by independent variable measures, among which the most relevant for the discontinuity hypothesis are age of acquisition and L2 proficiency. Subjects having the same age of acquisition or the same proficiency level are grouped together. Unfortunately, these studies also cannot capture individual variation in response to items in different experimental conditions. Even when groups of subjects having the same proficiency level respond with similar components to semantic or syntactic violations, wave amplitude, polarity, timing (onset) and source of electric activity on the scalp may actually differ a lot both from subject to subject and from sentence to sentence. All these variations get lost after averaging. The fact that individual differences are blurred by ERP methodology (also in grouped design experiments) undermines the reliability and the importance of the results for SLA theories. In fact, it could be that the shift or ‘biphasic pattern’ from N400 to P600 – rather than being a mark of progress in acquisition – is simply an artefact of both averaging or of groups’ composition. Even when (in a longitudinal experimental designs) subjects act as their own controls (across time), individual progresses and the relative neural changes occurring through different experimental conditions and at different times remain invisible. It is unlikely that any serious SLA theory can rely on evidence concerning ‘groups of subjects’, regardless of individual difference, without the risk of reducing itself to a ‘theory of grouping’, having nothing to say about how real people learn a second language. Tanner (2013) and Tanner et al. (2013a, 2013b) represent an exception to the current trend in ERP methodology. These studies present a multivariate approach to assess the impact of individual factors on variability concerning both the type and the magnitude of ERP responses. To take an example, individual differences that have been taken into account in Tanner et al.

30

Discont inuit y in Second L anguage Acquisit ion

(2013b) are age of arrival (in the country where the language is spoken), length of residency, frequency of L2 use, language proficiency and learner motivation. Participants in this study were 20 native speakers of Spanish who had acquired English as a second language. They were all highly proficient and long-term L2 exposure learners. Individual differences were assessed through a written questionnaire and a proficiency test. Stimuli were sentences that were either grammatically correct or violated subjectverb agreement. ERP results were assessed in two ways: by looking at grand mean results and by looking at individual differences analysis. The former showed that ungrammatical sentences elicited a biphasic pattern, that is, a centrally distributed negativity (N400) followed by a broadly distributed positivity (P600). This is not the expected pattern in advanced learners, at least according to those studies in the literature on which the discontinuity hypothesis is based (see Chapter 4). For this reason, a closer inspection of waveforms was undertaken. A more fine-grained individual responses analysis revealed that learners actually divided in three groups: some learners showed the biphasic pattern described above; others showed primarily N400; and finally some learners showed primarily a P600. Then Tanner et al. (2013b) averaged ERPs separately for the latter two groups and found that both negative (N400) and positive (P600) dominance was in all effects exclusive. This means that subjects with a strong N400 effect did not show any P600-like waveforms and the other way round. This could only mean that most subjects in the experimental group showed either an N400 or a P600 and not both. Therefore, grand mean waveforms were not representative of individual neural profiles as far as the majority of subjects in the study were concerned. The next methodological step undertaken by Tanner et al. (2013b) was to find what individual factors predicted the kind of ERP responses previously described. The dependent variable utilized in this phase (and in the statistical analysis) could not be represented any longer by an averaged ERP measure, but should reflect both dominance and individual sensitivity to ungrammaticality conditions. The regression model including the new dependent variable (which reflected dominance and individual sensitivity) was then fitted with the individual level variables listed above (age of arrival, proficiency, etc.). Tanner et al. (2013b) found that different aspects of the ERP signal (amplitude, polarity, timing and distribution) were associated with different aspects of a learner’s background. To take an example, age of arrival and motivation were strongly associated with P600-effect dominance. Instead, proficiency (assessed with a paper and pencil test) correlated neither with effect dominance nor with ERP magnitude. An interesting fact is that some of these high-proficient bilinguals showed a

Second L anguage Acquisit ion Fac it Saltus

31

persistent N400 effect and no P600. Further correlations suggest that these were the less motivated learners and those who arrived in the US at postpubertal age (this second correlation in particular fits very well with the discontinuity hypothesis). Overall, Tanner et al. (2013b) point out that processing strategies may vary also across high-proficient L2 learners. Importantly, these strategies have their neural correlates. Some advanced learners still seem to rely on memory-based heuristics (they are N400 dominant) while others would rely on combinatorial information (they are P600 dominant). In sum, we have seen that variation exists at every stage of SLA and that this variation can be handled in ERP studies by using appropriate measures as both dependent and independent variables. Regression statistics and especially mixed-effects modeling (Baayen et al., 2008) nowadays do provide SLA researchers using ERP with tools for understanding this variation ‘especially with regard to the multiple dimensions of processing those ERP recordings provide’ (Tanner et al., 2013b: 14). The discontinuity hypothesis is not undermined by closer inspections at individual waveforms. In fact, both the timing and the quality of the shift from N400 to P600 components in morphosyntactic processing by L2 learners can be modulated by individual differences. However, the existence of the shift itself and its significance is not weakened by individual differences analysis.

1.14 Credits 1.14.1 Credits for the terms ‘discontinuous/discontinuity’ and for the idea that acquisition ‘takes a leap’ Devine and Stephens’s (2000) book Discontinuous Syntax is about hyperbaton in Greek. Hyperbaton is the grammatical phenomenon that allows interrogatives and wh- word modifiers to be sub-extracted (extracted separately from the rest of the phrase) and placed alone to the left periphery of the sentence (e.g. *which has he invited friend to dinner?). Discontinuity here refers only to the separation of one part of the same constituent from the other as a consequence of movement, but it has nothing to do with how the term ‘discontinuity’ is used in this book. The term ‘discontinuity’ is also utilized in the ‘radical discontinuity hypothesis’ (Kariaeva, 2009) to refer to those phenomena which are characterized by ‘discontinuous constituency’ (see also Alexiadou et al., 2013; Bunt, 1996). In both of these cases the term has a bearing on the tension between hierarchy and linearization, which is not a central tenet of the discontinuity hypothesis, even though one of the

32

Discont inuit y in Second L anguage Acquisit ion

consequences of gemination and the switch between SL and GL is a change in the L2 processing direction (Section 6.4). The idea of discontinuity outlined in this volume takes its inspiration from Berwick (1997, 2011). This author suggests that the notion of ‘gradualism’ (or ‘incrementalism’, see Section 1.11) does not apply to the study of the core properties of language, that is, to the syntactic competence (essentially, the operation internal merge, Section 6.3) which is species distinctive because it is shown uniquely by humans. In the discontinuity hypothesis perspective, it is claimed that the phenomenon of SLA in adulthood poses the same theoretical problems envisaged in the ‘phenotype-genotype’ opposition (the contrast between the hidden, underlying principles and what is surfaced by language uses) which is now a widely debated issue in linguistic theory. In this debate, it is questioned how much of what is visible at the surface of linguistic behaviors is accountable in terms of adaptation to the environment (in our case, to the usage of language) and how much only to the abstract properties of mind. In this book it is also claimed that SLA too cannot be totally adaptive, that is, a direct function of environmental pressure (learners’ language uses and language interactions). Consequently, language acquisition in adulthood should not be studied as a process of accumulation of small, linear and visible changes which occur as a function and in consequence of those uses and interactions. According to Berwick (1997), observable externalizations of language are in fact only secondary, that is, subsequent to conceptualization and the core principles of human language faculty which are internal and not usage driven. If this is true also for SLA, then one must expect that not every developmental progress is explicable in terms of what has been previously attained by learners. Instead, from an external observer’s viewpoint, some developmental mutations from one stage to another stage of adult SLA must appear inexplicable because they are partly driven by inner forces. In this case, the overall impression could be that SLA does not follow a linear, mono-dimensional and continuous path, but that at some point it facit saltus (‘takes a leap’). If the amount of exposure to the TL input and interaction was capable of determining SLA straightforwardly, we would expect the developmental process to be smooth and linear, that is, a direct function of the amount of learners’ exposure to the TL input. Since it has often been observed in the literature that very advanced and near-native learners also fail to acquire some aspects of the L2 morphosyntax after many years of exposure to the TL input (see also Chapter 6), other explanatory factors must be considered (input aside). The presence of abstract (frequency-independent) principles and the characteristics of an adult’s brain are the most important among the developmental factors that need to be considered.

Second L anguage Acquisit ion Fac it Saltus

33

1.14.2 Late assignment of syntax theory Townsend and Bever’s (2001) book, Sentence Comprehension: The Integration of Habits and Rules, has the subtitle, We Understand Everything Twice. The late assignment of syntax theory (LAST) posits that linguistic processes come in two flavors, habits and computations, and that, in sentence comprehension, associative (non-symbolic) aspects coexist with computational (symbolic) aspects. The LAST expresses the double-edged truth that ‘we mostly behave out of habit, except when we do something novel’ (Townsend & Bever, 2001: 5) and captures the fact that a speaker’s (and a learner’s) previous experience with words and sentences is very much relevant for the pattern completion mechanism of associative learning7 (see Section 5.4), but it is not so relevant as far as the acquisition of some other parts of the language is concerned (Chapter 6). Sentences in fact also have computational derivations underlying them. These computational derivations are not susceptible to direct reinforcement and to direct modeling as is assumed in constraint-based systems theory (Townsend & Bever, 2001: 6). This means that frequency and analogy cannot affect the results of a computation where the values of absent items are also included. That complex L2 syntax (binding, quantifier scope, island phenomena) cannot be learned inductively ‘by experience’ is also acknowledged by non-generativist scholars (for instance, O’Grady, 2008: 157–158). The LAST assumes that there is a switch between statistical induction and computational derivation: statistically valid perceptual templates assign sentences an initial hypothesized meaning which is then checked by the regeneration of a full syntactic structure. While frequency assigns the initial meaning, syntax has its role late in the processing or even after the processing of a sentence has already been completed by the speaker. For instance, the correct syntax of a passive sentence is usually assigned later in comprehension, while superficial cues drive speakers to rely on pseudo syntactic passive-like structures that allow them to grasp a tentative meaning and to go on to the next sentence in a discourse. Pseudo syntactic passive-like structures are where passive morphology is ignored by speakers and where the first noun phrase is typically interpreted as the agent of the sentence (see Section 4.9 for some examples of the socalled ‘good-enough processing strategy’). In the discontinuity hypothesis, the phenomenon of ‘understanding things twice’ envisaged by LAST is ‘stretched developmentally’: adult learners would understand and learn things twice and in two different ways along the developmental path. While native speakers understand things twice because their native language grammar cannot monopolize the processor (frequency has a role too), adult L2 learners must learn things twice because

34

Discont inuit y in Second L anguage Acquisit ion

this is what has been already shown in experiments to be successful in early language acquisition and because it still turns out to be the best solution for the aging brain. The LAST is also relevant for the discontinuity hypothesis because it suggests that both comprehension and acquisition are never a matter of ‘all or nothing’. Both these processes admit approximation. The mechanisms of comprehension and acquisition are in fact modulated by the peculiar situation of the speaker/learner in terms of the availability of cognitive resources. Cognitive resources vary with age. If lexical-pragmatic comprehension turns out to be resource saving for an adult native speaker, then it will become the default. As such, it will be preferred as possible for structural comprehension at any time, until proved insufficient or ineffective. Likewise, if SL has been found to fit better an adult learner’s brain, then SL will be the default. As such, it will be automatically preferred to GL for any possible occasion until proved insufficient or ineffective.

1.14.3 Charles Yang’s approach to UG Charles Yang’s (2002, 2004, 2008, 2010, 2011a, 2011b) ideas on statistics and grammar being complementary and on the role of distributional learning in syntactic acquisition have a bearing on the explanation of the phenomenon of discontinuity accounted for in this book. According to Yang, SL using TP (Chapter 5) can be successful only if it is constrained by the intuitive knowledge of relevant linguistic categories. On the other hand, experiencebased learning and domain-general learning abilities can explain the course of language acquisition more effectively than the postulation of a domainspecific, triggering mechanism. Yang claims not only that statistics do not refute UG, but that it requires UG in order to be effective. Conversely, a child learning a first language does not suddenly conform to the target grammar, as would be expected if a triggering mechanism were in place. Instead, a child proceeds slowly by testing competing hypotheses, all within the hypothesis space provided by universal grammar (UG). The target grammar will emerge by elimination of all other competing grammars due to a gradual probabilistic access to the features of the input. To recap: ‘the timing of setting a parameter correlates with the frequency of the necessary evidence in child-directed speech’ (Yang, 2004: 454). If parameter setting is probabilistic, then UG limits itself to instructing the learner on what cues they should attend to. Some insights from Yang’s approach to UG can be extended safely to the discontinuity hypothesis, but others cannot. For instance, the discontinuity hypothesis would reject that: (1) SL in adulthood cannot support the acquisition of non-combinatorial L2 grammar; and (2) adults and children rely on

Second L anguage Acquisit ion Fac it Saltus

35

partly different cognitive resources to learn an additional language, so their developmental paths are not fully overlapping. Yang’s main contribution to the hypothesis of discontinuity is the idea that the role of the language faculty in L1 and L2 acquisition has become smaller than in the precedent formulation of the generative theory. This is especially due to the ‘minimalist turn up’ which took place in the mid-1990s.

1.14.4 The minimalist turn-up Chomsky’s (1995) minimalist program and subsequent generative theory concerns, to a much greater extent than was done in the past, how the language faculty is accommodated into the general properties of the mind and the other way round. The language faculty is supposed to include a cognitive system that stores information (Chomsky, 2000a: 4). Moreover, the language faculty has to meet ‘legibility conditions’ on linguistic expression which are not inherently linguistic, but are imposed by the mind (Chomsky, 2000b: 9, 2002: 61–70). The sensorimotor and conceptual-intentional systems themselves in fact predate the language faculty; that is, they are the conditions for the language faculty to work and to integrate with other modules in the mind. More technically, the sensorimotor and conceptual-intentional systems are the ‘boundary conditions’ which are imposed on language by the architecture of the mind (Chomsky, 2012: 36–38). It might even be that some principles of optimal computation which are held to be at work in some linguistic phenomena (such as island conditions, see Chapter 6) are not coded in UG but descend from more general laws of nature that define and constrain the scope and the length of constituent movement (Chomsky, 2009: 21). According to Chomsky (2007a), these laws of nature include: (a) memory limitation on storage; (b) the need to externalize; (c) the constraints on linearization; and (d) the effect of word frequency and word repetition on the ease or difficulty of neural activation (on this last point, see also Shtyrov et al., 2010). According to Yang and Roeper (2011), the minimalist program has forced researchers to reconsider the relationship between the language faculty and the general cognitive and perceptual system. This marks a shift from the earlier inclination to attribute the totality of linguistic properties to UG. Chomsky (2007b) explains that while in the past the problem was how much must be attributed to UG to account for language acquisition, the MP seeks to establish how little can be attributed to UG while relying on generaldomain cognitive principles. The discontinuity hypothesis agrees with the idea that a theory of late SLA may encompass the possibility that some general-domain learning mechanisms ‘shift some explanatory burden out of the

36

Discont inuit y in Second L anguage Acquisit ion

innate UG device’ (Yang & Roeper, 2011: 555). Nowadays, the attention of researchers can be drawn to some algorithmic mechanisms of language acquisition with the aim of observing how probabilistic distributions such as TP can operate over grammatical hypotheses which are domain specific. This agenda is compatible both with Yang’s idea of how frequency constrains the acquisition of grammar and with the interpretation of discontinuity that is put forward in this book.

1.14.5 The ‘semi-modular’ perspective on language acquisition Non-innatist, non-modular approaches on language acquisition can be more or less radical in their anti-generativist tenets. For instance, Sharwood Smith (2011: 94) describes the theoretical contrast between innatist and antiinnatist linguists as follows: ‘A hardline connectionist, emergentist account in one domain can hardly be married happily to an account that assumes symbolic representations and, say, such a thing as language faculty.’ Whether this could be true or not on the part of connectionists, assuming the existence of symbolic representations and rules of grammar does not necessarily neglect the importance of input-related factors for SLA. For instance, Newport (2011) advocates a ‘semi-modular’ view of language acquisition which nevertheless admits the existence of universal principles constraining language structure, language acquisition and language processing. Once one admits that these universal principles are at work, then one can also acknowledge that ‘the structure and patterning in language might be acquired by computational mechanisms or organizational principles that are shared with other domains’ (Newport, 2011: 282). Newport asks herself whether the constraints on externalization (e.g. the minimal computation principle) apply only to language or to any domains of complex pattern learning. The answer to this question seems to be positive: language learning shares important features with non-linguistic skills acquisition (e.g. music, see Patel et al., 1998), even if this does not necessarily mean that systems that subserve language learning and use completely overlap with those subserving non-linguistic skills (as it is suggested by Ferman et al., 2009: 408). The semi-modular perspective on language acquisition envisaged by Elissa Newport and colleagues sees the acquisition of language in adulthood as both a domain-specific and a cognitive general-domain phenomenon. The module for language acquisition is not ‘informationally encapsulated’ (as Jerry Fodor puts it) and participates with other parts of the mind and cognitive functions. In the semi-modular perspective on language acquisition, redundancy is a crucial feature of the process of language learning: as Ullman (2005b) puts

Second L anguage Acquisit ion Fac it Saltus

37

it, ‘more is sometimes more’. Retrieval from storage and computing by rule do not in fact exclude each other in the process of acquiring a language. When a child acquires the first language, the system may maximize storage rather than rule and variables computation, but both are available (Jackendoff, 2002b). Pinker (1999) argued that, in order to make sense of the world around us, native speakers use both family resemblances and rule-driven categories. The former speed up routine uses of language and the latter are more useful for predictions and generalizations. Moreover, in the semi-modular perspective on language acquisition, linguistic competence itself can be conceived as being both categorical and discrete. Some phenomena tend to be perceived and modeled as categorical by speakers and learners, while other phenomena show gradient behavior in a speaker’s judgment. Probabilistic linguistics in fact seeks to account for the full continuum between grammaticality and ungrammaticality (Bod et al., 2003). It could be that the language faculty itself displays probabilistic (statistical) properties. Such properties can account for how part of the grammar is acquired. While the acquisition of categorical grammar must rely on innate principles, probabilistic grammars can be learned from positive evidence alone. Finally, in Nooteboom et al. (2002) the concepts of ‘computation’ (rules with variable) and frequency-based ‘storage’ in language processing and representation are claimed to be entrenched and involved simultaneously at all levels in the language faculty. These authors claim that storage is not a ‘performance notion’ but ‘has become a valid component in explanatory linguistic hypotheses’ (Nooteboom et al., 2002: 16). What helped researchers move away from an axiomatic approach (where formulas are opposed to rules) is the growth of neurolinguistic studies (Nooteboom et al., 2002: 17). The DPM described in the next paragraph and in Chapter 4 is based on such studies.

1.14.6 The declarative/procedural model The discontinuity hypothesis described in this book is mainly based on an interpretation of a number of neuroanatomical and neurophysiological studies published in the last 15 years (Chapters 3 and 4). The theoretical framework for the interpretation of these studies is based on an extended interpretation and integration of Michael Ullman’s version of the DPM (for details, see Section 4.4) which, in turn, is framed in the multiple memory systems (MMS) theory. The discontinuity hypothesis is set precisely in this neurocognitive account of SLA. Even if most neurocognitive accounts of second languages have developed independently from linguistic approaches to SLA (Morgan-Short & Ullman, 2011: 282), theories in the language

38

Discont inuit y in Second L anguage Acquisit ion

domain are acknowledged to shed light on theories of non-language cognitive domains and the other way round (see also Ullman, 2004: 233, for a discussion). The discontinuity hypothesis would aim to contribute to this debate. Compared to the DPM predictions, in the discontinuity hypothesis the interpretation of neurophysiological (structural and functional) findings is more ‘interlanguage-oriented’. Unlike in Ullman’s DPM, the interpretation of experimental studies that is carried on in this book carefully distinguishes the objects of learning from the processes of learning an L2 and shifts the focus of interest from the former to the latter. It will be assumed that, especially at the early developmental stages, the lexicon and the grammar of a learner’s interlanguage do not differentiate by the nature of their items, but by how these items are learned. In adult SLA, some (not all) grammatical items of the L2 can be learned statistically before being learned grammatically. This holds, irrespective of the fact that native speakers or linguists may judge those items as belonging to the lexicon or to the grammar of the TL.

1.14.7 The special status of dependency relations in the brain Yosef Grodzinsky and colleagues (Grodzinsky, 2000, 2003, 2005; Grodzinsky & Santi, 2008; Santi & Grodzinsky, 2012) assert that there are parts of the syntax that cannot be processed by neurologically impaired speakers, namely by Broca’s aphasics. The reason is that dependency relations involving movement of NPs and of phrasal constituents have a special neuroanatomical and neurofunctional status in the brain as opposed to syntactic relations based on linear distance. Dependency relations are those where the computation of a trace is involved. Phenomena of this kind are passive sentences, wh-questions or relative clauses. The computation of the trace in these kinds of sentence requires the functioning of a dedicated unit of the working memory which is located in the left inferior frontal gyrus. The location of the trace in those sentences is inaccessible to Broca’s aphasics due the fact that this special unit of the working memory has been damaged. This was made visible, for instance, by analyzing and comparing the functional imaging (fMRI) of both aphasics and health subjects performing the n-back task. In this task, a subject is presented with a sequence of single letters separated by a few seconds. For each letter the subject has to decide whether it is identical to the letter that either was mentioned in the instructions (0-back), or appeared one (1-back), two (2-back) or three (3-back) items earlier in the sequence. Constituent movements are held to be somehow similar to the 2-back task because they require the presence and the functioning of a temporary memory store where the moved constituent is kept until an appropriate position is identified downstream in the sentence and a

Second L anguage Acquisit ion Fac it Saltus

39

connection between the old and the new positions can be established. The crucial fact is that Broca’s aphasics fail at the 2-back task, but not at the 1-back task, when the two connected elements are close. Similarly to a letter which has been presented two positions back in the n-back task, a moved element in the sentence cannot be held in memory by aphasics for later comparison (in the n-back task) or (in the case of language) for later linking to a trace. The reason aphasics do not fail at the 1-back task (that is, at detecting agrammaticalities when asked to check a relation between two adjacent position such as ‘they/*them were chased by the police’), is because the brain has different working memory units which process linear and non-linear relations differently. Each of these units can be damaged selectively and leave the other one intact. The first conclusion is that syntactic dependency and syntactic distance (in terms of intervening words between two linked items) are functionally differentiated in Broca’s aphasics. A further conclusion stemming from Grodzinsky’s studies is that dependency relations must also have a special status in health subjects. This special status can be described in terms of processing complexity, processing load and capacity limitation. There are parts of the syntax which are qualitatively more difficult to acquire and to process than others. Syntactic dependencies between adjacent items are easier to process than syntactic movement involving traces and silent (empty) categories. The distinction between ‘combinatorial’ and ‘noncombinatorial’ grammar, which is investigated in developmental terms in this book, at least partially reflects the distinction between morphosyntactic ‘linear relations’ and morphosyntactic ‘dependency relations’ in Grodzinsky’s terms. The former phenomena (combinatorial grammar) are easier to learn than the latter (non-combinatorial grammar) by adult L2 learners. Silent (empty) categories and displaced items are harder to track and keep in memory not only by aphasics, but also by children and late L2 learners. In Chapters 5 and 6 we will see that – whereas linear morphosyntactic relations can be learned discontinuously – dependency relations involving empty categories cannot.

1.15 Breakdown of the Volume This book is organized as follows: Chapter 2 singles out the developmental effect of frequency first in the passage from chunks to constructions and successively in the passage from construction to grammar rules. In Chapter 3, the difference between early discontinuity and late discontinuity is described. The discontinuity hypothesis is tested against the notion of a ‘critical’ or ‘sensitive’ period for language acquisition. The critical issues

40

Discont inuit y in Second L anguage Acquisit ion

concern the process of brain maturation versus the process of brain adaptation, as they are characterized by, respectively, maximal brain plasticity as opposed to lifelong brain plasticity. Chapter 4 aims to place the discontinuity hypothesis in the research framework of L2 neurocognition. The discontinuity hypothesis is compared to the predictions of the DPM concerning L1 and L2 acquisition. Some differences between these two models are outlined. In Chapter 5, the main features of SL are sketched: the research question is whether SL can cover the acquisition of all L2 grammar. The answer to this question seems to be negative. The aim of Chapter 6 is to explain why discontinuity operates only over combinatorial L2 grammar and is ineffective over non-combinatorial L2 grammar. This distinction is also meant to explain why advanced L2 learners experience difficulties in the acquisition of a subset of morphosyntactic phenomena, as is noticed by other SLA theories as well. The core idea of the discontinuity hypothesis outlined in this book can be summarized as follows: what adult L2 learners are more and less likely to learn depends on how it can be learned. What can be learned discontinuously is likely to be eventually learned. Instead, what implies absent features that cannot be counted up or displaced items cannot undergo SL and is less likely to be learned. It is the way of learning (not the lexicongrammar distinction) that determines what is eventually and successfully learned by adult L2 learners. A short, albeit important, note on terminology is also needed in this introduction. In this book, the terms ‘learning’ and ‘acquisition’ are used interchangeably, even though the author is of course aware of the Krashenian distinction. The acquisition versus learning dichotomy is not kept because the crucial opposition between explicit and implicit knowledge of the TL is not such for the discontinuity hypothesis. SL and GL are both implicit because they occur independently of conscious, directed learners’ efforts (contra Paradis, 2004, 2009). Therefore, in this book, SL will not be paired with ‘learning’ and GL will not be paired with ‘acquisition’.8

Notes (1) Event-related potentials (ERPs) record electrical fluctuations on the scalp (measured in terms of microvolts) which are time-locked to the presentation of visual or auditory stimuli. In language studies, researchers place a number of electrodes on the scalp of participants and show them the same sentence twice, once in the correct version and then in a version where some mistakes are present. Differences in the electrical fluctuation from the correct to the incorrect conditions are regularly observed. These brain responses reflect postsynaptic electrical activity of a limited population of cortical neurons sharing shape and orientation. When the signal has been correctly amplified and averaged across many participants, one can observe that different errors elicit qualitatively different brain responses or brainwaves called

Second L anguage Acquisit ion Fac it Saltus

(2)

(3)

(4)

(5) (6)

(7)

(8)

41

‘components’. Three among these components are the most studied in SLA. When sentences with gross, categorial or syntactic violations are presented, one gets a left anterior negativity (LAN) which is a negative-going brainwave with peaks 200– 300 ms after presentation. When sentences with infrequent, rare words, semantic anomalies or words in implausible contexts are presented, one gets an N400. It is assumed that this negative-going wave reflects difficulty with lexical access and integration. Finally, morphosyntactic anomalies (such as violations of agreement, tense, case and verb subcategorization) elicit a large positive-going wave with a peak around 600 ms post-stimulus. It is assumed that P600s reflect controlled processing and structural reanalysis for word-order and morphosyntactic difficulties. See also Sections 1.12, 1.13, 3.6 and 3.10. The image that chunks and construction ‘feed into grammar’ or ‘feed into a learner’s developing linguistic system’ is taken from Myles (2004: 150, see Chapter 2 in this book). The image renders the idea that chunks are not just abandoned by learners when they make room for a generative, rule-governed system. Rather, learners keep on utilizing the chunks when syntax is still developing. When doing so, learners become aware of the gap between the capacity of their current grammar and the grammatical information that would be necessary in order to process the single components of the chunks. This fact would force learners’ reanalysis and would drive morphosyntactic development forward (Myles, 2004: 163). The term ‘statistics’ is understood here in a technical sense. It refers to the capacity of the brain to track TP. TP are the probabilities that an event is preceded or followed by another event. Chapter 5 is dedicated to explaining the mechanisms of SL in more detail. Connectionist models (such as the competition model) claim that SLA emerges as a form-meaning mapping process that is largely determined, not by innate representations or rules, but by frequency of co-occurrence patterns, storage and chunking (see MacWhinney, 2005a). Words co-occurring together very frequently in the input ‘wire together’ in a learner’s brain to the extent that – when just one displays in a sentence – the other one may be implicitly evoked by memory systems as well. ‘Declarative’ and ‘explicit’ are not necessarily synonymous as in Paradis’s version of the DPM (see Chapter 4). According to many processing theories, a parser (or ‘processor’) is the unit that assigns automatically and incrementally the items of a sentence to a structure. Depending on the theoretical approach, the assignation may be driven by a learner’s intuitive knowledge of structural properties of the sentence or by a learner’s previous exposure to co-occurrence patterns of the items belonging to a sentence. Associative learning in language acquisition is often seen as a speaker’s capacity to establish efficient form-meaning relationships. Associative learning is sometimes used as synonymous of SL since TP can be regarded at as being measures of the strength with which some events are associated to other events (see Chapter 5). A reviewer observed that the processes of learning and of acquiring an L2 differ in terms of neurocognition and neurophysiology, and, consequently, in terms of how they affect the study of second language acquisition/learning. This distinction is particularly important in a stream of research which deals with the exact same fields from which it gathers evidence for discontinuity. The same reviewer suggested that the term ‘mastering a language’ could be used neutrally to cover the domain of both learning and acquiring. Actually, there are a very few instances in this book in which a comprehensive term such as ‘mastering’ would be needed. Basically, all instances

42

Discont inuit y in Second L anguage Acquisit ion

of SL and GL refer to ‘acquisition’ (in Krashenian terms). SL and GL act only in the realm of SLA. Discontinuity has nothing to do with ‘learning’ (in Krashenian terms). Perhaps future research will tell us if exposure to statistically controlled input (variation sets) in the L2 classroom can facilitate and speed up GL. But this is a very different issue, though.

2

Discontinuity as Chunks Feed into Grammar

2.1 Chapter Preview: Frequency Takes the Floor The topic of this chapter is that second language learners realize quite soon that some words of the L2 co-occur (occur together) more often than others. This fact is important because learners’ capacity to track the cooccurrence patterns in the target-like input is the essence of SL. In this chapter, I will review some studies on formulaic language, idioms, chunks and constructions in the L2. These studies are of interest for the discontinuity hypothesis because they assume that language learners first drive their attention and acquire groups of words rather than words taken in isolation. The relevant developmental dimension which will be investigated in this chapter is frequency of co-occurring items. According to many scholars, frequency and repetition of items are the striking property of input to which both L1 and L2 learners are exposed (Goldberg, 2008: 523). Frequency effects pervade language representations and processing to the extent that it is now widely acknowledged that it must be represented somewhere in the brain from birth (Bod et al., 2003: 4). Chunks, idioms, formulas and constructions, despite being different in many respects, have frequency of co-occurrence as their main component. The most important developmental effect of frequency that will be discussed in this chapter is the passage from chunks to constructions. This is a big step for adult L2 learners. Learners move ahead from chunks to constructions when they become capable of shifting their attention from adjacent co-occurring items to the similarity of non-adjacent items that can fill some patterns of alternation in sentences. When this passage takes place, it is likely that a better habitat for an adult L2 grammar to grow is establishing and eventually discontinuity can occur over time. 43

44

Discont inuit y in Second L anguage Acquisit ion

In this chapter, I start to define the domain of SL in second language acquisition (more detail on SL will be given in Chapter 5). In order to do that, I will try to clarify the domain of frequency effects, the concept of ‘formulaic language’ and the role of chunks and constructions in L2 development. A possible trajectory from chunks to L2 grammar will also be sketched. The term ‘formulaic’, as it is used in the literature, is very broad. It encompasses many different kinds of co-occurrence patterns: words kept together by conventional routines or by pragmatic values, and also words that are glued together only by frequency. In the discontinuity hypothesis, we are interested only in the frequency factor because frequency is what gets chunks feed into grammar. The next paragraph is aimed at describing why the ‘frequency factor’ is important in language processing and language acquisition. In Section 2.3, the frequency factor is singled out from other components of what goes under the label of ‘formulaic language’.

2.2 Three Aspects of the ‘Frequency Factor’ in Language Processing and Language Acquisition Some language researchers claim that words in isolation should no longer be considered the privileged units of mental representation and processing in both first and second language acquisition. What has attracted the attention of these researchers is rather the number of bonds and connections among words frequently co-occurring in texts that learners are capable of establishing. It seems that native speakers, but also L2 learners to some extent, have a strikingly precise and intuitive knowledge of how words combine and that they are capable of keeping track of these combinations after only a few encounters. Language processing research is currently being expanded to incorporate this fact into a theory of language acquisition. Some maintain that the fact that the brain keeps track of statistics among co-occurring words does suffice to pave the way for L1 (child) and L2 grammar to develop. Implicit statistics about word sequences would thus be essential to the ability to infer grammatical structures (Caldwell-Harris et al., 2012). According to this view, the developmental path would go from merely collecting statistics about word and phrase co-occurrence in initial phases to inferring the whole grammar of the TL at later stages. In Chapter 5 we will see that this passage is problematic, though; what is unclear is how learners recognize phrase boundaries and constituency from mere continuity of words (Caldwell-Harris et al., 2012: 17). An increasingly observed and studied cognitive device, which is a consequence of the fact that items co-occur, is the one that allows language speakers and language learners to make predictions

Discont inuit y as Chunk s Feed into Grammar

45

about upcoming words in a sentence. The automatic storing of frequencyweighted exemplars would help speakers/learners in predicting how a given sentence in a text will unfold. Speakers/learners can make use of their expectations because the options available as each word of a sentence is encountered are not infinite. According to Kutas et al. (2011), predicting what comes next in a sentence is a natural way used by the whole cognitive system (language included). Kutas et al. (2011) claim that during language processing, items are ‘pre-activated’ before they are actually encountered, through probability estimations. Can frequency of co-occurring items in the input and predictions alone make a grammar available to learners? A reasonable answer is no. Mere frequency and storage plus the ‘look-ahead’ capacity seem not to be enough to infer and learn the grammar of a language. A third ingredient that has been proposed as being essential for grammar formation is ‘structured repetition’. What permits learners to acquire dependency and constituency patterns of a language would be the speakers’ or learners’ capability of seeking and finding in the input partially alignable segments of sentences (Edelman, 2011). These segments are a partial overlap of nearby chunks of speech called ‘variation sets’. We will see (in Section 6.3) that variation sets to which learners are exposed represent a perfect environment where combinatorial grammar can grow. Learning a language from variation sets means developing expectations of partial repetition of already familiar material. The alignment of partially overlapping segments of sentences drives category recognition and category formation. According to Edelman (2011), variation sets take advantage of both mere frequency patterns and of predictive processing. To sum up: frequency of co-occurring words in the TL input, learners’ innate predictive capacity and structured repetition in the form of variation sets are three different facets of the same ‘frequency factor’ that ultimately drives SL to GL. The SL of L2 grammar possibly uses all these ingredients of frequency: higher than average co-occurrence patterns, automatic prediction and a certain amount of repetition of overlapping segments or chunks. According to Edelman (2011), the grammar spontaneously emerges as the ‘distillation of experience’ by the interaction of these three characteristics of frequency (see also Baayen, 2010).

2.3 Chunks, Not Formulas, Are the Building Blocks of SL We have seen that the frequency of co-occurring words in the TL input is one of the ingredients of SL. It is important to stress that not all co-occurrence patterns are important to the same extent for SLA. In this paragraph, chunks

46

Discont inuit y in Second L anguage Acquisit ion

are identified and singled out as a subset of what is generally called ‘formulaic language’. In its broader sense, formulaic language is whatever multiword expression is ‘stored and retrieved whole from memory . . . rather than being subject to generation or analysis’ (Wray, 2002: 9). According to BardoviHarlig (2009: 757), the term ‘formulaic’ in its proper sense includes ‘acquisitional formulas’ (high-frequency sequences too complex to be rule-computed by initial learners) but it should exclude ‘social formulas’ (pragmatic routines, conventional expressions, phraseological units and idioms). To say this differently, according to Bardovi-Harlig (2009), only in acquisitional formulas do frequency and repetition count as a determining factor. In social formulas, on the other hand, frequency of co-occurrence of items is not the main factor (the semantic and pragmatic values of formulas must be focused on instead by L2 learners). This distinction is important for the discontinuity hypothesis. Discontinuity in fact relates only to the role of frequency and disregards all other, non-frequency related aspects of formulaic language. Learners’ capacity to track co-occurrence patterns in the input flow is (a) implicit, (b) wired in the brain, and (c) not specific to language (see Chapters 4 and 5). Instead, learners’ capacity for using pragmatic routines in the appropriate contexts reflects the explicit knowledge of norms and/or a sensitiveness to social norms related to the different uses of a language. Learners’ sensitivity to acquisitional and social formulas seems to be developmentally moderated, that is, might depend on a learner’s proficiency level. Both beginner and very advanced L2ers are in fact credited with the ability to store and retrieve unanalyzed forms, i.e. forms that are not subject to generation or analysis. Actually the nature of the forms that are learned differs greatly depending on the proficiency level. Initial learners are biased by an innate, statistical sensitiveness to words that co-occur with greater than random significance in the TL input (Chapter 5). For instance, Durrant and Schmitt (2010) found that L2 learners of English, after little exposure, retain a memory of which words come together in the language they meet and that this retention appears to occur implicitly. On the other hand, very advanced or near-native learners have also developed sensitiveness to native speakers’ ‘best usage’. In fact, as learners become proficient, they discover that ‘the standard ways in which native speakers of a language refer to standard situations’ are through collocations, idioms and phrasal verbs (Erman, 2009: 326). Beginner learners access only the statistical information behind co-occurring words, but fail to use TL social formulas properly (DeKeyser, 2009: 124; Erman, 2009: 330; Ohlrogge, 2009: 380; Steinel et al., 2007) and even to perceive prosodic cues discriminating between literal and idiomatic meanings of words in the TL (Vanlancker-Sidtis, 2003). Since beginner learners lack a pragmatic competence, they overuse acquisitional formulas but

Discont inuit y as Chunk s Feed into Grammar

47

underuse social formulas (Bardovi-Harlig, 2009). Thus early interlanguage can be said to be ‘formulaic’ not because learners overuse formulas or collocations in general, but because it is developmentally parasitic to just one of the factors that make up formulaic language. This factor is frequency of co-occurring items in the input. We will see that frequency is all about TP, which is the basic, linear, statistical relationship among co-occurring words in a chunk. To sum up: SL in this book has to do with a learner’s sensitiveness to frequency, not to her sensitiveness to pragmatic, discourse-related, interactional and social values that may also be present in formulaic language. In order to make the distinction between acquisitional and social formulas even clearer, I will use the term ‘chunk’ to refer only to the former kind. Chunks belong to a circumscribed subset of formulaic language. In this subset, only frequency of co-occurrence counts. Among all the other kinds of formulas, only chunks are of interest for the discontinuity hypothesis. How exactly can the difference between chunks and the rest of formulas (idioms, collocations, phraseological units and routines) be evaluated in representational and processing terms? Chunks are items with particularly higher than average forwards and backwards transition probabilities (Section 5.6). Chunks are organized units of adjacent word tokens (Abney, 1991; Federici et al., 1996; Lenci et al., 2001). Of course there exist kinds of word relationships other than TP: therefore frequently co-occurring adjacent words can be only chunks or also formulas. For instance, in Sentence (1), è andata ‘(she) went’ is a chunk, while in Sentence (2) it is also an idiom: (1) Elena è andata a casa ‘E. went home’ (2) Questa volta è andata ‘this time we got off’ In Sentence (1) the words è and andata form a finite-verb chunk in Italian. These words are related through a linear, dependency chain between a semantic head (andata ‘gone’) and an auxiliary (è ‘is’). On the other hand, in the formulaic-idiomatic Sentence (2), the same words form also a semantic constituent in which the constituency relation between members is opaque (Cruse, 1997). Not all chunks must be semantic constituents or collocations and abide by the ‘idiom principle’ (Sinclair, 1991: 110; Wray, 2000: 466). For this reason, chunks in this chapter and in this book will not be confounded with formulas (idioms, collocations, etc.) in the broader sense. Even if the distinction is blurred quite often (e.g. Ellis, 2002: 155), in this book I assume – unlike Wray (2000) and Hall (2010) – that chunking is only

48

Discont inuit y in Second L anguage Acquisit ion

a frequency-based process, totally independent of meaning and of social, pragmatic and variational uses (see also Malec, 2010). Chunking is understood as a neurophysiologically grounded phenomenon by which repeated sequences of words are packaged together in cognition in order to form larger sequential units which in their turn feed into constructions (Bybee, 2010: 7). At a processing level, chunks are repeated sequences of words automatized by the speaker to the extent that they are perceived, stored, retrieved and executed as single units at every level of linguistic organization (Bybee, 2002: 112, 2003). At a neuroanatomical level, chunks, just like sequences of actions, are credited with having correlates in neuromotor routines which are speeded up by practice. They are wired in the brain and are a property of the procedural memory system (Bybee, 2008: 220; Ellis, 1996; Ellis & Cadierno, 2009). We will see that, in order for SL to occur, L2 learners need massive exposure to chunks with higher than average TP and repeated practice (e.g. in a classroom setting); ‘representational strength is determined by the power law of practice rather than raw frequency’ (Ellis, 2002: 160). Since chunks represent a well-defined subset of a broader ‘formulaic language’, the teaching/learning problem envisaged by Wray (2000, 2009) fades out: if chunks are destined to be learned, they will be learned only automatically, implicitly (without conscious effort) and irrespectively of whether learners are trained or simply exposed to them in the input of the TL. Finally, in this chapter, and also in Chapter 5, I will not discuss in depth the debated issue of whether L2 learners, when exposed to the TL input, implicitly form chunks when they parse the input or they just develop a sensitivity to TP present in sequences (Franco & Destrebecqz, 2012). At the core of this debate is whether learners’ sensitiveness to TP among words operate only for adjacent words or also for non-adjacent ones. For the purpose of this chapter, it will suffice to say that chunks are adjacent words glued together. Instead, constructions (Section 2.5) may have open slots separated by intervening words. The developmental passage from chunks to construction (Section 2.4) is made possible just because TP begin to operate regardless of adjacency (in Chapter 5 this passage will be described in more detail). Therefore learners are initially sensitive only to TP between adjacent words, but afterwards they become capable of tracking non-adjacent ones as well.

2.4 Chunks Feed into L2 Grammar Syntax-driven and usage-based theories disagree about the relationship between chunks and SLA (Eskildsen & Cadierno, 2007), especially as to whether chunks may feed into grammar or not. The former position assumes

Discont inuit y as Chunk s Feed into Grammar

49

that chunks are ‘syntactic freezes’ learners must melt to extract the rules of grammar (e.g. functional categories, case, inflection, but also argument structure). The latter position instead assumes that there are no abstract rules hidden in chunks: learners’ systematic exposure to type-frequent chunks is enough for them to extract linguistic regularities which gradually consolidate in a learner’s mind in the form of constructions. Let us now address the syntactic account of the relationship between chunks and L2 grammar. In the next paragraph, the usage-based perspective will be described. SLA at initial stages can be boosted by the fact that learners soon become implicitly aware of the presence of chunks in the input flow. Therefore chunks may initially serve as the ‘learning units’ of SLA. Three properties of chunks make them good candidates for being acquired early and, successively, to pave learners’ way to L2 grammar. These properties of chunks are adjacency, maximality and non-recursivity. Adjacency is the most important of the three features. Chunks are only made of adjacent words. Adjacency means that (all the rest being equal) words that are close are more likely to be connected to each other and learned together than words that are distant. The experimental studies presented in Chapter 5 show that an increasing linear distance among items in a sequence is a delaying factor for acquisition. One of the consequences of the developmental primacy of adjacency is also that dependency relationships among distant words could be more difficult to be detected and to be learned by adult learners (Section 6.8). The second feature of chunks is maximality. Maximality means that, in a speaker’s mental representation, a chunk cannot be embedded in another chunk. In other words, elements inside a chunk are not hierarchically ordered. The experimental studies presented in Chapter 5 show that embedded (ABBA-like) structures are more difficult to learn than iterative ones (ABAB . . .). Finally, chunks are not recursive. Nonrecursivity means that, in order to include a chunk (AB) into a larger unit (ABn), learners suffice to replicate the same local relationship (an A must be followed by a B which must precede an A, etc.) and do not need to know and project an abstract, property-based rule. Therefore, non-recursivity implies that, when a chunk combines with another chunk to form a larger unit (as is described by Ellis, 2001: 38), the relation between the latter and the former is always linear (sisterhood) and never structural (asymmetric, due to merge and c-command). Learners could be advantaged if they can rely on local dependencies rather than on structural ones. All they have to know in order to build larger units is ‘what follows what’ and not ‘what c-commands what’. In the syntax-driven perspective, adjacency, maximality and nonrecursivity of chunks make them play a decisive role in the acquisition of L2 grammar. Chunks play a role in the shaping of L2 grammar because – when the managing of all grammatical features exceeds initial learners’ parsing

50

Discont inuit y in Second L anguage Acquisit ion

capacity – they provide learners with a databank of complex, but already shallowly processable structures (Myles, 2004: 150–153). These structure are easily processable because they are made by adjacent, maximal and nonrecursive elements. Since these structures can also be practiced by beginner learners, they can function as the ideal ‘playground’ for tentative grammar rules to be observed and practiced by learners and thus to be developed and consolidated. For instance, Myles et al. (1998) for two years followed the progression of 16 beginner learners of French (aged 11–13) by means of unplanned oral production tasks. Research focused on the integration of three chunks in learners’ interlanguage. The chunks were: j’aime ‘I like’, j’adore ‘I love’ and j’habite ‘I live’. At the beginning of the observation period, learners often over-extended those chunks, such as in Sentence (3): (3) Un famille . . . j’habite un maison a family I live a house ‘a family lives in the house’ (Myles et al., 1998: 330) Myles et al.’s study longitudinally tracked these chunks in order to see if and when they were segmented and their components were used separately by learners. At a certain point during the observation period, approximately onethird of the learners started to break up or ‘unpack’ the chunks, that is, to use the verb with other pronouns and the first-person pronoun with other verbs. Myles et al. (1998) reported that this process ‘was clearly linked with the emergence of the pronoun system’ in learners’ interlanguage (Myles et al., 1998: 347). Explicit self-monitoring revealed that learners realized that the first pronoun form j’ in Sentence (3) was inappropriate for expressing third-person reference. The authors suggested that chunks have a role of facilitating entry into communication and speeding up production in the early stages, and especially in dialogic, classroom-oriented conversations. When learners realized that chunks are inadequate to establish explicit third-person reference, this apparently triggered the breakdown of the chunks in non-formulaic contexts. Learners at this grammatical stage did not drop chunks but rather worked on them by analyzing and recycling their components (Myles et al., 1998: 359). According to Bardovi-Harlig (2006: 6–7), two claims about the relation between chunks and the process of L2 grammar formation are possible: (a) the grammar catches up to the chunk (when it manages to break it up); (b) the chunk drives the acquisition of grammar (when the former provides the latter with input). In both cases, chunks are used by a learner for speeding up communication at a stage where the internal grammar of the chunk exceeds a learner’s grammar in general. Maximal, non-recursive chunks may also represent the processing and the representational units of SL in

Discont inuit y as Chunk s Feed into Grammar

51

adulthood because ‘they form the explicit knowledge base on which implicit associative mechanisms operate’ (Robinson, 2005: 259). After generative grammar takes over, chunks are not discarded, but actively feed into the developing linguistic system.

2.5 Chunks Feed into L2 Constructions In the usage-based perspective, chunks do not prepare the field for L2 grammar to develop. Instead, chunks in the TL stand autonomously, in their own right. They are themselves the building blocks of language categories which gradually take shape in a learner’s mind in the form of ‘constructions’ (Ellis, 2002, 2009). In the usage-based perspective, L2 learners step up when they move from chunks to constructions. Learners’ innate sensitiveness to the presence of chunks in the input flow is only the starting point. With time and practice, L2 learners realize that chunks enter into more sophisticated patterns of language usage. These patterns are the constructions of the TL. Constructions are form-meaning correspondences which are represented both in a language a community uses and in a learner’s mind (Ellis, 2009: 139). Combinations of chunks which are specified by item-based constructions are processed through functional neural circuits in the short-term sentence memory (MacWhynney, 2005a: 53). These constructions are the units of a linguistic system ‘accepted as conventions in the speech community and entrenched as grammatical knowledge in the speakers’ mind’ (Ellis & Ferreira, 2009: 370). Constructions emerge when the cognitive principles of category learning shape learners’ experience with the language input (Ellis & Collins, 2009: 330). They can either be concrete or be formed by a morpheme, or single noun (such as dog) or they can also be abstract and display a more complex internal structure. For instance, the passive construction ‘Subj Aux VP (PPby)’ is a complex construction (Boyd & Goldberg, 2009: 418). The main factor for construction learning in a L2 is learners’ exposure to type frequency and not just to token frequency. While chunks are formed through mere repetitions of identical sequences (tokens), constructions are acquired when learners encounter different lexical items or chunks (e.g. past participles) alternating in the same positions as what they eventually infer to be an identical structure or type (e.g. the passive). While frequency alone ensures the storage and retrieval of single words and chunks, high type frequency ensures that what is strengthened is a representational schema (Wulff et al., 2009). This is also true for L1 acquisition by children (Tomasello, 2003: 107). Constructions as abstract schema emerge from the ‘conspiracy of memorized instances’ (Ellis, 2008: 241). The usual route for naturalistic acquisition of a second language

52

Discont inuit y in Second L anguage Acquisit ion

is from formula through limited scope pattern to creative construction (Ellis, 2005: 321). Unlike Goldberg (2003: 219), construction is not simply ‘pairings of forms and function’ because this definition eclipses the distinctive feature of constructions as compared to chunks and formulas. In most constructionist studies, learners’ capacity for making complex constructions out of chunks is held to be the driving factor of both L1 and L2 acquisition. There is experimental evidence that chunks precede construction developmentally. Bartning et al. (2012) showed that morphosyntactic deviance in very advanced and near-native users of French correlate inversely with the presence or absence of two different kinds of formulaic language. By using acceptability judgments and a cloze test, these authors found that – if those formulaic expressions contained open slots for gender and verb agreement and imply rule application – L2 learners were more likely to commit some morphosyntactic mistakes. Conversely, chunks where only invariant words occurred rarely contained errors. This developmental trajectory from chunks to constructions occurs along a unary, linear dimension: a learner’s experience with the TL. The more learners hear and practice the language, the more they notice that words co-occur, the more they can pack and retrieve words together, and the more they interiorize the categories of language from prior experience with exemplars or instances of those categories. The emerging of grammatical categories from the processing of stored chunks is driven only by analogy (MacWhynney, 2005a: 55). Analogy defines the developmental relationship between chunks and constructions. With increasing exposure to TL input, learners become sensitive to type frequency and not only to token frequency. Analogy is what learners use to establish and memorize complex patterns of usage out of simple chunks. In those patterns or constructions, some parts are fixed and others are movable. Analogy ensures that all forms entering a pattern or construction ‘resemble’ each other in some respects. Over time, learners come to recognize that those usage-based patterns or constructions are both clearly identifiable (in their form-to-function mappings) and productive (in the interchangeability of their components). In Chapter 5, we will see how the definition of category membership is problematic and that the items that substitute for one another in a construction may happen to share properties that go beyond analogy or formal resemblance of any kind.

2.6 One Example of Gemination of TL Representation in L2 Italian In Section 2.4 we described the situation in which L2 learners of French eventually make it to breaking up or ‘unpacking’ the chunk j’aime ‘I like’ by

Discont inuit y as Chunk s Feed into Grammar

53

using the same verb with other pronouns and the first-person pronoun with other verbs. Myles et al. (1998) conclude that this process marks the emergence of the pronoun system in learners’ interlanguage. It is interesting to argue how this could happen. It is likely that, after repeated encounters with chunks in the native input or in the classroom input, maybe learners start looking more carefully at chunks’ neighborhood/vicinity. Given that some items occur and alternate very often in a chunk’s neighborhood, learners are pushed into facing novel linguistic material. Somehow, they have to begin processing (systematizing, categorizing) this material. Learners may end up postulating that regularities in a chunk’s neighborhood are due to some abstract properties of the chunk itself. Since the chunk is the invariable part and the building block of the constructed units, learners may be likely to conclude that chunks attract some peculiar linguistic material into their orbit. The attractive power of chunks is what we may tentatively call the ‘property’ of chunks. L2 learners might eventually generalize these abstract properties when they unpack chunks into their components. It is not clear, however, whether there is a trajectory from L2 chunks to L2 grammar and, if any, why this trajectory should be discontinuous. The aim of this section is to suggest a possible way in which the reiterative use of chunks might help learners pave their way through the grammatical system of the TL. Moreover, by looking at performance data it is also possible to hypothesize that neither chunks nor constructions themselves can provide learners with the necessary means for ‘categorizing over instances’. This is instead the job of grammar. The trajectory from L2 chunks to L2 grammar described in this chapter is argued to be discontinuous because at a certain point representations geminate; that is, a statistical representation has a grammatical counterpart (Section 1.4). Geminations mark a qualitative change in the developmental path. Geminations can also be observed directly by looking at longitudinal data from learner corpora. The qualitative change in learners’ mental grammar would take place when learners recognize grammatical, abstract categories as being the ‘gravity force’ that attracts linguistic material into the orbit of chunks. In the rest of this section, we will focus on a qualitative and quantitative (statistical) analysis based on examples taken from a longitudinal corpus of spoken learner Italian.1 Of course, performance data like those described below cannot be conclusive about the discontinuity hypothesis. For their nature, these kinds of data can be considered only ‘surface cues’ that must be integrated with other kinds of experimental evidence about learners’ realtime processing. Nevertheless, performance data from Italian learner corpora, to a certain extent at least, seem to converge with the interpretation of electrophysiological studies that will be commented upon in Chapter 4. The data have been collected at the University of Pavia, Bergamo and Milan, Italy

54

Discont inuit y in Second L anguage Acquisit ion

Table 2.1 The learner’s background Acronym Age Home country

L1

MK

Tigrinya Electrician 12

20

Eritrea

Job

No. of recording

Period of recording

Time spent in Italy at first recording

7 months

1 month

(see Andorno & Bernini, 2003; Giacalone Ramat, 2003, for more details). The data analyzed in this paragraph concern just one L2 Italian learner – MK – of L1 Tigrinya, a semitic language of Eritrea. The information about this learner is summarized in Table 2.1. The data are from oral narratives, both autobiographical and descriptions of film scenes. They were recorded approximately every two weeks for 12 recordings spanning seven months in total. It is important to underline that, at the time of the first recording, MK had been living in Italy for just one month and had no previous knowledge of Italian. During the timespan of the interviews, MK attended regular classes in the Italian language at private schools. We here focus on how MK uses the verb sapere ‘know’. Since the learner takes part in a conversation with a native interviewer, it should not come as a surprise if the forms of the first- and second-person singular are in the majority. What is striking, however, is the disproportion between the formulaic and the non-formulaic uses of the verb sapere. Moreover, the chunk-based uses of sapere are also initially overextended by MK into contexts in which other ‘basic’, very frequent verbs would be expected. Table 2.2 provides samples from the sentences using the verb sapere uttered by MK throughout the period of recordings. What seems to emerge from the table above is the following: (a) Lexical underspecification. From recording No. 1 to recording No. 5, the chunks (io) non so and (io) so cover a range of verbs equivalent to the English ‘read’, ‘become aware’, ‘possess information’, ‘remember’. (b) Exclusive use of first person. From recording No. 1 to recording No. 5, the chunks in the first-person singular ((io) non so and (io) so) cover 100% of all occurrences of sapere. (c) Early transitivity pattern. Chunks with the verb sapere have object NPs very early, from the second recording onwards. (d) Use of overt pronominal subject. From recording No. 1 to recording No. 5, the information about the person in the verb sapere seems not be encoded by verb morphology but only by the overt pronominal subject io ‘I’. Instead, from sixth recording on, the overt first-person subject

io non so non so io io non so la piassa io no so l’aficio cosa non so happen io non so deli nomi adesso io so questa storia in un libro io lo so tutta questa io non lo so niente non sappiamo niente di genti di Sudan perché non sapete la nostra lingua

lui non sa la lingua io non saputo per sapere che cosa c’è la differenza se io sapevo questa non sai arabo? perché non sapevo questa parola lo sapevo

1 2 2 2 2 3 4 4 5 6

6 9 9 10 11 12 12

6

Utterance

No. of recording I don’t know don’t know I I don’t know the square I don’t know the office what I don’t know happen I don’t know some names Now I know this story in a book I know it all this I do not know nothing (We) do not know anything of people from Sudan Because (you) do not know our language He does not know the language I not known to know what there is the difference If I knew this not (you) know Arabic? because I did not know this word I knew it

Literal translation

Table 2.2 Samples from the sentences with the verb sapere across recordings

I don’t know what happen I don’t know their names Now I read this story in a book I remember it all I was not aware (We) do not know anything of people from Sudan Because (you) do not know our language He does not know the language I did not know to know which is the difference If I knew this Don’t you know Arabic? because I did not know this word I knew it

I do not remember I don’t know

Reconstructed meaning

Discont inuit y as Chunk s Feed into Grammar 55

56

Discont inuit y in Second L anguage Acquisit ion

pronoun with sapere starts being dropped and becomes more and rare in the remaining recordings. (e) Emergence of verb morphology. Inflected forms of the verb sapere2 emerge from recording No. 6 onwards (that is, after four months). There are only 16 occurrences of inflected forms out of 70 occurrences of sapere in all recordings. As we have already said, all the rest is chunks. A statistically informed analysis of the data presented above confirms the presence of discontinuity in the form of a ‘morphology burst’. A chunkdominating phase is suddenly followed by a burst of morphosyntactic competence. According to Kendall’s rank correlation statistics, the appearance of morphology on the verb sapere strongly depends on the interview (correlation between time of interviews and presence of inflected forms, z = 3.3831, p-value = 0.0007166, τ = 0.7946721), while the presence of chunks does not correlate significantly with the time of interview (z = −1.1148, p-value = 0.2649, τ = −0.2542567). Figures 2.1 and 2.2 show graphically the trend of inflected forms and of chunks. The inspection of both graphs reveals that there is a burst of verb morphology between Interviews 5 and 6, while – on the other hand – MK keeps on using chunks over time. In more detail, our data show that initially (from Interview 1 to Interview 5) MK uses the verb sapere only with two chunks. The

Figure 2.1 Presence of inflected forma of sapere across interviews

Discont inuit y as Chunk s Feed into Grammar

57

Figure 2.2 Presence of chunks with sapere across interviews

qualitative analysis of data suggests that the grammar (in this case, verb morphology) might have developed from MK’s labor on these chunks. Maybe MK uses chunks as a mental laboratory for implicit grammar acquisition. We argue this because from recording No. 1 to recording No. 5 chunks are overextended even though they have no formal properties: no person, no inflection, no lexical meaning (transitivity in sentence cosa non so happen ‘what I don’t know happen’ seems to be a property of linearization, not a property of the verb). Instead, from the sixth recording on, chunks are suddenly juxtaposed with inflected, lexically specified, target-like forms of sapere. Four facts suggest that L2 grammar may have geminated from MK’s implicit mental elaboration on L2 chunks and that gemination would mark a discontinuous developmental change: (1) the acquisition of morphology is not gradual; rather, it is sudden (placed between the fifth and sixth recordings); (2) chunks with sapere do not disappear; rather, they accompany the morphological uses; (3) the improvement is definitive (newly acquired forms are always used correctly, no U-shaped effect); (4) different formal properties of the verb (morphology, prodrop, lexical specification) display together on the verb sapere.

58

Discont inuit y in Second L anguage Acquisit ion

Facts 1and 2 suggest that we could be in the presence of gemination of L2 representations (Section 1.4). Fact 3 suggests the MK may have acquired a deep-rooted capacity for automatically abstracting over instances. Facts 1 and 4 suggest that this capacity could not have emerged gradually as the result of MK’s exposure to L2 input. Only a profound and sudden change in MK’s competence may bring it about that chunks are quickly juxtaposed with a whole range of inflected target-like forms. Finally, in the view sketched above, the L2 grammar in MK is an emerging property neither of chunks nor of constructions to which MK may have been exposed. While constructions may result from regularities that L2 learners find in chunks, abstract symbols and formal properties do not. If there had been a developmental continuum between chunks, constructions and abstract symbols (of the kind envisaged for L1 children’s acquisition by Bannard et al., 2009), then it is likely that different formal properties of the verb sapere would have been displayed gradually and at different rates depending on the rate of MK’s exposure to the input. Instead, MK’s progress to TL productions was anything but gradual. The progress was both sudden and effective possibly because chunks and constructions per se do not contain anything symbolic (or, to say it differently, the symbolic level is out of reach for chunks and constructions). In other words, what provided MK with the possibility to generalize, immediately, effectively and irreversibly over frozen, unanalyzed chunks is some kind of pre-existing linguistic knowledge that one can well call ‘the grammar’. Discontinuity in this volume is assumed to hold between chunks and constructions on one side and the L2 grammar on the other side.

2.7 How Much Grammar Can Be Found in Chunks and Constructions? In the last section we observed a developmental change that was anything but linear. The development of verb morphology in MK was (1) sudden (it can be safely placed between Interviews 5 and 6); (2) multidimensional (because it included prodrop and lexical-actional specification); and (3) additive, not subtractive (because inflected forms and chunks cohabit and do not outrank one another). A first legitimate question is how can the grammar be extracted from chunks and constructions. In Section 6.3 we will see that learners become able to unpack chunks and constructions and categorize over instances occurring in variation sets, which are repetitions of partially identical structures. A second important question is how much grammar can be found by learners in chunks and constructions. Do chunks and

Discont inuit y as Chunk s Feed into Grammar

59

constructions contain everything that is necessary to L2 learners to build the TL grammar? The answer from researchers within the constructivist field is no doubt yes. Basically everything that is needed to learn the grammar of the TL can be found in chunks and constructions. The central tenet of construction grammar is the idea of a continuum between the lexicon, grammar and syntax (Goldberg, 2006). The position of Ray Jackendoff on this issue is even more radical. Jackendoff’s (2002b) argument is that the assumption that lexicon and grammar are fundamentally different kinds of mental representations should be completely given up by linguists. According to Jackendoff, there is no line to draw between words and rules because ‘there are things you need to store in the lexicon that are progressively more and more rulelike, so there seems less and less reason to distinguish them from things that everyone accepts as rules’.3 Jackendoff claims that many idioms have argument structure (e.g. take NP for granted) and that most idioms also have canonical syntactic structure (e.g. day in, day out; what about XP?). According to Jackendoff, syntax is in the lexicon because rules of grammar are just schemas with productive or semiproductive variables, stored in the brain in the very same format as words. Since both constructions and items computed by rules are pieces of stored structure, they can be accessed and learned in the same way. Paradis (2004, 2009) holds exactly the opposite view. Paradis’ view is based on a neurophysiological distinction between what can be learned implicitly and what can be learned explicitly in two different memory systems, respectively: the procedural and the declarative memory systems (see Chapter 4). Paradis in parallel distinguishes the lexicon from the vocabulary of a language. He refers to the lexicon as the set of implicit grammatical properties of items. Items of a language become ‘lexical items’ when their implicit grammatical features are implicitly recorded by learners through the communicative use of the language. For instance, native speakers and also language learners (to a lesser extent, though) may come to notice that a certain verb such as obey must take three arguments. This knowledge is implicit because it has been acquired exclusively by repeated encounters and uses of the language and it has not been explicitly taught. Instead, items from the vocabulary are simply form-meaning pairs that are stored in the declarative memory as unanalyzed wholes. Since the vocabulary items cannot be analyzed, they cannot be decomposed into their components (see Chapter 4). According to Paradis, what has been stored as an unanalyzed whole (form-meaning pair) cannot be further processed by the same procedural mechanism (supported by procedural memory) that serves the implicit recognition of the elements of the lexicon. Therefore Paradis keeps a strict distinction between chunks and constructions on the one side (the vocabulary) and the analyzable lexicon (that is, the grammar) on the other

60

Discont inuit y in Second L anguage Acquisit ion

side. The discontinuity hypothesis puts forward an intermediate position between Jackendoff’s and Paradis’ views: a continuum exists between chunks/ constructions and combinatorial grammar. Instead, what cannot be found in chunks and constructions is non-combinatorial grammar. Non-combinatorial grammar is the part of the grammar that cannot be learned statistically because some of its elements are ‘absent’ or displaced and cannot counted up and be tracked by any statistical mechanism (Chapters 4 and 6).

2.8 Chunking (and SL) Operates on Sociolinguistic Variants as Well We have seen (in Section 1.10) that in the discontinuity hypothesis the difference between combinatorial and non-combinatorial grammar is important. It predicts what can or cannot be counted, statistically assessed and used for pattern/construction building. Thus, items whose co-occurrence can be evaluated statistically by learners can eventually feed into L2 grammar. In order for a given grammatical item to be assigned to combinatorial or noncombinatorial grammar, what is important is being systematically ‘present’ or systematically ‘absent’ when some other items occur in the same sentence. Nevertheless, what does ‘being absent’ mean in terms of linguistic theory? There exist linguistic items that are omitted under some circumstances and are present under others. Can these alternating items be said to be ‘absent’ to all extents just because they do not occur in a given circumstance or phrasal context? To take an example, ne is very often omitted in ne … pas negation in some varieties of oral French (see below). Does this mean that the frequently omitted ne is not likely to enter chunks and (after) constructions and be learned statistically, unlike the never omitted pas? The aim of this section is to clarify what ‘absent’ means to a linguistic theory and to distinguish between which absent items can and which cannot be learned statistically. The core argument is that items of a language have different ways of being ‘absent’. Items whose absence is the result of a computation cannot be learned statistically. Instead, those items whose absence is due to a stylistic or sociolinguistic choice can be learned statistically, provided that they are frequent enough to form chunks and to turn into constructions later on. Sociolinguistic variants of a form may alternate in different contexts. These variants are just different ways of saying the same thing. The fact that one variant is absent and another one is present does not mean anything in processing terms, but it may mean a lot in terms of pattern building. If one variant is used more frequently than another, then maybe the former is likely to enter a construction more quickly and more steadily than the latter.

Discont inuit y as Chunk s Feed into Grammar

61

However, the lesser used variant is also not prevented from entering a pattern-building process. Let us see this through a concrete example. It is a well-known phenomenon that some constructions or patterns may delete one of their components under some circumstances. We have just mentioned the case of ne deletion in ne … pas negation in informal, oral French. Ne deletion and ne maintenance are two sociolinguistic variants. Ne is deleted widely in the whole French-speaking world in all varieties of the language (Dewaele, 1992; Dewaele & Regan, 2000, 2001; Regan, 1995, 1996). Whether it is omitted or not, ne is always present in a native speaker’s competence because all native speakers know that both variants exist and can be used. The fact that a native speaker of French knows when to use either variant of the negation does not mean that this choice is the result of a real-time computation. A computation would be involved if the choice between ne deletion and ne maintenance could affect the meaning of the sentence. But in the case of sociolinguistic variants, what is absent in the sentence is linguistically the equivalent of what is present. This is why SL can operate both on the variant which occurs in a given context and on the variant which is absent in that context. Of course, as we have said before, the rate and effectiveness of SL of either variant will depend on their absolute frequency of use. There is evidence that at least the first step of SL (the chunk formation process) can also operate on all sociolinguistic variants of a single form in a second language. Fixed expressions such as Je ne sais pas ‘I do not know’ are more likely to be further reduced to Je sais pas than less predictable, nonformulaic expressions by L2 learners of French (Regan, 1995, 1996). This would mean that SL is capable of forming (and of storing in a learner’s memory) a new chunk by reducing a larger, more complex chunk. Data from learner corpora would even show that the more formulaic the verbal or nominal frame where a standard, double negation occurs (such as Je ne sais pas) the more ne tends to be deleted by L2 learners of French ( Je sais pas). This possibility does not violate the principle of chunk maximality (a chunk cannot be embedded in another chunk; elements inside a chunk are not hierarchically ordered) (Section 2.3). A shorter chunk where ne is omitted simply substitutes a chunk with ne … pas in speech linearization in some oral varieties of French. Both chunks are parts of the competence of a L2 advanced learner of French. These learners can learn both chunks statistically just because there is nothing for these learners to compute. Ne deletion in fact does not oppose ne maintenance. Each form is just retrieved from the memory storage when needed. Null pronominal subjects (in a null subject language) or gaps left by NP-movement in a relative clause and other similar phenomena (such as those listed in Chapter 6) are in a very different category from ne … pas

62

Discont inuit y in Second L anguage Acquisit ion

negation in French. They are not ‘variants’ of the same form. Null subjects are not a variant of overt subjects. They mean something else. The absence of overt pronominal subjects gears the whole sentence towards a different syntactic and pragmatic interpretation. When subject pronouns do not occur in the sentence, it is not because they have been ‘omitted’ by speakers for reasons of appropriateness. Their presence/absence is the result of a different mind/brain process, which is called a ‘computation’. The computation links the presence/absence of an item to systematic, fully predictable changes in sentence processing and interpretation. Null or overt pronouns and similar phenomena are not retrieved from a learner’s memory storage. They are computed in real time. This also means that, when a speaker/learner of Italian chooses whether to drop a pronoun or not, the context (that is, TP) does not count. Items whose omission/maintenance must be computed cannot form chunks and be learned statistically. Their presence/absence depends on the meaning of the sentence, not on whatever adjacent items or n-grams are nearby. In Chapter 6 we will see how the computation of empty categories and of displaced items abstracts away from distribution. At the beginning of this chapter, we stated that the process of chunk formation and a learner’s capacity for selecting among sociolinguistic variants are different developmental phenomena. Learners’ statistical capacity for detecting co-occurrence patterns is likely to emerge inexorably a few months after birth (Section 3.4). Learners’ capacity for choosing among linguistically equivalent (sociolinguistically non-equivalent) variants instead emerges later and to a more variable extent. This implies that all chunks and variants of chunks, which alternate in different contexts, can be learned statistically regardless of the fact that learners may not be aware from scratch of their sociolinguistic values.

2.9 To Sum Up: Some Language Properties Are Not a Property of Input When they parse the TL input, adult L2 learners are very likely to analyze (and learn) words together rather than analyze and learn words one at a time. This tendency is confirmed by acquisitional data and computational evidence (see Chapter 5). The cognitive forces that drive learners to group words together are frequency and analogy. The former has an immediate impact, while the latter plays its role especially later on. Frequency of cooccurring items (along with learners’ predictive capacity and repetitivity) is the measure that learners implicitly record and use to chunk the input into manageable (and easily learnable) pieces. Analogy of items entering

Discont inuit y as Chunk s Feed into Grammar

63

well-defined positions in patterns is the measure that learners use to form and memorize constructions. Chunks and constructions are the immediate products of SL. In Chapter 5, we will see in more detail how SL extracts regularities from sequences of items. In the discontinuity hypothesis outlined in this book, learners’ initial use of chunks and constructions reflects the learners’ earliest way to organize cognitively and to store the primary linguistic data. Chunks and constructions are the by-products of an early, tentative systematization which fits well with adult learners’ communicative needs and an adult’s brain capacities. Chunks and constructions are not the end of the story, though. At a certain point, discontinuity – the breakup which parallels gemination and the switch between SL and GL – may occur. We have seen the case of MK, a learner of L2 Italian, whose gemination from chunks-only to the TL fully inflected use of the verb sapere is sudden and definitive. When in fact discontinuity occurs, a learner becomes capable not only of processing larger chunks, but also of extracting abstract properties out of them. We will see that at some point in the developmental path, a L2 learner’s grammar brackets chunks and constructions under some ‘labels’. These labels are abstract features that an item projects over other items belonging to the same concatenation. Labels are abstract because they are not self-generated: they are not an emerging by-product of frequency and of language uses. They precede chunks and constructions; that is to say, they are already in place in a learner’s mind when statistics does its job.

Notes (1) We bear in mind all the caveats relative to the use of learner corpora, especially of qualitative analysis, without measures of significance, as developmental evidence (see Yang, 2010). (2) Sappiamo ‘we know’, sapete ‘you all know’, sapevo ‘I knew’, sai ‘you know’, sa ‘she/ he knows’, saputo ‘known’, sapere ‘knowing’. (3) See http://ling50.mit.edu/wp-content/uploads/Jackendoff-handout.pdf.

3 Discontinuity in the Maturing and in the Adapting Brain

3.1 Chapter Preview: Discontinuity Across a Learner’s Age In this chapter the idea that discontinuity characterizes both early and late language acquisition in a child’s maturing brain and in an adult’s adapting brain, respectively, will be described. The nature and the outcomes of discontinuity in childhood and in adulthood are different, however. According to a recent theory, discontinuity in infancy would mark the neuroanatomical passage from an earlier SL through the hippocampus and the temporal cortices to a later GL of morphosyntax which is made possible by the slow establishment of efficient fiber connections between temporal and frontal areas in the left hemisphere around the age of 6–7 years. Unlike in children, discontinuity in adulthood is not neuroanatomically driven. It is rather an adaptive, cognitive fallback mechanism to which an adult’s brain resorts when it copes with representational and processing problems that are harder to deal with after the critical period. It has also been hypothesized that a child’s brain naturally acquires the language it is exposed to first statistically and then grammatically. This sequence seems to fit the natural course of neuroanatomical maturation of the child’s brain. Instead, in order to learn a second language, an adults’ brain – which is already mature – must adapt itself to the task. An adult’s brain is adaptive (although not completely plastic), and also capable of responding with anatomical and functional rapid changes to environmental challenges. In this chapter, it will be claimed that, in order to face the learning task, an adult’s brain might find it natural to replicate an already built-in mechanism that uses statistics to learn the grammar. If this is true, the switch between 64

Discont inuit y in the Matur ing and in the Adapting Brain

65

SL and GL would replicate the original blueprint for early language acquisition. The trick works again in adulthood, but possibly to a lesser extent. While SL is in fact fully available to adult learners, GL has some limitations and bugs. Notably, GL does not cover all L2 grammar and syntax. Therefore this chapter elaborates that discontinuity is differently age related: it characterizes in the same way – but with very different results – both pre- and post-pubertal L2 acquisition. While Chapters 5 and 6 describe which parts of the TL can or cannot be learned statistically, this chapter aims at explaining why, although SL has primacy in both early and late language acquisition, it represents just the initial, starting option for children learning their L1 and the default option for adult L2 learners. In the literature it is widely debated whether adults, like young L2 learners, can attain a native-like proficiency in a second language. RodrìguezFornells et al. (2009: 3713) quote a dozen studies estimating percentages ranging between 5% and 30% of adult learners who could attain native-like performances even in phonology and syntax. Rodrìguez-Fornells et al. (2009) also add that such studies and percentages do not explain the implication of the same cognitive resources or processes. This must always be kept in mind when one argues whether adults can or cannot attain native-like proficiency in a second language with respect to pre-pubertal L2 learners (e.g. Montrul, 2009; Singleton, 2007; Singleton & Ryan, 2004). Therefore, the research question in this chapter is not whether or not adults can attain a second language to a native-like level, but how they do it. It is indeed evident that adults can also learn a second language quite well (even if our judgments of learners’ native-likeness are often divergent and misclassified; see Magnusson & Stroud, 2012). This is true to the extent that some speakers who learned the second language at different ages are apparently undistinguishable. The fact that a difference between child and adult learners is not so apparent does not mean that it does not exist. The research described in this chapter discusses whether this difference does exist at a neurophysiological level. Abrahamsson (2012: 211) claims that the hypothesis that children and adults acquire L2s in fundamentally different ways, using fundamentally different brain mechanisms, should be highly prioritized in future research: ‘There is already an entirely clear answer to the question of whether children are more successful learners of L2s. What still remains are answers to the question why this is so.’ These answers are to be investigated as to the differences between successful discontinuity in a child’s maturing brain and in an adults’ adapting brain. In the next paragraph we focus on whether the hypothesis of discontinuity can shed light on how the differences between child and adult SLA have been interpreted so far in generative theory. ‘Fundamental difference versus full access’ is a theoretical debate which has a long history in the

66

Discont inuit y in Second L anguage Acquisit ion

tradition of generative studies. Quite recently, however, the attention of generative-oriented researchers has shifted. Instead of asking whether or not adult L2 grammars are UG constrained, some started considering that adults might have their own ways to learn a second language, regardless of the fact that UG is still accessible to them.

3.2 Beyond the ‘Fundamental Difference’ Versus ‘Full Access’ Debate Among generative approaches to SLA, two competing theories question whether access to UG, which is held to modulate a learner’s ultimate attainment, is age dependent (for a summary, see White, 2003, Chapter 3). Proponents of the fundamental difference hypothesis (Bley-Vroman, 1990) maintain that adult and child SLA are not comparable, while proponents of the full access/full transfer/full parse hypothesis (Dekydspotter et al., 2006; Schwartz & Sprouse, 1996) maintain that the initial state of L2 development is the L1 grammar and further grammatical development is UG constrained. For instance, the first position, according to Herschensohn (2009: 281), is as follows: since adult L2 learners are already ‘language-endowed’ with their first language, they are fundamentally different from monolingual learners regardless of their age. The reasons for the deterioration in adults’ capacity to learn a second language would be only partly maturational. In fact, the previous experience with the L1 shapes the neural networks of the brain dedicated to language. SLA is partially affected by age and partially affected by the amount of exposure to the first language (Herschensohn, 2007). Song and Schwartz (2009) maintain that adults and children have the same possibilities of learning a second language and following the same route to TL convergence. Song and Schwartz (2009) point out that this is particularly clear when one looks at how learners with the same L1 learn ‘poverty of stimulus’ syntactic features (such as wh- constructions with negative polarity items), where previous experience with linguistic data counts less or not at all. There is a methodological and a theoretical reason for suggesting that the ‘initial versus final-attainment comparison’ perspective seen above, which is elaborated in the UG-access debate, does not explain where exactly early and late language acquisition differ and which mechanisms adults put into action to learn a second language as compared to children. The methodological reason is that so far as researchers are aware, early and late learners may appear to achieve the same native speakers’ effects by relying on completely different underlying representations and processing routes

Discont inuit y in the Matur ing and in the Adapting Brain

67

(Kotz, 2009; Kotz et al., 2007; Mueller, 2005). This conclusion was unapproachable before the middle of the 1990s, when the majority of studies were based mostly or solely on acceptability judgment tests and on performance data. Since neuroimaging and electrophysiological investigations have taken to the floor in the investigation of second languages, the research paradigm has undergone some changes. Nowadays, any developmental hypothesis is required to provide evidence – in favor of or against the critical period hypothesis – that early and late SLA are characterized by different ways of learning which are rooted in different brain mechanisms and which may be also relatively independent of behavioral measures about the degree of ultimate attainment. Yet these deep (brain-rooted) learning differences may turn out to be obscured by accuracy percentages, acceptability judgments and even by an incautious analysis of reaction times of the respondents. There is also a theoretical reason behind a hope for a change in the research paradigm on the relationship between age and acquisition. We have seen (in Sections 1.13.4 and 1.13.5) that this reason resides in a change of perspective within generative linguistics. In the minimalist program, the language faculty retreats from carrying the whole burden of language acquisition. Chomsky (2007a) proposes that this burden has somehow to be divided between specific computational devices of the language faculty and the general domain – cognitive devices which are related to how human minds work. Chomsky (2007a) lists these principles: (a) some properties of the brain; (b) general learning principles; (c) information on word frequencies (see also Yang & Roeper, 2011). There is no reason to believe that principles (b) and (c) are age related; that is, they are not available in adulthood. For instance, there is evidence that adults, like infants, remain highly sensitive to word frequency. If the labor of language acquisition is divided, the discontinuity hypothesis proposes that the process of acquisition displays a point of rupture between a general-domain way of learning (SL) and a more language-specific way of learning (GL). Again, the main research question cannot any longer remain as to whether adult and child L2 learners can attain the same proficiency level, but whether, in cases where children and adults appear to have achieved the same effect, they are using the same neural and cognitive resources. Therefore it will be claimed that discontinuity is differently age related not because adults have no access to UG in general, but because: (a) there could be aspects of the TL that adults are unlikely to learn and other aspects that are learnable; (b) the different degree of learnability of language items could have some neural correlates. The field of research about the age-related effects on L2 acquisition is defined by the intersection between linguistic theory – which explains why some items are

68

Discont inuit y in Second L anguage Acquisit ion

more learnable than others – and the neurocognition of second language, which tells us why different and age-related ways of learning could be differently grounded in the brain. These are some reasons why the fundamental difference hypothesis debate could be updated and seen under the lens of L2 neurocognition and of the discontinuity hypothesis.

3.3 Learning by Patches is Typical of Adult SLA The idea that adult learners can also be good learners of a second language is not excluded even by the most radical tenets of the critical period hypothesis. For instance, Lenneberg (1967: 176) wrote that: ‘Most individuals of average intelligence are able to learn a second language after the beginning of their second decade, although the incidence of language-learning-blocks rapidly increases after puberty.’ In the updated version of the fundamental difference hypothesis, which is meant to be integrated in the minimalist program, Bley-Vroman (2009) does not retreat from his original idea that only child L1 and L2 acquisition stem from UG. But he also proposes that, since adults have no more access to the core grammar of the second language, they can still rely on a cognitive device that mimics the original outcomes of a dedicated language acquisition system which is no longer available. This cognitive device resorts to ‘patches’. Patches are linguistic expressions which superficially resemble the products of a grammar-driven processor of a native language, but are instead the products of simplified, ‘shallow’ rules-driven processing. Therefore adult learners’ performance in the second language is modulated not only by UG, but also by cognitive (not language-specific) mechanisms such as word frequency, salience and noticing effects (Kweon & BleyVroman, 2011: 225). Bley-Vroman’s patches are comparable to the chunks and item-based constructions we discussed in Chapter 2. Researchers who study formulaic language agree that, regardless of the final attainment state, the aging brain may modulate the way people learn a second language. For instance, Wray (2008) claims that children can learn a second language in a formulaic way, while most adults learn the second language ‘intellectually’, by explicitly drawing their attention to forms and to form-mapping variations. Arnon and Ramscar (2012) examined the hypothesis that the order in which the items of the second language are presented in a syllabus affects what gets learned by late bilinguals. The idea is that exposure to smaller units (such as simple nouns in isolation) before exposure to larger units (e.g. nouns in the context

Discont inuit y in the Matur ing and in the Adapting Brain

69

of sentences) can impair the acquisition of within-item relationships (e.g. the article + noun sequence). This could mean that the intellectual (or analytic) experience that adults bring into language learning (which is typical of language courses) is a hindrance when it comes to learning second languages more naturally. According to Lenet et al. (2011), older adults benefit from language instruction that encourages incidental, implicit learning, where feedback is given without explicit grammar rules. To sum up: the idea is becoming increasingly popular in SLA research that there might exist particular ways to learn a second language depending on a learner’s age. We have seen that, according to some (e.g. Wray, 2008), learning by blocks and patches is peculiar to children and is precluded for adults. According to others, exactly the opposite holds. According to the discontinuity hypothesis outlined in this book, learning by patches is precisely the way adult learners pave their way to learning some difficult features of the TL grammar once their brains have reorganized the neural resources at their disposal.

3.4 Discontinuity in Brain Maturation and Brain Adaptation The core idea of this chapter is that L2 acquisition in adulthood tries to mimic and recap the sequenced mechanisms of early language acquisition.1 Both early and late acquisitions are in fact discontinuous: in both cases, SL not only precedes, but possibly also paves the way to GL. The difference is that, while discontinuous early acquisition is successful, discontinuous late acquisition is only partially so. Despite an adult’s brain being highly adaptive, it seems that it cannot recreate the exact conditions in which early language acquisition took place. The reason is that early language acquisition accompanies and takes advantage of a child’s brain’s neuroanatomical and functional maturation. A brain’s maturation (as it is defined below) occurs once in a lifetime and offers some acquisitional opportunities just once. These opportunities are due to the way the brain takes on its definite configuration and establishes its functions. In the following paragraphs, the neurobiological evidence about the different features of a child’s brain maturation and an adult’s brain adaptation are presented and briefly commented on. In Chapter 4 we will discuss exactly how and in which parts of the TL early and late language acquisition may differ. We will introduce the idea that what late language learners cannot catch up with any more is a subset of L2 grammar and L2 syntax which lies beyond the scope of an SL. The first difference between the processes of maturation and adaptation

70

Discont inuit y in Second L anguage Acquisit ion

has to do with the definition of the ‘sensitive’ and ‘critical’ periods for language acquisition.

3.5 The Difference Between a ‘Sensitive’ and a ‘Critical’ Period for Language Acquisition The biological differences underlying language learning by children and adults was dealt with at the beginning of the late 1950s by the proponents of the critical period hypothesis, namely by the neurologists Wilder Penfield and Lamar Roberts. The core tenet of the stronger version of the critical period hypothesis, which only indirectly accounts for L1–L2 acquisition differences, is rooted in human biology. Being a single case of a more general biologically regulated phenomenon, the acquisition of any language from mere exposure would be fully available to a young child and would be limited in older adolescents and young adults. The traditional biological account maintains that language-learning capacity declines along with brain maturation: ‘for the purposes of learning languages, the human brain becomes progressively stiff and rigid after the age of nine’ (Penfield & Roberts, 1959: 255). Most research traditionally concentrated on why stiffness/rigidity is so age related and what age effects mean in neuroanatomical and functional terms. As to the first issue, Knudsen (2004) distinguishes between ‘sensitive’ and ‘critical’ periods in neuroanatomical and neurofunctional terms. The term ‘sensitive period’ (used by Susan Oyama in 1976 to define an age-related limitation on the learning of L2 phonology) applies to some environmental effects of experience. These effects are particularly strong during a limited period in brain circuits’ development, but they do not disappear after this period is over. Instead, a ‘critical’ period is strictly defined by a terminus (endpoint) after which the environmental stimulus has no further effect on the connectivity patterns of neural circuits and the changes in brain functions are irreversible. Therefore, the distinction suggests that a critical period is just a subset of a sensitive period. In other words, the environmental experience (onset, length and degree of the exposure to the TL input) consolidated early in life may have two different effects: (a) it may lead to irreversible changes in neural circuits, or (b) it may bring about always-reversible modifications of the architecture of some neural circuits, leading to a stabilization of connectivity patterns which are energetically preferred given the current cognitive demand. The architectural changes underlying this time window involve at least three kinds of biological phenomena: (a) the elaboration of a new axonal

Discont inuit y in the Matur ing and in the Adapting Brain

71

projection field which establishes new connections among neurons; (b) the loss of dendrite spines due to the elimination of unused synapsis (Section 3.6); (c) synaptic consolidation by an increased number of pre-synaptic vesicles and/or neurotransmitter receptors (Knudsen, 2004: 1415). These modifications as a whole, which occur when the brain’s development meets the necessary external stimuli at the due time, are referred to as brain plasticity. The crucial point is that some of these modifications are reversible, while others are not. If these modifications are irreversible (cannot occur further after the critical period even if the stimulus continues), the brain can be said to suffer from a loss of plasticity. A clinical example of loss of plasticity occurring after the end of a critical period is the age-related decreasing equipotentiality of the brain hemispheres recovering after a focal brain injury. Instead, the typical example of plasticity occurring after the end of a critical period is the capacity to reorganize the neural substrates for novel tasks, such as learning a new language in adulthood. In the former case, the neural plasticity of children (before the critical period is over) should come into play when injuries to the brain cause some components of the language to be shifted to another part of the brain (often in the opposite hemisphere). Most research points out that this capacity of plasticity is lost after a brain has come to complete maturation. On the other hand, recent research based on structural morphometric measures provides evidence that the human brain even after puberty may achieve functional and structural reorganization – in terms of adult neurogenesis, angiogenesis and cell proliferation – after central and peripheral nervous system damage (Draganski & May, 2008: 139; May, 2011). In this latter case, neural plasticity is regarded as being a normal feature of the developing intact brain. The human brain seems in fact to have a prolonged postnatal development that displays considerable regional variability in time course (Stevens & Neville, 2009). This developmental plasticity is the prerequisite for some brain functions to change their structural-anatomical support over the lifespan as a consequence of maturation and experience. Therefore, plasticity is not only a characteristic of the maturing brain before the critical period, but it also characterizes the capacity of the adapting brain to cope with environmental changes. This capacity does not disappear in adulthood (Hiscock & Kinsbourne, 2008: 253–254; see also Birdsong, 2006a; Osterhout et al., 2004). By keeping apart the distinction on the one hand between critical and sensitive periods and on the other hand between the two kinds of plasticity (maximal and lifelong), it has been argued that some higher level cognitive functions and related brain systems display a period of maximal plasticity, whereas other functions display the same degree of plasticity throughout life

72

Discont inuit y in Second L anguage Acquisit ion

(Neville & Sur, 2009: 89). The issue here is to establish whether the acquisition of a second language in adulthood – or the acquisition of specific parts of the second language, at least – must take place within a period of maximal plasticity or whether it can rely on a lifelong plasticity instead. Plasticity of the former kind has been linked mostly to the phenomenon of brain’s maturation from birth to pre-puberty, while plasticity of the latter kind has been mostly linked to the degree of an adults’ brain adaptation over a whole lifespan. In the remaining part of this chapter, maturation and adaptation will always be kept apart. Therefore the initial question about whether and why loss of plasticity is age related becomes a more specific question on human brain anatomy and functioning. Can adult SLA still take place despite the fact that brain’s maturation has come to an end? Is an adult’s brain capable of adaptation in order to acquire additional languages? We have seen that some modifications which cannot typically occur after a brain has come to maturation exist. We ask ourselves whether these modifications are also crucial for SLA.

3.6 Discontinuity in the Maturing Brain 3.6.1 General features of left-hemisphere language function specialization As we have seen, maximal plasticity occurring within a critical period (and not just in a sensitive period), is considered to be the typical maturational (time-locked) feature of the pre-pubertal brain. Maximal plasticity is accounted for in studies about lateralization of the language network. In the neurocognitive literature, the word ‘lateralization’ alternates with the word ‘specialization’ to the extent that they are also frequently exchanged in this book. Lateralization/specialization is held to be the process by which the left hemisphere becomes the dominant hemisphere for language functions. This process was traditionally considered to start at the age of two and to be completed before puberty (Lenneberg, 1967: 142). The reasons behind the language-related advantage for the left hemisphere are still unclear. Some recent functional imaging studies claim that the presence of a structural asymmetry (higher gray matter density) would predict functional asymmetry and consequent lateralization for language functions (e.g. Josse et al., 2011). Others (e.g. Corballis, 2009) maintain the position that cerebral asymmetry in humans and in animals is not attributable to environmental influences because it has a genetic basis. Recent neurofunctional and neuroanatomical studies would confirm that a bias for left hemisphere

Discont inuit y in the Matur ing and in the Adapting Brain

73

language lateralization may be present very early in development, that is, before two years of age or even at birth. For instance, Kikuchi et al. (2011) investigated 78 preschool Japanese children aged 1–4 years with magnetoencephalography (MEG), by looking for temporal neural coordination of neural activity. When oscillation in theta-band activity (between 6 and 8 Hz) was coordinated between different regions in the left hemisphere parietotemporal cortex, these children scored better at vocabulary recognition tests. Kikuchi et al. (2011) conclude that higher language performance in infants and very young children is correlated with the coordination of brain activity of different regions in the left hemisphere. This and other similar results conclude that lateralization is not really gradual (from bilateral to left) and that linguistic competence actually is left lateralized from the start (see also Paradis, 2005: 413). According to other studies, on the other hand, specialization is not completely defined at birth, but is a function of a number of different phenomena which contribute to establishing gradually and slowly more efficient connections which substitute precedent connections. For instance, in Ojeman et al.’s (2003) study, essential language areas of 26 children ranging from four to 16 years were electrically stimulated during surgery for epilepsy. A map of the brain sites whose stimulation resulted in arrest in speech or in impairing object-naming capacity was recorded. It was found that the number of these crucial language sites around the perisylvian cortices increased significantly with the subjects’ age. These results support theories of gradual, agedependent intra-hemispheric organization of the left cortex for language function. A number of specific phenomena were found to be linked to a gradual brain specialization process. Among these phenomena are synaptic pruning, myelinization, dendrite migration and axon rewiring. For instance, axonal rewiring is the neural migration and dendrite projection from subcortical structures to the neocortex. After different axonal systems have differentiated from one another a few years after birth, cognitive functions based on axonal rewiring across a long distance cannot occur any more. If language acquisition crucially depends on axonal rewiring, then it cannot be replicated with similar outcomes in adulthood (Uylings, 2006).

3.6.2 Early discontinuity and the maturation of intra-hemispheric connectivity patterns Some studies support the view that the maturation process accompanies the establishment of a ‘default language network’ from birth to childhood rather than the neuroanatomical development of well-defined ‘language areas’ segregated in the left hemisphere. In this perspective, maturation

74

Discont inuit y in Second L anguage Acquisit ion

means that some functional capabilities of language production and comprehension are gradually localized and concentrated within some major ‘hubs’ which, in turn, are interconnected with other hubs of the same hemisphere and of other brain regions in controlateral hemispheres as well (Porter et al., 2011: 1865). Two of these language-specific hubs are located in the frontal and in the temporal cortical regions in the left hemisphere. Maturation of language capacities could depend on the increase of a functional connectivity between the frontal and posterior temporal regions in the left hemisphere, which parallels a decrease in controlateral connectivities (Friederici et al., 2011). Early discontinuity, as opposed to a late discontinuity, depends on whether these connections are already effective or not in a child’s brain. Brauer et al. (2011), using a combination of diffusion tensor imaging (DTI) and functional magnetic resonance imaging (fMRI) techniques, targeted the anatomy and functioning of the arcuate fasciculus/superior longitudinal fasciculus, which serves as a pathway connecting Broca’s and Wernicke’s areas in children’s and adults’ left hemisphere. They found that the dorsal pathway of fiber tracts connecting the inferior frontal gyrus and the superior temporal gyrus/sulcus has not yet matured by seven years of age. This result indicates that children learning any language(s) would first rely on a more locally organized connectivity pattern between close regions of both hemispheres rather than on long-distance connections between the classical language areas (Broca’s and Wernicke’s areas) in the left hemisphere alone. The biologically determined bias for a short-distance, more efficient interhemispheric connectivity would determine the preference for the SL of the first language. SL would have primacy in early language acquisition. Only when efficient intra-hemispheric connectivity patterns are established, would GL take to the floor. Friederici et al. (2012) describe three behavioral milestones in the development of language functions in infants. These language functions closely depend on the degrees of brain maturation. At step 1 (1–6 months of age), SL by children can be observed in the speech input. This capacity is supported by auditory and association cortices located in the temporal cortex. The early-established connectivity patterns between the temporal cortex in both hemispheres ensure that children are sensitive to TP among words much earlier than they are to rules of grammar (in the sense better specified in Sections 5.12 and 5.13). At step 2 (2–3 years of age), the syntactic phrase structure of clauses and subordinated clauses can be mastered to a sufficient degree. This capacity could be supported by a network involving the inferior temporal gyrus and the temporal cortex connected by a ventrally located pathway. Finally, at step 3 (6–7 years of age), complex and embedded syntactic structures are processed because the dorsally located

Discont inuit y in the Matur ing and in the Adapting Brain

75

connection between Broca’s area and the left posterior superior temporal gyrus and the left superior temporal sulcus is established. According to this view, child brain maturation determines the gradual passage from SL to the mastery of complex syntax (Friederici et al., 2012: 209). This passage is discontinuous and it is due to a brain’s maturation. Similarly to the phenomenon of axonal rewiring seen in the last section, once these functional connections among language areas in the left hemisphere are established, the change is irreversible and it can neither be modified nor replicated in adulthood. There are converging cues of a change occurring between the pre- and post-pubertal periods which could be linked to the difference between early and late discontinuity in language acquisition. The establishment and the strengthening of the language network seem to have detrimental consequences on learners’ capacity to extract regularities from the input to which very young and adult learners are exposed. Children before puberty were found to display more sensitivity to low-frequency regularities and subtle statistical cues in comparison to adults. If this is true, SL in infants is more powerful and accurate than in adult learners. In McNealy et al.’s (2010, 2011) fMRI studies, adult and child subjects are exposed to a speech stream in an artificial language. The fMRI shows that the signal increase in the temporal and inferior parietal cortices is progressively lateralized (from right to left) as a function of aging: adults display a left hemisphere bias (towards the temporal lobes, namely the superior temporal gyrus), while children display a bilateral left-hemisphere-right-hemisphere pattern of activation. Crucially, when listening to the random syllables stream with very small statistical regularities, only children before six exhibited significant signal increase (in the right hemisphere). Finally, the amount of increase was positively correlated with both children’s age and proficiency in a second language. It is suggested not only that age and proficiency determine the functional organization of regions engaged during language learning, but also that being exposed to more than one language during childhood may influence the ease with which another language is learned (McNealy et al., 2011: 1278).

3.6.3 Early discontinuity and other (ir)reversible features of brain maturation There are also phenomena other than fiber-traits establishment that occur as a child’s brain matures. Some of these phenomena take place during the period of maximal plasticity and are neither reversible nor replicable in adulthood. Some others (e.g. changes in the anatomy of the corpus callosum, see below) seem to occur both during maximal and lifelong plasticity. As a

76

Discont inuit y in Second L anguage Acquisit ion

whole, all these features of a brain’s maturation are in theory potentially linkable to the phenomenon of early discontinuity discussed above, that is, to the passage between SL and GL. Research on interhemispheric connectivity in the maturing brain is also based on the study of the anatomy and of the functioning of the corpus callosum. The corpus callosum is a wide, subcortical fiber tract connecting the right and left hemispheres. The myelination of the corpus callosum (the process by which the neuron axons are wrapped with a glial cells sheath which increases efficiency and signal speed across neurons) is held to be uncompleted before the age of 6–7 years. On the one hand, insufficient myelination of axons is the reason why, according to Hellige (2008: 263), young children before this age would find it more difficult to perform a task that requires information to be shared across the hemispheres, while older learners can rely on efficient callosal function and to interhemispheric connectivity to acquire a language On the other hand, these results are contradicted by other studies claiming a child’s advantage for interhemispheric connectivity due to early maturation (myelination of neural axons) of a specific part of the posterior end of the corpus callosum, the splenium (Dubois et al., 2008). Studies on the corpus callosum suggest that there are parts of the brain for which the distinction between brain maturation and brain adaptation is not crystal clear. In fact, modifications in the corpus callosum seem to begin early or even at birth and continue in adulthood without discontinuity. An fMRI study on the callosal thickness of 190 children and adolescents (aged 5–18) revealed a different (anterior-to-posterior) pattern in corpus callosum maturation and suggested that there might be alternating phases of callosal growth and shrinkage which mirror the fine-tuning of fibers connecting homologous cortical areas in both hemispheres (Luders et al., 2010). This fine-tuning process would also last after puberty because – unlike what happens for axon rewiring and fiber tract maturation – the gray matter increase of corpus callosum is environmentally modulated and may vary across a lifespan as a function of learning (not only language learning). Another phenomenon which is related to brain maturation is synaptic pruning. Synaptic pruning is the reduction of the number of neurons and synapses in order to leave only more efficient synaptic configurations (see Green et al., 2006). A hypothesis has been advanced that monolingualism correlates with more extensive synaptic pruning due to non-use of a second language (De Bot, 2006: 130). By observing the rate of synaptic pruning, one might say that some brain areas become specialized for language only after puberty (and after the critical period is over). For instance, cortical gray matter decrease in thickness and volume has been observed only starting from adolescence. This decrease is taken as a cue for increased neural

Discont inuit y in the Matur ing and in the Adapting Brain

77

efficiency in language-related regions because the volume and thickness of gray matter decrease as a function of synaptic pruning. Porter et al. (2011) found that a decrease in cortical gray matter correlates with performance in a controlled oral word association test. In this test, subjects are asked to produce as many unique words as possible (all beginning with the same letter) while timed; 167 subjects aged nine through 23 years were scanned with fMRI. Estimates of the cortical thickness were obtained by processing the high-resolution anatomical images. Improved verbal performance correlated with decreased thickness in the superior temporal gyrus and middle temporal gyrus (in the left temporal lobe) and the inferior frontal gyrus, but also in the temporal-parietal junction. Moreover, the thickness-controlled oral words association test relationship was found to be stronger in frontal regions for the older compared to younger subjects. The authors concluded that cortical thickness in crucial language networks begins maturation toward an adult-like processing efficiency by middle childhood (around the age of nine years). To conclude: while there is still a lot of discussion and criticism about the existence of cut-off points and about at what age the effects of maturation must be placed, there is some consensus that some of these effects on the brain at least are anatomically and functionally distinguishable from the effects of a lifelong brain adaptation. The distinction between the effects of brain maturation and the possibilities offered to adult language learners by brain adaptation is crucial for the discontinuity hypothesis. We have seen that the process of establishment of a network for language accompanies a child’s brain maturation and eventually comes to an end some years before puberty (6–7 years of age) but also, for some phenomena at least, around nine years of age or even later. Whether or not the onset of the maturation process can be anticipated to be one year or even a few months after birth, it seems reasonable to think that the establishment of a specialized network for the primary language(s) causes the reduction of a learner’s potential for acquiring additional languages at a later age, when the process is over. This would happen because the process serves to accommodate language functions into the broader cognitive system and especially to interconnect the language networks with the memory systems, with executive and attention control systems and with the motor-sensory areas. Once synaptic pruning, myelinization, dendrite migration and long-distance axonal rewiring have occurred, no-one can return the system to its ‘state of innocence’ and start the process of language acquisition again from scratch. It is questionable whether, after these neuroanatomical and functional changes, the ‘window of opportunity’ for additional languages to be learned subsequently in life is definitely shut or if it remains half-open.

78

Discont inuit y in Second L anguage Acquisit ion

3.7 Lifelong Effects of the Early Acquisition of Additional Languages In the last section, I introduced the idea that neuroanatomical and functional changes occurring exclusively during brain maturation from birth to childhood may make the difference between early and late language acquisition (see note 1 in this chapter for the uses of the expressions ‘early’ and ‘late’ language acquisition). Actually, acquiring just one or many languages during childhood has different consequences for the maturing brain. In this section, we see how early SLA may have lifelong effects on successive structural and functional brain development and on the development of cognitive functions. Successful early bilingualism provides the classical example of a phenomenon that must necessarily occur within the period of maximal plasticity. The hypothesis that the experience of early bilingualism influences children’s cognitive development in the brain’s maturation process was advanced some years ago. Nowadays it seems evident that early bilingualism brings both cognitive advantages and linguistic disadvantages. The latter would show in the slower retrieval of words in the domain of lexical competence and at the syntax-discourse interface (for a review, see Sorace, 2011b). The cognitive advantage, on the other hand, would definitively change a child’s brain organization and would persist over the lifespan. It is known that bilingual children at the age of 4–8 years outperform their monolingual peers in executive and inhibitory control tasks, such as drawing a continuous path in sequential order through numbers arranged casually on a page (Bialystok, 2010; Martin-Rhee & Bialystok, 2008). The management of two languages from early childhood would also cause bilingual young adults to show and keep a processing advantage in tasks that require attentional control to ignore or inhibit misleading cues. Early bilinguals, in fact, have had massive practice in exercising inhibitory control in keeping the two languages separate since birth. When solving non-linguistic cognitive problems, early bilinguals continue to rely on already well-trained neural resources (left prefrontal cortex and anterior cingulate cortex), and also in adulthood (Bialystok et al., 2005). Bialystok et al. (2004: 297–301) concluded that the age-related processing decline related to a specific set of cognitive tasks is more severe for the monolinguals than for comparable bilinguals and that lifelong bilingualism provides a defense against this decline. Luk et al. (2011) tested whether there is a relationship between the onset age of bilingualism (seen as the degree of experience in handling two languages) and enhancement in cognitive control. The study does not take into consideration only the age at which the second language is learned, but

Discont inuit y in the Matur ing and in the Adapting Brain

79

the age at which both languages are used on a daily basis. To access these data, a detailed language history questionnaire for each participant was taken into account. Early bilinguals reported that they started active bilingualism before the age of 10, and late bilinguals reported that their onset age of active bilingualism was after the age of 10. Thus an earlier onset indicates a longer history in using two languages. Participants were then categorized as monolinguals or early and late bilinguals. The cut-off age for categorizing early or late bilingualism was taken to be 10 years old. Participants were asked to indicate the direction of a red chevron on a computer monitor. The chevron was preceded and followed by other black chevrons pointing in the same (congruent condition) or in the opposite (incongruent condition) direction. Reaction times were calculated. Results show that early bilinguals show less interference than late bilinguals and monolinguals when indicating the direction of the red chevron. For better evaluation of the impact of these studies it would be crucial to separate the length of exposure to active bilingualism from age of onset of acquisition. Unfortunately, these two factors are not disentangled in the study as the operationalized factor was ‘age of active bilingualism’. Luk et al. (2011: 594) conclude that early and persistent bilingual experience is related to increased cognitive efficiency and higher L2 proficiency. By using a combination of fMRI and VBM techniques, Abutalebi et al. (2011) found that adult bilinguals especially use the dorsal anterior cingulate cortex (ACC) – a structure connected to domain-general executive control functions – more efficiently (with less effort) than monolinguals when facing conflicts and choices. Behavioral measures also correlated positively with an increase of gray matter volume in the ACC. Abutalebi et al. (2011) suggest that the early learning and lifelong practice of two languages impacts upon human neocortical development and that the bilingual brain adapts better to resolving cognitive conflicts in domain-general cognitive tasks (for disconfirmatory findings, see Kousaie & Phillips, 2012). The capacity of highproficient bilinguals to switch from one language to the other seems to be associated not only with the ACC, but also with a complex neural network which comprises the dorsolateral prefrontal cortex and the supramarginal gyrus. Gray matter volume increase was also found in the left caudate nucleus of 27 adult, spoken and sign-language bimodal bilinguals as compared to 13 Chinese spoken and sign-language bimodal monolinguals (Zou et al., 2011). Results of functional imaging and anatomical (voxel-based morphometry) analyses indicate that structural plasticity in the head of the left caudate nucleus occurs in response to the demand of speaking two languages. These results confirm that early bilingualism is claimed to shape both the anatomy and the functioning of a whole network of structures which supervise executive cognitive control.

80

Discont inuit y in Second L anguage Acquisit ion

3.8 Discontinuity and an Adult’s Brain Adaptation Unlike the process of brain maturation (in terms of plasticity, i.e. the lifelong brain’s adaptation), it seems that there is no critical period or decline in the capacity of experience-based learning by adults ‘because of the plasticity potential of the underlying brain circuitry elements necessary for these abilities’ (Uylings, 2006: 60). In fact, learning from experience does not seem to require long-distance axon rewiring (which is age-sensitive), but local dendritic and synaptic alterations with thickness increase or decrease of gray matter. These modifications and this kind of plasticity begin in childhood and do not seem to stop in adulthood (Uylings, 2006: 73). Rodrìguez-Fornells et al. (2009: 3727) maintain that new discoveries about neurogenesis and epigenetics in adults might confirm some claims about the brain’s plasticity and also about the neural consequences of language learning as changing. For instance, it seems that acquiring a second language in adulthood, which is allowed by brain plasticity, may also confer protection against the onset of Alzheimer disease, significantly contributing to ‘cognitive reserve’ (Craik et al., 2010). Cognitive reserve is an active process of the brain attempting to cope with or to compensate for pathology (for a definition, see Stern, 2002). Bilinguals who learned and used the language in adulthood can also be said to have more enhanced attention and cognitive control than monolinguals. These factors would delay the onset of Alzheimer disease by five years on average. Other structural and functional changes occurring in an adult’s brain may be due to the direct environmental effects of the experience of learning another language, independent of age of onset of acquisition. These effects have well-known anatomical correlates. Coggins et al. (2004) examined midsagittal corpus callosum variability in 12 bilinguals compared to seven monolingual individuals by using magnetic resonance imaging (MRI). Significant differences were found in the size of the anterior midbody of the corpus callosum, which was larger in bilinguals. This was attributed to either an increase in axon number or to an increase in myelination. Since myelination and the number of axons do not necessarily correlate (Uylings, 2006), Coggins et al. (2004) explain these results in two ways: (a) when adult individuals learn another language, the processing demands of multiple language organization in the cortex require an increase in communication and higher signal-conduction speed from symmetric cortical regions through the corpus callosum; (b) if the increase in axon numbers is the cause of corpus callosum enlargement, then the switching device which keeps one language from interfering with another may be enhanced (Coggins et al., 2004: 72). Mechelli et al. (2004) used voxel-based morphometry (a technique which distinguishes

Discont inuit y in the Matur ing and in the Adapting Brain

81

the sizes of gray and white matter by spatial normalization of high-resolution MRI of the brain; see Mechelli et al., 2005) to assess whether and how environmental demands determine changes in white and gray matter in adults’ brains. The hypothesis is that the structure of the brain alters when one faces novel tasks, such as learning a second language, and that the degree of this structural reorganization depends on both the age of onset of acquisition and proficiency. To assess this, Mechelli et al. (2004) compared the gray and white matter of 25 monolinguals, 25 early (5 years solved the problem Bill cannot be interpreted as the subject of solve (Berwick, 2011) because there is an abstract, invisible space < i > between the words thinks and solved that is already occupied in a speaker’s mental representation. It has been proposed (see Belletti & Rizzi, in press) that certain relations between adjacent items cannot hold across an intervening word of a certain kind. The critical notion here is ‘intervention’: children cannot compute a local relation across an intervener which is structurally similar to the target (the landing site) of the displacement relation. The difficulty related to the acquisition of the constraints on movement is visible in L1 and L2 acquisition. For instance,

Par t s of L2 Grammar That Resist St at ist ical Lear ning

193

children until after the age of five are unable to properly comprehend and produce object relatives (while subject relatives are understood and produced as soon as the relative construction is mastered around the age of three). Friedmann et al. (2009) explain that the acquisitional difficulty arises from the intervention of the subject (e.g. the elephant) in the chain connecting the head of the relative (the lion) with the object gap < i >. The object gap is the place left empty after the constituent NP the lion (object of the verb wets) has been fronted to merge with the relative clause as in Sentence (4): (4) The lioni that the elephant wets < i > In elicitation experiments where the object relative such as in Sentence (5) was targeted, children often opted for the production of a corresponding subject relative with passive like Sentence (6): (5) I would rather be the child that the mother covers (6) I would rather be the childe that is covered by the mother Adult L2 learners too – like the L1 children – seem to find object relative clauses more difficult to process than subject relative clauses. For instance, Havik et al. (2009), in a self-paced reading experiment, found that L1 German, L2 Dutch learners exhibit greater reading difficulties with Sentence (7) than with Sentence (8) (the English translation of the Dutch is provided): (7) There is the train driveri who the conductors have freed < i > from the burning train-carriage (8) There is the train driveri who < i > has freed the conductors from the burning train-carriage One explanation of this well-known fact is that L2 learners experience a general difficulty in constructing hierarchical syntactic representations which include the gaps < i > originated by fronted NPs (such as the train driver). These structural representations should take into account the relationship between the moved element and its invisible copies, or the fillergap dependencies. If these representations drive sentence processing, a sentence like (7) could take a longer time and more processing resources than Sentence (8) to be computed because of two reasons, which are not incompatible: (i) the greater distance between the copy and the gap and (ii) because the adjacent intervener (the conductors) potentially misleads a learner’s interpretation towards a resource-saving, local (or ‘good-enough’, Section 4.9) interpretation under which the subject is the closest NP (see

194

Discont inuit y in Second L anguage Acquisit ion

also Felser & Roberts, 2007; Jackson & Bobb, 2009; Rah & Adone, 2010). Marinis et al. (2005) found out that – where there is more than one gap in a sentence – this facilitates wh- dependency resolution by native speakers, but not by L2 learners. For instance, reading times in a self-paced reading experiment revealed that native speakers found it easier to link the pronoun (who) to its proper gap < i > when there is another suitable (structural) intermediate gap < i >2 like in Sentence (9) as compared to Sentence (10) where there is no such additional gap: (9) The manager who i the consultant claimed < i > 2 that the new proposal had pleased < i > will fire five workers tomorrow (10) The manager who i the consultant’s claim about the new proposal had pleased < i > will fire five workers tomorrow The authors concluded that adult L2 learners cannot make use of abstract syntactic representations which contain all kinds of filler-gap dependencies, intermediate ones included (the so-called successive cyclic wh- movements). This same study was replicated with different kinds of L1 Greek very advanced learners of English by Pliatsikas and Marinis (2012) where the effect of naturalistic exposure (and not only of formal instruction) in processing wh- dependencies was also factorized. Twenty-six advanced GreekEnglish L2 learners of English with an average nine years of naturalistic exposure, 30 with classroom exposure, and 30 native speakers of English completed a self-paced reading task with sentences involving intermediate gaps such as those in Sentence (9). Only L2 learners with naturalistic exposure showed evidence of native-like processing of the intermediate gaps. In more detail, reading and comprehending the segment of the sentence immediately after the intermediate gap was easier (faster) for adult learners who had spent many years (>9) abroad compared to learners in the classroom setting, suggesting that linguistic immersion can lead to native-like abstract syntactic processing in the L2. Overall, these findings do not support the idea that the distance between concordant elements is a delaying factor for acquisition and would instead stress the importance of the structural factor (see also Aldwayan et al., 2010). It seems also that advanced adult L2 learners cannot see structural gaps in a native-like way, but linguistic immersion can bring about native-like processing. Did the learners in Pliatsikas and Marinis (2012) succeed because linguistic immersion gave them the possibility (time and density of instances) to perform statistical computation of empty categories such as wh- dependencies? Although the authors of the research did not indicate the precise source of the L1–L2 convergence in the processing of empty categories, they clearly

Par t s of L2 Grammar That Resist St at ist ical Lear ning

195

excluded the idea that frequency can be considered a factor. Constructions involving gaps and successive cyclic movement are in fact extremely rare in the colloquial input to which learners were exposed. Moreover, the verbs used in the experiment (claim, argue, prove, suggest, conclude, decide) were purposely selected for having a sentential complement bias rather than a direct object bias in native English, the latter being a far more frequent construction than the former, in general. If we exclude frequency as a possible factor for learners’ performance, it remains unclear how ‘linguistic immersion’ characterizes itself in experimental terms, especially when it is also said to differ from proficiency in general (Pliatsikas & Marinis, 2012: 13). Maybe if a learner’s L2 becomes dominant, L2 processing becomes undistinguishable from L1 processing (Birdsong, 2006b: 47). Maybe the facilitation effect on reaction times at intermediate gaps of cyclic wh- extraction in object relatives is not as conclusive as these authors suppose it to be in order to establish that L1 and L2 processing become similar over time. For sure, it is not enough to conclude that late learners can generalize over invisible entities such as gaps, nodes and constituent borders. Kim and Goodall (2011) offer an example of how age of arrival in the new language environment differently affects learners’ capacity to represent and to process island constraints and that-trace effect. This capacity presupposes a twofold ability: a representational and a processing one. Firstly, learners are credited with representing and computing the syntactic structures of the sentence, that is, they must know whether certain positions at node boundaries can be filled with certain elements or must remain empty. Secondly, learners must overcome some performance-related difficulties that rise above a threshold to create the perception of unacceptability of sentences. According to Hofmeister and Sag (2010), the acceptability of sentences with violation of island-constraints depends on the processing costs and not on learners’ representational capacity. Kim and Goodall (2011) found that, while both native speakers of English and native Korean immigrated in the US at various ages did not show differences in rejecting sentences that violate wh- island constraints like Sentence (11) (11) *Who do you wonder whether Ann saw? the acceptability of sentences with subject and object extraction across the complementizer that clearly sets apart native speakers of English and nonnative learners, with the latter having more positive ratings for ungrammatical sentences like Sentence (12): (12) *Who do you think that will see Mary?

196

Discont inuit y in Second L anguage Acquisit ion

Kim and Goodall (2011) advanced the hypothesis that processing islandconstraints overwhelms both native and non-native capacities, while sensitivity to that-trace effect is purely grammatical and, as such, it is subject to age-related effects and to the availability of relevant examples in the input at the proper window of opportunity. These authors found that early arrivals are way more comparable to native speakers in that they at least reject subject extraction across that, while late arrivals do accept both object and subject extraction. One explanation for the fact that the acquisition of sensitivity to that-trace effect could be age-related follows from the consideration that late arrivals could have had less opportunity than early arrivals to exposure to the relevant input at the proper age. Since such constructions are rare in the native input, this very fact could have been decisive for acquisition.

6.6 There are Parts of the Second Language that Cannot Be Learned like a Song Pliatsikas and Marinis (2012) is noteworthy because it represents a clear attempt to demonstrate that some very abstract properties of the TL can be learned (represented and processed) in a native-like manner. What seems to determine success in acquisition in Pliatsikas and Marinis’ (2012) study is ‘linguistic immersion’. Unfortunately, this is not better specified or factorized otherwise. There are, on the other hand, many more numerous studies which support a very different idea: late learners cannot learn some parts of the TL grammar. This is claimed to happen not because those learners were not exposed to a sufficient quantity of TL input, but because of the existence of representational or processing age-related deficits. Some parts of the L2 grammar cannot be represented and processed by adult learners after the critical period is over. The reason is that here the switch between SL and GL cannot occur. This does not mean that there are parts of the second language that cannot be acquired by adult learners. This means only that, for this purpose, adult learners cannot recruit and rely on well-practiced, wellexperienced and fully available cognitive mechanisms. They cannot rely on discontinuity and cannot replicate and mimic the devices that worked so well in early language acquisition. Similarity metrics, chunking, type frequency, Zipfian laws, TP and attentional biases are parts of SL which we mentioned and discussed in the last chapter. These devices together are those that enable us to learn implicitly and perfectly by heart – after only few repetitions – the melody and lyrics of the songs we listen to in the morning when driving to work. These cognitive capacities even enable us to remember the order of our favorite songs in a compilation to the extent that, when a

Par t s of L2 Grammar That Resist St at ist ical Lear ning

197

song ends, we can predict which one comes next and also guess what key the tune is in. Unfortunately these cognitive mechanisms do not help with the acquisition of a subset of linguistic phenomena we have dealt with in the last paragraphs: empty categories and constituents’ movement due to internal merge. In the next sections we will also deal with phenomena at the language interfaces because they too are good candidates for resisting and being completely impermeable to statistical predictions and to SL. Before addressing phenomena at the interfaces (Section 6.8), it is important to stress that (a) the dichotomy between adjacency and non-adjacency does not characterize non-combinatorial grammar (Section 6.6), and (b) non-combinatorial grammar is only about structural distance, not about linear distance.

6.7 Non-combinatorial Grammar and Non-adjacency of Items in a Sentence In Section 6.3 we focused on just two examples of non-combinatorial grammar: null pronominal subjects and filler-gap dependencies after constituents’ movement. One important part of non-combinatorial grammar (not all non-combinatorial grammar) is in fact characterized by internal merge, which is an instance of the general phenomenon of displacement. Displacement is not simply an alteration or disruption of linear word order. Displacement is when an item pronounced/read in a certain place in a sentence is interpreted as in another (e.g. Chomsky, 2000a, 2007a, 2013). This very fact brings it about that the mental structural representation that the speaker/learner must construct in real time must keep in account the relationship between the moved element and its invisible copy. We have said that (a) statistics over word distribution cannot help this kind of computation and (b) the notion itself of ‘frequency of empty categories’ is nonsense. In order to be ‘noncombinatorial’, the grammar must involve displacement, not just disruption/ alteration of linear word order. In Chapter 5 we have seen that the fact that two connected items are not adjacent is not a problem for SL because nonlocal adjacencies can be learned discontinuously. Even though non-adjacency and displacement are different phenomena, sometimes they are treated on a par. To take an example, the grammatical phenomena listed in Bunt (1996) are gathered under the label of ‘discontinuous constituency’. Discontinuous constituents are those made up of words which do not stand next one another. For instance, a very short discontinuous VP is ‘Mary woke me up at seventhirty’, where the pronoun intervenes between the verb and the preposition. To give another example, in a sentence like ‘this is a better movie than I expected’ the word movie separates two parts of the adjectival phrase. Since

198 Discont inuit y in Second L anguage Acquisit ion

linear word order may also be altered by structural principles of a different nature, Bunt (1996) lists other phenomena under the same label of discontinuous constituency. These further phenomena are: (1) relative clause extraction; (2) extraposed sentences (‘knights appeared who were riding white horses’); and (3) scrambling (in German languages, Russian, Ukrainian, etc.). We argue that non-adjacency in phrasal verb components (‘wake me up’) and displacement in object relative cannot be compared, neither from a theoretical nor from a developmental point of view. Non-adjacency in phrasal verbs can be learned statistically. Non-adjacency in (1)–(3) has a different nature because the NP has been displaced and has a relationship with an invisible item. Alexiadou et al. (2013) make another list of grammatical phenomena under the label ‘non-local dependencies in syntax’. These dependencies can involve two positions in a syntactic structure whose correspondence is not dictated by predicate/argument structure. Relevant phenomena explored in Alexiadou et al. (2013) include long-distance movement, long-distance reflexivization, long-distance agreement, control, non-local deletion, long-distance case assignment, consecutio temporum, extended scope of negation, and semantic binding of pronouns. The list is very heterogeneous. Here again, non-local dependencies and non-combinatorial grammar are put under the same label. From the discontinuity hypothesis perspective, the fact that many words intervene between two linked items does not hinder SL in doing its job. Instead, the NP-fronting in a sentence like (4) (‘the lion that the elephant wets’) poses problems (to children, aphasics and adult second language learners) which go well beyond linear distance. These problems are common to all non-combinatorial grammar phenomena. In conclusion, non-combinatorial grammar is a small subset of those more numerous phenomena which involve word-order disruption. But wordorder disruption and linear distance between connected items are not the keys to define non-combinatorial grammar. In the lists provided by Bunt (1996) and Alexiadou et al. (2013), we can isolate just four phenomena and label them as non-combinatorial grammar: (a) passive sentences; (b) relative clauses; (c) anaphora binding; and (d) wh- extraction. In the next section we will discuss structures involving movement and non-local dependencies such as filler-gap relations and constituent extractions.

6.8 Non-combinatorial Grammar and Long-distance Dependencies: The Shallow Structure Hypothesis The shallow structure hypothesis (SSH) makes predictions that seem to fit those made in the discontinuity hypothesis. The difference is that the

Par t s of L2 Grammar That Resist St at ist ical Lear ning

199

SSH does not identify in a precise manner the role of SL in discriminating between the grammar that can be learned (combinatorial grammar) and the grammar (categorization over absences) that is very unlikely to be learned by adult L2 learners. The SSH’s basic claim is that – while children’s processing mechanisms are the same as in mature adults and do not undergo developmental changes – adult L2ers who have learned their L2 after acquiring their native language have limited access to or ‘under-use’ L2 syntactic representations, namely, hierarchical phrase structure and empty categories (Clahsen & Felser, 2006a, 2006b, 2006c; Clahsen et al., 2010; Felser & Roberts, 2007; Marinis et al., 2005). Adult L2 learners do not have problems with computing grammatical representations ‘that lack complex hierarchical structure and abstract, configurationally determined elements such as movement traces’ (Clahsen & Felser, 2006c: 117). This means that adult L2 learners’ processing is complete and robust as far as it is restricted to local domains (argument structures, word composition and morphosyntactic agreement between adjacent constituents), but it turns out to be defective when it faces structures involving movement and non-local dependencies such as filler-gap relations and constituent extractions. Clahsen and Felser (2006a) claim that the crucial distinction involves local and non-local dependencies. The former concern adjacent words or constituents; gender agreement in the NP and subject-verb agreement are examples of local dependencies. Non-local dependencies arise, for example, in wh- questions such as Sentence (13) (13) Which book did Mary think borrowed_?

John believed

the student had

where – in order for the sentence to be interpreted correctly – the fronted NP (underlined) needs to be linked to its proper subcategorizer (the verb borrowed). In turn, in order to establish this link, a learner’s processing capacity must include the ability to cross two embedded sentences and disregard two potentially available gaps (). This capacity could be impaired in late L2ers for four reasons: (a) limitations of the L2 grammar (impossibility of computing a-chains); (b) the limited role of L1 transfer in non-local dependencies; (c) cognitive resources limitations (having to identify words and sentences in a second language causes an additional drain on working memory resources); (d) maturational constraints (late learners rely on declarative rather than on procedural memory).

200

Discont inuit y in Second L anguage Acquisit ion

The SSH is supported by a number of findings coming from behavioral and online developmental studies. For instance, Felser and Roberts (2007) found that adult L2 learners fail to postulate syntactic gaps when processing filler-gap dependencies, contrary to native speakers who have positionspecific shorter reaction times in a cross-modal picture priming experiment. In two reading comprehension experiments with the eye-tracking technique, Felser et al. (2012) wanted to test whether the processing of whdependencies is mediated by structural information to the same extent as in native language. The non-native group in the experiment was comprised of very proficient German-speaking learners of L2 English. They were told to read sentences without and with wh- extraction, like in Sentence (14), where a relative clause [who . . .] formed an island that is expected to block dependency formation between the fronted NP (the magazine or the shampoo) and the verb read and to favor the placement of the correct gap after the verb bought: (14) Everyone liked the magazine (shampoo) that the hairdresser [who read extensively and with such enormous enthusiasm] bought before going to the salon Moreover, sentences were balanced between a plausible reading (NP = the magazine) and an implausible one (NP = the shampoo). Since sensitivity to extraction through islands depends on an interaction between plausibility and structural constraints, native speakers were expected to show the effects of implausibility (longer reading times) only in non-island environments, sensitivity to island constraints violation having the absolute precedence (shown by a disruption in early reading measures) over the consequences on reading due to the effects of semantic violations. On the contrary, the effects of plausibility in non-native speakers are not expected to be modulated by syntactic constraints. The results showed that only native controls were sensitive to the syntactic constraints signaled by the presence of a relative clause boundary [who . . .] independently of plausibility effects. Non-native speakers showed instead greater sensitivity to plausibility violations. Felser et al. (2012) conclude that, while filler integration in L1 processing is mediated by subcategorization requirements and by constituent structure, the initial stages of filler integration in L2 processing are semantically driven. Some commentaries on the SSH are also significant about the issues raised by the discontinuity hypothesis. A first criticism arose about the SSH concerning the existence of native-like responses by adult L2ers in some ERP experiments. Evidence of L1–L2 electrophysiological convergence for grammatical violations would undermine the basic tenets of the SSH. For instance,

Par t s of L2 Grammar That Resist St at ist ical Lear ning

201

Frenk-Mestre (2006) points out that variability in anterior negativities and N400–P600 ERP responses concern both native speakers and L2 learners and cannot be taken as a marker of differential processing. Clahsen and Felser (2006c: 119) replicate the finding that native-like responses (e.g. in the form of a delayed P600 component) which were found in many studies are actually found only in some domains of the L2 grammar, ‘such as simple concatenative morphology’. Also, gender agreement in the NP – where L2ers showed native-like patterns, according to Steinhauer (2006: 122) – is ‘still a very local phenomenon’. Clahsen and Felser’s (2006c) reply is line with our interpretation of fMRI and ERP studies presented in Chapter 4. L1–L2 convergences, when they occur, concern exclusively combinatorial grammar, that is, the concatenation of items which eventually end up being labeled (recognized by learners as being headed) and whose features turn out to be extracted from mere repetition of co-occurrence patterns in a flat sequence and projected in a vertical (hierarchical) structure thanks to the frequency effect (TP, variation-sets and context-frames). No electrophysiological or neuroimaging study so far has shown full L1–L2 convergence for wh- extraction, null subjects, object relatives and other phenomena involving the computations of invisible (mental) features such as traces, constituent boundaries and phenomena at the interfaces. A second criticism – exemplified by Juffs (2006) – is that the SSH is not sufficiently specified in grammatical terms. Juffs claims that it is not clear why L2 learners do not face difficulties with the formal features-checking mechanism involved in lexical integration while they would face difficulties with structural gaps involving [±wh] features. In substance, Juffs (legitimately) wonders why some uninterpretable features (case and agreement) are checked successfully by adult L2 learners, while other abstract features are not. Also Gillon Dowens and Carreiras (2006: 50) ask themselves whether (given that Clahsen and Felser believe that L2 learners are not able to fully parse sentences because of developmental factors) they can account for why there is a critical period for sentence parsing and not for morphosyntactic processing. Finally, Libben (2006) notices that the differences that Clahsen and Felser pose between morphological and sentence processing are not intrinsic to these domains, but may be due to the choice of the materials used as stimuli. An indirect answer to these questions is provided by Clahsen and Felser (2006c). These authors posit that the procedural memory system is partly accessible to adult L2 learners only for the processing of local morphology. Local morphology is combinatorial grammar. This could explain why morphology is easier than complex syntax to be learned. Ullman (2006a: 100), on the contrary, claims that the declarative system also supplies the functions of a less available procedural memory system for a number of rules

202 Discont inuit y in Second L anguage Acquisit ion

of grammar. So it is not the procedural memory which is available for morphology, but the declarative memory that tentatively takes its role. Ullman also makes it clear that not all the rules of grammar can be equally learnable in declarative memory: complex forms that are shorter and more frequent are easier to learn than those involving long-distance dependencies. But it is just a matter of (developmental) time: the DP model – unlike the SSH – posits that experience with the L2 leads to proceduralization of grammar resulting in L1-like grammatical processing to any extent (syntax comprised). Therefore the DPM, unlike the SSH, does not differentiate between the degree of learnability (ultimate attainment) of morphology and of complex syntax. In fact Brovetto and Ullman (2001) and Birdsong and Flege (2001) found frequency effects for the acquisition of past tense morphology only in low-experienced L2 speakers but not in high-experienced L2 speakers and in native speakers. This suggests that the path from declarative to procedural memory concerns both morphology and syntax, despite the former being probably more likely to be learned faster than the latter. A third important point of controversy about the SSH is to ascertain whether the SSH concerns only processing or also representational difficulties. Sorace (2006) emphasizes that the SSH has the merit to suggest that grammatical processing is not parasitic to grammatical representation. L2 learners may have incomplete/defective processing procedures while having the corresponding grammatical representations perfectly in place (Sorace, 2006: 88). Clahsen and Felser (2006c) actually suggest a different interpretation of the SSH. L2 processing is defective because the L2 grammar of adult learners is defective and cannot ‘feed’ the parser with the appropriate symbolic representations. Clahsen and Felser (2006c) assume that the parsing mechanisms are universal, do not need to be learnt and as such are also available to adult learners. Clahsen and Felser (2006c: 137) furthermore assume (with Townsend & Bever, 2001; Section 1.13.2) that the human language processing system has two different routes for comprehending sentences. These two routes – in both native speakers and L2 learners – work in parallel. Successful robust parsing depends on the availability of ‘sufficiently detailed, implicit grammatical knowledge’. The alternative processing route which is not driven by grammar is shallow parsing. Shallow parsing in computational terms is all about: (a) recognizing parts of speech; (b) segmenting the input into meaningful chunks; (c) concatenating (determining local relations). It should not go unnoticed that this definition of shallow parsing is roughly similar to that we gave to SL (Sections 5.1 and 5.3).

Par t s of L2 Grammar That Resist St at ist ical Lear ning

203

In Townsend and Bever’s (2001) LAST model, it is assumed that native speakers ‘understand sentences twice’, because both processing routes are run simultaneously as words in the sentences are encountered. Shallow processing gives speakers a quick idea of the meaning of the sentence, while robust processing is called into action in cases of ambiguity and of competing cues. According to the SSH, L2 learners would also understand a sentence twice: at least in theory, they would have the two processing routes at their disposal as well. In practice, when the grammar is defective and cannot feed the parser, shallow processing/parsing becomes the obligatory choice. When defective grammar hinders robust parsing, adult L2ers resort to a shallow parsing route which is driven by lexical-semantic, pragmatic information, world knowledge and strong associative meaning or form patterns. In the discontinuity hypothesis, learners do not understand sentences twice as native speakers do (according to the LAST and the SSH perspectives). Rather, they may happen to learn sentences twice over time. At a certain point in the developmental path, L2 representations geminate: a statistical representation turns out to have a grammatical counterpart. This occurs when – after a long exposure to the TL input – statistics eventually feed into grammar (combinatorial grammar). Complex syntax is excluded by this developmental discontinuity because absences cannot be counted up. What the SSH could encompass in future extensions is the fact that complex syntax involving movement is not the only domain out of reach for shallow processing. Any computations concerning phenomena at the interface, features that cannot be interpreted and the mapping of semantics into functional morphology are at risk of overwhelming an adult learner’s processor capacity and a learner’s representational capacity. Clahsen and Felser (2006a) seem to acknowledge this point when they include anaphora resolution and binding in the realm of non-local dependencies. They in fact posit that the difference between the referring of the reflexive pronoun her to Alice or to Jane, respectively, in sentences (15) and (16), is due to the hierarchical structural position of the two NPs (I have added the brackets to represent the binding domain of the anaphora): (15) Jane believed [Alice to have over-exerted herself]. (16) Janei seemed to Alice [ to have over-exerted herself]. Actually, it is not only the structural position (domain) of Alice and Jane that determines which of the two can enter into a co-reference relationship with the reflexive. Of course L2 learners must know (and process) that linear distance is not what matters for the right interpretation of who has overexerted herself in those sentences. But L2 learners must also know something

204

Discont inuit y in Second L anguage Acquisit ion

about co-reference itself. They must, for instance, know the difference between herself, her and Jane/Alice in terms of constraints on constituent binding. Moreover, they must know the possibility – which might have been provided by previous knowledge of the subcategorization frame of the verb seem – that Jane might have been raised (to occupy the sentence spec IP position) from an embedded sentence subject position (so that Jane seemed to Alice is equal to it seemed to Alice that Jane . . .) where it leaves a trace . These kinds of computations are not only ‘structural’ in the sense that they do not only involve awareness of constituents’ boundaries, knowledge of islands and psychological sensitiveness to displacement traces. They also engage an ability to bridge anaphoras, establishing analogies between different subcategorization frames of the same lexical entry and, in some cases, the capacity of mapping of semantic values of the lexicon into syntactic positions (as it is in the case of binding). The importance of the SSH can be fully acknowledged given the theoretical background also provided by the discontinuity hypothesis. The discontinuity hypothesis uses the notion that there are areas of human cognition that supply inputs and use outputs of the mental modulus dedicated to language (the language faculty). When adults are engaged in learning a second language, the computational system must interact with other cognitive systems to allow communication and the use of those parts of the TL which are difficult to fit into a learner’s mental grammar. SL in adulthood comes about at exactly this point by supplying a shallow processing which makes use exclusively of a simplified form of mental representation (constructions; see Section 2.5). These tentative mental representations at a certain point can feed into grammar. Now we turn to other phenomena that also are good candidates for resisting and being completely impermeable to statistical predictions and to SL.

6.9 Non-combinatorial Grammar at the Interfaces Non-combinatorial grammar does include more than empty category and internal merge. It also includes some phenomena at the syntax-semantics and syntax-pragmatics interfaces. Studies that will be described in the following paragraphs predict that adult L2 learners will find it difficult to compute over abstract values when these values are of a different nature (responding to different organizing principles). We have already seen that the decision about whether dropping a pronoun or not depends on a computation over syntactic and pragmatic (± topic shift) values. The correlation between these different values is not overtly signaled in the sentence. There are no

Par t s of L2 Grammar That Resist St at ist ical Lear ning

205

visible cues that the pronoun must be dropped. L2 learners must calculate it every time from the scratch. In such cases, analogy and similarity metrics cannot work as they usually do in SL. A limited number of phenomena at the syntax-semantics interface pose similar problems for adult L2 learners. To take an example, in order to choose between the perfective and imperfective past in Italian, L2 learners must compute the relationship between the chosen aspectual perspective on the event and functional morphology in the VP. Again, analogy and learners’ previous experience with similar structures of the TL may be of little help. For instance, L2 learners of Italian must run the ‘aspectual computation’ every time from scratch based on the elements at hand in the sentence. This computation entails the comparison of two very different pieces of information: (a) is the event bounded or not? and (b) how is (un)boundedness morphologically encoded in Italian? Information (a) concerns semantics (and discourse grounding to some extent), while information (b) concerns morphosyntax. The integration of (a) and (b) may be difficult and costly for L2 learners. Moreover, the aspectual computation also entails that learners must take into account other aspectual cues (adverbials, prepositions, time expressions and adjuncts) beside functional morphology. All these cues can gear the interpretation of the sentence towards a perfective or an imperfective reading regardless of functional morphology in ways that cannot be predicted by TP. This is why phenomena at the syntaxsemantics interface (such as the perfective-imperfective dichotomy in the past tense) are also difficult to acquire for very experienced L2 learners. Those phenomena are very frequent in the TL input, but frequency cannot support their computation. L2 learners in fact do not calculate TP when they choose between the perfective or imperfective past in Italian.

6.9.1 What is a linguistic interface? In the latest version of generative theory, interfaces in a system are points of interaction between the modules of the system and/or between the modules themselves and entities which are external to the system. They are abstract devices that put into relation elements that are in the language faculty with elements that are in the mind (where the language faculty is embedded) or that put into relation the various elements that are in the language faculty itself (Chomsky, 2000a: 1–17; 2000b: 9–12; 2002: 157–160). In the former case – when they involve language and non-language cognitive modules – they are called ‘external interfaces’. In the latter case they are called ‘internal interfaces’. In a general model of the architecture of the human mind where the language faculty is autonomous and informationally encapsulated, external interfaces also function as abstract validators. Interfaces

206 Discont inuit y in Second L anguage Acquisit ion

convert a piece of information (whether input or output) originated from within the language faculty so that it becomes interpretable by other modules which are external to the language faculty, but which are designed to process that piece of information to produce meaningful and pronounceable utterances in a natural language. In Chomsky’s minimalist program, these cognitive, external modules are called the articulatory-perceptual system and the conceptual-intentional system, while phonetic form and logical form are the respective interfaces to these modules. Instead, internal interfaces convert a piece of information (whether input or output) originated from one module so that it becomes interpretable by all other modules of the language faculty: syntax, semantics and phonology. Much has been discussed about the timing and the way in which internal and external interfaces come about from the moment a sentence is planned and derived through phases, to the moment when it is physically uttered (and heard by the listener). It seems, however, that syntax has a privileged role in the whole process since it is the computational, grammatical core of the system. For this reason, syntax is assumed to operate before the point of spellout (when a sentence is uttered). Its job is to cyclically derive intermediate abstract configurations (phases) by merging elements which share some abstract features and by deleting copies (byproducts of the merge operation) and evaluating features that are not interpretable. In spite of its exceptional role, syntax does not work in isolation and it is constantly faced with other internal interfaces. According to López’s (2009) interpretation of Chomsky’s phase-theory, every phase of the syntactic derivation is immediately (in real time) brought to semantics and phonology modules where it is validated thanks to the general interface properties (see below). All interface validations would happen before the moment of spell-out. Instead, the phonetic form and logical form interfaces which link linguistic information to the articulatoryperceptual and conceptual-intentional modules of the mind would operate only after the point of spell-out and determine the conditions by which the elements of the sentence are both linearized (so that they can be pronounced) and valued in logical terms (as entities, variables or operators) in order to be interpreted. To fulfill its task, any interface of the system must be designed so that it shares a degree of organizational commonality between different modules of the language architecture (Rothman & Slabakova, 2011: 569). Moreover, the language architecture system itself imposes – by design – that every piece of information originated by a single module must be readable at all the system interfaces. Models of linguistic interface architecture are, for instance, those proposed by Jackendoff (2002a) and Reinhart (2006). According to Tanya Reinhart, syntax is the core computational module of the architecture.

Par t s of L2 Grammar That Resist St at ist ical Lear ning

207

Syntax is a computational system that works between the other brain systems, which operates over concepts and context on the one hand (the conceptualintentional module), and over the sensorimotor system on the other hand (the articulatory-perceptual module). The computational system is ‘optimally designed’; that is, it is capable of converting and translating information between brain systems while at the same time operating choices over possible competing derivations (phases) with the purpose of minimizing the burden on working memory and computational resources. To this purpose, the computational system discriminates whether performing ‘local computations’ (those that that are evaluated only on portions of a derivation) or ‘reference set computations’ (those that involve and compare more entire derivations and are aimed at licensing inefficient operations when easier (less costly) derivations are not permitted or do not permit the intended interpretation). Put differently, reference-set computation is a costly operation that becomes necessary when no other simpler and licit derivation licenses what speakers mean to say. In Chapter 4 of her book, Reinhart (2006) describes how – in anaphora resolution of Sentence (17) – when binding the pronoun he to the matrix subject of the embedded sentence Max (the easier solution) as in Sentence (18) cannot be licensed, an alternative, more difficult derivation (19) is possible. (17) Only he still thinks that Max is a genius (18) (19) The alternative solution (19) is licit, but it is more difficult than the illicit derivation (18) because (18) requires only that both he and Max are interpreted equally as open variables over individuals (the interpretation being ‘Max is the only among all other individuals who may think of being genius’), while (19) requires variable comparison over extra sentential referents (the interpretation being that Max in the matrix sentence is contrasted with any other individuals who may or may not think that the precise person called Max is a genius). It is possible to say that while illicit interpretation like (18) could be – at least in theory – self-licensed automatically by the core computational system alone (if spell-out did not exist), it is nevertheless ruled out and substituted by its (more costly) competitor (19) after being checked against the conceptual-intentional interface requiring that Max must not be a variable but a well-identified individual or entity. The example quoted above shows us that – at least at a linguistic architecture-theory level – the fact that processing can be more or less costly for speakers depends on whether some more efficient, least-effortful computations can

208

Discont inuit y in Second L anguage Acquisit ion

be performed or not. If the outcomes of default (and ‘energy-saving’) computations cannot be licensed, then the computation must be widened and new values must be encompassed and mapped at the interfaces. At an acquisitional level, L2 learners capacities (either representational or processing) may be overburdened by the fact that mapping at interfaces gets complicated to accommodate licensed interpretations. Since learners have to learn and automatize the rules of correspondence between different layers of representations at the interfaces, a source of difficulty arises when the number of rules increases and the correspondences are not straightforward but must be computed over binary values which represent extra grammatical information (such as ± topic shift). This precise source of difficulty may turn out not to be eliminable, regardless of learners’ level of proficiency. The implications of interface problems for first and second language acquisition are accounted for in the interface hypothesis.

6.9.2 The interface hypothesis The interface hypothesis (Serratrice et al., 2001; Sorace, 2011a, 2012; Sorace & Filiaci, 2006; Sorace & Serratrice, 2009) proposes that residual problems and optionality in L2 acquisition are due to learners’ difficulty in coping with the processing demands inherent in interface-conditioned properties. Such difficulty surfaces a learner’s working memory limitations and decrease of processing efficiency, which are the consequences of different manifestations of both early and late bilingualism (adult or early L2 acquisition, L1 and L2 attrition). Most research in the interface hypothesis focused on the syntaxpragmatics interface, and especially on null subjects and anaphora resolution. In a number of experimental studies, it was observed that L2 learners overextended overt pronouns in contexts requiring null subjects (in null subject languages such as Italian or Spanish). Non-target-like overt subject pronouns are also claimed to be accepted by near-native learners in the presence of a topical antecedent, as in (20): (20) ??La vecchiettai saluta la ragazzaj quando leii attraversa la strada The old woman greets the girl when she crosses the road In Belletti et al. (2007) it is shown that advanced learners of Italian acquired at a native-like level the formal mechanisms of the syntax of subjects and of subject-verb agreement, but failed to map overt pronouns and the pragmatic principles determining their contextual appropriateness. The hypothesis was then advanced that overt pronouns are the default forms (the ‘learner’s default’; Tsimpli, 2011) when learners cannot efficiently

Par t s of L2 Grammar That Resist St at ist ical Lear ning

209

process the mapping of syntactic (verb-subject agreement) and pragmatic (± topic shift) values. In the interface hypothesis, the word ‘interface’ does not mean that interfaces must be learned (as underlined by Pérez-Leroux, 2011), but it stresses the computational difficulties stemming from the fact that syntactic realizations are conditioned by information coming from other, external modules which needs to be accommodated into syntax (Sorace, 2012). Computational difficulty comes from the integration of these different sources of information. A hierarchy of computational difficulties has been hypothesized as well. Structures requiring on-time integration of contextual information at the syntax-pragmatics interface are held to be more difficult to acquire and process than structures that require so-called ‘internal mapping’ (e.g. the syntax-semantics interface). In Reinhart’s (2006) terms, only the former imply that reference-set computations must be performed by learners. Sorace and Serratrice (2009: 197) explain that ‘the syntax-semantics interface involves formal features and operation within syntax and logical form, whereas the syntax-discourse interface involves pragmatic conditions that determine appropriateness in context’. Finally, with respect to L1–L2 transfer, some experiments suggest that there is little facilitation from L1 properties that require the integration of contextual information (Gabriele & Canales, 2011: 686) and that phenomena at the syntax-discourse interface are also vulnerable in sign languages (Lillo-Martin & de Quadros, 2011). For an example of selective vulnerability at the syntax-discourse interface, Tsimpli and Sorace (2006) compare the acquisition of structures relevant to the syntax-semantics interface (focusing and topicalization) with syntax discourse phenomena (null subjects) in L1 Russian, L2 Greek learners. The authors found that even beginner learners of Greek can distinguish between focusing (which involves verb-raising) and topicalization (with resumptive object clitic). In order to show this distinction, learners are held to be capable of performing a native-like semantic interpretation of the operator status of the focus element versus the topicalized element. Unlike the phenomena at the syntax-semantics interface, overuse of subject pronouns – which indicates a difficulty at the syntax-discourse interface – is attested even in very advanced learners. Criticism of the interface hypothesis is directed at its circularity (e.g. Duffield, 2011), its theoretical underpinnings (what features belong to ‘core syntax’ and what features are at the syntax-discourse interface; see Slabakova, 2011a; Sorace, 2012: 213, for a response) and its acquisitional implications. For instance, it was observed that the term ‘interface’ cannot be associated with structures sensitive to conditions of a varying nature because an interface does not correspond to any particular structure type (Pérez-Leroux, 2011: 72).

210

Discont inuit y in Second L anguage Acquisit ion

The fact that core or narrow syntax may ‘care about’ pragmatics or discourse conditions has been contested as well (Tsoulas & Gil, 2011). But the main criticism is that it would be a mistake to assume that the syntax-discourse interface as a whole is problematic: acquisitional problems are rather construction specific (White, 2011a, 2011b). The same observation could in fact be made for the syntax-discourse interface as well, for certain phenomena at the syntax-semantics interface (Pires & Rothman, 2011). For instance, White (2011b: 585) quotes the example of the problematic acquisition of English articles by speakers of languages without articles (Mandarin, Russian, Turkish). She also observes that any linguistic phenomenon involves multiple interfaces and that interfaces themselves are not ‘monolithic’ because there are linguistic properties which are harder or easier to acquire across all modules and interfaces (see also Montrul, 2011; Slabakova & Ivanov, 2011). Another criticism is that difficulties at the syntax-discourse interface do not mean that structures at this interface are not learnable. Slabakova et al. (2012) focuses on the acquisition of clitic left-dislocation (CLLD) and focus fronting (FF) in L1 English, L2 Spanish learners. These learners were presented with a short dialogue which contained two possible answers (the test sentences). Each answer had to be evaluated in felicity on a scale of 4 (perfect) to 1 (very strange). For instance, for the CLLD conditions, the correct answer (21) displayed the clitic, while the incorrect answer (22) did not: (21) Bueno, las sillas las puse en la cocina [. . .] Well, the chairs them I put in the kitchen (22) *Bueno, las sillas puse en la cocina Well, the chairs I put in the kitchen The results showed that the acceptability of correct CLLD constructions was recognized with high reliability by natives and learners alike. Moreover, the existence of a developmental trajectory is confirmed: intermediate and advanced learners performed differently in a significant manner. Slabakova et al. (2012: 339) concluded that ‘knowledge at the syntax-discourse interface can be attested as early as intermediate levels of proficiency, but is more prevalent at advanced levels’. Ivanov (2012) also tested the acquisition of clitic doubling in L1 English, L2 Bulgarian learners. Clitic doubling in Bulgarian occurs when a DP and a co-referential clitic co-occur in the same sentence. This construction serves to mark the topical object, as in (23): (23) Na Ivan mu pisah tri pâti tasi sedmica To Ivan him-clitic I wrote three times this week ‘I wrote three times to Ivan this week’

Par t s of L2 Grammar That Resist St at ist ical Lear ning

211

The results showed that advanced learners of Bulgarian performed like the native controls in the sentence evaluation task, which is interpreted as successful acquisition of the pragmatic meaning of clitic doubling and that the claims of the interface hypothesis for permanent L2 deficiency in syntax-discourse interface coordination are not supported (Slabakova et al., 2012: 365) The discontinuity hypothesis can provide a unified framework to explain the difficulties at the syntax-discourse interface and at the other interfaces as well. We have already seen that frequency cannot help learners decide when to drop unnecessary pronominal subjects in L2 Italian. Instead, we have seen that frequency is a factor in the acquisition of auxiliary selection in L2 Italian. BTP in fact supports the passage from SL of mere auxiliary + past participle associations to the acquisition of labeled (headed) structures where syntactic and semantic (aspectual) values are projected. The discontinuity hypothesis predicts difficulty at the interfaces when frequency (and its algorithms) does not help and structures cannot be learned statistically before being over-generalized and generated by the rules of combinatorial grammar. In this precise sense, the general divide between what can and cannot be learned statistically includes phenomena at the interfaces and even overarches the distinction between the syntax-discourse and the syntaxsemantics interfaces. One possible difficulty in integrating the interface hypothesis in the discontinuity hypothesis is that – according to the interface hypothesis – interface problems also arise in early bilinguals. This is not expected in the discontinuity hypothesis, because GL is claimed to be limited to combinatorial grammar only in adulthood. In childhood the passage from SL to GL is constrained by neuroanatomical development, not by the nature of the items that are learned. So, why should bilingual children experience troubles at the interfaces only in their L2 and not in their L1?

6.9.3 Are difficulties at the interfaces age-related? It is an open issue as to whether or not the interface hypothesis is a developmental theory at all effects and to make predictions of what can and cannot be learned. It is even more controversial to establish whether residual optionality at the interfaces concerns early or just late bilinguals (their level of proficiency being equal). If null pronouns cannot be learned even by early bilinguals, then the difference between the scope of early and late discontinuity and between the learnability of combinatorial and non-combinatorial grammar are challenged. According to Slabakova et al. (2012: 320), the interface hypothesis is ‘technically agnostic on whether or not UG is fully accessible in adulthood’.

212

Discont inuit y in Second L anguage Acquisit ion

Instead, Sprouse (2011: 99) observes that the notion of ‘residuality’ itself calls upon the idea that grammatical development in adulthood is UG constrained and that residual optimality resides in the fact that learners cannot integrate UG-compatible representations with other cognitive domains for which UG is not directly relevant. Sorace (2011a) states that the interface hypothesis is not a developmental theory and does not make predictions about the fact that core syntactic features – such as A-movement and binding relations – are acquired before properties at the interfaces. It only concerns ultimate nearnative attainment. Lardiere (2011) argues against the interface hypothesis suggestion that structures at the syntax-discourse interfaces are left over after everything else in L2 grammar has been acquired. Lardiere claims that functional morphology is an area which exhibits persistent optionality as well. Whether Lardiere is right or not, the key issue in the interface hypothesis is that residual problems are peculiar to very advanced, near-native speakers, even though this does not exclude the idea that these problems cannot emerge in intermediate learners (White, 2011a). Residuality entails that processing resources may not be available for all kinds of L2 computations, maybe because they are consumed by syntactic operations that are not proceduralized in the L2 (Sorace, 2012: 210). If the term ‘proceduralized’ is used by Sorace to recall the DPM findings and predictions, then not only bilingualism in general, but specifically late bilingualism is held to be the cause of difficulties at the interfaces. This is because access to procedural memory circuits is strongly moderated by age. In this chapter we have suggested that – if difficulties at the interfaces are age-related – it is not because adult learners do not access UG. It is because some phenomena at the interfaces are impermeable to frequency effects, whereas adult learners need to rely on the property of input (frequency) in order to set in motion the rules of those parts of L2 grammar which we named combinatorial. But again, if GL is claimed to be limited to combinatorial grammar only in adulthood, why should bilingual children experience troubles at the syntax-pragmatics interfaces? And why do they experience these difficulties only in their L2 and not in their L1? In childhood noncombinatorial grammar should not represent a problem either for the L1 or the L2 once interhemispheric connections are established. Sorace (2011b) approaches the problem from a processing perspective and hypothesizes that optionality could be due to the integration of information from different domains (syntax and pragmatics). This integration in real time could be particularly difficult for bilingual speakers, but sometimes it is also for monolingual speakers. This very fact excludes the presence of representational deficits. Where exactly is the bilingual (early and late)

Par t s of L2 Grammar That Resist St at ist ical Lear ning

213

processing problem located in respect of monolinguals? One explanation is that the bilingual is by definition slower than the monolingual in assessing the referent’s status (± topic) because she has to inhibit the inappropriate one both within and across the two languages. Bilinguals in fact need to exercise executive control to avoid interference from the unwanted language. This operation considerably increases the cognitive load. Therefore problems with null pronouns and anafora resolution by early bilinguals are not counter-examples to the hypothesis that early discontinuity – unlike late discontinuity – allows the acquisition of non-combinatorial grammar and of interface phenomena. Residual difficulties in this area arise because learners have to update in real time the interface values in two languages according to the sentence context. This has more to do with an individual’s cognitive capacity than with the nature (combinatorial or not) of L2 grammar.

6.10 Non-combinatorial Grammar and Uninterpretable Features The interpretability hypothesis (Hawkins & Hattori, 2006; Tsimpli, 2003; Tsimpli & Dimitrakopoulou, 2007; Tsimpli & Mastropavlou, 2008; Tsimpli & Papadopoulou, 2009) predicts that adult learners will experiment difficulties in learning the morphosyntactic features of the TL which are not interpretable at the logical form interface. In the minimalist program, these features are called ‘uninterpretable features’. Abstract case and agreement are examples of uninterpretable features, whereas person, number, gender (ϕ-features) and pronominal reference are interpretable because they are semantically motivated (Adger, 2003: 40–45). Uninterpretable features are not semantically motivated and their role is restricted to syntactic derivations (but the classification of features into one or another type is problematic; see Liceras et al., 2010: 10).The study in Tsimpli and Dimitrakopoulou (2007) focuses on the differences between Greek and English wh- interrogatives. Greek is a null subject language. Unlike English, it allows the filling of the extracted subject gap position with a complementizer such as oti ‘that’ (as in Sentence 24) and also, optionally, resumptive pronouns such as ton ‘him’ (as in Sentence 25): (24) Pji ipe oti whoNOM-PL said3sg. that *who did he say that left?

efighan? left3pl.

214 Discont inuit y in Second L anguage Acquisit ion

(25) Pjion whom who did

ipes oti (ton) prosealan xoris logo? said2sg. that him-insulted3sg without reason? you say that they insulted (*him) without reason?

Both the insensitivity to that-trace effects (Rizzi, 1986) and the existence of a resumptive-pronouns strategy are held to involve the uninterpretable feature of agreement (on Infl head in the former case and subject-verb agreement in the latter). The interpretability hypothesis adopts clear assumptions regarding the critical period hypothesis. While logical form-interpretable features are still accessible to adult L2ers, uninterpretable features such as agreement are not. Parametric L1 values associated with these features – once settled in infancy – resist re-setting. Uninterpretable features which are not instantiated during the critical period are no longer available and fade out (Tsimpli & Dimitrakopoulou, 2007: 224). Thus it is assumed that agreement causes learnability problems even at advanced stages of acquisition. To test this assumption, two group of L1 Greek, L2 English learners were presented with the English version of sentences such as (24) and (25) and they were requested to rate their acceptability. While English native controls, as expected, rejected sentences with complementizer and resumptive pronouns, more advanced learners accepted sentences like (25), whereas less advanced learners accepted both (24) and (25). The authors conclude that the abstract properties of subject-verb agreement in Greek are transferred to Greek/English interlanguage and that target-like abstract specifications of properties cannot be represented and established in processing routines (Tsimpli & Dimitrakopoulou, 2007: 237). Tsimpli and Papadopoulou (2009) is a study that focuses on verbal aspect. Verbal aspect is a functional category with two interpretable feature values: perfectivity and imperfectivity. Perfective predicates are [+bounded], while imperfective predicates are [−bounded] and [±iterative]. Perfectivity and imperfectivity play a role in the construction of telic versus atelic events. Telicity in Greek depends in fact on both the aspectual form of the verb (perfective or imperfective) and on the specificity of the DP-complement (what Verkuyl, 1999, 2005 called the [±specified quantity] and van Valin, 1990 calls the [±delimited quantity] opposition rules). In the experiment, Tsimpli and Papadopoulou (2009) use a subclass of activity predicates, namely manner of motion verbs, which can be followed by a PP that can be either a complement (GOAL of motion) as in Sentence (26) or an adjunct (PATH) as in Sentence (27): (26) I ghata etrekse ston kipo the cat ranperf in the garden → PP complement (endpoint interpretation)

Par t s of L2 Grammar That Resist St at ist ical Lear ning

215

(27) I ghata etrehe ston kipo the cat ranimperf in the garden → PP adjunct (locative interpretation) The authors presented sentences like (26) and (27) (with different kinds of manner of motion verbs) to both monolingual speakers and to learners of Greek of intermediate level. In the first part of experiment, participants were asked to match an auditory presented sentence with a picture in a booklet. In the second part, they were asked to watch short videos and to provide a description of what was going on. The results show that learners of Greek are capable of associating imperfective forms with atelic readings (PP adjuncts) and perfective forms with telic readings (PP complements), which in turn might mean that the interpretable feature of aspect is not problematic. It is also worth noticing that – unlike native speakers – L2 learners are not likely to admit a directional (atelic) interpretation of perfective predicates with PP complements, which in Greek is also possible (the cat of Sentence (26) may have already reached the garden or it may be still on her way to the garden). The telic/directional interpretations are not allowed at a syntax-semantics interface level, but at a syntax-discourse interface level, where the elements of the derivation are mapped onto values of the extra sentential context. On this basis, Tsimpli and Papadopoulou (2009) conclude that the interpretability hypothesis and the interface hypothesis are fully compatible because they identify the loci of vulnerability in L2 acquisition in external rather than in internal interfaces. Still, as in the last paragraph, it is possible to conclude that operations involving the mapping of heterogeneous (syntactic versus pragmatic) values pose processing difficulties which are harder to deal with for adult, very advanced learners of a second language.

6.11 Non-combinatorial Grammar and Functional Morphology: The ‘Bottleneck’ Hypothesis The source of learners’ difficulties could reside not only at the interfaces. Slabakova (2006: 10) claims that semantics is not sensitive to age-related effects: many neurophysiological studies have in fact concluded that ‘there is evidence of considerable plasticity in the network that mediates language comprehension’ as opposed to morphosyntax. The real acquisitional challenge which confronts adult L2ers concerns inflectional morphology encoded in the functional lexicon, while compositional (phrasal) semantics and syntax would come ‘for free’ from UG. The ‘bottleneck hypothesis’ predicts that functional morphology represents in fact the ‘tight spot’ in the mapping of

216

Discont inuit y in Second L anguage Acquisit ion

syntactic values onto meaning at the syntax-semantics interface. While the same semantic primitives – such as the distinction between ongoing and finished events – are part of an universal conceptual structure, in different languages they are distributed over different pieces of functional morphology. Thus functional lexicon is where language variation is encoded and where learning difficulties reside, while meanings are universal. Learning a new language in adulthood entails learning how interpretable and not interpretable features are mapped into inflectional morphology (Slabakova, 2009: 282, 2010, 2011b). There might be cases where inflections are seemingly similar between L1 and L2 (as the morphemes in the English past progressive and Spanish imperfect) but they encode very different meanings and there will be cases where the opposite holds (syntax is complex because sentences involve less frequent constructions, but their interpretation at the syntax-semantics interface is not problematic). Aspectual mismatches of the first kind are difficult to acquire not because aspectual meanings are unclear to learners, but because it is not clear how they are mapped onto functional morphology. Once the inflectional morphology is learned, learners would become immediately aware of all its semantic consequences. The contrast between the acquisition of aspectual morphology (perfective versus imperfective) in L2 Italian and the acquisition of semantic content (Aktionsart) of predicates in L2 Italian explains the point and is very telling for the discontinuity hypothesis. Giacalone Ramat and Rastelli (2013) argued that adult learners can gradually learn verb actionality because the frequency of verbs in the input paves the way to the acquisition of even subtle semantic distinctions. On the contrary, the distinction between the perfective and the imperfective in the past cannot be learned statistically: it in fact involves the computation of the perspective taken by the speaker on the event and implies the correct form-to-function mapping of this perspective on verb morphology, expressions of time and space and verb arguments.

6.12 Is Non-combinatorial Grammar Important for SLA? The non-combinatorial aspects of grammar have attracted substantive attention especially in the UG tradition. This does not necessarily mean that these aspects are core aspects in SLA. It may well be argued that in the whole range of what constitutes language, these aspects are peripheral. Therefore, the real question is not only what aspects of a language can or cannot be learned statistically, but to what extent the non-statistically

Par t s of L2 Grammar That Resist St at ist ical Lear ning

217

learnable aspects are important for SLA. It would be problematic putting two kinds of grammar on equal footing while one (non-combinatorial grammar) is only relevant for a tiny part of the language to be acquired. Is non-combinatorial grammar important for SLA? The answer is positive: maybe the non-combinatorial aspects of the L2 grammar are not that numerous, but they are developmentally crucial, as a cursory overview of SLA research in the last 40 years can easily demonstrate. Since the dawn of SLA studies in the 1970s and early 19980s, the attention of non-UG-oriented researchers was also immediately attracted to the acquisition of interrogatives sentences (wh- question formation in L2 English) and of relative clauses (see Mitchell & Myles, 1998: 33–34, for references). Moving to the present day, the Handbook of Cognitive Linguistics and Second Language Acquisition (Robinson & Ellis, 2008) is by far one of the least UG-oriented books ever published. In this handbook, six pages are dedicated to the acquisition of interrogative sentences, three pages to wh- extraction and eight pages are dedicated to the acquisition of relative clauses by L2 learners. The developmental perspectives adopted to describe how these phenomena are acquired are the cognitive-oriented approach and the usage-based approach to language acquisition. This could be enough to conclude at the very least that some non-combinatorial phenomena have always attracted – and do continue to attract – the attention of developmental linguists, not only for their theoretical importance, but also by virtue of their functional and communicative values. Knowing how to construct questions, how to bridge pronouns and antecedents and how to embed sentences into other sentences are not only ‘core-computational’ operations. These are among the most important skills that allow learners to communicate in their everyday lives. Therefore non-combinatorial grammar is at least as important for generative theory as it is for SLA research and also for second language teaching. Another similar objection is that non-combinatorial phenomena are difficult to learn just because they are less frequent in the TL input. Whdependencies, null pronouns and island constraints are not learned statistically because they are both infrequent and typically display some of the most complex properties of a language in many dimensions (sentence length, semantic, referential, communicative, etc.). Null pronouns are obvious counter-examples to this frequency objection. Null subjects in native Italian are absolutely pervasive in the TL input. Nevertheless, they are not learned even by very advanced, near-native learners of Italian. Object relatives are pervasive as well, in both oral and written Italian and, again, very advanced learners still experience trouble in both comprehending and producing object relatives.

218 Discont inuit y in Second L anguage Acquisit ion

6.13 To Sum Up: Whether Something Can Be Learned or Not Depends on How it Can Be Learned In this chapter we have tried to demonstrate that the expression ‘grammatical learning’ in language acquisition in adulthood refers to two different mental operations: counting and computing. The former operation targets combinatorial grammar, while the latter targets non-combinatorial grammar. Combinatorial grammar is categorization over presences and, as such, is supported by statistics. Non-combinatorial grammar is categorization over absences and it is not supported by statistics. Features belonging to the former type are countable and their co-occurrence can be tracked down implicitly by learners, relatively regardless of the linear distance among the items in a sequence: adjacent items are learned first; non-adjacent patterns are eventually learned. Instead, features belonging to the latter kind cannot be counted, simply and straightforwardly because their computation also involves items that are not there or that have been displaced somewhere else in the sentence. In this chapter, we have seen some domains in which frequency and TP help and some domains in which they do not help. For instance, auxiliary selection in L2 Italian is a construction in which BTP and coherently built variation sets provide the ideal environment for grammatical knowledge of the auxiliary to grow. As a consequence, the auxiliary and the past participle first just ‘concatenate’ and then also ‘merge’ in a headed structure under an abstract LABEL. We have also seen that the benefits of frequency do not extend to many linguistic phenomena, e.g. empty categories (null subjects), phenomena at the interfaces, movement across islands, extracted wh-, object relatives, functional morphology and maybe some uninterpretable features. When we tried to establish what these phenomena have in common, we said that: (a) they contain items that must be computed, not counted by learners; and (b) analogy or previous experience cannot help their computation. In the last part of the chapter, we dealt with theories of SLA which identify areas in which late L2 acquisition is more difficult. The discontinuity hypothesis provides a unified account for all phenomena in those domains. These phenomena contain items over which statistics is ineffective. All these items in fact must be computed and ‘cannot be learned as a song’. Finally, we also acknowledged that the discontinuity hypothesis has a lot in common – beside the DPM – with both the LAST hypothesis and the SSH. One important difference among these models is that – while in the LAST and the SSH learners are claimed to understand sentences twice as native speakers do – in the discontinuity hypothesis learners (but not native speakers) are claimed to learn sentences twice over time, once statistically and then

Par t s of L2 Grammar That Resist St at ist ical Lear ning

219

grammatically, the availability of this discontinuous pattern depending on whether the grammar to be learned is combinatorial or not. The overall conclusion is the following. What adult L2 learners are more and less likely to learn depends on how it can be learned. What can be learned discontinuously is likely to be learned eventually when L2 representations and processing routes ‘geminate’ in a learner’s competence. Instead, what implies absent or displaced features that cannot be counted up, cannot undergo SL and is less likely to be learned. It is the way of learning (not the lexicon-grammar distinction) that determines what is eventually and successfully learned by adult L2 learners. This is the discontinuity hypothesis that has been outlined in this book.

Conclusions

The discontinuity hypothesis described in this book can be summed up in nine points: (a) Over time, adult L2ers develop two different ways to represent and to process a part of L2 grammar which is called ‘combinatorial’. Combinatorial grammar is likely to be learned twice by adult L2 learners. (b) Combinatorial grammar includes predictable form-function pairs (regardless of adjacency). These pairs can be both memorized and retrieved as wholes (SL operating via TP and bottom-up category formation) and decomposed into their components in order to be computed by a rule (GL operating via abstract rules). (c) SL developmentally precedes and paves the way for GL. The former is a general-domain, not language-specific capacity. It is not spared by age. That’s why it is the default option for adult L2ers. (d) SL and GL are rooted in different – although highly interconnected – parts of the brain. Relative representations are stored and can be accessed separately. SL and GL are developmentally independent from one another, but interact in a learner’s competence. Statistics provides the L2 grammar the ‘environment’ to grow and develop. (e) ‘Discontinuity’ opposes to ‘continuity theories’. Adult L2 acquisition does not progress by stages, but by permanent fractures borne out from qualitative leaps due to a shift in the neural sources. These leaps cause mental representations of L2 grammar and the processing routes to duplicate in a learner’s competence (Gemination). (f) The existence of geminations in L2 representations and processing is manifested by both within and across individuals variability in neural responses to morphosyntactic violations and by statistically informed analysis of learner corpora. (g) Predictions about L2 acquisition and learnability criteria can be assessed: only a part of L2 grammar can be learned statistically. Non-combinatorial 220

Conclusions

221

grammar cannot. Incomplete L2 attainment in adult SLA directly stems from here. (h) SLA is quantized: one cannot expect that the amount of exposure to the TL input and interaction determine second language acquisition straightforwardly. When acquisition takes a ‘quantum leap’, a learner becomes immediately capable of using two different prepresentations and processing exactly in the same manner in which a child – maybe at the end of a long series of tentatives – finally learns to ride a bicycle or to swim. (i) Theoretical background: The discontinuity hypothesis is framed in the ‘semi-modular’ approach to SLA. Storage and computation in the Language Faculty do not exclude each other (the UG needs statistics, and the other way round). SL and GL are based on two interconnected capacities of a learner’s mind which are differently wired in the brain and which may apply to a subset of grammatical items of the TL. When SL and GL apply to the same items of the TL, the acquisition of those items is said to be discontinuous because (a) it does not directly depend on the time of exposure to TL input and (b) the outcomes of the two sequenced processes are qualitatively different. When discontinuity occurs, SLA in adulthood turns out to be also redundant because L2 items are learned twice over time and represented in two different ways in a learner’s competence. These two ways cohabit (gemination). Discontinuity characterizes both early and late language acquisition. Early and late discontinuities differ as to the scope and effectiveness of GL. While the scope of SL remains roughly the same across a learner’s ages, GL is less effective in adulthood (see below). Early and late discontinuities differ also as to its cause: early discontinuity is neuroanatomically motivated. Since the areas that process complex syntax are not yet connected and functioning before the age of 6–7, children might rely on associative learning which is centered on temporal cortices and on the hippocampus. Late discontinuity is instead also cognitively motivated: SL is faster than GL and more effective to resort to under communicative pressure by adults. When resorting to discontinuity, adult SLA recaps and mimics the developmental sequence from SL to GL that was so effective in childhood. Like the causes, the outcomes of discontinuity in adulthood are different too. The hypothesis of discontinuity is couched in the semi-modular perspective on language acquisition. According to this perspective, some general principles of the mind – which are sensitive to input distributional patterns and to memory constraints of individuals – interact with modular principles of the Language Faculty which are less informationally encapsulated than assumed in the recent past UG theories.

222

Discont inuit y in Second L anguage Acquisit ion

SL and GL are similar under some respects and are different under others. They are both implicit, because they do not depend on automatization of explicit rules of grammar but on implicit categorization of items. SL and GL have different neural supports, however. SL is centered in the hippocampus and in the temporal lobe, while GL is centered in the basal ganglia and in the left prefrontal cortex. These neural supports are independent but interconnected. The neural support for SL – but crucially not that for GL – is substantially spared from age-effects (in normal language development condition at least). Moreover, some areas of the left frontal cortex seem to be restricted from post-pubertal language acquisition. When adults are to learn a second language, they must rely on SL not because the brain areas that support syntactic processing are not already in place (as in early discontinuity), but because learning by patches is more viable, more effective and faster for adults. SL paves the way for GL when first chunks and then – in a gradient of abstractness – constructions feed into L2 grammar over time. This happens provided that they are frequently met by learners in the input of the TL. SL and GL divide the labor in adult SLA. In the early phases of language acquisition, SL (subserved by in the declarative memory system) takes the floor. In this phase, very frequent chunks are stored and processed as wholes. Then alternation patterns, which occur frequently, consolidate in a learner’s memory in form of constructions. These constructions are built and kept together by the force of analogy. Analogy permits similar items of the TL to fill the available slots in a pattern, which is a form-function pairing consolidated by language uses. Eventually, these constructions are learned grammatically. It is proposed that learners step up from constructions to grammar rules. Learners shift from the criterion of ‘category membership’ – the belonging to an unordered set on the basis of formal, surface resemblance – to think in terms of abstract LABELS. A LABEL is a projection of the peculiar properties of the head of a concatenation. This passage could be mirrored by an increased activation of sections of Broca’s area and by decreased hippocampal activity which is instead associated to similarity-based learning. In more general terms, the passage from construction to more abstract grammar rules is made possible because statistics provide adult learners the necessary amount of relevant instances, while the grammar provide those instances the necessary abstract concepts (labels) that gather instances together regardless of surface similarities or resemblances. The result is that learners can (a) categorize over encountered instances and (b) they gradually learn to generalize over not yet encountered instances. It must be stressed that the division of labor between SL and GL concerns the processes, not the products of learning. SL is not the learning of lexicon,

Conclusions

223

as GL is not the learning of grammar. Which items belong to the grammar and which items belong to the lexicon in a learner’s interlanguage can be defined on the basis of how they are acquired. Any a priori definitions commit the comparative fallacy. The switch between SL and GL is possible also after the critical period because of brain’s plasticity and of adaptive response to environmental demands. Factors such as length of exposure to the input, length of residence and of immersion in the country where the language is spoken and also the kind of instruction (explicit versus implicit) may reduce the gap between early and late learners to the point that they are even undistinguishable to measures based on performance or on accuracy rates. The discontinuity hypothesis posits that not all difficult items that are learned statistically by adults can be also learned grammatically. Learnability of L2 items depends on whether or not frequency may play a role in their acquisition. If frequency helps because a learner is likely to meet those items in the form of chunks and constructions as it is in the case of auxiliary selection, then GL is likely to join and accompany SL over time. If frequency cannot help because a learner must abstract away over absences and gaps (as it is in the case of null subjects and wh- extraction), GL of these items by adults is unlikely. Therefore discontinuity in adulthood ensures that adults can fully learn L2 combinatorial grammar. Instead, non-combinatorial grammar is out of reach for discontinuity in adulthood. Finally, as we stated from the very beginning of this book, no direct neurophysiological evidence is available to date to supporting the explanation of discontinuity hypothesis that is proposed here. In fact, no experimental study so far factorized the difference between combinatorial and noncombinatorial L2 grammar, that is, between the grammar that can or cannot be pretreated statistically. This book directly advocates the need for such experiments to be carried out by trying to provide some falsifiability criteria for a new interpretation of developmental phenomena that can be linked to discontinuity hypothesis. This book is meant to suggest that the explanation of discontinuity based on the lexicon-grammar divide misses the point and that an alternative explanation is worth being explored by experimental research in the framework of L2 neurocognition. There are of course many important issues that are not discussed. The first one is the impact of discontinuity on language instruction. It is possible that classroom teaching/learning could be aimed at guiding the learning process to the elements in language which can be learned statistically and then grammatically. Some possible effects of language instruction on discontinuity in SLA could be put on the research agenda in the future and are currently addressed in some studies.

224

Discont inuit y in Second L anguage Acquisit ion

The second one seems to be as obvious as fundamental. Why L2 acquisition in adulthood should be discontinuous? Why GL should join SL and geminate L2 representations in a learner’s mind? Why also simple, statistical and local rules are (re)learned by rule once GL becomes available? The fact that discontinuity imposes a re-learning or re-instantiation of grammatical rules that were initially learned statistically could and perhaps should be theoretically addressed and possibly explained. This book refrains from answering these why questions, however. The aim of the book was to describe the converging (neural and corpora-driven) evidence that adult SLA could be discontinuous and quantized and that it allows gemination in L2 representations. This book has completely focused on how questions and has completely disregarded why questions. Unlike Haspelmath (1999: 225), I do not believe that the how and the why questions should coincide or necessarily be entailed in language research. Where they are allowed and encouraged to, it is usually because the only explicative criteria that are invoked are historical, or relate to the functioning of language in real communicative contexts and uses. These are, in general, the most invoked ‘why criteria’ in linguistics. It should be acknowledged that it would not be that simple to account for gemination and discontinuity in functional, communicative terms. The idea itself of ‘neural redundancy’ would be even harder to accommodate where only criteria of computational efficiency reign. I therefore should be permitted to giving up answering the why question due to my radical ignorance at least: I do not have functional, historical, communicative, pragmatic and computational reasons to put forward. The last, posthumous book of the Italian astrophysicist Margherita Hack is entitled – in Galileo’s style – Il perché non lo so ‘I do not know why’. At the end of her life as a worldwide-known, leading scientist, Margherita Hack seemed to be satisfied of having tried to answer a very few, albeit fundamental, how questions about the origin of some far galaxies. She used to add that any scientist, it does not matter the field of investigation, should be happy with trying to answer how questions. Should a linguist not?

References

Abney, S.P. (1991) Parsing by chunks. In R.C. Berwick et al. (eds) Principled-Based Parsing: Computation and Psycholinguistics (pp. 257–278). Dordrecht: Kluwer. Abrahamsson, N. (2012) Age of onset and nativelike L2 ultimate attainment of morphosyntactic and phonetic intuition. Studies in Second Language Acquisition 34, 187–214. Abutalebi, J. (2008) Neural aspects of second language representation and language control. Acta Psychologica 128 (3), 466–478. Abutalebi, J. and Green, D. (2007) Bilingual language production: The neurocognition of language representation and control. Journal of Neurolinguistics 20, 242–275. Abutalebi, J., Cappa, S. and Perani, D. (2005) What can functional neuroimaging tell us about the bilingual brain? In J.F. Kroll and A. De Groot (eds) Handbook of Bilingualism – Psycholinguistic Approaches (pp. 497–515). Oxford and New York: Oxford University Press. Abutalebi, J., Della Rosa, P.A., Green, D.W., Hernandez, M., Scifo, P., Keim, R., Cappa, S.F. and Costa, A. (2011) Bilingualism tunes the anterior cingulate cortex for conflict monitoring. Cerebral Cortex 21 (10); doi:10.1093/cercor/bhr287. Adger, D. (2003) Core Syntax. A Minimalist Approach. Oxford: Oxford University Press. Aldwayan, S., Fiorentino, R. and Gabriele, A. (2010) Evidence of syntactic constraints in the processing of wh-movements. A study of Najdi Arabic learners of English. In B. VanPatten and J. Jegerski (eds) Research in Second Language Processing and Parsing (pp. 65–86). Amsterdam and Philadelphia, PA: John Benjamins. Alexiadou, A., Tibor, K. and Gereon, M. (eds) (2013) Local Modelling of Non-Local Dependencies in Syntax. Berlin and New York: De Gruyter Mouton. Andorno, C. and Bernini, G. (2003) Premesse teoriche e metodologiche. In A. Giacalone Ramat (ed.) Verso l’Italiano. Percosi Strategie di Acquisizione (pp. 27–36). Rome: Carocci. Andrew, P. (2012) The Social Construction of Age. Adult Foreign Language Learners. Bristol: Multilingual Matters. Arnon, I. and Ramscar, M. (2012) Granularity and the acquisition of grammatical gender: How order-of-acquisition affects what gets learned. Cognition 122, 292–305. Ashby, F.G. and O’Brien, J.B. (2005) Category learning and multiple memory systems. Trends in Cognitive Sciences 9 (2), 83–89. Baayen, R.H. (2010) Demythologizing the word frequency effect: A discriminative learning perspective. The Mental Lexicon 5, 436–461. Baayen, R.H., Davidson, D.J. and Bates, D.M. (2008) Mixed-effects modeling with crossed random effects for subjects and items. Journal of Memory and Language 59 (4), 390–412. Babcock, L., Stowe, J., Maloof, C., Brovetto, C. and Ullman, M.T. (2012) The storage and composition of inflected forms in adult-learned second language: A study of the 225

226

Discont inuit y in Second L anguage Acquisit ion

influence of length of residence, age of arrival, sex and other factors. Bilingualism: Language & Cognition 15 (4), 820–840; doi:10.1017/S1366728912000053. Baggott, J. (2011) The Quantum Story. A History in 40 Moments. Oxford: Oxford University Press. Bahlmann, J., Gunter, T.C. and Friederici, A.D. (2006) Hierarchical and linear sequence processing: An electrophysiological exploration of two different grammar types. Journal of Cognitive Neuroscience 18 (11), 1829–1842. Bahlmann, J., Schubotz, R. and Friederici, A.D. (2008) Hierarchical artificial language processing engages Broca’s area. NeuroImage 42 (2), 525–534. Bannard, C., Lieven, E. and Tomasello, M. (2009) Modeling children’s early grammatical knowledge. Proceedings of the National Academy of Sciences 106 (41), 17284–17289. Bardovi-Harlig, K. (2006) On the role of formulas in the acquisition of L2 pragmatics. In K. Bardovi-Harlig, C. Félix-Brasdefer and A.S. Omar (eds) Pragmatics Language Learning (special issue) 11, 1–28. Bardovi-Harlig, K. (2009) Conventional expressions as a pragmalinguistic resource: Recognition and production of conventional expressions in L2 pragmatics. Language Learning 59 (4), 755–795. Bartning, I., Forsberg Lundell, F. and Hancock, V. (2012) On the role of linguistic contextual factors for morphosyntactic stabilization in high-level L2 French. Studies in Second Language Acquisition 34, 243–267. Belletti, A. and Rizzi, L. (2013) Ways of avoiding intervention: Some thoughts on the development of object relatives, passive, and control. In M.P. Palmarini and R.C. Berwick (eds) Rich Languages from Poor Inputs (pp. 115–126). Oxford and New York: Oxford University Press. Belletti, A., Bennati, E. and Sorace, A. (2007) Theoretical and developmental issue in the syntax of subjects: Evidence from near-native Italian. Natural Language and Linguistic Theory 25, 657–689. Bentley, D. (2006) Split Intransitivity in Italian. Amsterdam and New York: De Gruyter Mouton. Berwick, R.C. (1997) Syntax facit saltum: Computation and the genotype and phenotype of language. Journal of Neurolinguistics 10 (2–3), 231–249. Berwick, R.C. (2011) Syntax facit saltum redux: Biolinguistics and the leap to syntax. In A.M. Di Sciullo and C. Boeckx (eds) The Biolinguistics Enterprise: New Perspectives on the Evolution and Nature of the Human Language Faculty (pp. 65–99). Oxford and New York: Oxford University Press. Bettoni, C. and Di Biase, B. (eds) (forthcoming) Processability Theory: Current Issues in Theory and Application. Eurosla Monographs. Amsterdam: John Benjamins. Bialystok, H. (2010) Global-local and trail-making tasks by monolingual and bilingual children: Beyond inhibition. Developmental Psychology 46 (1), 93–115. Bialystok, H. (2011) How analysis and control lead to advantages and disadvantages in bilingual processing. In C. Sanz and R.P. Leow (eds) Implicit and Explicit Language Learning. Conditions, Processes and Knowledge in SLA and Bilingualism (pp. 49–58). Washington, DC: Georgetown University Press. Bialystok, E., Craik, F.M.I., Klein, R. and Viswanathan, M. (2004) Bilingualism, aging and cognitive control: Evidence from the Simon task. Psychology and Aging 1, 290–303. Bialystok, H., Craik, F.I.M., Grady, C., Chau, W., Ishii, R., Gunji, A. and Pantev, C. (2005) Effect of bilingualism on cognitive control in the Simon task: Evidence from MEG. NeuroImage 24, 40–49.

References

227

Birdsong, D. (2005) Interpreting age effects in second language acquisition. In J.F. Kroll and A.M.B. de Groot (eds) Handbook of Bilingualism (pp. 109–127). Oxford and New York: Oxford University Press. Birdsong, D. (2006a) Age and L2 acquisition and processing. In M. Gullberg and P. Indefrey (eds) The Cognitive Neuroscience of Second Language Acquisition (pp. 9–49). Malden, MA and Oxford: Blackwell. Birdsong, D. (2006b) Dominance, proficiency, and second language grammatical processing. Applied Psycholinguistics (special issue on the Shallow Structure Hypothesis) 27 (1), 46–49. Birdsong, D. and Flege, J.E. (2001) Regular-irregular dissociations in the acquisition of English as a second language. In A.H.J. Do, L. Dominguez and A. Johansen (eds) BUCLD 25, Proceedings of the 25th Annual Boston University Conference on Language Development (pp. 123–132). Somerville: Cascadilla Press. Bley-Vroman, R. (1983) The comparative fallacy in interlanguage studies: The case of systematicity. Language Learning 33, 1–17. Bley-Vroman, R. (1990) The logical problem of foreign language learning. Linguistic Analysis 20, 3–49. Bley-Vroman, R. (2009) The evolving context of the fundamental difference hypothesis. Studies in Second Language Acquisition 31, 175–198. Bloch, C., Kaiser, A., Kuenzli, E., Zappatore, D., Haller, S., Franceschini, R., Luedi, G., Radue, E. and Nitsch, C. (2009) The age of second language acquisition determines the variability in activation elicited by narration in three languages in Broca’s and Wernicke’s area. Neuropsychologia 47, 625–633. Bod, R., Hay, J. and Jannedy, S. (2003) Probabilistic linguistics: Introduction. In R. Bod, J. Hay and S. Jannedy (eds) Probabilistic Linguistics (pp. 1–10). Cambridge, MA and London: MIT Press. Bonatti, L., Peña, M., Nespor, M. and Mehler, J. (2005) Linguistic constraints on statistical computations: The role of consonants and vowels in continuous speech processing. Psychological Science 16, 451–459. Bowden, H. (under review) Processing of syntax in college foreign language learners. Bilingualism: Language and Cognition. Bowden, H., Sanz, C., Steinhauer, K. and Ullman, M.T. (2007) An ERP study of proficiency in second language. Journal of Cognitive Neuroscience (suppl.), 170. Bowden, H., Gelfand, M.P., Sanz, C. and Ullman, M.T. (2010) Verbal inflectional morphology in L1 and L2 Spanish: A frequency effects study examining storage versus composition. Language Learning 60 (1), 44–87. Boyd, J.K. and Goldberg, A.E. (2009) Input effects within a constructionist framework. Modern Language Journal 93 (3), 418–429. Brauer, J., Anwandler, A. and Friederici, A.D. (2011) Neuroanatomical prerequisites for language functions in the maturing Brain. Cerebral Cortex 21 (2), 459–466. Brovetto, C. and Ullman, M.T. (2001) First vs. second language: A differential reliance on grammatical computations and lexical memory. Paper presented at the CUNY 2011 Conference on Sentence Processing, Philadelphia, PA. Buchanan, M. (2011) Differentiating the discontinuous. Nature Physics 7, 589. Bunt, H. (1996) Discontinuous constituency. Introduction. In H. Bunt and A. van Hork (eds) Discontinuous Constituency (pp. 1–10). Berlin and New York: De Gruyter Mouton. Bybee, J. (2002) Sequentiality as the basis of constituent structure. In T. Givón and B.F. Malle (eds) The Evolution of Language Out of Pre-language (pp. 109–134). Amsterdam and Philadelphia, PA: John Benjamins.

228 Discont inuit y in Second L anguage Acquisit ion

Bybee, J. (2003) Mechanisms of change in grammaticalization: The role of frequency. In B.D. Joseph and R.D. Janda (eds) Handbook of Historical Linguistics (pp. 602–623). Oxford: Blackwell. Bybee, J. (2008) Usage-based grammar and second language acquisition. In P. Robinson and N. Ellis (eds) Handbook of Cognitive Linguistics and Second Language Acquisition (pp. 217–236). New York and London: Routledge. Bybee, J. (2010) Language, Usage and Cognition. Cambridge: Cambridge University Press. Caldwell-Harris, C., Berant, J. and Edelman, S. (2012) Measuring mental entrenchment of phrases with perceptual identification, familiarity ratings and corpus frequency statistics. In D. Divjak and S.T. Gries (eds) Frequency Effects in Language Representation (pp. 130–165). Amsterdam and Philadelphia, PA: De Gruyter Mouton. Cameli, L., Phillips, N.A., Kousaie, S. and Panisset, M. (2005) Memory and language in bilingual Alzheimer and Parkinson patients: Insights from verb inflection. In J. Cohen, K.T. McAlister, K. Rolstad and J. MacSwan (eds) ISB4: Proceedings of the 4th International Symposium on Bilingualism (pp. 452–476). Somerville: Cascadilla Press. Cappa, S. (2012) Imaging semantics and syntax. Neuroimage 61, 427–431. Chan, A.H.D., Luk, K.K., Li, P., Yip, V., Li, G. and Weekes, B. (2008) Neural correlates of nouns and verbs in early bilinguals. Annals of the New York Academy of Science 1145, 30–40. Chater, N. and Christiansen, M.H. (2010) Language acquisition meets language evolution. Cognitive Science 34, 1131–1157. Chee, M.W.L., Soon, C.S., Lee, H.L. and Pallier, C. (2004) Left insula activation: A marker for language attainments in bilinguals. Proceedings of the National Academy of Science 101 (42), 15265–15270. Chesi, C. (2012) Competence and Computation: Towards a Processing-friendly Minimalist Grammar. Padova: Unipress. Chesi, C. (in press) On directionality of phrase structure building. Journal of Psycholinguistic Research. Chiswick, B.R. and Miller, P.W. (2008) A test of the critical period hypothesis for language learning. Journal of Multilingual and Multicultural Development 29 (1), 16–29. Chomsky, N. (1995) The Minimalist Program. Cambridge: MIT Press. Chomsky, N. (2000a) The Architecture of Language. New Delhi: Oxford University Press. Chomsky, N. (2000b) New Horizons in the Study of Language and Mind. Cambridge: Cambridge University Press. Chomsky, N. (2002) On Nature and Language. Cambridge: Cambridge University Press. Chomsky, N. (2007a) The Biology of the Language Faculty: Its Perfection, Past and Future. Conference held at MIT, 19 October 2007. See http://mitworld.mit.edu/video/517. Chomsky, N. (2007b) Approaching UG from below. In U. Sauerland and H.M. Gärtner (eds) Interfaces + Recursion = Language? Chomsky’s Minimalism and the View from SyntaxSemantics (pp. 1–30). Berlin: De Gruyter Mouton. Chomsky, N. (2009) Opening remarks. In M. Piattelli-Palmarini, J. Uriagereka and P. Salaburu (eds) Of Minds and Language. A Dialogue with Noam Chomsky in the Basque Country (pp. 13–43). Oxford: Oxford University Press. Chomsky, N. (2012) The Science of Language. Interviews with James McGilvray. Cambridge: Cambridge University Press. Chomsky, N. (2013) Problems of projection. Lingua 130, 33–49. See http://dx.doi. org/10.1016/j.lingua.2012.12.003. Christiansen, M.H. and Chater, N. (2008) Languages as shaped by the brain. Behavioral and Brain Sciences 31, 489–558.

References

229

Christianson, K., Hollingworth, A., Halliwell, J.F. and Ferreira, F. (2001) Thematic roles assigned along the garden path linger. Cognitive Psychology 42, 368–407. Christianson, K., Williams, C.C., Zacks, R.T. and Ferreira, F. (2006) Younger and older adults’ ‘good enough’ interpretations of garden-path sentences. Discourse Process 42 (2), 205–238. Christiansen, M.H., Conway, C.M. and Onnis, L. (2012) Similar neural correlates for language and sequential learning: Evidence from event-related brain potentials. Language and Cognitive Process 27 (2), 231–256. Clahsen, H. and Felser, C. (2006a) How native-like is non-native language processing? Trends in Cognitive Science 10, 564–570. Clahsen, H. and Felser, C. (2006b) Grammatical processing in language learners. Applied Psycholinguistics 27, 3–42. Clahsen, H. and Felser, C. (2006c) Continuity and shallow structures in language processing: A reply to our commentaries. Applied Psycholinguistics 27, 107–126. Clahsen, H., Felser, C., Neubauer, K. and Sato, M. (2010) Morphological structures in native and non-native language processing. Language Learning 60, 21–43. Coggins, P.E., Kennedy, T.J. and Armstrong, T.A. (2004) Bilingual corpus callosum variability. Brain and Language 89, 69–75. Corballis, M.C. (2009) The evolution and genetics of cerebral asymmetry. Philosophical Transactions of the Royal Society of Biological Sciences 364, 867–879. Craik, F.I.M, Bialystok, E. and Freedman, M. (2010) Delaying the onset of Alzheimer disease: Bilingualism as a form of cognitive reserve. Neurology 75, 1726–1729. Cruse, D.A. (1997) Lexical Semantics. Cambridge: Cambridge University Press. Davidson, D. (2006) Strategies for longitudinal neurophysiology. In M. Gullberg and P. Indefrey (eds) The Cognitive Neuroscience of Second Language Acquisition (pp. 231– 234). Malden, MA and Oxford: Blackwell. Davidson, D. (2010) Short-term grammatical plasticity in adult language learners. Language Learning 60, 109–122. Davidson, D. and Indefrey, P. (2009a) An ERP study on changes of violation and error responses during morphosyntactic learning. Journal of Cognitive Neuroscience 21, 433–446. Davidson, D. and Indefrey, P. (2009b) Plasticity of grammatical recursion in German learners of Dutch. Language and Cognitive Processes 24, 1335–1369. Davidson, D. and Indefrey, P. (2011) Error-related activity and correlates of grammatical plasticity. Frontiers in Psychology 2, 219; doi:10.3389/fpsyg.2011.00219. de Bot, K. (2006) The plastic bilingual brain: Synaptic pruning or growth? Commentary on Green et al. In M. Gullberg and P. Indefrey (eds) The Cognitive Neuroscience of Second Language Acquisition (pp. 127–132). Malden, MA and Oxford: Blackwell. de Bot, K. (2008) Review article: The imaging of what in the multilingual mind? Second Language Research 24 (1), 11–133. de Diego-Balaguer, R. and Lopez-Barroso, D. (2010) Cognitive and neural mechanisms sustaining rule learning from speech. In M. Gullberg and P. Indefrey (eds) The Earliest Stages of Language Learning (pp. 151–187). Malden: Wiley-Blackwell. de Diego-Balaguer, R., Couette, M., Dolbeau, G., Durr, A., Youssov, K. and Bachoud-Lèvi, A.C. (2008) Striatal degeneration impairs language learning: Evidence from Huntington’s disease. Brain 131, 2870–2881. de Diego-Balaguer, R., Fuentemilla, L. and Rodriguez-Fornells, A. (2011) Brain dynamics sustaining rapid rule extraction from speech. Journal of Cognitive Neuroscience 23 (10), 3105–3120.

230

Discont inuit y in Second L anguage Acquisit ion

DeKeyser, R. (2007) Skill acquisition theory. In B. VanPatten and J. Williams (eds) Theories in Second Language Acquisition (pp. 97–114). Mahwah: Lawrence Erlbaum. DeKeyser, R. (2009) Cognitive-psychological processes in second language learning. In C. Doughty and M. Long (eds) The Handbook of Language Teaching (pp. 119–137). London: Blackwell. DeKeyser, R. and Larson-Hall, J. (2005) What does the critical period really mean? In J.F. Kroll and A.M.B. de Groot (eds) Handbook of Bilingualism (pp. 88–108). Oxford and New York: Oxford University Press. Dekydspotter, L. (2009) Second language epistemology. Studies in Second Language Acquisition 31, 291–321. Dekydspotter, L., Schwartz, B.D. and Sprouse, R.A. (2006) The comparative fallacy in L2 processing research. In M.G. O’Brien, C. Shea and J. Archibald (eds) Proceedings of the 8th Generative Approaches to Second Language Acquisition Conference (GASLA 2006) (pp. 33–40). Somerville: Cascadilla Press. Devine, A.M. and Stephens, L.D. (2000) Discontinuous Syntax: Hyperbaton in Greek. New York: Oxford University Press. Dewaele, J.-M. (1992) L’omission du ne dans deux styles oraux d’interlangue française. Journal of Applied Linguistics 7, 3–17. Dewaele, J.-M. and Regan, V. (2000) The use of colloquial words in advanced French interlanguage. Paper presented at Eurosla 2000 Conference, Krakow, September. Dewaele, J.-M. and Regan, V. (2001) Maîtriser la norme sociolinguistique en interlangue française: le cas de l’omission variable de ne. Journal of French Language Studies 12 (2), 123–148. Doeller, C.F., Opitz, B., Krick, C.M., Mecklinger, A. and Reith, W. (2006) Differential hippocampal and prefrontal-striatal contributions to instance-based and rule-based learning. NeuroImage 31, 1802–1816. Dörnyei, Z. (2007) Research Methods in Applied Linguistics. Oxford: Oxford University Press. Draganski, B. and May, A. (2008) Training-induced structural changes in the adult human brain. Behavioural Brain Research 192, 137–142. Dubois, J., Dehaene-Lambertz, G., Perrin, M., Mangin, J.F. and Cointepas, Y. (2008) Asynchrony of the early maturation of white matter bundles in healthy infants: Quantitative landmarks revealed noninvasively by diffusion tensor imaging. Human Brain Mapping 29, 14–27. Duffield, N. (2011) Loose ends? Commentary on Sorace. Linguistic Approaches to Bilingualism 1 (1), 35–38. Durrant, P. and Schmitt, N. (2010) Adult learners’ retention of collocations from exposure. Second Language Research 26, 163–188. Edelman, S. (2011) On look-ahead in language: Navigating a multitude of familiar paths. In M. Bar (ed.) Predictions in the Brain. Using Our Past to Generate a Future (pp. 170–189). Oxford and New York: Oxford University Press. Eichenbaum, H. (2004) Hippocampus: Cognitive processes and neural representations that underlie declarative memory. Neuron 44, 109–120. Ellis, N. (1996) Sequencing in SLA: Phonological memory, chunking and points of order. Studies in Second Language Acquisition 18, 91–126. Ellis, N. (2001) Memory for language. In P. Robinson (ed.) Cognition and Second Language Instruction (pp. 33–68). Cambridge: Cambridge University Press. Ellis, N. (2002) Frequency effects in language processing. Studies in Second Language Acquisition 24, 143–188.

References

231

Ellis, N. (2005) At the interface: Dynamic interactions of explicit and implicit language knowledge. Studies in Second Language Acquisition 27, 305–352. Ellis, N. (2008) The dynamics of second language emergence: Cycles of language use, language change, and language acquisition. Modern Language Journal 92 (2), 232–249. Ellis, N. (2009) Optimizing the input: Frequency and sampling in usage-based and formfocused learning. In C. Doughty and M. Long (eds) The Handbook of Language Teaching (pp. 139–158). London: Blackwell. Ellis, N. and Cadierno, T. (2009) Constructing a second language. Introduction to the special section. Annual Review of Cognitive Linguistics 7, 111–139. Ellis, N. and Collins, L. (2009) Input and second language acquisition: The roles of frequency, form and function. Modern Language Journal 93 (3), 329–335. Ellis, N. and Ferreira, F. (2009) Construction learning as a function of frequency, frequency distribution, and function. Modern Language Journal 93 (3), 370–385. Embick, D. and Marantz, A. (2005) Cognitive neuroscience and the English past tense: Comments on the paper by Ullman et al. Brain and Language 93, 243–247. Erman, B. (2009) Formulaic language from a learner perspective. What the learner needs to know. In R. Corrigan, E.A. Moravcsik, H. Ouali and K.M. Wheatly (eds) Formulaic Language, Vol. 2 (pp. 323–346). Amsterdam and Philadelphia, PA: John Benjamins. Eskildsen, S.W. and Cadierno, T. (2007) Are recurring multi-word expressions really syntactic freezes? Second language acquisition from the perspective of usage-based linguistics. In M. Nenonen and S. Niemi (eds) Collocations and Idioms 1. Papers from the First Nordic Conference on Syntactic Freezes (pp. 89–99). Joensuu: Joensuu University Press. Eubank, L. and Gregg, K. (1999) Critical periods and second language acquisition. In D. Birdsong (ed.) Second Language Acquisition and the Critical Period Hypothesis (pp. 65–100). Mahwah: Lawrence Erlbaum. Fabbro, F. (2001) The bilingual brain: Cerebral representations of languages. Brain and Language 79, 211–222. Federici, S., Montemagni, S. and Pirrelli, V. (1996) Shallow parsing and text chunking: A view on underspecification in syntax. In J. Carroll (ed.) Proceedings of the Workshop on Robust Parsing, 12–16 August. Prague: ESSLLI. Fell, J., Fernandez, G., Klaver, P., Axmacher, N., Mormann, F., Haupt, S. and Egler, C.E. (2006) Rhinal-hippocampal coupling during declarative memory formation: Dependence on item characteristics. Neuroscience Letters 407, 37–41. Felser, C. and Roberts, L. (2007) Processing wh-dependencies in a second language: A cross-modal priming study. Second Language Research 21, 9–36. Felser, C., Cunnings, I., Batterham, C. and Clahsen, H. (2012) The timing of island effects in nonnative sentence processing. Studies in Second Language Acquisition 34, 67–98. Ferman, S., Olshtain, E., Schechtmann, E. and Karni, A. (2009) The acquisition of linguistic skills by adults: Procedural and declarative memory interact in the learning of an artificial morphological rule. Journal of Neurolinguistics 22, 384–412. Ferreira, F. (2003) The misinterpretation of noncanonical sentences. Cognitive Psychology 47, 164–203. Ferreira, F. and Patson, N.D. (2007) The ‘good enough’ approach to language comprehension. Language and Linguistic Compass 1 (1–2), 71–83. Ferreira, F., Bailey, K.G.D. and Ferraro, V. (2002) Good enough representations in language processing. Current Directions in Psychological Science 2 (1), 11–15. Fitch, T. and Friederici, A. (2012) Artificial grammar learning meets formal language theory: An overview. Philosophical Transaction of the Royal Society of Biological Sciences 367, 1933–1955.

232

Discont inuit y in Second L anguage Acquisit ion

Fitch, T. and Hauser, M. (2004) Computational constraints on syntactic processing in a nonhuman primate. Science 303, 377–380. Fitch, T., Friederici, A. and Hagoort, P. (2012) Pattern perception and computational complexity: Introduction to the special issue. Philosophical Transaction of the Royal Society of Biological Sciences 367, 1925–1932. Franck, M.C. and Tenenbaum, J.B. (2011) Three ideal observer models for rule learning in simple languages. Cognition 120, 360–371. Franco, A. and Destrebecqz, A. (2012) Chunking or not chunking? How do we find words in artificial language learning? Advances in Cognitive Psychology 8 (2), 144–154. Franco, A., Cleeremans, A. and Destrebecqz, A. (2011) Statistical learning of two artificial languages presented successively: how conscious? Frontiers in Psychology 11 (2), 1–12. Frenck-Mestre, C. (2006) Commentary on Clahsen and Felser. Applied Psycholinguistics (special issue on the Shallow Structure Hypothesis) 27 (1), 64–65. Frenck-Mestre, C., Anton, J.L., Roth, M., Vaid, J. and Viallet, F. (2005) Articulation in early and late bilinguals’ two languages: Evidence from functional magnetic resonance imaging. Brain Imaging 16 (7), 761–765. Friederici, A.D. (2002) Towards a neural basis of auditory sentence processing. Trends in Cognitive Sciences 6, 78–84. Friederici, A.D. (2009) The brain differentiates hierarchical and probabilistic grammars. In M. Piattelli-Palmarini, J. Uriagereka and P. Salaburu (eds) Of Minds and Language. A Dialogue with Noam Chomsky in the Basque Country. Oxford: Oxford University Press. Friederici, A. (2011) Towards a neural basis of auditory sentence processing. Trends in Cognitive Science 6, 78–84. Friederici, A.D., Steinhauer, K. and Pfeifer, E. (2002) Brain signatures of artificial language processing: Evidence challenging the critical period hypothesis. Proceedings of the National Academy of Sciences 99, 529–534. Friederici, A.D., Bahlmann, J., Heim, S., Schubotz, R.I. and Anwander, A. (2006) The brain differentiates human and non-human grammars: Functional localization and structural connectivity. Proceedings of the National Academy of Sciences 103 (7), 2458–2463. Friederici, A.D., Brauer, J. and Lohmann, G. (2011) Maturation of the language network: From inter- to intrahemispheric connectivities. PLoS One 6 (6), 1–7. Friederici, A., Oberecker, R. and Brauer, J. (2012) Neurophysiological preconditions of syntax acquisition. Psychological Research 76, 204–211. Friedmann, N., Beletti, A. and Rizzi, L. (2009) Relativized relatives. Types of intervention in the acquisition of A-bar dependencies. Lingua 119, 67–88. Gabriele, A. and Canales, A. (2011) No time like the present: Examining transfer at the interfaces in second language acquisition. Lingua 121, 670–687. Giacalone Ramat, A. (ed.) (2003) Verso l’italiano. Percorsi e strategie di acquisizione. Rome: Carocci. Giacalone Ramat, A. and Rastelli, S. (2013) The qualitative analysis of actionality in learner language. In L. Comajoan and R. Salaberry (eds) Research Design and Methodology in Studies in L2 Tense and Aspect (pp. 391–422). New York and Berlin: De Gruyter Mouton. Gillon Dowens, M. and Carreiras, M. (2006) The shallow structure hypothesis of second language sentence processing: What is restricted and why? Applied Psycholinguistics (special issue on the Shallow Structure Hypothesis) 27 (1), 49–52.

References

233

Goldberg, A. (2003) Constructions: A new theoretical approach to language. Trends in Cognitive Science 7 (5), 219–223. Goldberg, A. (2006) Constructions at Work. Oxford: Oxford University Press. Goldberg, A. (2008) Universal grammar? Or prerequisites for natural languages. Behavioral and Brain Sciences 31, 522–523. Gomez, R.L. (2002) Variability and detection of invariant structure. Psychological Science 13 (5), 431–436. Green, D.W. (2003) The neural basis of the lexicon and the grammar in L2 acquisition. In R. van Hout, A. Hulk, F. Kuiken and R. Towell (eds) The Interface Between Syntax and the Lexicon in Second Language Acquisition. Amsterdam: John Benjamins. Green, D.W., Crinion, J. and Price, C.J. (2006) Convergence, degeneracy and control. In M. Gullberg and P. Indefrey (eds) The Cognitive Neuroscience of Second Language Acquisition (pp. 99–125). Malden, MA and Oxford: Blackwell. Grodzinsky, Y. (2000) The neurology of syntax: Language use without Broca’s area. Behavioral and Brain Sciences 23, 1–71. Grodzinsky, Y. (2003) Imaging the grammatical brain. In M. Arbib (ed.) Handbook of Brain Theory and Neural Networks (pp. 551–556). Boston: MIT Press. Grodzinsky, Y. (2005) Syntactic dependencies in memorized sequences in the brain. Canadian Journal of Linguistics 50, 241–266. Grodzinsky, Y. and Santi, A. (2008) The battle for Broca’s region. Trends in Cognitive Neuroscience 12 (12), 447–488. Gullberg, M., Roberts, L. and Dimroth, C. (2012) What word-level knowledge adults and children can acquire after minimal exposure to a new language. International Review of Applied Linguistics in Language Teaching 50 (4), 239–276. Gullberg, M., Roberts, L., Dimroth, C., Veroude, K. and Indefrey, P. (2010) Adult language learning after minimal exposure to an unknown natural language. In M. Gullberg and P. Indefrey (eds) The Earliest Stages of Language Learning (pp. 5–24). Malden: Wiley-Blackwell. Hagoort, P. (2005) Broca’s complex as the unification space for language. In A. Cutler (ed.) Twenty-first Century Psycholinguistics: Four Cornerstones (pp. 157–172). London: Lawrence Erlbaum. Hagoort, P. (2007) The memory, unification, and control (MUC) model of language. In T. Sakamoto (ed.) Communicating Skills of Intention (pp. 259–291). Tokyo: Hituzi Syobo. Hakuta, K., Bialystok, E. and Wiley, E. (2003) Critical evidence: A test of the criticalperiod hypothesis for second language acquisition. Psychological Science 14, 31–38. Hall, T. (2010) L2 learner-made formulaic expressions and constructions. Columbia University Working Papers in TESOL and Applied Linguistics 10 (2), 1–18. Halle, M. and Marantz, A. (1993) Distributed morphology and the pieces of inflection. In S.J. Keyser and K. Hale (eds) The View from Building 20: Essays in Linguistics in Honor of Sylvain Bromberger (pp. 110–176). Cambridge, MA and London: MIT Press. Hartshorne, J.K. and Ullman, M.T. (2007) Why girls say ‘holded’ more than boys. Developmental Science 9 (1), 21–32. Haspelmath, M. (1999) Optimality and diachronic adaptation. Zeitschrift für Sprachwissenschaft 18 (2), 180–205. Hastie, T., Tibshirani, R. and Friedman, J. (2009) The Elements of Statistical Learning, New York: Springer. Hauptmann, B. and Karni, A. (2002) From primed to learn: The saturation of repetition priming and the induction of long-term memory. Brain Research 13, 313–322.

234 Discont inuit y in Second L anguage Acquisit ion

Hauser, M.D., Chomsky, N. and Fitch, T. (2002) The faculty of language: What it is, who has it, and how did it evolve? Science 298, 1569–1579. Havik, E., Roberts, L., van Hout, R., Schreuder, R. and Haverkort, M. (2009) Processing subject-object ambiguities in the L2: A self-paced reading study with German L2 learners of Dutch. Language Learning 59, 73–112. Hawkins, R. (2001) Second Language Syntax. London: Blackwell. Hawkins, R. (2005) Revisiting wh-movement: The availability of an uninterpretable [wh] feature in interlanguage grammars. In L. Dekydspotter (ed.) Proceedings of the 7th Generative Approaches to Second Language Acquisition Conference (GASLA) (pp. 124–137). Somerville: Cascadilla Press. Hawkins, R. and Hattori, H. (2006) Interpretation of English multiple wh-questions by Japanese speakers: A missing uninterpretable feature account. Second Language Research 22, 269–301. Hedenius, M., Persson, J., Tremblay, A., et al. (2011) Grammar predicts procedural learning and consolidation deficits in children with specific language impairment. Research in Developmental Disabilities 32, 2362–2375. Hellige, J.B. (2008) Interhemispheric interaction in the lateralized brain. In B. Stemmer and H.A. Whitaker (eds) Handbook of the Neuroscience of Language (pp. 257–266). Amsterdam: Elsevier. Hernández, A.E, Fernández, E.M. and Aznar-Besé, N. (2007) Bilingual sentence processing. In M.G. Gaskell (ed.) The Oxford Handbook of Psycholinguistics (pp. 371–384). Oxford: Oxford University Press. Herschensohn, J. (2007) Language Development and Age. New York: Cambridge University Press. Herschensohn, J. (2009) Fundamental and gradient differences in language development. Studies in Second Language Acquisition 31, 259–289. Hiscock, M. and Kinsbourne, M. (2008) Lateralization of language across the life span. In B. Stemmer and H.A. Whitaker (eds) Handbook of the Neuroscience of Language (pp. 247–255). Amsterdam: Elsevier. Hochmann, J.R., Azadpour, M. and Mehler, J. (2008) Do humans really learn A n Bn artificial grammars from exemplars? Cognitive Science 32, 1021–1036. Hofmeister, P. and Sag, I. (2010) Cognitive constraints and island effects. Language 86, 366–415. Hornstein, N. (2009) A Theory of Syntax. Minimal Operations and Universal Grammar. Cambridge: Cambridge University Press. Hornstein, N. and Pietroski, P. (2009) Basic operations: Minimal syntax-semantics. Catalan Journal of Linguistics 8, 113–139. Hull, R. and Vaid, J. (2005) Clearing the cobwebs from the study of the bilingual brain: Converging evidence from laterality and electrophysiological research. In J.F. Kroll and A.M.B. de Groot (eds) Handbook of Bilingualism (pp. 480–496). Oxford and New York: Oxford University Press. Hummel, J.E. (2010) Symbolic versus associative learning. Cognitive Sciences 34, 958–965. Hyltenstam, K. and Abrahamsson, N. (2003) Maturational constraints in SLA. In C. Doughty and M. Long (eds) The Handbook of Second Language Acquisition (pp. 539– 588). Malden, MA and Oxford: Blackwell. Indefrey, D. (2006) A meta-analysis of hemodynamic studies on first and second language processing: Which suggested differences can we trust and what do they mean? In M. Gullberg and P. Indefrey (eds) The Cognitive Neuroscience of Second Language Acquisition (pp. 279–304). Malden, MA and Oxford: Blackwell.

References

235

Isel, F., Baumgaertner, A., Thrän, J., Meisel, J.M. and Büchel, C. (2010) Neural circuitry of the bilingual mental lexicon: Effects of age of second language acquisition. Brain and Cognition 72, 169–180. Ivanov, I. (2012) L2 acquisition of Bulgarian clitic doubling: A test case for the interface hypothesis. Second Language Research 28 (3), 345–368. Jackendoff, R. (1992) Languages of the Mind. Cambridge: MIT Press. Jackendoff, R. (2002a) Foundations of Language: Brain, Meaning, Grammar, Evolution. Oxford: Oxford University Press. Jackendoff, R. (2002b) What’s in the lexicon? In S. Nooteboom, F. Weerman and F. Wijnen (eds) Storage and Computation in the Language Faculty (pp. 23–58). Dordrecht: Kluwer Academic Publishers. Jackendoff, R. and Lerdahl, F. (2006) The capacity for music: What is it, and what’s special about it? Cognition 100, 33–72. Jackson, C.N. and Bobb, S.C. (2009) The processing and comprehension of wh-questions among second language speakers of German. Applied Psycholinguistics 30, 603–636. Jäger, G. and Rogers, J. (2012) Formal language theory: Refining the Chomsky hierarchy. Philosophical Transactions of the Royal Society of Biological Sciences 367, 1956–1970. Jiang, N. (2012) Conducting Reaction Time Research in Second Language Studies. New York and London: Routledge. Jordens, P. (1997) Introducing the basic variety. Second Language Research 13 (4), 289–300. Josse, G., Kherif, F., Flandin, G., Seghier, M.L. and Price, C.J. (2011) Predicting language lateralization from gray matter. Journal of Neuroscience 29 (43), 13516–13523. Juffs, A. (2006) Grammar and parsing and a transition theory. Applied Psycholinguistics (special issue on the Shallow Structure Hypothesis) 27 (1), 69–71. Kaplan, R.M. (2003) Syntax. In R. Mitkov (ed.) The Oxford Handbook of Computational Linguistics (pp. 70–90). Oxford: Oxford University Press. Kariaeva, N. (2009) Radical discontinuity: Syntax at the interface. Unpublished PhD dissertation, Rutgers, State University of New Jersey. Kaufman, S.B., DeYoung, C.G., Gray, J.R., Jiménez, L., Brown, J. and Mackintosh, N. (2010) Implicit learning as an ability. Cognition 116, 321–340. Kidd, E. and Kirjavainen, M. (2011) Investigating the contribution of procedural and declarative memory to the acquisition of past tense morphology: Evidence from Finnish. Language and Cognitive Process 26 (4–6), 794–829. Kikuchi. M., Shitamichi, K., Yoshimure, Y., et al. (2011) Lateralized theta wave connectivity and language performance in 2- to 5-year-old children. Journal of Neuroscience 31 (42), 14984–14988. Kim, B. and Goodall, G. (2011) Age-related effects on constraints on wh-movement. In J. Herschensohn and D. Tanner (eds) Proceedings of the 11th Generative Approaches to Second Language Acquisition Conference (GASLA 2011) (pp. 54–62). Somerville: Cascadilla Press. Kim, R., Seitz, A., Feenstra, H. and Shams, L. (2009) Testing assumptions on statistical learning: Is it long-term and implicit? Neuroscience Letters 461, 145–149. Klein, W. and Perdue, C. (1992) Utterance structure. In C. Perdue (ed.) Adult Language Acquisition: Cross-linguistic Perspectives, Vol. 2: The Results. Cambridge: Cambridge University Press. Klein, W. and Perdue, C. (1997) The basic variety (or: Couldn’t natural languages be much simpler?). Second Language Research 13 (4), 301–347. Knudsen, E.I. (2004) Sensitive periods in the development of the brain and behavior. Journal of Cognitive Neuroscience 16 (8), 1412–1425.

236 Discont inuit y in Second L anguage Acquisit ion

Kotz, S.A. (2009) A critical review of ERP and fMRI evidence on L2 syntactic processing. Brain and Language 109, 68–74. Kotz, S.A., Holcomb, P.J. and Osterhout, L. (2007) ERPs reveal comparable syntactic sentence processing in native and non-native readers of English. Acta Psychologica 128, 514–527. Kousaie, S. and Phillips, N.A. (2012) Conflict monitoring and resolution: Are two languages better than one? Evidence from reaction time and event-related brain potentials. Brain Research 1146, 71–90. Kroll, J.F. and Tokowicz, N. (2005) Models of bilingual representation and processing. In J.F. Kroll and A.M.B. de Groot (eds) Handbook of Bilingualism (pp. 531–553). Oxford and New York: Oxford University Press. Kumar, M. (2008) Quantum. Einstein, Bohr and the Great Debate About the Nature of Reality. New York: W.W. Norton. Kuperberg, G.R. (2007) Neural mechanisms of language comprehension: Challenge to syntax. Brain Research 146, 23–49. Kutas, M., DeLong, K.A. and Smith, N.J. (2011) A look around at what lies ahead: Prediction and predictability in language processing. In M. Bar (ed.) Predictions in the Brain: Using Our Past to Generate a Future (pp. 190–207). Oxford and New York: Oxford University Press. Kweon, S. and Bley-Vroman, R. (2011) Acquisition of the constraints on wanna contraction by advanced second language learners: Universal grammar and imperfect knowledge. Second Language Research 27 (2), 207–228. Lakshmanan, U. and Selinker, L. (2001) Analysing interlanguage: How do we know what learners know? Second Language Research 17, 393–420. Lamers, L.J.A. (2006) Cracking the nutshell differently: Commentary on Mueller. In M. Gullberg and P. Indefrey (eds) The Cognitive Neuroscience of Second Language Acquisition (pp. 271–277). Malden, MA and Oxford: Blackwell. Langendoen, D.T. (2003) Finite state languages. In W.J. Frawley (ed.) Oxford International Encyclopedia of Linguistics, Vol. 3 (pp. 26–28). Oxford: Oxford University Press. Lardiere, D. (2011) Who is the interface hypothesis about? Linguistic Approaches to Bilingualism 1 (1), 48–53. Lee, D. and Schachter, J. (1997) Sensitive period effects in binding theory. Language Acquisition 6, 333–362. Lenci, A., Montemagni, S. and Pirrelli, V. (2001) Chunk-it. An Italian shallow parser for robust syntactic annotation. In A. Lenci, S. Montemagni and V. Pirrelli (eds) Linguistica computazionale. Pisa and Rome: Istituti editoriali e poligrafici internazionali. Lenet, A.E., Sanz, C., Lado, B., Howard Jr, J.H. and Howard, D.V. (2011) Aging, pedagogical conditions and differential success in SLA: An empirical study. In C. Sanz and R.P. Leow (eds) Implicit and Explicit Language Learning. Conditions, Processes and Knowledge in SLA and Bilingualism (pp. 73–84). Washington, DC: Georgetown University Press. Lenneberg, E. (1967) Biological Foundations of Language. New York: John Wiley. Leonard, M.K., Torres, C., Travis, K.E., Brown, T., Hagler, D.J., Dale, A.M., Elman, J.L. and Halgren, E. (2011) Language proficiency modulates the recruitment of nonclassical language areas in bilinguals. PLoS One 6 (3), 1–10. Libben, G. (2006) How do language learners comprehend and produce language in real time? Applied Psycholinguistics (special issue on the Shallow Structure Hypothesis) 27 (1), 46–49. Liceras, J.M., Zobl, H. and Goodluck, H. (2010) Formal features in linguistic theory and learnability: The view from second language acquisition. In J.M. Liceras, H. Zobl and

References

237

H. Goodluck (eds) The Role of Formal Features in Second Language Acquisition (pp. 1–21). Mahwah: Lawrence Erlbaum. Lillo-Martin, D. and de Quadros, R.M. (2011) Acquisition at the syntax-discourse interface: The expression of point of view. Lingua 121, 623–636. Logan, G. (1992) Shape of reaction time distributions and shapes of learning curves: A test of the instance theory of automaticity. Journal of Experimental Psychology: Learning, Memory and Cognition 18, 883–914. Long, M.H. (2007) Problems in SLA. Mahwah, NJ: Lawrence Erlbaum Associates. Longworth, C.E., Keenan, S.E., Barker, R.A., Marslen-Wilson, W.D. and Tyler, L.K. (2005) The basal ganglia and rule-governed language-use: Evidence from vascular and degenerative conditions. Brain 128, 584–596. López, L. (2009) A Derivational Syntax for Information Structure. Oxford: Oxford University Press. Lucas, T.H., McKhann, G.M. and Ojemann, G.A. (2004) Functional separation of languages in the bilingual brain: A comparison of electrical stimulation language mapping in 25 bilingual patients and 117 monolingual control patients. Journal of Neurosurgery 101, 449–457. Luck, S.J. (2005) An introduction to the event-related potential technique. Boston: MIT Press. Ludden, D. and Gupta, P. (2000) Zen in the art of language acquisition. Statistical learning and the less is more hypothesis. In L. Gleitman and A.K. Joshi (eds) Proceedings of the 22nd Annual Conference of the Cognitive Science Society (pp. 812–817). Mahwah: Lawrence Erlbaum. Luders, E., Thompson, P.M. and Toga, A.W. (2010) The development of the corpus callosum in the healthy human brain. Journal of Neuroscience 30 (33), 10985–10990. Luk, G., De Sa, E. and Bialystok, E. (2011) Is there a relation between onset age of bilingualism and enhancement in cognitive control? Bilingualism: Language and Cognition 14 (4), 588–595. Lum, J.A.G., Conti-Ramsden, G., Page, D. and Ullman, M.T. (2011) Working, declarative and procedural memory in specific language impairment. Cortex 48 (9), 1138–1154. MacWhinney, B. (2005a) A unified model of language acquisition. In J.F. Kroll and A.M.B. de Groot (eds) Handbook of Bilingualism (pp. 49–67). Oxford and New York: Oxford University Press. MacWhinney, B. (2005b) Commentary on Ullman et al. Brain and Language 93, 239–242. Magnusson, E.J. and Stroud, C. (2012) High proficiency in markets of performance. Studies in Second Language Acquisition 34, 31–345. Malec, W. (2010) On the Asymmetry of Verb-Noun Collocations. In J. Arabski and A. Wojtaszek (eds) Neurolinguistic and Psycholinguistic Perspectives on SLA (pp. 126–143). Bristol: Multilingual Matters. Manning, C.D. (2003) Probabilistic syntax. In R. Bod, J. Hay and S. Jannedy (eds) Probabilistic Linguistics (pp. 289–341). Cambridge, MA and London: MIT Press. Marinis, T., Roberts, L., Felser, C. and Clahsen, H. (2005) Gaps in second language sentence processing. Studies in Second Language Acquisition 27, 53–78. Martin-Rhee, M.M. and Bialystok, E. (2008) The development of two types of inhibitory control in monolingual and bilingual children. Bilingualism: Language and Cognition 11 (1), 81–93. Martìn-Vide, C. (2003) Formal grammars and languages. In R. Mitkov (ed.) The Oxford Handbook of Computational Linguistics (pp. 157–177). Oxford: Oxford University Press.

238

Discont inuit y in Second L anguage Acquisit ion

May, A. (2011) Experience-dependent structural plasticity in the adult human brain. Trends in Cognitive Sciences 15 (10), 475–482. McClelland, J.L. and Patterson, K. (2002) Rules or connections in past-tense inflections: What does the evidence rule out? Trends in Cognitive Sciences 6 (11), 465–472. McLaughlin, B. (1990) Restructuring. Applied Linguistics 11, 113–128. McLaughlin, B. and Heredia, R. (1996) Information processing approaches to research on second language acquisition and use. In W. Ritchie and T. Bathia (eds) Handbook of Second Language Acquisition (pp. 213–228). San Diego: Academic Press. McLaughlin, J., Osterhout, L. and Kim, A. (2004) Neural correlates of second-language word learning: Minimal instruction produces rapid change. Nature Neuroscience 7 (7), 703–704. McLaughlin, J., Tanner, D., Pitkänen, I., Frenck-Mestre, C., Inoue, K., Valentine, G. and Osterhout, L. (2010) Brain potentials reveal discrete stages of L2 grammatical learning. Language Learning 60, 123–150. McNealy, K., Mazziotta, J.C. and Dapretto, M. (2010) The neural basis of speech parsing in children and adults. Developmental Science 13 (2), 385–406. McNealy, K., Mazziotta, J.C. and Dapretto, M. (2011) Age and experience shape developmental changes in the neural basis of language related learning. Developmental Science 14 (6), 1261–1282. Mechelli, A., Crinion, J.T., Noppeney, U., O’Doherty, J., Ashburner, J. and Frackowiak, R.S (2004) Structural plasticity in the bilingual brain: Proficiency in a second language and age at acquisition affect grey-matter density. Nature 431, 757. Mechelli, A., Price, C.J., Friston, K.J. and Ashburner, J. (2005) Voxel-based morphometry of the human brain: Method and applications. Current Medical Images Review 1, 1–9. Mehler, J., Peña, M., Nespor, M. and Bonatti, L. (2006) The soul of the language does not use statistics: Reflections on vowels and consonants. Cortex 42, 846–854. Meisel, J. (2009) Second language acquisition in early childhood. Zeitschrift Für Sprachwissenschaft 28, 5–34. Minz, T.H. (2002) Category induction from distributional cues in an artificial language. Memory and Cognition 30, 678–686. Minz, T.H. (2003) Frequent frames as a cue for grammatical categories in child directed speech. Cognition 90, 91–117. Misyak, J.B. and Christiansen, M.H. (2012) Statistical learning and language: An individual differences study. Language Learning 62 (1), 302–331. Misyak, J.B., Christiansen, M.H. and Tomblin, J.B. (2010) Sequential expectations: The role of prediction-based learning in language. Topics in Cognitive Science 2, 138–153. Mitchell, R. and Myles, F. (1998) Second Language Learning Theories. London: Arnold. Montrul, S. (2009) Reexaming the fundamental difference hypothesis: What can early bilinguals tell us? Studies in Second Language Acquisition 31, 225–257. Montrul, S. (2011) Multiple interfaces and complete acquisition. Lingua 121, 591–604. Morgan-Short, K. and Ullman, M.T. (2011) The neurocognition of second language. In A. Mackey and S. Gass (eds) The Routledge Handbook of Second Language Acquisition (pp. 282–299). London: Routledge. Morgan-Short, K., Sanz, C., Steinhauer, K. and Ullman, M.T. (2010) Second language acquisition of gender agreement in explicit and implicit training conditions: An event-related potential study. Language Learning 60 (1), 154–193. Morgan-Short, K., Steinhauer, K., Sanz, C. and Ullman, M.T. (in press) Explicit and implicit second language training differentially affect the achievement of native-like brain activation patterns. Journal of Cognitive Neuroscience 24 (4), 933–947.

References

239

Mueller, J.L. (2005) Electrophysiological correlates of second language processing. Second Language Research 21 (2), 152–172. Mueller, J.L. (2006) L2 in a nutshell: The investigation of second language processing in the miniature language model. Language Learning 56, 235–270. Mueller, J.L., Hahne, A., Fujii, Y. and Friederici, A.D. (2005) Native and nonnative speakers’ processing of a miniature version of Japanese as revealed by ERPs. Journal of Cognitive Neuroscience 17, 1229–1244. Mueller, J.L., Oberecker, R. and Friederici, A. (2009) Syntactic learning by mere exposure – an ERP study in adult learners. BMC Neuroscience 10 (89); doi:10.1186/1471-2202-10-89. Myles, F. (2004) From data to theory: The over-representation of linguistic knowledge in SLA. Transactions of the Philological Society 102 (2), 139–168. Myles, F., Hooper, J. and Mitchell, R. (1998) Rote or rule? Exploring the role of formulaic language in classroom second language learning. Language Learning 48 (3), 323–363. Nespor, M., Peña, M. and Mehler, J. (2003) On the different roles of vowels and consonants in language processing and language acquisition. Lingue e Linguaggio 2, 221–247. Neville, H. and Sur, M. (2009) Plasticity: Introduction. In M.S. Gazzaniga (ed.) The New Cognitive Neuroscience (4th edn) (pp. 89–90). Cambridge: MIT Press. Newman, A.J., Ullman, M.T., Pancheva, R., Waligura, D.L. and Neville, H.J. (2007) An ERP study of regular and irregular English past tense inflection. NeuroImage 34, 435–445. Newport, E.L. (1990) Maturational constraints on language learning. Cognitive Science 14, 11–28. Newport, E.L. (2002) Critical period in language development. In L. Nadel (ed.) Encyclopedia of Cognitive Science (pp. 737–740). London: Macmillan/Nature Publishing Group. Newport, E.L. (2011) The modularity issue in language acquisition: A rapprochement? Comments on Gallistel and Chomsky. Language Learning and Development 7 (4), 279–286. Newport, E.L. and Aslin, R.N. (2000) Innately constrained learning: Blending old and new approaches to language acquisition. In S.C. Hoell, S.A. Fish and T. Keith-Lucas (eds) Proceedings of the 24th Annual Boston University Conference on Language Development (pp. 1–21). Somerville: Cascadilla Press. Newport, E.L. and Aslin, R.N. (2004) Learning at a distance. Statistical learning of nonadjacent dependencies. Cognitive Psychology 48, 127–162. Nooteboom, C., Weerman, F. and Wijnen, F. (eds) (2002) Storage and Computation in the Language Faculty. Dordrecht: Kluwer Academic Publishers. O’Grady, W. (2008) Language without grammar. In P. Robinson and N. Ellis (eds) Handbook of Cognitive Linguistics and Second Language Acquisition (pp. 139–167). New York and London: Routledge. Ohlrogge, A. (2009) Formulaic expressions in intermediate EFL writing assessment. In R. Corrigan, E.A. Moravcsik, H. Ouali and K.M. Wheatly (eds) Formulaic Language, Vol. 2 (pp. 375–386). Amsterdam and Philadelphia, PA: John Benjamins. Ojemann, S.G., Berger, M.S., Lettich, E. and Ojemann, G.A. (2003) Localization of language functions in children: Results of electrical stimulation mapping. Journal of Neurosurgery 98, 465–470. Onnis, L. (2012) The potential contribution of statistical learning to second language acquisition. In J. Williams and P. Rebuschat (eds) Statistical Learning and Second Language Acquisition. Berlin: De Gruyter Mouton.

240

Discont inuit y in Second L anguage Acquisit ion

Onnis, L., Waterfall, H.R. and Edelman, S. (2008a) Learn locally, act globally: Learning language from variation set cues. Cognition 109 (3), 423–430. Onnis, L., Waterfall, H. and Edelman, S. (2008b) Variation sets facilitate artificial language learning. Proceedings of the 30th Annual Meeting of the Cognitive Science Society. See http://kybele.psych.cornell.edu/~edelman/OnnisWaterfallEdelman-variation-setsCogSci08.pdf. Optiz, B. and Friederici, A. (2004) Brain correlates of language learning: The neural dissociation of rule-based versus similarity-based learning. Journal of Neuroscience 24 (39), 8436–8440. Opitz, B. and Kotz, S. (2012) Ventral premotor cortex lesions disrupt learning of sequential grammatical structures. Cortex 48 (6), 664–673. Ortega, L. (2011) Sequences and processes in language learning. In C. Doughty and M. Long (eds) The Handbook of Language Teaching. Malden: Wiley-Blackwell. Osterhout, L., McLaughlin, J., Kim, A., Greenwald, R. and Inoue, K. (2004) Sentences in the brain: Event-related potentials as real-time reflections of sentence comprehension and language learning. In M. Carreiras and C. Clifton (eds) The On-line Study of Sentence Comprehension (pp. 271–308). New York: Psychology Press. Osterhout, L., McLaughlin, J., Pitkanen, I., Frenck-Mestre, C. and Molinaro, N. (2006) Novice learners, longitudinal designs and event-related potentials: A means for exploring the neurocognition of second language processing. In P. Indefrey and M. Gullberg (eds) The Cognitive Neuroscience of Second Language Acquisition (pp. 199– 230). Malden, MA and Oxford: Blackwell. Osterhout, L., Poliakov, A., Inoue, K., McLaughlin, J., Valentine, G., Pitkanen, I., FrenckMestre, C. and Hirschensohn, J. (2008) Second-language learning and changes in the brain. Journal of Neurolinguistics 21, 509–521. Osterhout, L., Kim, A. and Kuperberg, G.R. (2012) The neurobiology of sentence comprehension. In M. Spivey, M. Joannisse and K. McCrae (eds) The Cambridge Handbook of Psycholinguistics (pp. 365–389). Cambridge: Cambridge University Press. Oyama, S. (1976) A sensitive period for the acquisition of a nonnative phonological system. Journal of Psycholinguistic Research 5, 261–285. Pacton, S. and Perruchet, P. (2008) An attention-based associative account of adjacent and nonadjacent dependency learning. Journal of Experimental Psychology 34 (1), 80–96. Pakulak, E. and Neville, E.J. (2011) Maturation constraints on the recruitment of early processes for syntactic processing. Journal of Cognitive Neuroscience 23 (10), 2752–2765. Paradis, M. (2002) Neurolinguistics of bilingualism and the teaching of languages. Colloquium on the Multimodality of Human Communication: Theories, Problems and Applications. University of Toronto. See http://www.semioticon.com. Paradis, M. (2004) A Neurolinguistic Theory of Bilingualism. Amsterdam and Philadelphia, PA: John Benjamins. Paradis, M. (2005) Introduction to Part IV: Aspects and implications of bilingualism. In J.F. Kroll and A.M.B. de Groot (eds) Handbook of Bilingualism: Psycholinguistic Approaches (pp. 411–415). Oxford and New York: Oxford University Press. Paradis, M. (2008) Language and communication disorders in multilinguals. In B. Stemmer and H.A. Whitaker (eds) Handbook of the Neuroscience of Language (pp. 341– 349). Amsterdam: Elsevier. Paradis, M. (2009) Declarative and Procedural Determinants of Second Languages. Amsterdam and Philadelphia, PA: John Benjamins.

References

241

Paradis, M. (2013) Late-L2 increased reliance on L1 neurocognitive substrates: A comment on Babcock, Stowe, Maloof, Brovetto & Ullman (2012). Bilingualism: Language and Cognition 16 (3), 704–707. Park, D.C., Lautenschlager, G., Hedden, T., Davidson, N., Smith, A.D. and Smith, P. (2002) Models of visuospatial and verbal memory across the adult life span. Psychology and Aging 16, 299–320. Patel, A.D. (2012) Language, music, and the brain: A resource-sharing framework. In P. Rebuschat, M. Rohrmeier, J. Hawkins and I. Cross (eds) Language and Music as Cognitive Systems. Oxford: Oxford University Press. Patel, A.D., Gibson, E., Ratner, J., Besson, M. and Holcomb, P.J. (1998) Processing syntactic relations in language and music: An event-related potential study. Journal of Cognitive Neuroscience 10, 717–733. Peña, M., Bonatti, L., Nespor, M. and Mehler, J. (2002) Signal-driven computations in speech processing. Science 298, 604–607. Penfield, W. and Roberts, L. (1959) Speech and Brain Mechanism. Princeton: Princeton University Press. Perani, D. (2005) The neural basis of language talent in bilinguals. Trends in Cognitive Science 9 (5), 211–213. Perani, D. and Abutalebi, J. (2005) The neural basis of first and second language processing. Current Opinion in Neurobiology 15, 202–205. Perani, D., Abutalebi, J., Paulesu, E., Brambati, S., Scifo, B., Cappa, S. and Fazio, F. (2003) The role of age of acquisition and language usage in early, high-proficient bilinguals: An fMRI study during verbal fluency. Human Brain Mapping 19, 70–82. Pérez-Leroux, A.T. (2011) What I don’t understand about interfaces. Linguistic Approaches to Bilingualism 1 (1), 71–73. Perruchet, P. and Pacton, S. (2006) Implicit learning and statistical learning: One phenomenon, two approaches. Trends in Cognitive Sciences 10 (5), 233–238. Perruchet, P and Tillman, T. (2010) Exploiting multiple sources of information in learning an artificial language: Human data and modeling. Cognitive Science 34 (2), 255–285. Phillips, C. (1996) Order and structure. PhD thesis, Massachusetts Institute of Technology, Boston, MA. Phillips, C. (2003) Linear order and constituency. Linguistic Inquiry 34 (1), 37–90. Phillips, C. and Sakai, K.L. (2005) Language and the brain. McGraw-Hill Yearbook of Science and Technology (pp. 166–169). New York: McGraw-Hill. Pienemann, M. (1998) Language Processing and Second Language Development: Processability Theory. Amsterdam: John Benjamins. Pienemann, M. (2007) Processability theory. In B. VanPatten and J. Williams (eds) Theories in Second Language Acquisition (pp. 137–154). Mahwah: Lawrence Erlbaum. Pienemann, M., Di Biase, B. and Kawaguchi, S. (2005) Extending processability theory. In M. Pienemann (ed.) Cross-linguistic Aspects of Processability Theory (pp. 199–251). Amsterdam: John Benjamins. Pinker, S. (1998) Words and rules. Lingua 106, 219–242. Pinker, S. (1999) Words and Rules. The Ingredients of Language. New York: Basic Books. Pinker, S. and Ullman, M.T. (2002a) The past and future of the past tense. Trends in Cognitive Sciences 6 (11), 456–463. Pinker, S. and Ullman, M.T. (2002b) Combination and structure, not gradedness, is the issue. Trends in Cognitive Sciences 6 (11), 472–474. Pires, A. and Rothman, J. (2011) An integrated perspective on comparative bilingual differences. Linguistic Approaches to Bilingualism 1 (1), 74–78.

242 Discont inuit y in Second L anguage Acquisit ion

Pliatsikas, C. and Marinis, T. (2012) Processing empty categories in second language: When naturalist exposure fills the (intermediate) gap. Bilingualism: Language & Cognition 16 (1), 167–182; doi:10.1017/S136672891200017X. Poldrack, R.A. and Foerde, K. (2008) Category learning and the memory systems debate. Neuroscience and Biobehavioral Reviews 32, 197–205. Poldrack, R.A. and Kéri, S. (2008) The cognitive neuroscience of category learning. Neuroscience and Biobehavioral Reviews 32, 193–196. Poldrack, R.A. and Packard, M.G. (2003) Competition among multiple memory systems: Converging evidence from animal and human brain studies. Neuropsychologia 1497, 1–7. Poldrack, R.A. and Rodriguez, P. (2004) How do memory systems interact? Evidence from human classification learning. Neurobiology of Learning and Memory 82, 324–332. Poldrack, R.A., Clarck, J., Pare-Blagoev, E.J., Shohamy, D., Moyano, J.C., Myers, C. and Gluck, M.A. (2001) Interactive memory systems in the human brain. Nature 414, 245–251. Porter, J.N., Collins, P.F., Muetzel, R.L., Lim, K.O. and Luciana, M. (2011) Associations between cortical thickness and verbal fluency in childhood, adolescence and young adulthood. NeuroImage 55, 1865–1877. Prado, E.L. and Ullman, M.T. (2009) Can imageability help us draw the line between storage and composition? Journal of Experimental Psychology 35 (4), 849–866. Pustejovsky, J. (1995) The Generative Lexicon. Cambridge, MA: MIT Press. Rah, A. and Adone, D. (2010) Processing of the reduced relative clause versus main verb ambiguity in L2 learners at different proficiency levels. Studies in Second Language Acquisition 32, 79–109. Ramscar, M., Yarlett, D., Dye, M., Denny, K. and Thorpe, K. (2010) The effect of featurelabel order and their implications for symbolic learning. Cognitive Science 32, 909–957. Reber, P.J. and Squire, L.R. (1999) Intact learning of artificial grammar and intact category learning by patients with Parkinson’s disease. Behavioral Neuroscience 113, 235–242. Reeder, P.A., Newport, E.L. and Aslin, R.N. (2010) Novel words in novel contexts: The role of distributional information in form-class category learning. In S. Ohlsson and R. Catrambone (eds) Proceedings of the 32nd Annual Meeting of the Cognitive Science Society (pp. 2063–2068). Austin, TX: Cognitive Science Society. Regan, V. (1995) The acquisition of sociolinguistic native speech norms: Effects of a year abroad on second language learners of French. In B.F. Freed (ed.) Second Language Acquisition in a Study Abroad Context (pp. 245–267). Amsterdam and Philadelphia, PA: John Benjamins. Regan, V. (1996) Variation in French interlanguage: A longitudinal study of sociolinguistic competence. In R. Bayley and D.R. Preston (eds) Second Language Acquisition and Linguistic Variation (pp. 177–201). Amsterdam and Philadelphia, PA: John Benjamins. Reichle, R.V. (2010) The critical period hypothesis: Evidence from information structural processing in French. In J. Arabski and A. Wojtaszek (eds) Neurolingusitic and Psycholinguistic Perspectives on SLA (pp. 17–29). Bristol: Multilingual Matters. Reinhart, T. (2006) Interface Strategies: Reference-set Computations. Cambridge: MIT Press. Reiterer, S., Pereda, E. and Bhattacharya, J. (2009) Measuring second language proficiency with EEG synchronization: How functional cortical networks and hemispheric involvement differ as a function of proficiency level in second language speakers. Second Language Research 25 (1), 77–106.

References

243

Reiterer, S., Pereda, E. and Bhattacharya, J. (2011) On a possible relationship between linguistic expertise and EEG gamma band phase synchrony. Frontiers in Psychology 2, 1–11. Rizzi, L. (1986) Null objects in Italian and the theory of pro. Linguistic Inquiry 17 (3), 501–557. Rizzi, L. (2010) Movements and concepts of locality. In M. Piattelli-Palmarini, J. Uriagereka and P. Salaburu (eds) Of Minds and Language: A Dialogue with Noam Chomsky in the Basque Country (pp. 155–168). Oxford and New York: Oxford University Press. Robinson, P. (2002) Effects of individual differences in intelligence, aptitude and working memory on adult incidental SLA: A replication and extension of Reber, Alkenfeld and Hernstadt, 1991. In P. Robinson (ed.) Individual Differences and Instructed Language Learning (pp. 211–266). Amsterdam and Philadelphia, PA: John Benjamins. Robinson, P. (2005) Cognitive abilities, chunk-strength and frequency effects in implicit artificial grammar and incidental L2 learning. Studies in Second Language Acquisition 27, 235–268. Robinson, P. (2010) Implicit artificial grammar and incidental natural second language learning: How comparable are they? In M. Gullberg and P. Indefrey (eds) The Earliest Stages of Language Learning (pp. 245–263). Malden: Wiley-Blackwell. Robinson, P. and Ellis, N. (2008) The Handbook of Cognitive Linguistics and Second Language Acquisition. London: Routledge. Rodgers, D.M. (2011) The automatization of verbal morphology in instructed second language acquisition. International Review of Applied Linguistics and Language Teaching 49, 295–319. Rodríguez-Fornells, A., Cunillera, T., Mestres-Missé, A. and de Diego-Balaguer, R. (2009) Neurophysiological mechanisms involved in language learning in adults. Philosophical Transactions of the Royal Society of Biological Sciences 364, 3711–3735. Romberg, A.R. and Saffran, J.R. (2010) Statistical learning and language acquisition. Cognitive Science 1 (6), 906–914. Rossi, S., Gugler, M.F., Friederici, A.D. and Hahne, A. (2006) The impact of proficiency on syntactic second-language processing of German and Italian: Evidence from event-related potentials. Journal of Cognitive Neuroscience 18 (12), 2030–2048. Rothman, J. and Slabakova, R. (2011) The mind-context divide: On acquisition at the linguistic interfaces. Lingua 121, 568–576. Runnqvist, E., Strijkers, K., Sadat, J. and Costa, A. (2011) On the temporal and functional origin of L2 disadvantage in speech production: A critical review. Frontiers in Psychology 2, 1–8. Sabourin, L. (2009) Neuroimaging research into second language acquisition. Second Language Research 25 (1), 5–12. Sabourin, L. and Haverkort, M. (2003) Neural substrates of representation and processing of a second language. In R. van Hout, A. Hulk, F. Kuiken and R. Towell (eds) The Lexico–Syntax Interface in Second Language Acquisition (pp. 175–195). Amsterdam and Philadelphia, PA: John Benjamins. Sabourin, L. and Stowe, L.A. (2008) Second language processing: When are first and second languages processed similarly? Second Language Research 24, 397–430. Saffran, J.R. (2002) Constraints on statistical language learning. Journal of Memory and Language 47, 172–196. Saffran, J.R., Aslin, R.N. and Newport, E.L. (1996a) Statistical learning by 8-month-old infants. Science 274, 1926–1928.

244

Discont inuit y in Second L anguage Acquisit ion

Saffran, J.R., Newport, E.L. and Aslin, R.N. (1996b) Word segmentation: The role of distributional cues. Journal of Memory and Language 35, 606–621. Sakai, K. (2005) Language acquisition and brain development. Science 310, 815–819. Sakai, K.L., Miura, K., Narafu, N. and Muraishi, Y. (2004) Correlated functional changes of the prefrontal cortex in twins induced by classroom education of second language. Cerebral Cortex 14, 1233–1239. Sakai, K.L., Tatsuno, Y., Suzuki, K., Kimura, H. and Ichida, Y. (2005) Sign and speech: Amodal commonality in left hemisphere dominance for comprehension of sentences. Brain 128, 1407–1417. Sakai, K.L., Nauchi, A., Tatsuno, Y., Hirano, K., Muraishi, Y., Kimura, M., Bostwick, M. and Noriaki, Y. (2009) Distinct roles of the left inferior frontal regions that explain individual differences in second language acquisition. Human Brain Mapping 30, 2440–2452. Santi, A. and Grodzinsky, Y. (2012) Broca’s area and sentence comprehension: A relationship parasitic on dependency, displacement or predictability? Neuropsychologia 50, 821–832. Schlaug, G. (2001) The brain of musicians: A model for functional and structural adaptation. Annals of the New York Academy of Sciences 930 (1), 281–299. Schwartz, B.D. (2003) Child L2 acquisition: Paving the way. In B. Beachley, A. Brown and F. Collins (eds) Proceedings of the 27th Annual Boston University Conference on Language Development (pp. 26–50). Somerville: Cascadilla Press. Schwartz, B.D. and Sprouse, R.A. (1996) L2 cognitive states and the full transfer/full access model. Second Language Research 12, 40–72. Segalowitz, N. (2003) Automaticity and second languages. In C. Doughty and M. Long (eds) The Handbook of Second Language Acquisition (pp. 382–408). Malden, MA and Oxford: Blackwell. Segalowitz, N. and Hulstijn, J. (2005) Automaticity in bilingualism and second language learning. In J.F. Kroll and A.M.B. de Groot (eds) Handbook of Bilingualism (pp. 371– 388). Oxford and New York: Oxford University Press. Segalowitz, N. and Segalowitz, S.J. (1993) Skilled performance, practice and the differentiation of speed-up from automatization effects: Evidence from second language word recognition. Applied Psycholinguistics 14, 369–385. Serratrice, L., Sorace, A., Filiaci, F. and Baldo, M. (2011) Pronominal objects in English– Italian and Spanish–Italian bilingual children. Applied Psycholinguistics 33 (4), 725– 751; doi:10.1017/S0142716411000543. Sharwood Smith, M. (2011) Crossing interfaces in theory and practice. Linguistic Approaches to Bilingualism 1, 94–96. Shtyrov, Y., Nikulin, V. and Pülvermuller, F. (2010) Rapid cortical plasticity underlying novel word learning. Journal of Neuroscience 30, 16864–16867. Sinclair, J. (1991) Corpus, Concordance, Collocation. Oxford and New York: Oxford University Press. Singleton, D. (2001) Age and second language acquisition. Annual Review of Applied Linguistics 21, 77–89. Singleton, D. (2005) The critical period hypothesis: A coat of many colours. International Review of Applied Linguistics 43, 269–286. Singleton, D. (2007) The critical period hypothesis: Some problems. Interlingüìstica 17, 48–56. Singleton, D. and Ryan, L. (2004) Language Acquisition: The Age Factor (2nd edn). Clevedon: Multilingual Matters.

References

245

Slabakova, R. (2006) Is there a critical period for semantics? Second Language Research 22 (3), 1–37. Slabakova, R. (2009) What is easy and what is hard to acquire in a second language? In M. Bowles, T. Ionin, S. Montrul and A. Tremblay (eds) Proceedings of the 10th Conference on Generative Approaches to Second Language Acquisition (GASLA 10) (pp. 280–294). Somerville: Cascadilla Press. Slabakova, R. (2010) Semantic theory and second language acquisition. Annual Review of Applied Linguistics 30, 249–265. Slabakova, R. (2011a) Which features are at the syntax–pragmatics interface? Linguistic Approaches to Bilingualism 1 (1), 89–93. Slabakova, R. (2011b) L2 semantics. In S. Gass and A. Mackey (eds) The Routledge Handbook of Second Language Acquisition. London: Routledge/Taylor Francis. Slabakova, R. and Ivanov, I. (2011) A more careful look at the syntax–discourse interface. Lingua 121, 637–651. Slabakova, R., Kempchinsky, P. and Rothman, J. (2012) Clitic doubled left-dislocation and focus fronting in L2 Spanish: A case of successful acquisition at the syntax–discourse interface. Second Language Research 28 (3), 319–343. Song, H.S. and Schwartz, B.D. (2009) Testing the fundamental difference hypothesis. Studies in Second Language Acquisition 31, 323–361. Song, S., Marks, B., Howard Jr., J.H. and Howard, D.V. (2009) Evidence for parallel explicit and implicit sequence learning systems in older adults. Behavioural Brain Research 196, 328–332. Sorace, A. (2006) Possible manifestations of shallow processing in advanced second language speakers. Applied Psycholinguistics (special issue on the Shallow Structure Hypothesis) 27 (1), 89–91. Sorace, A. (2011a) Pinning down the concept of ‘interface’ in bilingualism. Linguistic Approaches to Bilingualism 1, 1–33. Sorace, A. (2011b) Cognitive advantages in bilingualism: Is there a ‘bilingual paradox’? In P. Valore (ed.) Multilingualism, Language, Power and Knowledge. Pisa: Edistudio. Sorace, A. (2012) Pinning down the concept of ‘interface’ in bilingual development. Linguistic Approaches to Bilingualism 2 (2), 209–216. Sorace, A. and Filiaci, F. (2006) Anaphora resolution in near-native speakers of Italian. Second Language Research 22 (3), 339–368. Sorace, A. and Serratrice, L. (2009) Internal and external interfaces in bilingual language development: Beyond structural overlap. International Journal of Bilingualism 13, 195–210. Sprouse, R. (2011) The interface hypothesis and the full transfer/full access/full parse. A brief comparison. Linguistic Approaches to Bilingualism 1 (1), 97–100. Squire, L.R. and Knowlton, B.J. (2000) The medial temporal lobe, the hippocampus, and the memory systems of the brain. In M.S. Gazzaniga (ed.) The New Cognitive Neurosciences (pp. 765–780). Cambridge: MIT Press. St Clair, M.C., Monaghan, P. and Christiansen, M.H. (2010) Learning grammatical categories from distributional cues. Flexible frames for language acquisition. Cognition 116, 341–360. Steedman, M. (2007) On ‘the computation’. In G. Ramchand and C. Reiss (eds) The Oxford Handbook of Linguistic Interfaces (pp. 575–611). Oxford: Oxford University Press. Stein, M., Federspiel, A., Koening, T., Wirth, M., Lehmann, C., Wiest, R., Strik, W., Brandeis, D. and Dierks, T. (2009) Reduced frontal activation with increasing 2nd language proficiency. Neuropsychologia 47, 2712–2710.

246

Discont inuit y in Second L anguage Acquisit ion

Stein, M., Federspiel, A., Koening, T., Wirth, M., Lehmann, C., Strik, W., Wiest, R., Brandeis, D. and Dierks, T. (2012) Structural plasticity in the language system related to increased second language proficiency. Cortex 48, 458–465. Steinel, M.P., Hulstijn, J.H. and Steinel, W. (2007) Second language idiom learning in a paired-associative paradigm. Studies in Second Language Acquisition 29, 449–484. Steinhauer, K. (2006) How dynamic is second language acquisition? Applied Psycholinguistics (special issue on the Shallow Structure Hypothesis) 27 (1), 92–95. Steinhauer, K. (2011) The temporal dynamics of second language acquisition: The role of AoA, L2 proficiency, L1-transfer and training environment as reflected by ERPs. Presentation given at Workshop on Bilingualism: Neurolinguistic and Psycholinguistic Perspectives, Aix-En-Provence, 12–14 September. Steinhauer, K., White, E., Cornell, S., Genesee, F. and White, L. (2006) The neural dynamics of second language acquisition: Evidence from event-related potentials. Journal of Cognitive Neuroscience (supplement), 99. Steinhauer, K., White, E.J. and Drury, J.E. (2009) Temporal dynamics of late second language acquisition: Evidence from event-related potentials. Second Language Research 25 (1), 13–41. Stern, Y. (2002) What is cognitive reserve? Theory and research application of the reserve concept. Journal of the International Neuropsychological Society 8, 448–460. Stevens, C. and Neville, H. (2009) Profiles of development and plasticity in human neurocognition. In M.S. Gazzaniga (ed.) The New Cognitive Neurosciences. Cambridge, MA and London: MIT Press. Stowe, L. (2006) When does the neurological basis of first and second language processing differ? Commentary on Indefrey. In M. Gullberg and P. Indefrey (eds) The Cognitive Neuroscience of Second Language Acquisition (pp. 305–311). Malden, MA and Oxford: Blackwell. Stowe, L.A. and Sabourin, L. (2006) Imaging the processing of a second language: Effects of maturation and proficiency on the neuronal processes involved. International Review of Applied Linguistics and Language Teaching 43, 329–353. Tanner, D. (2011) Agreement mechanisms in native and nonnative language processing: Electrophysiological correlates of complexity and interference. Unpublished doctoral dissertation, University of Washington. Tanner, D. (2013) Individual differences and stream of processing. Linguistic Approaches to Bilingualism 3 (3), 350–356. Tanner, D. and Van Hell, J.G. (2012) ERPs reveal individual differences in syntactic processing strategies. Poster presented at the Psychonomics Society Conference, Minneapolis, MN. Tanner, D., Osterhout, L. and Herschensohn, J. (2009) Snapshots of grammaticalization: Differential electrophysiological responses to grammatical anomalies with increasing L2 exposure. In Proceedings of the Boston University Language Development Conference. Somerville MA, Cascadilla Press. Tanner, D., McLaughlin, J., Herschensohn, J. and Osterhout, L. (2013a) Individual differences reveal stages of L2 grammatical acquisition. ERP evidence. Bilingualism: Language and Cognition 16 (2), 367–382. Tanner, D., Inoue, K. and Osterhout, L. (2013b) Brain-based individual differences in online L2 grammatical comprehension. Bilingualism: Language and Cognition 17 (2), 277–293; doi:10.1017/S1366728913000370. Tatsuno, Y. and Sakai, K.L. (2005) Language-related activations in the left pre-frontal regions are differentially modulated by age, proficiency and tasks demands. Journal of Neuroscience 25, 1637–1644.

References

247

Tettamanti, M., Rotondi, I., Perani, D., Scotti, G., Fazio, F., Cappa, S. and Moro, A. (2009) Syntax without language: Neurobiological evidence for cross-domain syntactic computations. Cortex 45, 825–838. Thompson, S.P. and Newport, E.L. (2007) Statistical learning of syntax: The role of transitional probability. Language Learning and Development 3 (1), 1–42. Timman, D., Drepper, J., Frings, M., Maschke, M., Richter, S., Gerwig, M. and Kolb, F.P. (2010) The human cerebellum contribute to motor, emotional and cognitive associative learning. Cortex 46, 845–857. Tokowicz, N. and MacWhinney, B. (2005) Implicit and explicit measures of sensitivity to violations in second language grammar: An event-related potentials investigation. Studies in Second Language Acquisition 27, 173–204. Tomasello, M. (2003) Constructing a Language. A Usage-Based Theory of Language Acquisition. Cambridge, MA and London: Harvard University Press. Townsend, D.J. and Bever, T.G. (2001) Sentence Comprehension. The Integration of Habits and Rules. Cambridge: MIT Press. Travis, L. (2010a) The role of features in syntactic theory and language variation. In J.M. Liceras, H. Zobl and H. Goodluck (eds) The Role of Formal Features in Second Language Acquisition (pp. 22–47). London and New York: Routledge-Taylor Francis. Travis, L. (2010b) Inner Aspect. The Articulation of VP. Dordrecht: Springer. Tsimpli, I. (2003) Clitics and determiners in L2 Greek. In J. Liceras, H. Zobl and H. Goodluck (eds) Proceedings of the 6th Conference on Generative Approaches to Second Language Acquisition (pp. 331–339). Somerville: Cascadilla Press. Tsimpli, I. (2011) External interfaces and the notion of ‘default’. Linguistic Approaches to Bilingualism 1 (1), 101–103. Tsimpli, I. and Dimitrakopoulou, M. (2007) The interpretability hypothesis: Evidence from wh-interrogatives in second language acquisition. Second Language Research 23 (2), 215–242. Tsimpli, I. and Mastropavlou, M. (2008) Feature interpretability in L2 acquisition and SLI: Greek clitics and determiners. In J. Liceras, H. Zobl and H. Goodluck (eds) The Role of Formal Features in Second Language Acquisition (pp. 142–183). London: Routledge. Tsimpli, I. and Papadopoulou, D. (2009) Aspect and the interpretation of motion verbs in L2 Greek. In N. Snape, Y.I. Leung and M. Sharwood-Smith (eds) Representational Deficits in SLA (pp. 187–227). Amsterdam and Philadelphia, PA: John Benjamins. Tsimpli, I. and Sorace, A. (2006) Differentiating interfaces: L2 performance in syntaxsemantics and syntax-discourse phenomena. In D. Bamman, T. Magnitskaia and C. Zaller (eds) BUCLD 30: Proceedings of the 30th Annual Boston University Conference on Language Development (pp. 653–664). Somerville: Cascadilla Press. Tsoulas, G. and Gil, K.H. (2011) Elucidating the notion of syntax-pragmatics interface. Linguistic Approaches to Bilingualism 1 (1), 104–107. Uddén, J. and Bahlmann, J. (2012) A rostro-caudal gradient of structured sequence processing in the left inferior frontal gyrus. Philosophical Transaction of the Royal Society of Biological Sciences 367, 2023–2032. Uddén, J., Ingvar, M., Hagoort, P. and Petersson, K.M. (2012) Implicit acquisition of grammars with crossed and nested dependencies: Investigating the push-down stack model. Cognitive Science 36 (6), 1078–1101; doi:10.1111/j.1551-6709.2012.01235.x. Ullman, M.T. (2004) Contributions to memory circuits to language: The declarative/ procedural knowledge. Cognition 92, 231–270. Ullman, M.T. (2005a) A cognitive neuroscience perspective on second language acquisition: The declarative/procedural model. In C. Sanz (ed.) Mind and Context in

248

Discont inuit y in Second L anguage Acquisit ion

Second Language Acquisition (pp. 141–178). Washington, DC: Georgetown University Press. Ullman, M.T. (2005b) More is sometimes more. American Psychological Society 18 (12), 45–46. Ullman, M.T. (2006a) The declarative/procedural model and the shallow structure hypothesis. Applied Psycholinguistics (special issue on the Shallow Structure Hypothesis) 27 (1), 97–105. Ullman, M.T. (2006b) Is Broca’s area part of a basal ganglia thalamocortical circuit? Cortex 42, 480–485. Ullman, M.T. (2008) The role of memory systems in disorders of language. In B. Stemmer and H.A. Whitaker (eds) Handbook of the Neuroscience of Language (pp. 189–198). Amsterdam: Elsevier. Ullman, M.T. and Pierpont, E.R. (2005) Specific language impairment is not specific to language: The procedural deficit hypothesis. Cortex 41, 399–433. Ullman, M.T., Pancheva, R., Love, T., Yee, E., Swinney, D. and Hickok, G. (2005) Neural correlates of lexicon and grammar: Evidence from the production, reading and judgment of inflection in aphasia. Brain and Language 93, 185–238. Unsworth, S., Argyri, F., Cornips, L., Hulk, A., Sorace, A. and Tsimpli, I. (2011) On the role of age and input in early child bilingualism in Greek and Dutch. In M. Pirvulescu, M.C. Cuervo, A.T. Pérez-Leroux, J. Steele and N. Strik (eds) Selected Proceedings of the 4th Conference on Generative Approaches to Language Acquisition North America (GALANA 2010) (pp. 249–265). Somerville: Cascadilla Press. Uylings, H.B.M. (2006) Development of the human cortex and the concepts of ‘critical’ or ‘sensitive’ periods. In M. Gullberg and P. Indefrey (eds) The Cognitive Neuroscience of Second Language Acquisition (pp. 59–90). Malden, MA and Oxford: Blackwell. Van Den Noort, M., Bosch, P. and Hugdahl, K. (2006) Looking at second language acquisition from a functional and structural MRI background. In R. Sun and N. Miyake (eds) Proceedings of the 28th Annual Meeting of the Cognitive Science Society (pp. 2293– 2298). Mahwah: Lawrence Erlbaum. Van Den Noort, M., Bosch, P., Hadzibeganovic, T., Mondt, K., Haverkort, M. and Hugdahl, K. (2010) Identifying the neural substrates of second language acquisition: What is the contribution from functional and structural MRI? In J. Arabski and A. Wojtaszek (eds) Neurolinguistic and Psycholinguistic Perspectives on SLA (pp. 3–16). Bristol: Multilingual Matters. van Heuven, W.J.B. and Dijkstra, T. (2010) Language comprehension in the bilingual brain: fMRI and ERP support for psycholinguistic models. Brain Research Reviews 64, 104–122. Vanlancker-Sidtis, D. (2003) Auditory recognition of idioms by native and nonnative speakers of English: It takes one to know one. Applied Psycholinguistics 24, 45–57. VanPatten, B. and Benati, A. (2010) Key Terms in Second Language Acquisition. London and New York: Continuum. van Valin, R.D. (1990) Semantic parameters of split intransitivity. Language 66, 221–260. van Valin, R.D. and LaPolla, R.J. (1997) Syntax: Structure Meaning and Function. Cambridge: Cambridge University Press. Verkuyl, H.J. (1999) Aspectual Issues: Studies on Time and Quantity. Stanford, CA: CSLI Publications. Verkuyl, H.J. (2005) Aspectual composition: Surveying the ingredients. In H. Verkuyl, H. De Swart and A. van Hout (eds) Perspectives on Aspect (pp. 19–39). Dordrecht: Springer.

References

249

Vingerhoets, G., Borsel, J.V., Tesink, C., van den Noort, M., Deblaere, K. and Seurink, R. (2003) Multilingualism: An fMRI study. Neuroimage 20, 2181–2196. Wagner, A.D., Schacter, D.L., Rotte, M., Koutstaal, W., Maril, A. and Dale, A.M. (1998) Building memories: Remembering and forgetting of verbal experiences as predicted by brain activity. Science 281 (5380), 1188–1191. Wartenburger, I., Heekeren, H.R., Abutalebi, J., Cappa, S.F., Villringer, A. and Perani, D. (2003) Early settings of grammatical processing in the bilingual brain. Neuron 37, 159–170. Weber-Fox, C. and Neville, H.J. (1996) Maturational constraints of functional specializations for language processing: ERP and behavioral evidence in bilingual speakers. Journal of Cognitive Neuroscience 8 (3), 231–256. Weber-Fox, C. and Neville, H.J. (2001) Sensitive period differentiate processing of openand closed-class words: An ERP study on bilinguals. Journal of Speech, Language and Hearing Research 44, 1338–1353. White, L. (2003) Second Language Acquisition and Universal Grammar. Cambridge: Cambridge University Press. White, L. (2011a) The interface hypothesis. How far does it extend? Linguistic Approaches to Bilingualism 1 (1), 108–110. White, L. (2011b) Second language acquisition at the interfaces. Lingua 121, 577–590. Williams, J.N. (2010) Initial incidental acquisition of word order regularities: Is it just sequence learning? In M. Gullberg and P. Indefrey (eds) The Earliest Stages of Language Learning (pp. 221–244). Malden: Wiley-Blackwell. Williams, J.N. and Kuribara, C. (2008) Comparing a nativist and an emergentist approach to the initial stage of SLA: An investigation on Japanese scrambling. Lingua 118, 522–553. Wonnacot, E. and Newport, E.L. (2005) Novelty and regularization: The effects of novel instances on rule formation. In A. Brugos, M.R. Clark-Cotton and S. Ha (eds) Proceedings of the 29th Annual Boston University Conference on Language Development. Somerville: Cascadilla Press. Wray, A. (2000) Formulaic sequences in second language teaching. Applied Linguistics 21 (4), 463–489. Wray, A. (2002) Formulaic Language and the Lexicon. Cambridge: Cambridge University Press. Wray, A. (2008) The puzzle of language learning: From child’s play to ‘linguaphobia’. Language Teaching 41 (2), 253–271. Wray, A. (2009) Future directions in formulaic language research. Journal of Foreign Languages 32 (6), 2–17. Wulff, S., Ellis, N., Bardovi-Harlig, K., Leblanc, C.J. and Römer, U. (2009) The acquisition of tense-aspect: Converging evidence from corpora and telicity ratings. Modern Language Journal 93, 354–369. Yang, C. (2002) Knowledge and Learning in a Natural Language. Oxford and New York: Oxford University Press. Yang, C. (2004) Universal grammar, statistics or both? Trends in Cognitive Science 8 (10), 451–456. Yang, C. (2008) The great number crunch. Journal of Linguistics 44, 205–228. Yang, C. (2010) Who’s afraid of George Kingley Zipf? unpublished manuscript. See http:// www.ling.upenn.edu/~ycharles/papers.html. Yang, C. (2011a) Computational models of syntactic acquisition. WIREs Cognitive Science; doi: 10.1002/wcs.1154.

250

Discont inuit y in Second L anguage Acquisit ion

Yang, C. (2011b) A statistical test for grammar. Proceedings of the 2nd Workshop on Cognitive Modeling and Computational Linguistics (pp. 30–38). Stroudsburg, PA: Association for Computational Linguistics. Yang, C. and Roeper, T. (2011) Minimalism and language acquisition. In C. Boeckx (ed.) The Oxford Handbook of Minimalism (pp. 551–573). Oxford: Oxford University Press. Yang, J., Tan, L.H. and Li, P. (2011) Lexical representations of nouns and verbs in the late bilingual brain. Journal of Neurolinguistics 24, 674–682. Zou, L., Ding, G., Abutalebi, J., Shu, H. and Peng, D. (2011) Structural plasticity of the left caudate in bimodal bilinguals. Cortex 48 (9):1197–1206; doi:10.1016/ j.cortex.2011.05.0.

Index

absences, 26, 187, 200, 213, 228 abstract features, 12, 73, 118, 122, 192, 200, 211, 216 abstract knowledge, 166, 180 abstract properties, 16, 18, 63, 73, 156, 169, 189, 190, 197, 206, 224 abstract rules, 12, 59, 164, 170, 230 abstraction, 115, 145, 174 acceptability judgments, 29, 62, 77, 93, 131, 160, 180 accuracy percentages, 16, 77, 94 activation, 28, 36, 37, 45, 85, 91, 97, 99, 101, 103, 104, 106, 107, 108, 113, 117, 160, 163, 174, 175, 177, 179, 232, 237, 238, 248, 255 adaptation, 42, 50, 79, 82, 86, 87, 90, 94, 103, 107, 109, 243, 254 adapting brain, 74, 81 adaptive, 42, 74, 79, 94, 102, 104, 233 additional brain areas, 104, 106 adjacency, 17, 22, 33, 49, 53, 57, 58, 59, 60, 72, 126, 132, 136, 139, 146, 148, 154, 156, 157, 159, 160, 168, 169, 170, 171, 172, 173, 174, 175, 176, 178, 180, 182, 183, 184, 185, 186, 187, 188, 189, 192, 195, 202, 203, 207, 209, 228, 250 adjectival declension, 129, 135 adolescence, 86, 103, 147, 252 age of arrival, 29, 40, 205, 236 age of exposure, 99 aging, 13, 27, 44, 78, 85, 104, 109, 116, 146, 236 aging brain, 13, 27, 44, 78, 109, 146

agreement, 23, 24, 34, 40, 51, 62, 96, 100, 118, 124, 125, 126, 129, 130, 131, 132, 133, 134, 135, 136, 140, 142, 143, 159, 174, 175, 202, 208, 209, 211, 218, 219, 223, 224 algorithm, 24, 139, 156, 185, 195, 198, 199, 221 Alzheimer disease, 90, 111, 119, 144 amount of exposure, 20, 25, 26, 28, 29, 42, 76, 231 analogy, 16, 17, 24, 43, 62, 72, 196, 201, 215, 228, 232 anaphora resolution, 213, 217, 218 aphasia, 48, 49, 111, 208 argument, 22, 69, 151, 226 artificial grammar, 97, 114, 122, 153, 154, 156, 157, 160, 166, 170, 171, 172, 173, 176, 180, 181, 182, 183, 184, 185, 186, 189, 252, 253 artificial language, 85, 97, 115, 124, 125, 126, 127, 153, 154, 158, 161, 168, 170, 171, 177, 178, 236, 242, 248, 250, 251 association, 123, 152, 221 associative learning, 32, 35, 43, 61, 110, 122, 123, 134, 135, 138, 140, 150, 152, 157, 163, 213, 231, 244, 250, 256, 257 associative memory, 35, 110, 123 asymmetry, 82, 157, 191, 239 attention, 29, 37, 38, 46, 53, 54, 76, 78, 87, 90, 226, 227, 250 automata, 150, 176, 177 automatization, 27, 28, 35, 94, 232, 253, 254

251

252 Discont inuit y in Second L anguage Acquisit ion

auxiliaries, 13, 18, 22, 23, 24, 57, 165, 166, 180, 192, 193, 194, 195, 196, 198, 199, 221, 228, 233 auxiliary selection, 13, 18, 22, 23, 24, 165, 194, 195, 199, 221, 228, 233 awareness, 93, 114, 154, 155, 181, 214 axon rewiring, 83, 86, 90 basal ganglia, 16, 93, 111, 112, 113, 122, 125, 144, 179, 232, 247, 258 behavioral measures, 35, 77, 94, 95 behavioral studies, 38, 93, 98 bilingual aphasia, 119 bilingual brain, 89, 91, 102, 103, 235, 239, 241, 244, 247, 248, 259, 260 bilingualism, 88, 89, 92, 105, 218, 222, 236, 247, 250, 254, 255, 258 bilinguals, 26, 38, 40, 78, 88, 89, 90, 91, 93, 95, 96, 98, 99, 100, 101, 102, 103, 107, 108, 140, 141, 221, 223, 238, 242, 246, 248, 251, 259, 260 binding, 43, 208, 213, 214, 217, 222, 246 biological loss, 104 biphasic pattern, 17, 19, 36, 39, 40, 126, 180 biphasic response, 97 blueprint, 75 bottleneck hypothesis, 225 bottom-up, 12, 189, 192, 200, 230 bracketing, 155, 191, 192 brain maturation, 13, 50, 80, 84, 85, 86, 87, 88, 90, 93, 103 brain plasticity, 50, 81, 90, 92, 94, 98, 100, 101, 104, 109 Broca, 48, 49, 84, 85, 111, 112, 122, 177, 178, 181, 232, 236, 237, 243, 254, 258 BROCANTO, 177, 178, 179 Brodmann Areas, 110, 111 category, 17, 44, 47, 59, 61, 62, 63, 104, 114, 136, 145, 152, 155, 156, 158, 160, 161, 164, 165, 166, 169, 170, 172, 181, 182, 183, 184, 186, 189, 197, 201, 202, 248, 255 categorization, 142, 151, 152, 169, 183, 184, 186, 200, 209, 228, 232 category learning, 114, 155, 156, 252 category members, 12, 155 category privileges, 155

caudate nucleus, 89, 111, 113 cerebellum, 99, 110, 111, 257 Chinese, 89, 97, 99, 101, 164 chunk, 12, 17, 18, 21, 25, 26, 28, 34, 49, 51, 53, 54, 55, 57, 58, 59, 60, 61, 62, 63, 64, 66, 67, 68, 69, 70, 71, 72, 73, 78, 95, 118, 122, 130, 136, 142, 143, 145, 147, 148, 165, 167, 171, 175, 182, 184, 185, 186, 187, 188, 197, 198, 212, 232, 233, 235 class, 96, 97, 130, 149, 154, 155, 156, 252, 259 class variables, 149 classification, 16, 113, 127, 149, 176, 223, 252 classroom, 52, 58, 60, 63, 98, 99, 124, 128, 130, 156, 204, 233, 249, 254 cluster, 149, 184 cognitive, 12, 13, 14, 17, 19, 22, 27, 32, 36, 37, 44, 45, 46, 48, 54, 55, 61, 72, 74, 75, 77, 78, 80, 81, 83, 87, 88, 89, 90, 94, 101, 104, 107, 110, 115, 127, 141, 150, 152, 158, 163, 164, 167, 171, 174, 191, 206, 207, 209, 214, 215, 216, 222, 223, 227, 236, 239,247, 252, 254, 256, 257 collocation, 56, 57, 165, 192, 198, 199 combination, 26, 32, 84, 89, 97, 100, 120, 121, 122, 131, 132, 136, 138, 142, 148, 162, 165, 175, 176 combinatorial grammar, 12, 22, 24, 25, 31, 32, 34, 38, 41, 44, 49, 50, 55, 70, 121, 122, 123, 124, 133, 134, 135, 136, 137, 138, 140, 141, 143, 145, 146, 147, 148, 156, 163, 168, 171, 175, 184, 185, 187, 188, 191, 197, 198, 200, 202, 207, 208, 209, 211, 213, 214, 221, 222, 223, 225, 226, 227, 228, 229, 230, 233 COMBINE, 191 compensatory mechanisms, 95, 104 competence, 10, 11, 12, 13, 14, 18, 21, 28, 42, 47, 56, 66, 68, 71, 83, 88, 95, 99, 100, 117, 119, 139, 140, 185, 229, 230, 231, 252 computation, 12, 13, 17, 18, 23, 24, 26, 27, 29, 31, 32, 43, 45, 46, 47, 48, 70, 71, 72, 93, 97, 113, 114, 115, 117, 121, 122, 123, 127, 130, 135, 137,

Inde x

138, 142, 148, 152, 165, 170, 174, 181, 182, 184, 185, 186, 187, 188, 189, 190, 191, 192, 195, 197, 200, 201, 202, 204, 207, 214, 215, 217, 218, 226, 228, 231, 255 concatenation, 32, 73, 120, 121, 122, 136, 165, 188, 191, 195, 197, 198, 199, 201, 211, 232 conceptual-intentional system, 45, 216, 217 concrete instance, 21, 142, 145, 153 connectionist, 23, 32, 46, 121 consistency, 149 constructions, 17, 26, 49, 53, 54, 58, 59, 61, 62, 63, 68, 69, 70, 71, 73, 76, 78, 95, 116, 122, 123, 128, 135, 136, 142, 143, 148, 160, 170, 182, 184, 185, 186, 188, 206, 214, 220, 226, 232, 233, 243 continuity, 25, 33, 54, 134, 230 convergence model, 37, 107, 108 co-occurrence, 137 copy, 191, 202, 203, 207 corpora, 63, 71, 73, 149, 165, 230, 234, 259 corpus callosum, 85, 86, 90, 239, 247 cortical stimulation, 102 counting strategy, 174 critical period, 74, 78, 80, 81, 82, 86, 90, 91, 92, 93, 95, 97, 98, 100, 102, 116, 119, 141, 146, 206, 211, 224, 233, 238, 240, 242, 252, 254, 255 critical period hypothesis, 78, 80, 91, 93, 97, 224, 238, 242, 252, 254 cues, 17, 23, 28, 86, 131, 141, 166, 180, 248 declarative memory, 21, 29, 31, 34, 35, 69, 94, 95, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 121, 122, 123, 126, 127, 132, 134, 137, 143, 144, 146, 212, 232, 240, 241, 245 dendrite migration, 83, 87 dependency, 48, 49, 55, 57, 59, 129, 167, 172, 175, 178, 181, 202, 204, 210, 250, 254 derivation, 43, 190, 200, 216, 217, 225 determiner, 129, 166 developmental path, 14, 15, 25, 33, 43, 54, 63, 73, 105, 115, 123, 134, 139, 185, 213

253

diffusion tensor imaging, 84, 240 discrete stages, 134, 138 displaced items, 13, 23, 26, 49, 50, 72, 135, 187, 200 displacement, 207 distribution, 16, 18, 19, 40, 72, 102, 140, 148, 159, 162, 167, 186, 241 distributional information, 157, 183, 252 distributional learning, 44 distributional properties, 18, 149, 155, 169, 183, 186, 189, 192 dopamine, 93, 111, 144 dorsal pathway, 84 dorsal region, 99 double-dissociation, 29, 111 DPM, 21, 24, 25, 29, 31, 47, 48, 50, 51, 95, 106, 107, 109, 111, 115, 118, 119, 120, 121, 122, 123, 124, 134, 135, 136, 137, 143, 146, 147, 179, 187, 188, 212, 222, 228 dual-route mechanisms, 37 Dutch, 128, 129, 130, 133, 134, 164, 203, 239, 244, 258 early language acquisition, 13, 75, 79, 84, 97, 206 ELAN, 96, 97, 128 electrical cortical activity, 35 electrodes, 50, 102, 131 electrophysiological shift, 17 electrophysiological studies, 63, 181 elsewhere principle, 115, 116 embedded sentence, 31, 59, 71, 84, 129, 161, 170, 171, 177, 180, 201, 209, 214, 215, 217 emergentism, 46, 259 empty categories, 13, 49, 72, 135, 204, 207, 209, 228, 252 energy levels, 15, 19, 20 English regular past, 31 environmental factors, 25, 98, 106, 108, 127, 142 ERP, 17, 34, 35, 36, 37, 38, 39, 40, 41, 95, 96, 97, 101, 105, 115, 117, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 140, 146, 158, 162, 173, 175, 179, 180, 210, 211, 237, 239, 246, 249, 256, 258, 259 estrogen, 110, 111, 115, 118

254

Discont inuit y in Second L anguage Acquisit ion

event-related potentials. see ERP executive control, 37, 89, 106, 107, 108, 223 exemplar, 26, 55, 62, 182, 183, 244 expectancy, 158, 161 expectation, 55, 158, 159, 163, 164, 180, 181, 199, 248 explicit instruction, 36 explicit knowledge, 28, 56, 61, 114, 116, 127 explicitness, 109 external Merge, 190, 191 extraction, 24, 31, 137, 157, 170, 189, 196, 205, 206, 208, 210, 211, 227, 233, 239 falsifiability criteria, 22, 233 feedback, 36, 79, 113, 129, 152, 180 fiber connections, 74 filler-gap dependencies, 24, 204, 207, 210 finite state, 176, 177, 178, 179, 181, 183 first language, 19, 44, 47, 76, 84, 92, 102, 118, 120, 133 fMRI, 34, 48, 84, 85, 86, 87, 89, 95, 96, 97, 98, 99, 101, 103, 113, 117, 173, 174, 177, 178, 179, 211, 246, 251, 258, 259 formulaic language, 53, 54, 56, 57, 58, 62, 78, 259 formulaic uses. see formulas formulas, 17, 28, 33, 34, 47, 53, 56, 57, 62, 236 French, 17, 60, 62, 70, 71, 72, 97, 130, 131, 236, 240, 252 frequency, 17, 18, 19, 21, 22, 23, 26, 29, 30, 35, 40, 42, 43, 44, 45, 46, 47, 49, 51, 53, 54, 55, 56, 57, 58, 61, 62, 71, 72, 73, 77, 78, 85, 101, 110, 121, 122, 123, 132, 137, 138, 141, 142, 143, 145, 147, 148, 151, 156, 165, 170, 182, 188, 189, 192, 195, 196, 197, 198, 199, 201, 205, 206, 207, 211, 212, 215, 221, 222, 226, 227, 228, 233, 235, 237, 238, 241, 253 frontal lobe, 96, 99, 104, 105 frontoparietal, 100, 103, 174 full access/full transfer/full parse hypothesis, 76

function, 13, 14, 18, 20, 21, 24, 25, 27, 33, 34, 42, 60, 62, 82, 83, 85, 86, 87, 94, 98, 99, 101, 102, 103, 111, 120, 123, 124, 135, 147, 154, 157, 179, 182, 192, 215, 226, 230, 232, 241, 252 fundamental difference hypothesis, 76, 78, 189, 237, 248, 255 gamma band, 103 gaps, 71, 155, 200, 203, 204, 205, 209, 210, 211, 233 garden-path sentences, 139, 163, 239 gemination, 10, 13, 14, 18, 19, 20, 24, 32, 34, 42, 67, 68, 73, 124, 139, 141, 143, 188, 198, 231, 234 gender agreement, 124, 129, 133, 134, 171, 211, 248 general domain, 13, 77, 167, 175 generalization, 38, 116, 123, 151, 155 generative theory, 16, 45, 75, 215, 227 GL (Grammatical Learning), 12, 15, 16, 17, 21, 22, 24, 25, 27, 28, 29, 30, 32, 34, 42, 44, 50, 52, 55, 73, 74, 75, 77, 79, 84, 86, 94, 97, 104, 105, 109, 118, 124, 140, 141, 143, 153, 163, 175, 181, 182, 188, 190, 192, 194, 195, 197, 198, 199, 200, 206, 221, 222, 230, 231, 232, 233, 234 good-enough, 19, 43, 138, 144, 203 gradualism, 42 grammar rules, 21, 27, 28, 49, 79, 94 grammatical categories, 156 grammatical computation, 181 grammatical features, 13, 22, 25, 59, 119, 135, 138, 142 grammatical item, 14, 70 grammatical learning, 12, 175, 188, 248 grammaticality judgments, 158 grand averages, 38 gray matter, 82, 86, 87, 89, 90, 91, 100, 104, 245 habits, 43, 110, 112 heuristics, 41, 139, 140 hierarchical structure, 121, 162 hierarchies, 41, 176, 177, 181, 191, 192, 194, 200, 219, 245 hippocampal, 110, 179, 181, 232, 240, 241

Inde x

hippocampus, 74, 105, 110, 113, 123, 144, 179, 231, 232, 255 hormones, 119 Huntington, 111, 179, 239 identity matching, 184 idiom, 53, 56, 57, 69, 174, 258 imageability, 29, 252 immersion, 25, 98, 101, 124, 126, 127, 128, 133, 204, 205, 206, 233 implicit competence, 94 implicit knowledge, 28, 50, 157 implicit learning, 79, 127 implicitness, 109, 114 in presence, 152 inaccusative, 192 incidental, 79, 160, 253, 259 incremental, 15, 33, 172, 199, 200 incrementalism, 32, 42 individual differences, 19, 38, 39, 40, 41, 119, 140, 171, 253, 254, 256 inferior frontal gyrus, 37, 48, 84, 87, 91, 99, 103, 107, 108, 173, 174, 175, 177, 257 inferior temporal gyrus, 84 inflectional morphology, 33, 93, 156, 225, 226, 237 inhibitory control, 88 innatist, 13, 46, 150, 152 input, 20, 21, 25, 26, 34, 37, 42, 44, 57, 68, 84, 107, 113, 145, 149, 150, 153, 155, 164, 167, 185, 197, 227 instruction, 17, 25, 28, 79, 98, 100, 101, 104, 125, 129, 130, 131, 133, 143, 204, 233, 248 interface hypothesis, 218, 219, 221, 222, 225, 246, 255, 259 interfaces, 123, 135, 148, 149, 181, 189, 200, 207, 211, 214, 215, 216, 218, 219, 220, 221, 222, 225, 228, 242, 248, 251, 253, 254, 255, 257, 259 interiorization, 35 interlanguage, 30, 32, 48, 57, 60, 63, 104, 124, 127, 142, 224, 233, 237, 240, 244, 246, 252 internal merge, 42, 191, 200, 201, 207, 214 interpretability hypothesis, 223, 224, 225, 257

255

intervening words, 49, 157, 171, 189 irregular verbs, 112, 122 island conditions, 45 island constraints, 205, 210, 227 island phenomena, 43 Kendall’s rank correlation, 66 L1 English, 17, 100, 128, 130, 131, 220 L1–L2 divergences, 106 L1–L2 similarities, 101 L2 Italian, 13, 18, 22, 26, 33, 62, 64, 73, 139, 142, 193, 195, 196, 198, 199, 202, 221, 226, 228 L2 Spanish, 35, 100, 128, 220, 237, 255 label, 30, 32, 54, 121, 188, 191, 199, 207, 208 labeled, 120, 191, 192, 211, 221, see label LAN, 51, 96, 97, 112, 125, 128, 158, 159 language exposure, 122 language faculty, 45, 46, 47, 77, 149, 214, 215, 216 language functions, 82, 84, 87, 109, 112, 121, 237 language network, 82, 83, 85, 103, 242 LAST, 43, 44, 213, 228 LAN, 126, 128 late language acquisition, 74, 75, 76, 79, 88, 103, 104, 105, 118, 231 lateral sulcus, 100 lateralization, 82, 83, 98, 245 learnability, 77, 104, 146, 152, 153, 188, 212, 221, 224, 230, 246 learning curve, 15, 27 left hemisphere, 74, 82, 83, 84, 85, 107, 112, 126, 159, 254 left inferior frontal gyrus, 48, 91, 99, 103, 108, 129, 173, 174, 175, 177, 179, 254, 257 left temporal cortex, 16, 105 left temporal lobe, 16, 87 length of exposure, 36, 89, 98, 233 length of residency, 40 levels of energy, 15 lexical access, 38, 51, 133 lexical categories, , 166, 167 lexical factors, 19 lexical knowledge, 110, 124, 169 lexical learning, 29

256

Discont inuit y in Second L anguage Acquisit ion

lexicon, 21, 22, 24, 25, 29, 30, 33, 34, 48, 69, 93, 96, 102, 104, 106, 107, 108, 109, 114, 120, 121, 122, 123, 124, 134, 136, 137, 138, 142, 143, 146, 147, 148, 170, 187, 188, 214, 225, 226, 229, 232, 233, 243, 245, 258 lexicon-grammar divide, 24, 233 lexico-syntax interface, 194 linear relations, 49 linearization, 41, 45, 67, 71, 154, 189, 200 locative, 160, 225 long-distance dependencies, 117, 118, 123, 168, 170, 171, 176, 178, 179, 212 longitudinal study, 91, 128, 252 maturation, 34, 74, 79, 81, 82, 83, 85, 86, 87, 88, 94, 240, 256 maximality, 59 MEG, 36, 83, 100, 128, 236 membership, 62, 155, 156, 157, 167, 185, 186, 191, 192, 197, 232 memory, 23, 27, 28, 32, 34, 35, 37, 41, 45, 47, 48, 49, 51, 56, 58, 61, 69, 71, 72, 87, 93, 109, 110, 111, 112, 113, 114, 115, 116, 118, 119, 120, 122, 123, 124, 125, 127, 134, 140, 146, 160, 169, 171, 172, 173, 175, 176, 177, 181, 184, 200, 231, 232, 235, 237, 240, 243, 251, 252, 253, 255, 257, 258 mental lexicon, 93, 110, 115, 120, 123, 245 merge, 59, 120, 121, 122, 190, 191, 192, 195, 197, 199, 200, 202, 203, 207, 216, 228 meta-analysis, 98, 107, 244 metalinguistic, 93, 117, 125 minimalist program, 45, 77, 78, 149, 189, 200, 216, 223 modular, 13, 46, 47, 231 monitor, 89, 117 monolinguals, 88, 89, 90, 91, 98, 102, 140, 141, 223 motivation, 40, 169 motor skills, 28, 94, 110, 112 movement, 41, 45, 48, 71, 202, 205, 207, 208, 209, 213, 222, 228, 244, 245 MRI, 90, 91, 100, 258 multiword expression, 56 music, 46, 112, 148, 161, 162, 163, 164, 173, 245, 251

mutual information score, 160 myelination, 83:, 87, 86, 90 N400, 17, 19, 24, 36, 38, 39, 40, 41, 51, 96, 100, 124, 125, 126, 130, 131, 132, 133, 135, 136, 137, 138, 139, 140, 141, 142, 143, 175, 180, 211 narrow syntax, 220 native-like attainment, 97, 116 n-back task, 48, 49 ne . . . pas negation, 70, 71 nested structure, 172 network, 76, 87, 101, 103, 143, 163, 252 neural network, 89, 100, 103 neural underpinnings, 29, 94, 109, 121 neurocognition, 11, 17, 25, 26, 35, 36, 38, 50, 51, 78, 97, 106, 115, 119, 124, 127, 135, 136, 137, 140, 141, 142, 233, 235, 248, 250, 256 neuroimaging, 35, 36, 253 non-combinatorial grammar, 24, 31, 32, 49, 228 non-discontinuous aspects, 13 non-linear relations, 49 null pronouns, 189, 201, 221, 223, 227 null subjects, 22, 24, 26, 121, 202, 211, 218, 219, 228, 233, number agreement, 156, 158 onset of acquisition, 36, 89, 90, 91, 96, 97, 98, 99, 100, 101, 103, 104, 107, 118, 127, 128, 135 optimal period, 94 optionality, 154, 166, 218, 221, 222 order of acquisition, 33 oscillatory neural activity, 170, 171 overlapping, 46, 55, 104, 109, 155, 159, 160, 162, 196, 255 P600, 17, 19, 24, 36, 38, 39, 40, 41, 96, 97, 98, 100, 125, 126, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 158, 159, 161, 162, 163, 175, 179, 180, 211 Parkinson, 111, 114, 122, 144, 238, 252 pars opercularis, 111 pars triangularis, 97, 103 parsing, 59, 97, 211, 212, 213, 241, 245, 248

Inde x

passive sentence, 43 past tenses, 99, 111, 112, 114, 121, 165, 212, 215, 241, 245, 249, 251 Patches, 78, 79, 232 pattern sensitive device, 137 patterns, 16, 17, 30, 36, 37, 38, 51, 53, 54, 55, 56, 61, 62, 71, 72, 73, 80, 83, 84, 97, 98, 99, 101, 103, 104, 107, 108, 111, 119, 121, 122, 137, 141, 148, 149, 151, 154, 157, 164, 165, 166, 167, 168, 169, 171, 174, 175, 176, 178, 180, 181, 182, 184, 185, 187, 188, 211, 213, 228, 231, 232, 248 perceptual system, 45, 216 performance, 14, 28, 47, 63, 77, 78, 83, 87, 93, 99, 100, 110, 111, 116, 117, 127, 129, 169, 172, 173, 179, 185, 205, 233, 245, 247, 254, 257 perisylvian, 83, 102 PET, 34, 117 phrasal head, 18, 57, 89, 120, 165, 188, 189, 191, 192, 194, 195, 198, 199 203, 224, 232 phrase boundaries, 54, 154, 155, 160 phrase structure, 84, 96, 151, 153, 154, 155, 160, 166, 178, 179, 180, 181, 209, 238 picture naming, 108 plasticity, 81, 82, 85, 88, 89, 90, 91, 92, 99, 104, 142, 225, 233, 239, 248, 254, 256, 260 plausibility, 144, 210 position, 35, 48, 49, 58, 59, 69, 70, 76, 82, 128, 140, 148, 150, 157, 176, 178, 179, 181, 185, 186, 189, 190, 192, 197, 202, 213, 214, 223 post-pubertal, 75, 85, 99, 103, 232 poverty of stimulus, 76, 152 practice, 27, 28, 58, 61, 62, 88, 89, 94, 95, 98, 100, 104, 115, 116, 117, 119, 127, 146, 147, 213, 254 predictability, 120, 246, 254 predictive, 24, 55, 72, 153, 154, 158, 161, 163, 165, 166, 167, 170 prefrontal cortex, 88, 89, 98, 99, 110, 113, 174, 232, 254 pre-motor cortex, 16 primary language, 87, 104, 105 privileges of category, 145

257

probabilistic properties, 44, 46, 47, 113, 114, 120, 121, 127, 132, 150, 151, 152, 166, 170, 178, 188, 190, 242 probabilities distribution, 153 probabilities, 55, 151, 153, 154, 165, 182, 187, 199, 257 probability distribution, 151 procedural knowledge, 112, 114, 116, 257 procedural memory, 29, 32, 35, 69, 110, 111, 112, 113, 114, 115, 116, 118, 119, 120, 122, 123, 124, 125, 127, 132, 135, 142, 143, 144, 147, 179, 184, 209, 211, 212, 222, 247 proceduralizaton, 117 processability theory, 33, 34, 251 processing, 14, 15, 16, 17, 19, 20, 24, 25, 27, 28, 32, 33, 34, 38, 41, 42, 43, 46, 47, 49, 51, 53, 54, 55, 57, 58, 60, 62, 63, 70, 72, 73, 74, 76, 78, 87, 88, 90, 94, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 112, 115, 116, 117, 121, 124, 125, 126, 127, 128, 129, 130, 132, 133, 134, 135, 138, 139, 140, 141, 144, 146, 150, 157, 158, 160, 161, 163, 170, 171, 172, 173, 174, 175, 177, 178, 179, 180, 182, 198, 199, 200, 203, 204, 205, 206, 209, 210, 211, 212, 213, 214, 217, 218, 222, 223, 224, 225, 229, 230, 231, 232, 235, 236, 237, 239, 240, 241, 242, 244, 245, 246, 247, 248, 249, 250, 251, 252, 253, 255, 256, 257, 259 processor, 33, 34, 43, 51, 78, 172, 213 prodrop, 22, 67, 68 proficiency, 14, 25, 30, 36, 37, 38, 39, 40, 56, 75, 77, 85, 89, 91, 92, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 106, 107, 108, 116, 117, 118, 119, 124, 125, 126, 127, 128, 131, 132, 134, 135, 138, 140, 141, 142, 146, 149, 205, 218, 220, 221, 237, 246, 247, 252,253, 255, 256 projections, 81, 83, 121, 178, 195, 198, 232, 238 pronominal subjects, 71, 72, 137, 200, 207, 221 property-based categorization, 16 pseudo words, 131, 132, 136, 167

258

Discont inuit y in Second L anguage Acquisit ion

puberty, 37, 78, 81, 82, 85, 86, 87, 92, 105, 107, 115, 119, 138, 141 push-down stack model, 172, 173 putamen, 111, 113, 179 quantization, 15 quantized, 15, 19, 20, 21, 25, 141, 231, 234 quantum, 15, 19, 20, 25, 231 Quantum Theory, 19 queues model, 172 reaction times, 27, 77, 169, 205, 210 readability condition, 181 receptors, 81, 93 recruitment, 37, 102, 103, 104, 246, 250 redundant, 15, 113, 231 reflexive pronoun, 213 regularity, 32, 59, 63, 68, 73, 85, 121, 141, 150, 151, 153, 157, 164, 165, 166, 167, 168, 170, 180, 187, 259 relative clause, 71, 162, 203, 208, 210, 252 repetition, 45, 53, 55, 56, 108, 116, 154, 173, 211, 243 resource sharing paradigm, 162 restructuring, 27, 28, 248 rhinal cortex, 110 right hemisphere, 85 rostro-caudal, 174, 175, 257 routes, 11, 15, 25, 76, 107, 140, 141, 212, 213, 229, 230 routines, 19, 33, 54, 56, 57, 58, 133, 134, 173, 177, 180, 224 rule learning, 29, 171, 179, 239, 242 schema, 61 scrambling, 177, 208, 259 segmentation, 159, 254 self-paced reading, 203, 204, 244 sensitive period, 80, 82, 250 sensorimotor system, 45, 110, 200, 217 sentence building, 200 sequence learning, 120 sequenced structure, 161 sequencing privileges, 155 sequential, 58, 88, 114, 120, 150, 157, 158, 159, 165, 174, 175, 184, 239, 250 sequential learning, 150, 157, 158, 239 sex differences, 111

shallow, 19, 78, 139, 141, 202, 208, 212, 213, 214, 239, 242, 246, 255, 258 signal-to-noise ratio, 38, 39 similarity metrics, 150, 215 single-network hypothesis, 106, 107, 108 skill learning, 46 SL (Statistical Learning), 12, 15, 16, 17, 19, 21, 22, 24, 25, 27, 28, 29, 30, 31, 32, 34, 42, 44, 50, 51, 52, 53, 54, 55, 57, 58, 60, 70, 71, 73, 74, 75, 77, 79, 84, 85, 86, 94, 95, 97, 104, 105, 109, 110, 118, 124, 134, 140, 141, 142, 143, 145, 147, 148, 149, 150, 151, 152, 153,156, 157, 160, 161, 163, 164, 165, 166, 167, 168, 169, 170, 171, 175, 181, 182, 183, 184, 185, 186, 187, 188, 190, 191, 192, 195, 197, 198, 199, 200, 201, 202, 206, 207, 208, 209, 212, 214, 215, 221, 229, 230, 231, 232, 233, 234 sociolinguistic variants, 71, 72 specialization, 82, 83, 99 specific language impairment, 112, 244, 247 spell-out, 191, 216, 217 split representations, 14, 18, 139 Shallow Structure Hypothesis, 208, 209, 210, 211, 212, 213, 214, 228 statistical cues, 85 statistical distribution, 164 statistical learning, 12, 249, 251 statistical pretreatment, 13, 30 statistical properties, 156 statistical sensitiveness, 56 stochastic, 152 storage, 27, 29, 45, 47, 51, 55, 61, 71, 72, 109, 110, 117, 200, 235, 237, 252 stream, 51, 85, 140, 148, 152, 157, 160, 168, 170, 186, 197, 200, 256 striatum, 113 structurally integrated, 163 structured sequences, 157, 158, 161, 163, 257 subcategorization frames, 151 subcortical structures, 83, 86, 105, 111 superior temporal gyrus, 84, 85, 87 superior temporal sulcus, 85, 96 supervised learning, 149, 183 supra-regular grammar, 175, 176, 177 surface similarities, 16, 232

Inde x

syllable, 85, 164, 168, 173, 178, 179 symbolic, 12, 32, 43, 46, 68, 151, 152, 212, 252 symbol, 68, 137, 145, 148, 152, 164, 174, 176, 177, 181, 186, 197 synapsis, 81 synaptic pruning, 83, 86, 87 synchronization, 171, 252 syntactic dependency, 49, 163 syntactic freezes, 28, 59 syntactic movement, 49 syntactic violations, 39, 51, 96, 97, 159, 161 syntax-discourse interface, 88, 219, 220, 221, 225, 247 syntax-pragmatics interface, 219, 257 temporal cortex, 84 temporal region, 16, 38, 74, 83, 84, 85, 87, 91, 96, 100, 102, 105, 106, 110, 111, 113, 121, 123, 124, 129, 146, 160, 171, 179, 231, 232, 253, 255, 256 token, 57, 61, 158, 159, 160, 161 topic shift, 23, 214, 219 TP (Transition Probabilities), 12, 17, 18, 31, 35, 44, 46, 51, 57, 58, 72, 84, 121, 123, 136, 148, 153, 154, 155, 156, 165, 166, 167, 168, 169, 176, 182, 183, 184, 185, 186, 187, 189, 190, 192, 195, 197, 198, 202, 206, 211, 215, 228, 230 traces, 48, 49, 173, 175, 205, 206, 214, 224 training, 36, 95, 115, 116, 124, 125, 126, 127, 129, 130, 148, 156, 158, 159, 160, 161, 166, 169, 178, 248, 256 twin representations, 18, 139 type of instruction, 36 typological differences, 98, 107 UG (Universal Grammar), 44, 45, 46, 76, 77, 78, 118, 150, 189, 221, 222, 225, 226, 227, 231, 238 ultimate attainment, 77, 93, 212, 235

259

unanalyzed whole, 18, 34, 69 unergative, 192, 193, 195 ungrammatical, 40, 47 uninterpretable features, 135, 199, 211, 223, 224 units, 12, 27, 28, 49, 54, 56, 57, 58, 59, 60, 61, 63, 78, 162, 166, 167 universal principles, 46 unsupervised, 149 usage-based approaches, 26 U-shaped, 14, 67 variation sets, 159, 160, 196, 197, 250 verb inflection, 135, 238 verb morphology, 17, 64, 66, 67, 68, 98, 115, 226 verb subcategorization, 51, 97, 151 verbal aspect, 224 verb-subject, 135, 142, 143, 219 vocabulary, 69, 83, 93, 94, 99, 108, 109, 143 voxel-based morphometry, 89, 90 weather prediction task, 120 Wernicke’s area, 84, 237 wh- constituents extraction, 12 wh- movement, 177 white matter, 91, 240 window of opportunity, 87, 91, 98, 206 word boundaries, 168 words distribution, 156, 207 word order, 125, 135, 136, 155, 161, 207, 208, 259 word segmentation, 157 word sequence, 17, 54, 131, 143 word-order, 51, 126, 128, 132, 208 working memory (WM), 48, 49, 99, 115, 118, 139, 172, 209, 217, 218 Zipfian laws Zipf, 206 ϕ-features, 223