Meaning in the Second Language 9783110211511, 9783110203226

This book reviews recent research on the second language acquisition of meaning with a view of establishing whether ther

277 73 3MB

English Pages 337 [340] Year 2008

Report DMCA / Copyright

DOWNLOAD PDF FILE

Table of contents :
Frontmatter
Table of contents
Chapter 1. Introduction
Chapter 2. Architecture of the language faculty
Chapter 3. Psycholinguistic models of sentence comprehension
Chapter 4. What are imaging and ERP studies of bilinguals really testing?
Chapter 5. The Bottleneck Hypothesis
Chapter 6. Evidence from behavioral studies: Simple Syntax–Complex Semantics
Chapter 7. Evidence from behavioral studies: Complex Syntax–Simple Semantics
Chapter 8. Implications
Backmatter
Recommend Papers

Meaning in the Second Language
 9783110211511, 9783110203226

  • 0 0 0
  • Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up
File loading please wait...
Citation preview

Meaning in the Second Language



Studies on Language Acquisition 34

Editor Peter Jordens

Mouton de Gruyter Berlin · New York

Meaning in the Second Language by Roumyana Slabakova

Mouton de Gruyter Berlin · New York

Mouton de Gruyter (formerly Mouton, The Hague) is a Division of Walter de Gruyter GmbH & Co. KG, Berlin.

앝 Printed on acid-free paper which falls within the guidelines 앪 of the ANSI to ensure permanence and durability.

Library of Congress Cataloging-in-Publication Data Slabakova, Roumyana. Meaning in the second language / by Roumyana Slabakova. p. cm. ⫺ (Studies on language acquisition ; 34) Includes bibliographical references and index. ISBN 978-3-11-020322-6 (cloth) 1. Second language acquisition. 2. Semantics. 3. Grammar, Comparative and general. I. Title. P118.2.S578 2008 4011.93⫺dc22 2008037454

Bibliographic information published by the Deutsche Nationalbibliothek The Deutsche Nationalbibliothek lists this publication in the Deutsche Nationalbibliografie; detailed bibliographic data are available in the Internet at http://dnb.d-nb.de.

ISBN 978-3-11-020322-6 ISSN 1861-4248 쑔 Copyright 2008 by Walter de Gruyter GmbH & Co. KG, D-10785 Berlin. All rights reserved, including those of translation into foreign languages. No part of this book may be reproduced in any form or by any means, electronic or mechanical, including photocopy, recording, or any information storage and retrieval system, without permission in writing from the publisher. Cover design: Sigurd Wendland, Berlin. Printed in Germany.

“One of my all-time favorite jazz pianists is Thelonious Monk. Once, when someone asked him how he managed to get a certain special sound out of the piano, Monk pointed to the keyboard and said: ‘It can’t be any new note. When you look at the keyboard, all the notes are there already. But if you mean a note enough, it will sound different. You got to pick the notes you really mean!’ I often recall these words when I am writing, and I think to myself, “It’s true. There aren’t any new words. Our job is to give new meanings and special overtones to absolutely ordinary words.” I find the thought reassuring. It means that vast, unknown stretches still lie before us, fertile territories just waiting for us to cultivate them.” From Jazz Messenger, an essay by Haruki Murakami, translated by Jay Rubin, published in The New York Times Book Review on July 8, 2007

Acknowledgements

There are many people who have had an invaluable contribution to this project, all of whom deserve my profound gratitude. First of all, I am indebted to the participants in the experiments described in the book, the adult second language learners, whose mental grammar we, the researchers, are striving to capture. Without their generosity and time (and if they didn’t take us seriously), there is no research. A book on generative second language acquisition cannot appear in a vacuum, and I have benefited from ideas and discussion with numerous colleagues in this vibrant field of linguistics, most notably Lydia White, Rex Sprouse, Laurent Dekydtspotter, and Donna Lardiere. In addition to those colleagues (and others too numerous to mention), I would like to thank Silvina Montrul for countless—and ongoing—discussions of all matters to do with second language acquisition, and in particular, where our field is heading to. I have summarized and commented on many experimental studies in this book, and, by adjusting their findings in trying to make a theoretical point, I have certainly diminished their complexity and richness. I encourage readers interested in specific experimental work to go to the original publications. I want to specially acknowledge my students in the Generative Second Language Acquisition class at the University of Iowa (given every fall), for asking tough questions and generally displaying enthusiasm for this line of linguistic inquiry. The University of Iowa Department of Linguistics has provided a stimulating and friendly environment these last ten years (and my colleagues cannot be held responsible for the weather!). I also am grateful to the College of Liberal Arts and Sciences, the Graduate College, the Office of the Vice Provost for Research, International Programs, and the Obermann Center, all at the University of Iowa, for funding my various research projects, developmental leave, and conference travel. I am thankful to William Davies, Alice Davison, and Elena Gavruseva for reading and commenting on chapters of this book. I am especially obliged to Silvina Montrul, Paula Kempchinsky, and Vladimir Kulikov, who selflessly read the whole manuscript and provided sterling suggestions for improvement. Thanks also go the graduate research assistants Alisa ChenYang Chen, Lalita Dareshwar, and Kum-Young Lee, who checked all the references (twice) and googled out some of those first names. I’d like to acknowledge Professor Peter Jordens, Editor of the Studies on Language

viii Acknowledgements Acquisition series, and Ursula Kleinhenz at Mouton de Gruyter for encouraging me with this project every step of the way. Finally, I owe a world of gratitude to my husband Dr. Zlatko Anguelov, who is helping me in a double capacity. First, with his professional input: laying aside projects of his own, he edited every chapter of this book and improved its clarity. In the process, he probably learned more about linguistics and cognitive science than he needed to know. Secondly, and much more importantly, with his unstinting love and support throughout the years.

February, 2008 Iowa City

Table of contents

Acknowledgements

vii

Chapter 1: Introduction 1. The critical period in language acquisition: Observations and

claims 2. 3. 4.

Critical periods: Possible biological explanations Refining the quest: Multiple critical periods A look ahead

Chapter 2: Architecture of the language faculty 1. The Minimalist Program 2. Syntax-semantics equivalence? 3. Syntax-semantics mismatches 4. What’s wrong with “syntactocentrism” (Jackendoff, 2002)? 5. There is more to the syntax-semantics interface than meets the syntax 6. Relevant theoretical assumptions Chapter 3: Psycholinguistic models of sentence comprehension 1. Why look at psycholinguistics? 2. What is the relationship between the parser and the grammar? 3. Evidence for modularity of “syntax” and “semantics” 4. Interaction between syntax and semantics in sentence processing 5. Neurophysiology and electrophysiology of L2 processing 6. Processing of closed-class versus open-class lexical items Chapter 4: What are imaging and ERP studies of bilinguals really testing? 1. Linguistic tests 2. Can neuroscience inform theories of linguistic development, just yet? 3. Studies of semantic processing 4. What exactly should we test when we are testing semantics? 5. Stimuli in studies of syntax

1 4 6 11 17 21 26 30 36 41 46 48 51 55 60 67

70 71 74 76 78

x

Table of contents

Chapter 5: The Bottleneck Hypothesis 1. What must a theory of second language acquisition explain? 2. Current transition theories 2.1. Autonomous Induction Theory 2.2. Acquisition by Processing Theory 2.3. Shallow Structure Hypothesis 2.4. Constructionism 3. Functional morphology is the bottleneck, syntax and semantics flow smoothly 3.1. Syntax is easier than functional morphology 3.2. Semantics also depends on the functional morphology 3.3. Processing of morphology in the L2: Neuroimaging and ERP findings 3.4. Why is morphology so difficult to learn and to process? 4. Another transition theory proposal: Variational Learning 5. Conclusions

85 88 88 90 92 98 100 100 103 104 108 112 118

Chapter 6: Evidence from behavioral studies: Simple Syntax–Complex Semantics 1. Some methodological considerations 2. Interpretive dependencies of binding 2.1. Binding of reflexives 2.2. The Overt Pronoun Constraint 3. Aspectual challenges 3.1. Preliminaries 3.2. Telicity marking in Bulgarian-English interlanguage 3.3. Telicity marking in English-Russian interlanguage 3.4. Telicity and nominal interpretation 3.5. Acquisition of grammatical aspect (English and Japanese L2) 3.6. Acquisition of grammatical aspect (Spanish L2) 3.7. Uninstructed aspect-related semantic properties 4. Acquisition of article interpretation 5. Acquisition of subjunctive mood 6. Conclusions

122 128 128 134 143 143 149 151 156 160 169 175 182 186 192

Chapter 7: Evidence from behavioral studies: Complex Syntax–Simple Semantics 1. Process and result nominals 2. Wh-quantifiers and tense distinctions 3. Quantification at a distance 4. Combien extractions

201 208 214 219

Table of contents

5. 6. 7. 8. 9.

Quantifier scope Scrambling Acquisition of wh-movement Interpretation of subject and object questions at the initial state Overview and conclusions

xi 226 232 237 251 260

Chapter 8: Implications 1. What is difficult and what is easy in L2 Acquisition? 2. Where should researchers be looking next? 3. Implications for teaching 4. Last words

268 275 280 281

References Index

284 322

Chapter 1 Introduction

Few people start learning a second language for the exotic sounds, or for the elegant sentence structure that they detect in it. Meaning is what we are all after. We would all like to understand and to be able to convey thoughts and feelings and observations. In this book, I will try to examine the road to meaning, that is, how we get to understand and convey meaning in a second language, and where the pitfalls to that may lie. This book is about acquiring the ability to comprehend and produce sentences and discourse in a second language. I start by examining the differences and similarities between child language acquisition and adult second language acquisition. I shall introduce and explore one of the most important ideas in the study of language development: that of a critical period for language acquisition. Next, I shall look at some biological explanations for critical periods and the idea that there may be multiple critical periods for many functions. Linguists have discussed various dimensions along which critical periods in language acquisition may differ. Exploring several of those approaches, I conclude that it is best to conceive of multiple critical periods based on the current views of language architecture. At the close of the chapter, I offer an overview of the ideas proposed in the book. 1. The critical period in language acquisition: Observations and claims The outcome of first language acquisition is a success: before they go to elementary school, normally developed children have a complex command of the grammar of the language that surrounds them. Adult second language acquisition, on the other hand, results in varying degrees of success. Even after many years in a country where the new language is spoken, second language speakers are perceived as different from native speakers. Failure to acquire some aspects of the target language grammar is common, failure to sound like a native speaker is typical. The well-known contrast in attainment – universal success in the case of child language (L1) acquisition, variable success in second language (L2) acquisition – has been amply documented (e.g., Johnson and Newport, 1989; Sorace, 1993, among many others). Striving to explain this difference in ultimate attainment has been

2 Introduction rightfully promoted to the forefront of L2 acquisition research and engendered much debate. This contrast in ultimate attainment has usually been related to the influence of age. It has been widely documented that the earlier the onset of L2 acquisition, the more native-like the ultimate attainment is. Thus, a reasonable explanation for the facts of L1 and L2 acquisition is given by the Critical Period Hypothesis (Penfield and Roberts, 1959; Lenneberg, 1967). In its most succinct formulation, it states that there is a limited developmental period, during which it is possible to learn a new language—be it L1 or L2—to normal, native-like levels. Once this window of opportunity has closed, however, the ability to acquire languages declines (for recent surveys, see Hyltenstam and Abrahamsson, 2003; DeKeyser and Larsen-Hall, 2005; Birdsong, 2005). Still, the issues and proposed explanations for the influence of age differentiate between L2 and L1 acquisition. The effects of the timing of language exposure are more dramatic in L1 than in L2 acquisition. It is relatively well established that failure to engage the “language acquisition device” in children through exposure to meaningful input (due to deprivation or isolation) results in severe linguistic deficits that cannot be overcome by subsequent exposure to language. In the well-known case of Genie (Curtiss, 1977, 1988), this exposure to language came at around 13 years of age. Despite intensive educational and therapeutic interventions after that, her syntax lagged significantly behind her lexical growth, and she had particular difficulty with verb tense, word order, prepositions, and pronouns. There was also an unusually large discrepancy between her production and comprehension.1 The case of E.M. (Grimshaw et al, 1998), a deaf adolescent from rural Mexico who came to Canada at age 15, was fitted with a hearing aid and began learning Spanish, offers more or less the same linguistic development but without Genie’s severe social isolation. “Chelsea” (Curtiss, 1988), who was another case of a late-diagnosed deaf individual (age 31), did not develop even rudimentary language abilities. Data from deaf children who started acquiring American Sign Language between the ages of 4 and 6 showed slightly less than native fluency, while those who started their L1 acquisition after puberty never reached native proficiency (Newport 1990). These data point to the conclusion that the human brain is particularly adapted for language acquisition during an early period of life and if this window of time is not utilized, slight to severe divergence from native norm ensues. However, as Lenneberg himself acknowledges (Lenneberg, 1967: 176), it is not entirely obvious how the Critical Period Hypothesis relates to L2 acquisition, since L2 acquirers already have a native language and the lan-

The critical period in language acquisition

3

guage centers in the brain have presumably been activated in the opportune window. Thus, it is more appropriate to consider age-related effects in L2 acquisition, for example, a downward but even slope in proficiency scores, rather than a critical cut-off point after which it becomes impossible to achieve native-like proficiency (see Birdsong 2005 for many explicit arguments in favor of this idea). The age variable examined in L2 acquisition studies is usually the age of first exposure to the L2 (or age of acquisition). In studies of immigrant populations, this is typically indexed by the participant’s age of arrival in the host country. Nowadays, researchers of the critical period within (global) L2 acquisition fall roughly into two camps: those arguing for the “robustness” of critical period effects (DeKeyser, 2000; Johnson and Newport, 1989; MacDonald, 2000; Newport, 1991) and those who claim that while there is no real cut-off point for language acquisition abilities, there are age effects that persist throughout the life span (Birdsong, 2005; Birdsong and Molis, 2001; Flege, 1999; Hakuta, Bialystok, and Wiley 2003). For example, Johnson and Newport (1989), the classical (and frequently replicated) study in this field of inquiry, found that age of arrival correlates negatively with near-native performance: the older the participants on arrival in the host country, the less successful language acquirers they are. The correlation between age of acquisition and scores on their 276-item test was –.87 for those learners who started before age 16 while it was only –.16 for those who started after age 16. On the other hand, Hakuta et al (2003) examined 1990 US census reports that included self-reported English proficiency rating for 324,444 Chinese and 2,016,317 Spanish immigrants to the US and found no abrupt discontinuity but rather a downward linear trend in the proficiency ratings. Much previous research has suggested that age of acquisition is apparently an important determinant factor of overall degree of success, but certainly not the only one. In fact, a lot of researchers who observe that variables like type of input, language use and various social-psychological as well as cognitive factors play a role in ultimate attainment, also claim that these variables confound the effects of age. Among these additional variables, a lot of attention has been paid to verbal aptitude. DeKeyser (2000) points out that among his Hungarian adult learners of English, those with higher aptitude scores also obtained more native-like ratings in English, while for the child L2 learners this was not the case. He argued that there was no age effect for high-aptitude learners, while there was such an effect for low-aptitude learners. (However, see Bialystok (2002) for strong argu-

4 Introduction ments against DeKeyser’s interpretation of his results.) Social and psychological factors like integrative motivation and positive attitude to the L2 community have not been proven to be sufficient, nor even necessary for linguistic success. Cognitive differences between children and adults have also been suggested as a possible explanation for critical period effects. For example, the less is more hypothesis (Newport, 1990) proposes that limitations of cognitive capacity allow children to focus on and store the main features of the input, while adults overanalyze the input and memorize big chunks of it, because they can. While children compute simpler formmeaning relationships online due to memory constraints, the adult strategy to memorize proves inefficient and disastrous in the long run. However, formulaic L2 learning which purportedly speeds initial acquisition but is detrimental later on has been documented in child L2 learning as well (Wong-Fillmore, 1976). In addition, Long (1990) provides strong arguments against cognitive explanations, noting that general problem-solving abilities do not correlate with language proficiency. All things considered, it seems relatively safe to accept that, if there is a critical period at all (and not just linear age-effects), it would not be crucially dependent on motivation, verbal aptitude, or cognitive abilities. 2. Critical periods: Possible biological explanations In the previous section, we focused on the visible effects of the timing of language acquisition onset: what happens when L1 or L2 acquisition starts at birth, or after puberty. In this section, we will look at some biological explanations for these visible effects. The Critical Period Hypothesis rests on the assumption that the qualitative change in the language-learning abilities of children and adults is the result of maturational changes in the brain structures that are used to learn and/or process language. Lenneberg’s original idea was to attribute the critical period for first language acquisition to progressive brain lateralization. It has also been hypothesized that, as the brain matures, it becomes “less plastic”, in the sense that neurons are less likely to make new connections, and that reduced connectivity impedes L2 learning (Long, 1990; Patkowski, 1980). Some researchers have argued that we should see this brain maturation as the process of myelination of cortical neurons (Pulvermüller and Schumann, 1994). However, Gregg and Eubank (1999) submit that most of the evidence for a critical period in nat-

Critical periods: biological explanations

5

ural development comes from non-human species and that this evidence is far from complete. Critical periods in biological development have been recognized by biologists for over a century. Beginning with the behavioral observations of Konrad Lorenz (1958), this concept has profoundly influenced psychologists, philosophers, parents, and educators. Hensch (2004), in a recent review of critical period regulation in biological development, describes well studied and less well-studied critical period effects. Examples include the classical case (Lorenz, 1958) of the filial imprinting of certain precocial birds on a parental figure hours after hatching. It is now known to be a twostep process consisting of predisposition followed by actual learning (Bolhuis and Honey, 1998) and integrating many sensory modalities: vision, taste, olfaction, and audition. Another example is primary visual cortex, where converging inputs from the two eyes typically compete for connectivity (ocular dominance) with a peak sensitivity to monocular deprivation around one month after birth in cats and rodents (Hubel and Wiesel, 1970). In humans, intense auditory training in musicians before the age of seven leads to an increased cortical representation of piano tones, not attested for other tones (Pantev et al. 1998). In musicians who develop absolute pitch, there is a significantly greater left hemisphere asymmetry in planum temporale activation (Schlaug et al. 1995). Age of initial musical training has been found to be crucial for these developmental changes. Hensch (2004) offers nine common traits of all such effects observed and described in the sensory and cognitive systems of animals and humans, in view of formulating a general theory of critical periods. The fourth and fifth facets are particularly relevant to linguistic critical periods: “Fourth is the regulation of critical period onset and duration not simply by age, but rather by experience. If appropriate neural activation is not provided at all, then developing circuits remain in a waiting state until such input is available. Alternatively, enriched environments may prolong plasticity. In other words, the critical period is itself use-dependent. Understanding the cellular mechanisms of this effect will greatly influence strategies for life-long learning. Fifth is the unique timing and duration of critical periods across systems. (…) Cascades of critical periods and their cumulative sequence at different ages and levels of processing shape each brain function as the relevant neural pathways develop to a point where they can support plasticity.” (p. 550-2)

6 Introduction Critical periods are described for phenomena where genetic hardwiring and exposure to signals from outside the organism conspire to form the mature state. The current level of knowledge in the biological sciences does not allow us any more definitive conclusions except that a critical period is an extreme form of a more general sensitivity, when neuronal properties are particularly susceptible to modification from experience; nonetheless, this transient level of brain plasticity does not preclude the possibility of life-long learning. Two views have been defended in the literature, depending on what specific critical period effects were being studied. In one view, the potential for plasticity is never lost, but merely tempered by evolving dynamic neural activation (Fagiolini and Hensch, 2000). On this view, it is not impossible to “coax” neural networks out of one stable state into another through training regimens. Alternatively, a class of factors appears in the development of other biological systems (e.g., myelinization and perineuronal nets in visual cortex) that are inhibitory to further plasticity, eventually preventing large-scale circuit reorganization and thereby structurally closing off the critical period (Schoop et al, 1997). Evidence has been provided for both views. As is common in biology, both views are likely to be correct, for different critical period effects. 3. Refining the quest: Multiple critical periods Whatever the correct view of critical periods in biological development turns out to be, another idea seems to be eminently applicable to language acquisition: the existence of multiple critical periods for each brain function. For example, the three-step process, through which some birds acquire a single, stereotyped song, highlights the serial nature of multiple critical periods underlying a complex brain development. Young birds first memorize the song of a tutor during a sensory acquisition phase; that is followed by a sensorymotor practice phase when the bird actively matches its vocal output to the remembered template. These two phases may overlap (as in the zebra finch) or be separated by several hundred days (as in the swamp sparrows). Learning ends in song crystallization when note structure and sequence become highly stereotyped. Doupe and Kuhl (1999) have drawn multiple comparisons with child language development stages. Among linguists, this important idea has gained a lot of support recently: there is not just one but multiple critical periods for language acquisition (see Lamendella, 1977 and Seliger, 1978,

Multiple critical periods

7

for previous suggestions along these lines, and more recently Eubank and Gregg, 1999; Lee and Schachter, 1997). If one makes the common assumption that the language modules are phonology, morpho-syntax, and semantics, dealing with sound, word and morpheme order, and meaning, respectively, it is conceivable that phonology, morpho-syntax and semantics could have different critical periods. Put in another way, there could be differential age-related effects for the different parts of grammatical competence. However, this differentiation has been envisaged not only along language modules but also along different other dimensions. The famous study of Coppieters (1987) asked whether near-native speakers who have reached a level of apparent surface equivalence with native speakers in language use and proficiency (i.e., performance) also have the same underlying competence as native speakers. He divided the linguistic properties he tested into structures falling under the UG umbrella (complex syntax) and socalled cognitive or functional aspects of language such as the imparfait– passé composé distinction. Coppieters found near-native speakers to be more accurate with the UG versus the non-UG properties. He documented the most divergence with these latter test items, concluding that native speakers and near-native speakers did not interpret sentences in the same way. Although lacking appropriate theoretical and methodological tools, Coppieters at that time implicitly differentiated formal properties of grammar (acquirable) from interpretive properties (which may be indeterminate, in the sense of Sorace 1999, 2000). Birdsong (1992) offered an extensive criticism of Coppieters’s (1987) study, mostly based on methodological grounds. In replicating it, Birdsong designed a more appropriate test instrument and applied different subjectselection criteria. He found 15 out of 20 individuals whose intuitions of French syntax were not substantially different from those of native speakers. No evidence for a UG vs. non-UG distinction was found. Johnson and Newport (1991) found the same high correlation (–.63) between age of acquisition and proficiency for UG properties, such as subjacency, which they found in Johnson and Newport (1989) for more general morphosyntactic properties such as subject-verb agreement.2 Rejecting the idea that differential age-related effects can be linked to UG vs. non-UG structures, researchers have examined more closely the different modules of the language faculty. For example, Eubank and Gregg (1999) proposed that critical or sensitive periods affect various areas of linguistic knowledge differently (i.e., phonology, morpho-syntax, lexicon,

8 Introduction etc.) and even subcomponents of these modules (e.g., lexical items, inflections, syntactic effects of abstract features). Lee and Schachter (1997) suggested that principles of UG (e.g., binding, subjacency) and parameters have different age-related cut-off points for successful triggering and acquisition. Proposals of this type can be united under the label Multiple Critical Period Hypothesis. In recent years, efforts have turned to isolating precisely which linguistic modules, submodules, features, or interface areas are affected, how, and why. A clear example of those efforts is Beck’s (1998) proposal that there exist localized critical periods specifically affecting the feature strength of functional categories. She used this idea to explain persistent fossilization in the morpho-syntactic domain. More specifically, Beck studied verb movement in English-German interlanguage and found a dissociation between knowledge of verbal inflection (as measured by a translation task) and knowledge of verb movement (as measured by a psycholinguistic sentence matching task). She argued that learners’ grammars demonstrate a local impairment as to the strength of the feature that regulates verb movement from V to I to C and results in the V2 effect in German. In the absence of this knowledge, learners produced and accepted sentences with raised and non-raised verbs, optionally and indiscriminately. She speculated that the recent theoretical treatment of language variation in terms of the lexical properties of functional morphemes can explain her acquisition results. Given that the grammatical information that drives movement (the weak versus strong feature of a particular piece of morphology) resides in a highly differentiated lexicon, it is possible to posit a selective deficit in one facet of the functional lexicon item but not in another. Thus, Beck’s (more advanced) learners had acquired the phonological form of the verbal inflectional morphemes but not the strength features that make part and parcel of the lexical entry of that verbal inflection. The dissociation of correct morphology and correct syntax follows. Similar views were also defended in Eubank and Grace (1996) and in Eubank and Gregg (1999). As we shall see later in this book (in chapter 5), however, the majority of L2 acquisition findings are not in agreement with Beck’s findings. Although correct functional morphology and correct syntactic reflexes of that morphology can be dissociated, the advantage (in terms of higher accuracy) goes to the syntax, not to the morphology. Another example of differential critical period effects for different linguistic properties is Hawkins and Chan’s (1997) Failed Formal Features Hypothesis. Its updated and more precise version (Hawkins, 2005; Haw-

Multiple critical periods

9

kins and Hattori, 2006: 128), following Tsimpli (2003) and Tsimpli and Dimitrakopoulou (2007), claims that uninterpretable features not selected from the UG inventory of features during the critical period disappear. Let me briefly define interpretable and uninterpretable features here, while postponing a closer look at these definitions to the next chapter. Features that make an essential contribution to meaning (i.e., plural, human, gender, or aspect) are interpretable, whereas those that are purely grammatical and only relevant to the morpho-syntax (i.e., case or agreement) are uninterpretable. Only uninterpretable features can trigger movement of constituents. Now, Hawkins and Hattori’s (2006) view implies that L2 learners may be able to map features from functional categories in their L1 to new L2 morpho-phonological material, but they will not have access to the uninterpretable features of the L2 not present in their L1. The result is that L2 learners may use the morphology of the L2 with the feature specifications of their L1. Hawkins and Hattori (2006) tested the predictions of this hypothesis, which they dubbed the Interpretability Hypothesis, following Tsimpli and Dimitrakopoulou (2007), on the L2A of English wh-questions by Japanese native speakers. Hawkins and Hattori start from the assumption that an uninterpretable strong feature drives English wh-phrases to sentence initial position, while the equivalent feature in Japanese is weak. Observing that a number of studies (using both production and grammaticality judgment tasks) have attested Japanese learners’ apparent knowledge of the fact that in English wh-words and phrases move sentence-initially, while in their native language they remain in situ, Hawkins and Hattori attempt to find out whether their learners obey constraints on wh-movement triggered by an uninterpretable strong wh-feature, such as subjacency and superiority.3 They find that their Japanese participants do not obey these constraints while the natives do. Hawkins and Hattori speculate that the apparently high accuracy of Japanese learners on grammaticality judgment tasks containing English wh-questions can be explained by a grammar that uses a strong focus feature instead of a strong wh-feature. This strong focus feature, which is also uninterpretable, is selected by Japanese and thus can be re-used in the L2 English grammar of these learners.4 Note that this hypothesis can only be tested in the acquisition of properties which involve the operation of uninterpretable features, and not any other features, including interpretable syntactic and semantic features, since, by hypothesis, the latter are learnable in the second language.

10 Introduction Sorace (2000, 2003, see also Felser et al, 2003; Marinis et al 2005, Tsimpli et al, 2004) advances another hypothesis. Aspects of grammar that require the integration of syntactic knowledge with other types of information (e.g., pragmatic, semantic, prosodic) are more problematic for L2 learners than properties that require only syntactic knowledge. These latter properties may present residual difficulties even at the near-native level. In other words, the vulnerability resides at the syntax-semantics or the syntaxpragmatics interface. This proposal implies terminal inability for nearnative speakers to retreat from optionality in production or indeterminacy in their comprehension judgments for properties located at the interfaces. Sorace dubs this the Interface Hypothesis. Sorace and Filiaci (2006) offer an interesting clarification and specification of these claims. In testing the syntactic and pragmatic knowledge of English-speaking learners of Italian at the near-native level with respect to pronoun-antecedent ambiguity resolution, Sorace and Filiaci find that the syntactic constraints on pronoun antecedents are indeed observed by their participants. What is non-native-like in their performance is a processing principle called Position of Antecedent Strategy (Carminati, 2002, 2005), which regulates pronoun ambiguity resolution by postulating that null pronouns refer to higher subjects, while overt pronouns refer to objects or adjuncts (provided Principle B is not violated). The Position of Antecedent Strategy is a processing strategy, not a syntactic constraint, because its violations do not lead to ungrammaticality and because the strategy can be overcome by pragmatic and contextual clues. Since Sorace and Filiaci found that their near-native speakers’ preferences differed from those of native speakers in contexts where extra processing resources had to be brought to the interpretation process for the overt pronouns’ ambiguity resolution, they concluded that the near-native speaker–native speaker difference lies in the exact processing strategies for pronoun resolution. Nearnative speakers may lack the processing resources to obey fully the Position of Antecedent Strategy processing principle. Thus Sorace’s earlier representational-deficit account has evolved into a processing-load account, under the evidence of more data. The considerable merits of Beck’s and Eubank’s selective deficit proposal, Tsimpli and Hawkins’s Interpretability hypothesis and Sorace’s Interface Hypothesis stem from the fact that all of these proposals are firmly grounded in linguistic theory and use legitimate distinctions postulated by the architecture of the language faculty. We should expect refinements of the Critical Period Hypothesis and further progress in research on control-

A look ahead

11

ling linguistic critical period effects to come from contemporary linguistic or psycholinguistic theory. Their other merit is that these proposals make testable predictions. For example, the Interpretability Hypothesis makes a claim about a permanent inability of adult learners, after a (unspecified) critical period, to acquire a certain type of linguistic features. Furthermore, it is not enough to show that any learners have difficulty with these features, but that near-native learners do, who are perceived to be as close as possible to native speakers’ competence. My own proposal will also use these linguistic distinctions, but I will argue that the bulk of data point to another conclusion. The prediction, of course, would be of a different nature. 4. A look ahead This book continues the vein of research arguing for multiple critical periods by promoting the positive side of the argument in terms of possibility of acquisition, or linguistic potentials. In this respect, it meets Hyltenstam and Abrahamsson’s (2003) requirement for future research on the Critical Period Hypothesis that “the most fruitful way to research maturational constraints is to focus explicitly on ultimate L2 learning potentials – in late as well as in early starters.” (p. 567) I will seek to demonstrate not which parts of the grammar are subject to age-related effects but rather which are not subject to such effects. As mentioned in the previous section, it is a very common assumption, articulated in, for example, Paradis (2004: 119, but see also Jackendoff, 2002; Samuels, 2000) that the language modules are phonology, morpho-syntax, and semantics. The Multiple Critical Period Hypothesis will be refined by arguing that “morpho-syntax” is not a monolithic module in terms of processing and development and should be divided into morpho-phonology and syntax, with possibly quite different neural pathways. Semantics should also be viewed as comprising two types of linguistic operations on two levels: lexical semantics and phrasal semantics (Jackendoff, 2002). Support for this claim will come from experimental evidence that functional morphology and other syntactic effects are differentially affected in the behavior of L2 learners, and from a re-interpretation of (the bulk of) neurolinguistic studies of L2 comprehension. Thus, when we are told that “syntax” is processed differently from “semantics”, what we actually have is inflectional morphology encoded in the functional lexicon, which is be-

12 Introduction ing treated differently from a universal process of semantic composition, that comes to the learners for free from their L1 or UG. Furthermore, I will look at acquisition of phrasal semantics, which, similarly to the acquisition of the more subtle syntactic properties that come from UG, does not present insurmountable difficulty to the L2 learner. Behavioral studies of learners at all levels of proficiency will be reviewed. Copious evidence from behavioral studies will be provided in chapters 6 and 7, indicating that acquisition of abstract syntactic properties and mismatches at the syntax-semantics interface are successfully acquired by adult L2 learners. Based on their success in acquiring interpretive properties, I will argue that there is no critical period for (phrasal) semantics. In a nutshell, I will propose the Bottleneck Hypothesis, which will argue that there are “tight places” in the flow of L2 development and there are more “fluid” domains. To start with the already established facts, one such tight spot is functional morphology. It is processed by declarative memory, has to be learned by rote, and its forms present difficulty for L2 learners not only at beginning stages of acquisition but at later stages, too (Lardiere, 2005). It is a stumbling block in linguistic production but it is also crucial in comprehension. Most recent research in generative L2 acquisition focuses on morphology-syntax mismatches. Evidence comes from several studies of child and adult L2 production (Haznedar and Schwartz, 1997; Haznedar, 2001; Ionin and Wexler 2002; Lardiere, 1998a,b). What is especially striking in the data is the clear dissociation between the incidence of verbal inflection in production (e.g., -ed for past tense ranging between 46.5% and 4.5%) and the various syntactic phenomena related to it, like overt subjects, nominative case on the subject, and word order (above 98% accuracy). Prévost and White (2000a,b) explain this learning situation with the Missing Surface Inflection Hypothesis. The variable production of the morphology, according to this hypothesis, is due to learners’ imperfect mapping of specific morphological forms to abstract syntactic categories. In such cases (which could be due to processing difficulties and more often attested under communication pressures) learners resort to defaults, forms that are underspecified in some features. However, the L2 learners have acquired the relevant features of the terminal nodes in the syntax (from the L1, or on the basis of L2 input constrained by UG).

A look ahead

13

In the same way that abstract syntax is a relatively “easier” domain to acquire, the relative ease of semantics is documented but not widely acknowledged. The Bottleneck Hypothesis finds empirical support in a number of recent studies to be reviewed and critiqued in chapters 6 and 7. We will identify two types of learning situations that L2 acquirers are faced with. In one type of learning situation, illustrated in Montrul and Slabakova (2002), the challenge for the learners lies at the syntax-semantics interface. A number of meaning primitives (habitual action, ongoing action, etc) are subsumed in one aspectual tense morpheme, while another array of meanings pertains to its apparent equivalent in the target language. In this learning situation, which I call Simple Syntax—Complex Semantics, both initial transfer from the native language and subsequent incremental development reaching native levels are attested. In the other type of learning situation, exemplified by Dekydtspotter and Sprouse (2001), movement of a constituent in one construction (the continuous wh-phrase qui de célèbre in French) but not in another (the discontinuous wh-phrase qui … de célèbre) creates scope effects reflected in the presence or absence of present (speech-time) interpretation. I will argue that in such cases it may be the syntax that presents more difficulty to learners and native speakers, while the semantics is fairly straightforward. That is why I call this learning situation Complex Syntax—Simple Semantics. Learners’ performance is characterized by lower acceptance rates altogether, but both proficient and less proficient learners demonstrate that they

14 Introduction have established the semantic contrast in their interlanguage grammars. Once learners are capable of understanding the test sentences by being able to parse their complex syntax, they have no trouble with the available interpretations, since there is no syntax-semantics mismatch and they have recourse to the universal semantic calculation procedures. In both types of learning situations, then, adult learners demonstrate that they are perfectly capable of acquiring mismatches, constituent movements, and the resulting changes of meaning. There is no barrier to success or a cut-off age after which acquisition efforts will be futile. The Bottleneck model is a theory about L2 development and ultimate attainment. It will identify areas of the grammar where learners can expect to encounter enhanced difficulty and other areas where they can expect to swim (relatively) free. Since it is a model looking at linguistic potentials, evidence for it can come from learners of all proficiency levels, from beginners to near-native speakers. It predicts that in any learning situation, the difficulty will come from the inflectional morphology but not from compositional semantics. There are clear implications of this model for the Critical Period Hypothesis. In light of recent neurolinguistic, psycholinguistic, and behavioral findings, the strong version of the Critical Period Hypothesis has to be reconsidered. As demonstrated in the previous discussion, there is not just one but multiple critical periods for the different linguistic modules, and there may be no critical period for the acquisition of meaning. A hypothesis which tries to account for a complex and as yet poorly understood cognitive process such as language comprehension will of necessity be wide-ranging, using a “big brush”. Its main significance is that it makes testable and falsifiable predictions for L2 acquisition, thus engendering future research agendas. Its main advantage is that it combines neurocognitive and behavioral findings with linguistic sophistication. Although largely of theoretical import, the model will have practical implications for L2 instruction. For example, in order to boost the (relatively easier) acquisition of meaning, the model will suggest that learners should be trained systematically on functional morphology, while at the same time being exposed to a wealth of naturalistic, meaningful input, which will allow them to fully engage the linguistic structures involved in language comprehension. In chapter 2, I will present two current models of the architecture of the language faculty, Chomsky’s Minimalist view, which places all linguistic generation squarely in the syntax, and Jackendoff’s (1997, 2002) view,

A look ahead

15

which attributes generative abilities to all three linguistic modules: phonology, morpho-syntax, and semantics. I will also spell out the theoretical assumptions that I will make in the book. In chapter 3, experimental fMRI and ERP findings on native and L2 comprehension will be reviewed and some recent neuroscience models of language use (Friederici, Hagoort) will be briefly introduced. Our focus will be on the differential, modular processing of syntax and semantics, both in natives and L2 acquirers. In chapter 4 I will critique some of the assumptions made by researchers using functional imaging and electrophysiological techniques and the test items they are using. In chapter 5, I will present four recent proposals for a transition theory of L2 acquisition (Carroll, Sharwood Smith and Truscott, Clahsen and Felser, Herschensohn) as well as Yang’s L1A proposal that marries UG principles and parameters available to the child innately with frequency of the relevant data for each parameter available in the input. I will suggest that, although yet untested for L2A, this latter proposal is most compatible with the existing L2A findings and integrates seamlessly with the Bottleneck Hypothesis. Chapters 6 and 7 will submit the evidence supporting the hypothesis. Chapter 8 discusses ease and difficulty of acquisition more generally, and provides conclusions, implications, and directions for future research.

16 Introduction Notes

1. However, Genie’s childhood was atypical in so many ways that it is impossible to know if her language deficits reflect a critical period mechanism, or impaired cognitive development as a result of her tragic background. 2. Thinking more seriously about UG versus non-UG structures, we can see the absurdity of this differentiation of structures. It is not the case that determiners, word order in the simple sentence, negation or question formation are any less UG-regulated than subjacency and island effects and “wanna” contraction. The difference may be seen more appropriately in whether one type of structures is taught explicitly in language classrooms or learned implicitly from positive evidence only and without negative evidence. Although this distinction was common at the end of the eighties, more recent conceptualization of the human language faculty (Chomsky 1998) suggests that the principles of syntactic composition (how we build up phrases) and semantic composition (how we interpret them) are part of the genetic endowment and do not need exposure to language to be activated. The differences between languages lie in the features (both interpretable and uninterpretable) encoded in the functional lexicon, which bring about movement of heads and phrases. Thus, not even the simplest phrase in an L2 can be produced or comprehended without recourse to the general compositional principles and to a few parametric differences. Therefore, there cannot be any structure outside of the purview of UG. 3. They use a modified truth value judgment task, comprising a story, a whquestion, and three answers. All of the answers are pragmatically possible in the context of the story, but only some of them are answers to legitimate interpretations of the question. We look at this experiment in detail, in chapter seven. 4. An appropriate continuation of this experiment would compare a group of Chinese with a group of Japanese learners. Both Chinese and Japanese do not have the uninterpretable strong wh-feature that English has, but only Japanese has a strong focus feature that is purportedly used to compensate for the lack of the wh-feature. Thus, one would expect the Japanese to be able to compensate better than the Chinese. Of course, this effect should obtain only if the groups are at comparable levels of proficiency, specifically on wh-questions.

Chapter 2 Architecture of the language faculty

The answer to the question “What is the architecture of the language faculty?” is important, because it directly bears on what has to be learned or not, and what can come for free in acquiring a second language. In answering the question, I will consider both the Minimalist program, one of the latest developments in the generative research tradition, as well as the maverick anti-syntactocentric position of Ray Jackendoff, more widely assumed by phycho- and neurolinguists in their work on processing language. While looking at current theoretical assumptions, I will pay particular attention to how the different types of meaning (lexical, phrasal) are accessed and computed compositionally. 1. The Minimalist Program The Minimalist research endeavor (Chomsky 1995, 2000, 2001, 2004, 2005) maintains the traditional characterization of language, since Aristotle at least, as a system that links sound and meaning. Thus, the expressions generated by a language must satisfy two interface conditions: those imposed by the articulatory-perceptual (AP) system and those imposed by the conceptual-intentional (CI) system. The language faculty is the optimal realization of the interface conditions. What that means in practical terms is that there is not much redundancy in the system and the notion of economy of derivation applies: if you have to choose between two converging derivations, choose the one with less number of steps, at the same time obeying local movement. Earlier concepts of the language faculty architecture included the Extended Standard Theory model in the 70ies, where there were different levels of representation: D-structure, S-structure, etc. The Government and Binding model in the 80ies retained those levels. Levels were characterized by certain operations that happened at them. For instance, the verb moved from V to I at S-structure (overtly) in French but at LF (covertly) in English (at least according to some analyses of Pollock’s verb movement parameter). A general thrust of the Minimalist Program is to keep only those levels that are minimally necessary, which has culminated in abolishing Dstructure and S-structure. Chomsky (2001, 2005) defends the idea that there

18 Architecture of the language faculty should be only one level of computation, that between Numeration and Spell-Out, where Merge and Agree will apply, thus disposing of LF and PF as levels of computation. After the point of Spell-Out, there are only the AP and the CI interfaces.1 They provide the morpho-syntactic information needed to linearize the linguistic signs (arrange them one after the other in time) and produce a sentence, as well as assign a semantic interpretation to the same sentence. Movement of syntactic phrases, or the fact that some phrases surface or are interpreted displaced from where they originate, is an undeniable property of human language. In the minimalist framework, movement is explained as necessary for the checking of formal lexical features. Most lexical items come into the numeration pre-specified with their features in the lexicon, and these features are checked in the syntactic derivation.2 Lexical items including inflectional morphemes minimally combine phonological features, semantic features, and formal syntactic features. Two types of features are relevant to the syntax-semantics interface: interpretable and uninterpretable ones. Interpretable (semantic) features are legible at LF and contribute to the interpretation, so they cannot be eliminated. Uninterpretable, or formal features, on the other hand, should be eliminated before Spell-Out, since they do not contribute to meaning. For example, in the sentence in (1), the interpretable feature [plural] on the subject phrase nominal head survives into CI, the phonetic feature [pronounce the affix as /– s/] is useful at AP. The uninterpretable [plural] feature on the verb, which ensures agreement, on the other hand, is eliminated by CI, though it may survive until AP to be pronounced. Which feature is interpretable and which is not is subject to language variation, so it is predicted to pose a problem for second language learners. (See Adger 2003, chapter 2, for more examples of features across languages). (1)

The student-s are arriving. [pl] [upl]

Earlier minimalist analyses also distinguished between checking strong features (the so-called overt movement, or movement by Spell-Out) and checking weak features (or covert movement at LF). Interpretability of features survives into the latest minimalist efforts; however, the strong-weak distinction does not survive in the same form. There are two recent ideas of how overt and covert movement should be reconsidered so that Spell-Out does not feel like a level of representation, and so that movement is not pre

The Minimalist Program

19

and post Spell-Out. One idea, Chomsky’s (1995) Move-F, is that it is possible for features to move with or without pied-piping of the whole phrase they are part of. Thus, overt movement takes place if Move-F is accompanied by a morphological condition on the whole set of features pertaining to the phrase to be moved as well. Covert movement is Move-F where no such additional condition exists. In principle, though, there is no difference in the level at which the two types of movement occur.3 Another idea is Chomsky’s (2001, 2004) Agree. According to this approach, only interpretable features are encoded in the lexicon, while uninterpetable ones acquire their values, or are valued, in the course of the derivation. A probe is a head with some uninterpretable feature, while a goal is a phrase or head with a matching interpretable feature. The goal searches for a probe with a matching feature in its c-command domain, so that it can value its feature for correct morphological expression, and then check and delete its uninterpretable feature before LF. I reproduce here a slightly modified diagram of the general language architecture from Adger (2003: 146). CI system

Numeration

Syntactic Objects Select

Merge Move Agree

AP system Figure 1. General architecture of the language faculty

From the point of view of the predictions these new minimalist proposals make about language acquisition of features and their values, not much has changed. What is to be learned and what comes for free, keeping in mind the language architecture in Figure 1 above? Lexical items are drawn from the lexicon into the Numeration. The Syntactic Objects box can be

20 Architecture of the language faculty imagined as a working space where operations like Select and Merge combine lexical items into phrases, and then into bigger phrases. Syntactic operations continue until all of the lexical items in the numeration are exhausted and all uninterpretable features are checked and deleted. Both visible and invisible movements take place here. Principles and languagespecific parameters reside in the Syntactic Objects box. The complete syntactic object (a tree) is then passed on by means of Spell-Out to the AP and the CI systems. The sentence interpretation is read off the syntactic object at the CI interface (see the semantic operations ensuring this, below); it is linearized and pronounciation instructions are sent to the AP interface. The set of functional categories constitutes a sub-module of syntax, namely, the functional lexicon. Each functional category is associated with a lexical item, or items, specified for the relevant interpretable features. Parameterization is a blueprint made up of a finite set of features, feature values, and properties (e.g., whether a certain feature will induce phrasal movement or will move on its own, or what we call “strength of features”). Acquisition of L2 functional categories involves the functional properties of a set of lexical entries, but is manifested in syntactic reflexes superficially unrelated to the lexical entries, like displacement. For example, presence of overt agreement morphology on the verb and noun (functional morphology) has been argued to be causally related to verb and noun movement in Romance languages (a syntactic word order effect). Another recent idea is the distributed processing of the linguistic signal. Following work by Epstein (1999) and Uriagereka (1999), Chomsky (2001) suggests that syntactic structures can be encapsulated into phases and sent to the interfaces much before the complete sentence is exhaustively analyzed. This is a strongly derivational approach presupposing Multiple Spell-Outs. CP and vP are proposed to be phases; DP may be a phase, too. This idea is actually based on considerations of psychological plausibility as well as theory-internal considerations. If sentences are broken into smaller meaningful chunks (CP being a complete functional complex and vP being a complete thematic complex), then working memory is not unduly taxed. Note that phase theory fits neatly into Adger’s language architecture in Figure 1 above and does not change the division of labor between functional lexicon entries and visible movement.

Syntax-semantics equivalence?

21

2. Syntax-semantics equivalence? After the syntactic object is complete and all its uninterpretable features are checked and deleted, it is sent out to the syntax-semantics interface for interpretation. If the task of the syntax-semantics interface is to capture the meaning of a sentence, we must give some consideration to what “meaning” is. Utterance meaning is only defined in relationship to a situation. Simplifying drastically, attributing a meaning to a string of words entails knowing whether the sentence is True or False in a particular situation, or knowing the truth value (0 or 1) of a sentence. For example, utterance (1) would be judged as True by speakers of English who are witnessing a number of students entering a classroom, but False if only one student is involved in the situation. Since syntax employs recursive procedures to create phrases from words, and then sentences from phrases, semanticists have developed recursive procedures to assign a meaning to sentences based on the meaning of their parts (see definition of compositionality below). The idea is that for every syntactic node and movement, there is a semantic type of object and mode of composition that would ensure that the end result, the sentence, has a truth value. The syntax-semantics interface establishes the relationship between these two recursive procedures. Of course, it only makes sense to think of an interface if syntax and semantics constitute relatively autonomous levels. We can demonstrate the autonomy by pointing to sentences that make perfect sense semantically but are ill formed due to lack of agreement, for example: (3)

*The student are arriving.

Linguists mark such syntactic violations with an asterisk, and the inherent aim of the syntactic component is to eliminate uninterpretable features such as agreement before the sentence is handed over to interpretation. On the other hand, Chomsky’s famous example in (4) is perfectly well-formed syntactically, but makes no sense semantically, which is indicated by a number sign. (4)

#Colorless green ideas sleep furiously.

Even though the two components of syntax and semantics are autonomous, the operations at the interface can access units and operations at both.

22 Architecture of the language faculty Thus, a syntax-semantics functional equivalence is not impossible, at least in principle. The principle of Compositionality is attributed to Frege (1892) and roughly states that the meaning of a linguistic expression is a function of the meanings of its parts and the way they are combined in the syntax. What are the semantic operations used to assemble sentence meaning from individual word (and morpheme) meaning? As an aside, we should note that a necessary condition of compositionality is that semantic constituency roughly coincides with syntactic constituency. For example, we assemble VP meaning from combining verb and object meaning as in the syntax we construct the VP from combining the verb and the object, and not the subject and verb first, to the exclusion of the object (see (5d) below). In the next paragraphs, following Heim and Kratzer (1998), I will informally present the three basic operations of compositional interpretation, which are taken to be absolutely basic (and relatively uncontroversial). Functional Application starts from the idea that the verb has a special relationship with its arguments. In the syntax, the familiar Theta Criterion ensures that all the argument positions in a verb subcategorization frame are occupied by only one NP (or PP, as the case may be): (5)

a. John [VPmet Mary]. b. *met Mary. c. *John met. d. S ei NP VP | ei N V NP | | | John met N | Mary

In the semantics, this idea translates into conceiving of the verb as a predicate, that is, an unsaturated proposition, which entails that it has “gaps” in its meaning such that, when these gaps are filled by the arguments, the whole will denote a proposition. Thus, the meaning of meet has two gaps in it so that when we supply the two necessary entities, the agent and the theme, we can get a proposition with a truth value. A transitive

Syntax-semantics equivalence?

23

verb like meet is viewed as a function that requires one individual as its input and produces another function that must take another individual as its argument before resulting in a truth value. The VP on its own does not have a truth value: meet Mary cannot be mapped onto a situation. One further application of Functional Application ensures that the whole sentence John met Mary can be evaluated and attributed a truth value.4 Functional Application is also used in other cases of predication, for example, an adjective and a nominal joined by a copula: (6)

Mary is late.

The copula is assumed to be semantically empty and the adjective is a function from individuals to truth values, mapping Mary and the individuals who are late to True, all other individuals to False. In sum, the mode of semantic composition is Functional Application in all of the cases, in which the relationship between two syntactic elements is one of subcategorization, or selection, such that a predicate works like a function being saturated with an argument. It is important to note that syntactic heads (verbs and nouns) map one-to-one into conceptual functions, subjects and objects map into arguments. Thus, this is the most simple and basic mode of composition, in which a syntax-semantics equivalence is manifested. A second way to interpret a binary branching constituent is Predicate Modification. It is typically used when an adjective modifies a noun, as in sweet apple. (7)

a. Mary ate a sweet apple. b. DP ei Det NP | ei a AP N | | A apple | sweet

For (7) to be true, it is not enough for Mary to have eaten just any apple, but an individual at the intersection between the set of apples and the set of

24 Architecture of the language faculty sweet things. Predicate Modification is a rule of composition that intersects two predicates and produces a result of the same semantic type as its daughters: the nominal sweet apple can enter in Functional Application with eat in the same way as apple can.5 However, not all modification is intersective; thus, not all predication can be reduced to predicate conjunction. An alleged criminal is not a criminal who is also alleged; a small elephant is relatively big compared to a butterfly. The latter adjective is context dependent: the individual animal must be small for an elephant, or compared to all contextually salient other elephants in the discourse. The former type of adjective (alleged, former) is a true example of nonintersective modification (Higginbotham, 1985). Even more interesting are the cases in which the modifier does not modify the host as a whole, but only one of its semantic features. For example, a good knife is not a knife which is good, but a knife that cuts well, or is good for cutting. Thus the modifier good refers to the purpose of the host noun only. Pustejovsky (1995) uses this type of evidence to argue for the qualia structure of nouns.6 A third basic interpretation rule, Predicate Abstraction, is motivated by considering relative clauses. Take the sentences in (8) and (9). (8) (9)

Mary is a girl who likes apples. Mary is a girl whom John met.

Syntacticians argue that the relative pronouns who in (8) and whom in (9) are related to argument positions within the relative clause, an agent of the verb like and a patient of meet. These relative pronouns have moved from their original positions next to their respective verbs. (10)

a. whox x likes apples b. whomx John met x

Syntax-semantics equivalence?

c.

25

CP ru C’ whomx ru C S | ru Ȝx DP VP | ru John V DP | | met x

Looking at the representations in (10), we see that the semantic contribution of the relative pronoun is to produce a predicate (an unsaturated proposition) out of a complete clause with a potential truth value. We derive this meaning if we use the rule of Predicate Abstraction, or Lambda Abstraction. It converts a sentence into a predicate by binding a free variable within the relative clause with a lambda operator (Ȝ). The relative pronouns, which originated as the subject of like and the object of meet, respectively, move and leave behind a trace, represented as a variable (x). In their displaced positions, the relative pronouns are interpreted as the lambda abstractors: They “unsaturate” the proposition temporarily saturated by the trace, making a predicate out of a sentence. The next step in the interpretation involves the application of Mary to the unsaturated predicate, to saturate it by Functional Application. Strictly speaking, Predicate Abstraction is not a compositional operation but an operation of merging a variable-binding operator into a proposition. Heim and Kratzer (1998) argue that the variable-binding mechanism employed in any syntactic movement involves Predicate Abstraction, which makes it a rule used in the interpretation of almost all clauses in natural language. Functional Application, Predicate Modification and Predicate Abstraction are widely considered to be the basic semantic rules, minimally needed to interpret syntactic structures. They are universal, that is, they are used for the comprehension of any human language. Continuing to elaborate on syntax-semantic mapping, we just saw that constituency is one area where a number of correspondences between syntax and semantics are attested. Another such area is semantic scope. To borrow an example from Rodman (1976), compare the two sentences in (11).

26 Architecture of the language faculty (11)

a. There is a ball that every boy is playing with. b. Every boy is playing with a ball.

(11a) is not true when every boy is playing with a different ball (e.g., there are three boys and three balls in the gym); it is only true when every boy is playing with the same unique ball. On the other hand, (11b) can have both meanings (three boys play with three balls, one each; and also three boys play with the same ball). The semantic difference between (11a) and (11b) corresponds to difference in syntactic structure. This observation was captured by Reinhart (1976) with the notion of c-command, the special relationship between a node and its sister node (and everything that the sister dominates). She proposed that the domain of c-command is the domain of semantic rule application, or what semanticists call scope. Note that in (11a), the constituent a ball is not in the c-command domain of the quantifier every, hence it is not in the scope of every. The relative clause thatx every boy is playing with tx is a predicate (an unsaturated proposition), which is saturated by one unique individual x and mapped to True if every boy is playing with x. 3. Syntax-semantics mismatches I will now exemplify syntax-semantics mismatches with quantifier interpretation. In this section, I will be interested in mismatches that can potentially be resolved by positing two syntactic structures for the same surface string. I will delay talking about some other mismatches, which are not clearly resolvable in the syntax until section 5. Quantifiers in subject position challenge the view that predication in syntax and Functional Application in semantics are in complete correspondence. Consider again the sentence in (11b): (11)

b. Every boy was playing with a ball.

We have already assumed that the VP was playing with a ball is a predicate, an unsaturated proposition with one gap. The expectation is that Functional Application can proceed by applying this predicate to an individual denoting “every boy”. However, simple inspection indicates that “every boy” does not pick out an individual in the world, like “John”, for example. What if every boy denotes a group of (contextually relevant)

Syntax-semantics mismatches

27

boys, akin to “all the boys”? An example from Sauerland and von Stechow (2001) shows that this denotation will not do. (12)

a. Every boy (*together) weighs fifty kilos. b. All the boys (together) weigh fifty kilos.

Every boy and all the boys are not equivalent in meaning. Neither Functional Application nor Predicate Modification can serve as appropriate modes of composition for the VP with every boy. Instead, the answer to this puzzle goes back to Frege (1879) and lies in the analysis of the subject quantifier being a higher order function, which takes the VP predicate as its argument. If the VP was playing with a ball cannot be saturated with an individual because every boy does not denote an individual, then saturation must happen the other way around: the VP will be the gap-filler for the semantically more complex quantifier. Why is this considered a semanticssyntax mismatch? Because the type of difference between a subject denoting an individual (John) and a subject quantifier (every boy) is not captured in the syntax.7 The verb agrees with both kinds of subjects (in English and many other languages). Quantifiers in object positions present yet another challenge to the syntax-semantics equivalence (or the view that for every syntactic node and operation there must be a semantic object and a mode of composition). The sentence in (13) is uninterpretable if we maintain that quantifiers are higher order functions that are satisfied with their VP. (13)

a. John read every book. b. S ru DP VP | ru John V DP | 5 read

every book

On the one hand, every book needs to compose with a predicate with one argument missing (a predicate from individuals to truth values), a type corresponding to intransitive verbs. On the other hand, read is a transitive verb because it has an object. This situation is also known in semantics as a type mismatch. Any solution of this problem posits some type of readjust-

28 Architecture of the language faculty ment process. Heim and Kratzer (1998) describe two basic solutions proposed in the literature. One solution, the syntactic one, is Quantifier Raising, (May 1985), a syntactic movement which ensures that the quantifier leaves the VP and moves to a higher position, leaving behind a trace. This movement creates a suitable predicate type that will saturate the quantifier. (13)

c. Every bookx [John read tx]. d. CP ru C’ every bookx ru C S | ru Ȝx DP VP | ru John V DP | | read x

What is in the brackets in (13c) resembles a relative clause: a sentence from which one part has moved out, or a proposition with one part of its meaning missing, in other words, a predicate. This predicate describes the property of “being read by John” and can felicitously combine with the raised quantifier. The other solution, essentially a semantic one, is known as type-shifting. It proposes that the quantified phrase doesn’t move, but that there is a freely applying type-shifting rule in the grammar. In a sense, quantifiers like every have two lexical meanings, one used when they are in subject position and one when they are in object position. We shall not, of course, dwell on which solution is better, as they both seem to account for a range of data and “the choice seems to be a matter of taste” (Heim and Kratzer, 1998: 193).8 We only need to notice the fact that the surface syntactic position of quantifiers does not correspond to their scope position, at least in English. In contrast, Hungarian may be a language with just such a correspondence (Kiss 1991). The syntax-semantics mismatch is inevitable in sentences with two quantifiers, which can take different scope with respect to each other, like (11b) repeated here:

Syntax-semantics mismatches

(11)

29

b. Every boy is playing with a ball. Meaning 1: There exists an x such that x is a ball and for every y, such that y is boy, y is playing with x. Meaning 2: For all y, such that y is a boy, there exists an individual x, such that x is a ball, and y is playing with x.

Since the sentence has two possible meanings and only one word order, there must be two mental representations of the sentence. One (in this case, meaning 2) could be equivalent to the surface syntax, but the other (meaning 1) cannot. More generally speaking, when we have two meanings and only one word order, it follows that the meaning not corresponding to the surface word order is derived by some additional operation. How is the inverse scope computed? May (1977) and Chomsky (1975) propose that quantifiers may move to form a constituent that would include the verb and the other quantifier. Thus, the moved quantifier takes scope over the other one, with a subsequent change in meaning. This movement for scope taking is not visible in the surface syntax, hence, it must take place after spell-out but before interpretation, that is, at the syntax-semantics interface. Before moving on to discuss another theoretical proposal for the language architecture, that of Jackendoff (2002), we should note that quantifier interactions have been discussed for a long time in generative syntax, and their status as an operation at the syntax-semantics interface is not widely disputed. However, they are not considered sufficient ground to suggest that semantics can also be generative. Inverse scope of quantifiers can still be “read off the syntax” if it is syntactic movement that provides two separate representations for the two meanings. In a nutshell, the logic is as follows. If we can deal with a syntax-semantics mismatch at the interface by proposing covert syntactic operations like Quantifier Raising, why complicate the picture unnecessarily by building more complexity, hence more rules and more learning to do, into the language architecture? Syntactic structure is mirrored directly by a coarse semantic structure, and the interface is kept relatively simple. Most work in formal semantics seems to accept the basic logic of this view (e.g., Partee, 1995, Heim and Kratzer, 1998). This would be sound logic indeed if there were not even more mismatches, reviewed below in section 5, which lead Jackendoff to propose a generative semantics module.

30 Architecture of the language faculty 4. What’s wrong with “syntactocentrism” (Jackendoff, 2002)? In his 2002 book, Ray Jackendoff argues that the current picture of the language faculty is unfairly skewed towards syntax, as syntax is the only generative component of the grammar, while phonology and semantics are relegated to the interfaces. He criticizes the minimalist framework for not making it excessively clear exactly what happens at the interfaces, and what kinds of operations take place there. In order to make linguistic theory more compatible with findings from neurolinguistics and psycholinguistics, Jackendoff proposes that all three modules of the grammar are generative in the sense that they can generate structure by compositionally “unifying” units dedicated to the particular level. He calls his model the Parallel Architecture. “Linguistic structure is viewed as a collection of independent but linked levels of structure: phonology, syntax, and semantics. Each level of structure is characterized by its own set of primitives and combinatorial principles. The linking among levels of structure is established by a set of interface constraints—among which are the words of the language. Thus a wellformed sentence has well-formed structures in each component, connected in a well-formed fashion by linking constraints. Within this organization, syntax is but one among several “cooperating” generative components.” (p. 198)

At each level of the language architecture, a number of rules and constraints operate, which allow the formation of fully-specified structure at that level. These are called integrative processes. The classical example of these is the syntactic parser, which integrates a Determiner and Noun into a Determiner Phrase (DP), a verb and a DP argument as the Verbal Phrase, etc. At the interfaces, we have another kind of process, a process that takes as input one type of linguistic structure and outputs another. These are called interface processes. Note that the interface processes are qualitatively different from the integrative ones. Jackendoff also describes a third kind of operations, which takes as input the structures of a particular level and produces structures of the same type, for example, rules of inference derive new structures of the conceptual type. A crucial part of this parallel architecture is the working memory. It is a separate facility in the brain, where representations from long-term memory are retrieved and copied in order to undergo processing. The working memory is understood not as a static place for temporary storage, but as a

What’s wrong with syntactocentrism? 31

“workbench” or “blackboard” on which processors can cooperate in assembling linguistic structures. Phonological structure uncontroversially has tiers: prosodic, syllabic, and segmental structure as well as morphophonological structure joining the segments into words and prosodic words. Syntactic structure, the familiar syntactic tree, contains heads and phrases arranged hierarchically in accordance with syntactic constraints. The semantic/conceptual structure also consists of several tiers, each of which conveys a different aspect of sentence meaning. The descriptive tier contains roughly the information that can be encoded in a predicate logic: conceptual functions, arguments, and modifiers. Conceptual constituents, labeled with their own constituent type, may include Situation, Place, Object, Event, State, etc. The possible relations between conceptual constituents are function, modification, and lambda extraction.9 The referential tier organizes the referential claims about the entities of the sentence. It is common in the semantic literature to conflate the two kinds of information, but Jackendoff argues that there are advantages in keeping them apart. The examples that follow are from Jackendoff (2002: 395-6). In the simplest case in (14), the formal representation is given in (14b). (14) (15)

a. A fox ate a grape. b. x FOXx y GRAPEy (EATx,y) Syntax: [S [DP a fox]1 [VP ate [DP a grape]2 ]]3 Descriptive tier: [Event EAT ([Object FOX]1, [Object GRAPE]2)]3 Referential tier: 1 2 3

In Jackendoff’s tier representation, the three indices at the referential tier correspond to x and y in (14b), while index 3 corresponds to existential quantification over a Davidsonian event variable. To make the division interesting, of course, we need to look at a case of non-correspondence. One situation where the indices are not simply copied off onto the referential tier is the case of predicate nouns as in (16). (16)

Syntax: [S [DP Eva]4 [VP became [DP a doctor]5 ]]6 Descriptive tier: [EventINCH ([State BE], ([Object EVA]4, [Object DOCTOR]5)])]6 Referential tier: 4 6

32 Architecture of the language faculty On the referential tier, there is no index corresponding to the predicate nominal a doctor, thus asserting that one individual, Eva, has become describable also as a doctor. There is much more complexity on the referential tier; however, we will not go into the details here and will refer the interested reader to the original publication. This example should suffice to illustrate the main idea of the referential tier. Other linguistic phenomena that Jackendoff proposes to deal with by separating the descriptive and the referential tier are anaphors and quantification. The last tier of conceptual structure is the information structure tier¸ capturing the organization of topic and focus (also known as theme/rheme, topic/comment, old/new information, etc.). This part of linguistic structure encodes the function of a particular phrase within the discourse, from the point of view of the speaker-hearer interaction. The quintessential examples are question and answer sequences. For example, in an interaction like (15), some of the information is already mentioned in the question, so the new (focus) information can be stressed in English (marked with all caps). Note that it is not appropriate for the old information to bear focus prosody. (15)

Q: Who ate the cake? A: LEA ate the cake. #Lea ate the CAKE.

If we ask a thetic question (out of the blue, or all new information question), the answer will contain only information that is not mentioned before, therefore not known to the questioner, and all constituents should bear stress (but normally don’t). Thus, intonationally marking only one constituent is inappropriate: (16)

Q: What happened? A: Lea ate the cake. #LEA ate the cake.

English can also use more complex structures to mark topic and focus, for example, topicalization as in (17) and cleft constructions for focus marking as in (18). Languages like Russian and European Portuguese use word order permutations to reflect information structure relations. (17) (18)

Sushi, I like, but sashimi, I adore. It is Jane who helped me.

What’s wrong with syntactocentrism? 33

Information structure should not be reflected on the descriptive tier, because it is orthogonal to argument structure and any phrase can be focused. It cannot be reflected on the referential tier either, because non-referential constituents like adjectives and predicate nominals can be focused, too. So it gets a tier of its own. If a phrase is designated as [Focus] on the information structure tier, the interface with phonology can apply high stress to it (at least in languages like English). Other languages can utilize a special syntactic construction or particles such as even, only, etc. to the marking of focus. Another innovative feature of Jackendoff’s language architecture is his treatment of the lexicon. For him, a lexical item is “a small-scale three-way interface rule” (p. 131): it lists a chunk of syntax, a chunk of phonology and a chunk of semantics as in (19). Inflectional morphology, idioms, as well as small phrasal trees with empty nodes (or treelets, following Janet Fodor’s term) find themselves listed in this interface lexicon. (19)

Wordi s

t a r

N sing count

[Object TYPE: STAR]i i

Let us look more closely now at Jackendoff’s view of the syntaxsemantics interface, which of course is of special interest in this book. Syntactic structure needs to be correlated with semantic structure and that correlation is not always trivial, as we saw in section 3. The syntactic processor works with objects like syntactic trees and their constituents: DP, VP, etc. The parse of a sentence is also made up of such units. In contrast, a semantic processor does not operate with DPs and VPs and case-marking; it operates with events and states, agents and patients, individuals and truth of propositions. The semantic processor wants to know what thematic role a nominal is going to play in the event structure of the verb. The operations at the interface are limited precisely to those structures that need to be correlated and they “do not see” other structures and operations (like casemarking) that would have no relevance to the other module. As the syntaxsavvy reader can already see, this view of the syntax-semantics interface is in principle not much different from the Minimalist view, where the same effect of “not seeing” irrelevant structure is accomplished by checking uninterpretable features prior to Spell-out. Although Jackendoff’s model and Minimalism converge in principle on this issue, Jackendoff envisages much

34 Architecture of the language faculty more work being done at the interface, thereby relieving syntax of much of its complexity.10 The crucial difference between the two views is that for Jackendoff, there is no operation of sending syntactic structure to the semantic processor like “sending a signal down a wire, or a liquid down a pipe” (p. 223). Going from syntactic structure to semantic structure is a computation in its own right, a non-trivial process, and qualitatively different from semantic and syntactic integrative processes. When more than one language comes into the game, this computation gets even more complicated. That is why it is crucial to identify the locus of language variation. The sixty-four thousand dollar question for L2 researchers then is: How much of semantic/conceptual structure is part of UG and how much of it may be parameterized? Jackendoff is very careful to answer this question at several places in the book. “[I]t is clear that all these aspects of phrasal meaning are available in all the languages of the world. [L]anguages differ in their syntactic strategies for expressing phrasal semantics; but the organization of what is to be expressed seems universal. The elements of the descriptive, referential, and information structure tiers seem the same across languages, and the principles of combination, especially on the descriptive tier, seem universally available. At least on preliminary reflection, the possibility of learning any of this would seem severely limited by the poverty of the stimulus. Thus, I find it plausible that the basic architecture of conceptual structure is innate. (Jackendoff 2002: 417)

While the content of meaning is the same (concepts and relations between them), different linguistic forms map different natural groupings of meanings. Let me illustrate a mismatch at this interface with Spanish and English aspectual tenses. While the English past progressive tense signifies an ongoing event in the past, Spanish Imperfect can have both an ongoing and a habitual interpretation. The English simple past tense, on the other hand, has a one-time finished event interpretation and a habitual interpretation while the Spanish Preterit has only the former. (20)

a. Guillermo robaba en la calle. (habitual event) Guillermo rob-IMP in the street ‘Guillermo habitually robbed (people) in the street.’ b. Guillermo robó en la calle. (one-time finished event) Guillermo rob-PRET in the street ‘Guillermo robbed (someone) in the street.’

What’s wrong with syntactocentrism? 35

(21)

c. Guillermo robaba a alquien en la calle (quando llegó la Guillermo rob-IMP someone in the street (when arrived the policía) (ongoing event) police) ‘Guillermo was robbing someone in the street when the police arrived.’ a. Felix robbed (people) in the street. (habitual event) b. Felix robbed a person in the street. (one-time finished event) c. Felix was robbing a person in the street (when the police arrived) (ongoing event)

Thus, the same semantic primitives (ongoing, habitual, and one-time finished event), arguably part of universal conceptual structure, are distributed over different pieces of functional morphology. (An L2 acquisition study of this mismatch, Montrul and Slabakova (2002), is described in chapter 6.) Another example would be the grammaticalization of semantic concepts in inflectional morphology. Compare the marking of tense in English and Chinese. While English has a separate, productive piece of morphology to indicate that an event or state obtained in the past, Chinese does not. That of course does not mean that Chinese speakers have no concept of past, it is just differently expressed. The interpretation of a past event or state is based on the overt aspectual marking and some other interpretive principles (see Smith and Erbaugh, 2005). A third example comes from marking of politeness in languages like German, Bulgarian, and Russian, as opposed to English. German reserves the second person plural pronoun Sie for situations when the addressee is singular, but the speaker wants to be polite, and uses du (2p. sg.) for all other cases. English does not reflect this distinction in the morphology of personal pronouns. Again, that does not mean that English speakers have no concept of politeness. We find most of the language variation in meaning, then, at the syntax-semantics interface. Linguistic semantics is the study of the interface between conceptual form and linguistic form. It is the study of that part of our thoughts which can be expressed or invoked by language. To recapitulate this section, Jackendoff’s (2002) alternative proposal for parallel language architecture offers a more detailed and articulate view of the syntax-semantics interface. An important insight of his proposal is that

36 Architecture of the language faculty meaning has its own combinatorial structure and is not simply “read off the syntax”, as the syntactocentric Minimalist program would have it. The operations at the interface are non-trivial computations. When learning a second language, a speaker may be confronted with different mappings between units of meaning on the conceptual level and units of syntactic structure. In the next section, we will look at some other syntax-semantics mismatches, whose resolution cannot be attributed solely to the syntax. They constitute evidence for “generative semantics” within the parallel architecture. 5. There is more to the syntax-semantics interface than meets the syntax Let us once again reiterate why mainstream generative grammar and, more particularly, Minimalism would not like to attribute too much structure and complexity to the syntax-semantics interface. It makes sense that if we have a syntax which is articulated enough, we will be spared from building complex structure of a different kind elsewhere, thus minimizing complexity, and the ease and speed of child language acquisition will be explained. A notable attempt at capturing semantic structure in syntax is Hale and Keyser’s (1993) proposal, according to which each theta role (Agent, Patient, Location/Goal) is uniquely identified by its origin in syntactic structure: the Agent originates in the Spec of vP, the Theme originates in the Spec, VP and Goal/Location originates as complement of VP. In this spirit, Travis (1992), Borer (1994), and Ramchand (1997), among much other work, are indicative of a growing trend to articulate event structure in the VP shell. But of course, these attempts are not new. Generative semantics in the sixties sought to present aspects of lexical meaning in terms of syntactic combination, and Perlmutter and Postal’s Universal Alignment Hypothesis (Perlmutter and Postal, 1984) as well as Baker’s Uniformity of Theta Assignment Hypothesis (Baker, 1988)11 proposed a one-to-one correspondence between syntactic configurations and thematic roles, even before Hale and Keyser. Although I am in favor of this endeavor, having modestly contributed to it myself (Slabakova, 1997b), looking at data from other parts of the grammar indicates that the strong thesis of one-to-one correspondence between syntax and semantics is perhaps a tad too strong. In this section, I will briefly go over examples of syntax-semantics mismatches, which are hard to deal with by throwing more highly articulated

The syntax-semantics interface 37

syntactic structure at them. Most of these examples are to be found in the work of Jackendoff and Pustejovsky, but see also Egg (2003). First of all, one should consider cases of transfer of reference and metonymy. The classical example in (22) is from Nunberg’s (1979) exchange between two waitresses in a restaurant: (22)

The ham sandwich over there in the corner wants more coffee.

The subject of this sentence does not refer to the inanimate object “a ham sandwich” but to the person who has recently ordered a ham sandwich and is possibly eating it now. The intended interpretation can be paraphrased as: (22’)

The person contextually associated with a ham sandwich sitting over there in the corner wants more coffee.

Note that this paraphrase contains more information, given in italics, which is omitted from (22) (although understood), and cannot by any stretch of the imagination be represented in the syntactic structure. How is this productive transfer of reference accomplished? Jackendoff (2002: 389) proposes to deal with it by postulating a conventionalized piece of meaning that has no overt expression. His structure in (23) shows an “operator” that means “person associated with” relying on perfectly compositional operations. (23)

Object ei PERSON Object ei Ȝx Situation egi ASSOCIATE Object Object | | x ham sandwich

Having this conventionalized unit in the lexicon and applying it as an interface operation has to be supported by context, as in the case of (22), or by general world knowledge, as in the metonymy exemplified in (24).

38 Architecture of the language faculty (24)

Iraq has turned into another Vietnam.

It could be argued that transfer of reference is actually supplied by pragmatics, not semantics, in the sense that it is part of contextualized interpretation and relies heavily on world knowledge. The difference between semantics and pragmatics is notoriously murky, and is definitely an empirical issue. The point is, however, that even if semantics and/or pragmatics are involved, this productive operation relies on structure created at an interface (either the syntax-semantics or the syntax-pragmatics interface), but not in the syntax proper. Another characteristic case of what Jackendoff terms “enriched composition” as opposed to “simple composition” is aspectual coercion (Jackendoff, 1996; de Swart, 1998). The adverbial until dawn has to combine with an atelic predicate, providing a telos (endpoint) for it. In (25), however, the verb flash, an achievement, is inherently telic since it encodes an instantaneous event. The adverbial, then, coerces the verb meaning into a series of repetitive events, taking place until dawn. The atelic verb sing in (26), on the other hand, does not employ the semantic operation of aspectual coercion but simple composition. Note that if we apply a similar type of adverbial to a predicate that resists aspectual coercion, for example recognize, the sentence is anomalous since we cannot recognize an individual many times over. (25) (26) (27)

The light flashed until dawn. (telic, coerced into atelic) He sang until dawn. (truly atelic) #John recognized the man in the picture until dawn/for an hour.

Even if we employ very articulated VP structure as in the work of Travis, Borer, and Ramchand cited previously, we cannot capture the difference between (25) and (27) in the syntax. Both verbs flash and recognize are achievement verbs (instantaneous telic events), so they will project the same verbal template and their arguments will originate in the same syntactic configurations. The distinction between a repeatable instantaneous event and a non-repeatable such event comes from the items’ lexical semantic features, which permit or do not permit the legitimate operation of aspectual coercion at the syntax-semantics interface. The third example of a syntax-semantics mismatch, complement coercion, also has to do with lexical features that come into play into the com-

The syntax-semantics interface 39

position of meaning. Verbs like finish and begin, by virtue of their lexical meaning, select for a complement denoting an event: (28)

John finished reading/writing the book.

When the complement denotes an activity, simple composition suffices for interpretation. However, these verbs also happily take objects, not events, as their complements: (29)

John finished the book.

Comprehension of these sentences has been argued (Jackendoff, 1997; Pustejovsky, 1995) to involve a process of complement coercion, a nonsyntactic operation that converts the object into an event description compatible with the object (and the context). For this purpose, the interface process has to examine the internal semantic structure of the complement noun, and more specifically, its qualia structure. The qualia structure of a noun consists of an inventory of features and properties of the object, including its physical shape, appearance, its purpose, how it comes into being and is used, etc. In this case, either knowledge that a book comes to exist by being written or knowledge that a book is used for reading can be evoked. Thus, the selectional restrictions of the verb are satisfied. Note that context knowledge is also involved in the interpretation of (29): if we know that John is a writer, it is much more likely that we will interpret the sentence as denoting a writing event, not a reading event. If we hear (30), we might coerce the object into an eating-object event. (30)

The goat finished the book.

It is conceivable to argue that the syntactic structure of (29) contains an empty verb node as in (31). Contemporary syntax is full of null verbs that carry meaning, so this will not be an outrageous proposal.

40 Architecture of the language faculty (31)

S ei DP VP | ei John V VP | ei finished V DP | 5 ‡

the book

Then, the interpretation of (29) can be achieved by simple composition. However, there are syntactic arguments for and against this V-insertion in the syntax, discussed at length by Pylkkänen and McElree (2006). For example, evidence for V-insertion (and against a semantic explanation) is the fact that coerced nominals do not take part in the causative-inchoative alternation (*The book finished), which would be unexplained by a semantic account. However, evidence against a syntactic explanation comes from adverbial modification and passivization. For example, (32) can be interpreted to mean that a slow meal comes to an end quickly, but (33) is false under this interpretation. If there is an empty V node in the syntactic structure representing another event, why can’t it be selectively modified by an adverb? (32) (33)

We finished eating the meal slowly. (= there was a slow eating event, and it finished) We finished the meal slowly. (= there was a slow finish to the eating event)

The authors conclude that “collectively the evidence suggests complement coercion constitutes a strong candidate for a purely semantic interpretive process. (p. 35)”. In addition, even if the V-insertion solution was not found to make the wrong predictions syntactically, there would be the remaining question of how the interpreter of a sentence like (29) supplies the lexical meaning of the silent verb (reading?, writing?, eating?), without looking into the qualia structure of book. In this section, three instances of syntax-semantics mismatches were described, which arguably cannot be resolved by postulating either articulate or silent syntactic structure. They can be accounted for by a phase-by-phase interpretive processes at the syntax-semantics interface. These three cases,

Relevant theoretical assumptions 41

among others, suggest that simple compositionality is not always sufficient for interpretability. In other words, the meanings of sentences are not always fully and exhaustively determined by the meanings of their component words and the syntactic structure. We are led to believe that an alternative concept of the grammar could be more appropriate, one in which each syntactic step still corresponds to a semantic step, but in addition, there is a repertoire of purely semantic rules that can change the meanings of constituents to “fit” better into the sentence and in the discourse context. Such rules include type-shifting, postulating semantic operators, and looking into qualia structure. Arguments of this type support Jackendoff’s claim that semantics, as well as syntax, can be generative.12 6. Relevant theoretical assumptions In what follows, I will recapitulate the theoretical postulates I am going to assume in this book. I will summarize the language architecture from the point of view of the second language learner who has a native grammar already firmly established and is acquiring a second, target grammar. Since I am mainly interested in the acquisition of meaning as revealed in comprehension, I will be concentrating on the view of the syntax-semantics interface, and on the functional lexicon. Jackendoff ‘s Parallel Architecture and Minimalism converge on the view that conceptual structure, the content of thought, is universal, and is outside the premises of language. While the content of meaning is the same, different linguistic forms map different natural groupings of meanings. Both models relegate meaning differences between languages to the syntax-semantics interface.13 But how are these differences encoded? Let us consider once again the examples of such differences reviewed in section 4. First, a Spanish-English contrast in aspectual tense morphology, whereby the aspectual meaning of habitual action is reflected in the Imperfect tense in Spanish but in the Simple Past Tense in English. If an Englishnative learner is superficially mapping Simple Past onto Spanish Preterit, and Past Progressive onto Spanish Imperfect (see examples 20 and 21) based on similarity of their other meanings, she will be wrong about which inflection encodes the habitual activity meaning. Secondly, a Chinese learner of English has to remember that the meaning of past event is encoded by a designated piece of inflectional morphology that she has to attach to every regular verb if she wants to encode that meaning. Thirdly, the

42 Architecture of the language faculty English learner of German has to learn that an interlocutor can be addressed in two different ways, either by Sie or by du, depending on whether she wants to express deference and politeness or not. To add a fourth example (coming from the work of Talmy (1980)), Romance languages typically conflate path and motion together in a verb (John entered the room running), while English typically conflates manner and motion (John ran into the room). There are many other examples. The crucial thing to notice is that, in the majority of cases, what has to be learned in the target language is a piece of morphology or a class of lexical items, which has either no analogue in the native language, or if it does, the properties and/or meaning of the target items are different. The purely semantic rules discussed in section 2. are all arguably universal, so we expect them to not present any difficulty to learners. All the roads lead to The Lexical Item! Or more precisely, they pass through the lexical item. Therefore, an assumption that I will adopt from Jackendoff (2002) is the notion of the lexical item as interfacing between all the three modules of structure; i.e., as being composed of syntactic, phonological and semantic features. In this particular respect, open-class words (nouns, verbs) and closed-class words (bound and free productive inflectional morphology) are the same (although they differ in the types of meanings, referential versus grammatical, that they can encode). For example, the –ed affix in English has a specific pronunciation dependent on its environment, attaches to a verb, and means that the state/event reflected by the verb obtained in the past. There are “defective” lexical items like case endings (e.g., Nominative, Accusative, Genitive) in German, which comprise a chunk of phonology and a chunk of syntax, but have no semantics. On the other hand, the causative morpheme in English (as in I brokeCAUSE the vase) has semantics and syntax but no phonology. Since all lexical items have their feet on three linguistic levels, it is potentially possible for knowledge of these three sides to be differentiated, or to come at different times. Form-meaning differentiations are well-known in the L2 acquisition of functional words and morphemes. The three-pronged view of lexical items can adequately account for such findings. Although open-class words (notional lexical items) and closed-class words and morphemes (the functional lexicon) have in common their being three-way interface rules, they differ essentially in the types of meanings they can encode. Current generative linguistic theory argues that not only overt syntactic properties but also properties computable at the syntaxsemantics interface depend crucially on features encoded in the functional

Relevant theoretical assumptions 43

lexicon. For example, and simplifying dramatically, overt versus covert wh-movement is explained by a combination of a universal requirement on interpretation and a parameterized property. In order for sentences containing wh-words to be interpreted as questions, these words need to take (high) scope position at the interface, that is, they need to be in the CP projection. This is a universal requirement dictated by the semantic type of whoperators. The visible versus invisible movement has been taken to depend parametrically on features encoded in the wh-words in a language. Thus, interpretive properties encoded at the syntax-semantics interface like whmovement do not seem to be qualitatively different from purely syntactic properties like verb movement which do not give rise to interpretive differences between languages. Put differently, in the familiar verb movement case (think English versus French), if the verb moves, it is not for interpretive reasons. In wh-movement, the wh-word does move for scope-taking reasons. However, both movements are triggered by properties encoded in the functional lexicon. In principle, then, we should expect a similar pattern of processing and similar behavioral patterns of acquisition for abstract, subtle syntactic and semantic properties. In this chapter, I presented two contemporary views on how the language faculty is organized, what its components are, and how they interact. I will assume an advantageous mix of Minimalist language architecture with the more articulated view of the syntax semantics interface offered by Jackendoff’s (2002) Parallel Architecture model.14 I will further assume a universal and generative (linguistic) semantic module, lexical items as interfaces between the three modules, and an-open-class vs. closed-class lexical item distinction, where closed-class functional lexicon items trigger movement for interpretive or purely semantic reasons. These assumptions are going to be important throughout the book. In chapter 3, I discuss psycholinguistic models of language use that are partially based on the underlying language architecture. In chapter 4, I critique some ERP-based studies purporting to investigate differential processing of syntax and semantics. I will argue that these studies do not really address phrasal semantics and its processing, but lexical semantic compatibility only. Finally, the Bottleneck Hypothesis, a view of what is hard and what is easy to acquire in a second language crucially depends on the language architecture assumed in this chapter.

44 Architecture of the language faculty Notes

1. 2. 3. 4. 5. 6.

7.

8. 9. 10. 11.

12. 13.

The current literature largely maintains the labels PF and LF, for the AP and CI systems, respectively. Exceptions of this pre-specification include case features on N, which get valued by the inflectional and the verbal heads. See, however, Pesetsky (2000) for arguments that covert phrasal movement is indeed necessary in the grammar. I am ignoring here the interpretation of aspect, tense, and the higher functional heads, for ease of exposition. Again, we ignore the indefinite article here, considering it semantically empty. Since the qualia structure of nouns is not a syntax-semantics equivalence, we will postpone its discussion to section 5. For the time being, suffice it to say that Pustejovsky (1995) groups lexical features of concepts into qualia, which correspond to (groups of) noun properties. For example, knife has the telic quale of purpose. This same combination of quantified subject and a VP would not be a mismatch if one uses generalized quantifiers to represent DPs. Only the classic predicate logic has it as a syntax-semantics mismatch. I am grateful to Alice Davison for this clarification. However, as Heim and Kratzer (1998) proceed to show, there may be some other syntax-semantics mismatches like Antecedent Contained Deletion and inverse binding which independently need a rule of Quantifier Raising. Corresponding to Functional Application, Predicate Modification, and Predicate Abstraction described in section 3. For example, he proposes that the wh-movement parameter can be dealt with at the interface. The Universal Alignment Hypothesis (Perlmutter and Postal 1984: 97) states: “There exist principles of UG which predict the initial relation borne by each nominal in a given clause from the meaning of the clause.” The Uniformity of Theta Assignment Hypothesis (Baker 1988: 46) states: “Identical thematic relationships between items are represented by identical structural relationships between those items at the level of D-structure.” I am grateful to Alice Davison for discussion of the ideas in this section, although she may not completely agree with my conclusions here. Prosody (e.g., question intonation in languages without wh-movement) also contributes to the meaning calculation. This is reflected in Jackendoff’s (2002) model by a special phonological structure-conceptual structure inter-

Notes 45 face (p. 125). The contribution of prosody to sentence information structure seems to be unstatable within the Minimalist Program. 14. Although it may seem that the syntactocentric Minimalist language architecture and the Parallel Architecture cannot be truly reconciled, I believe that their treatment of the syntax-semantics interface is largely mutually compatible. It is this interface that is the focus of our attention in this book. Therefore, I will not presume to solve the bigger incompatibility issues between the two views of language architecture here.

Chapter 3 Psycholinguistic models of sentence comprehension

1. Why look at psycholinguistics? In this book, I wish to articulate a view of the grammatical representations that mediate second language performance, and in particular, comprehension. In a nutshell, I am arguing that the bottleneck for the adult language learner is inflectional morphology; since the semantic operations utilized in sentence comprehension are universal, they should come for free, once the morphology with all its attending interpretable and uninterpretable features is acquired. In this line of arguments, it is crucial to demonstrate the modularity of morphosyntax on the one hand, and semantics on the other, as reflected in time-sensitive, possibly ordered parsing phases. Since at this stage of our knowledge we cannot reliably tease apart which linguistic performance reflects underlying linguistic representations and which processing effects (see Duffield quotation below), I shall look for evidence of modularity in both behavioral and processing studies, starting with the latter. Furthermore, I am interested in evidence for differential acquisition of semantic knowledge and morphosyntactic knowledge. A considerable number of recent psycholinguistic studies purport to do just that: tease apart “syntax” and semantics” in processing language. Clearly, a critical review at such studies is amply justified. In proposing explanations for second language acquisition of comprehension, it is impossible these days to disregard the findings of on-line techniques for investigating sentence comprehension: the domain of psycholinguistics and, to some extent, neurolinguistics. The ability to process linguistic input in real time may be indicative of the way representations of linguistic knowledge are structured and stored in the mind/brain of the hearer/reader. To take a very recent example of integrating processing data and linguistic representation in a single theory, Clahsen and Felser (2006) used evidence from the on-line processing of filler-gap dependencies (e.g., Marinis, Roberts, Felser and Clahsen, 2005) to postulate the Shallow Structure Hypothesis, a model of second language mental representations of sentences. According to this hypothesis, L2 learners use lexical-semantic and pragmatic information as well as unmarked templates of predicate argument structure (Agent-Verb-Patient) to interpret incoming strings of

Why look at psycholinguistics? 47

words in a minimal (shallow) semantic representation, without mapping detailed and complete syntactic representations onto semantic representations. Because of that, learners do not process filler-gap dependencies, especially the intermediate traces in examples like (1) from Marinis et al, as native speakers do. (1)

[DP The nurse [CP [ whoi ] the doctor argued [CP [e2 ] that the rude patient had angered [e1 ] ]]] … is refusing to work late.

In processing sentences like (1) in a self-paced reading task, learners of English did not demonstrate the same reading profile with respect to the intermediate trace spot (e2) and the original gap (e1) as the natives did, therefore indicating that the intermediate trace was not activated in the learners’ comprehension process. While the native speakers utilized e2 to break the long wh-movement dependency into two shorter ones, learners appeared to link the wh-word who (referring to the nurse) directly to the predicate angered which had selected and licensed it. Clahsen and Felser argue that “L2 processing is different because of inadequacies of the L2 grammar” (Clahsen and Felser, 2006: 120). I mention the Shallow Structure Hypothesis here just to illustrate the influential view that theories of language acquisition are incomplete unless they also incorporate findings of grammatical processing.1 On the other hand, I agree with Duffield’s commentary on the Clahsen and Felser (2006) article (p. 57) that no one can say “that a particular piece of linguistic performance—whether it comes from traditional behavioral methods such as response latencies, or from more modern techniques such as event-related potentials—is uniquely due to the grammar or to the processing system. [… ] grammatical competence is always mediated by the processing system (in virtue of being part of it) …” Without much argument, I will assume here a view of linguistic processing which postulates that there is a considerable overlap between the grammar and the parser (following the seminal work of Janet Fodor, Lynn Frazier, and many others), or that the parser’s operations are based on grammatical representations since “the grammar proposes, the parser disposes”. Therefore, in this chapter I will examine findings from processing studies using different on-line methodologies to investigate native and L2 processing. A complete survey of the contemporary psycholinguistics of sentence comprehension will be beyond our needs here;2 thus, I will concentrate on a few research questions that are relevant to my thesis. The take-home message will be that morphosyntax and semantics are separate

48 Psycholinguistic models lingustic modules; the evidence will come from dissociation of processing patterns. In section 2, I will briefly review recent assumptions about the relationship between the parser and the grammar, following Townsend and Bever (2001). I will review literature that digs into the biological substrate of language and documents activation of different brain regions and differential timing of syntactic and semantic integration in the course of language processing (section 3). Next, I will examine some evidence for syntax-semantics interaction in processing (section 4). I will present two recent comprehensive models of native language processing, those of Friederici and Hagoort, which take this interaction into account although one model proposes serial processing while the other argues for parallel processing. In section 5, we will turn to the neurophysiology and electrophysiology of L2 acquisition. Our attention will focus on studies that specifically document differences between processing of syntax and semantics in the L1 and L2, as well as differences in how age of acquisition and proficiency affect that processing. Finally, I will also review findings on the processing of closedclass words and morphemes as opposed to open-class lexical items by native speakers and L2 learners. In the next chapter, I will look critically at how far the electrophysiological and neurological findings can go on their own toward explaining native and L2 linguistic representations. 2. What is the relationship between the parser and the grammar? The question in which psycholinguistic descriptions of language processing have been interested is the following: How does it happen that manipulations of symbols (words and morphemes and phrases) produce meaningful interpretations in our minds? Sentence comprehension is fundamentally a computation of meaning. This question was first addressed more than forty years ago by the Derivational Theory of Complexity (Miller and Chomsky, 1963; Miller and McKeen, 1964, Miller, 1962). On the wings of excitement about the new generative transformational framework of linguistics, Miller proposed that transformational distance between sentences determines the processing difficulty, reflected in the time it takes to comprehend or produce a sentence given one of its forms. For example, it is harder to produce a passive-negative sentence than a question, given a declarative base sentence, because the comprehender has to compute two transformations instead of one. The linking hypothesis was one of a direct mapping between the grammar and the parser. Disappointment hit early and split the field of

Modularity of syntax and semantics 49

psycholinguistics into two large camps. When several studies at the end of the sixties showed that the derivational theory failed to predict perceptual difficulty corresponding to the number of transformations, the conclusion (at least on the part of some) was that the mentally represented grammar is not correlated with the processor, and that linguistic theory is somehow “not psychologically real”. The tension in the psycholinguistic literature in the years after the demise of the Derivational Theory of Complexity has been between the “syntax” and “semantics.” I follow Townsend and Bever (2001) in describing meaning-based versus syntax-based comprehension. The first approach would take it that we comprehend sentences largely based on meaning, helped by contextual pragmatics, knowledge of the world and internal semantic constraints. Syntax may come in at times to resolve ambiguous or unclear utterances, but the majority of sentences do not need syntactic intervention. On the other hand, the syntax-first approach would have it that syntactic structure is assigned prior to meaning; it is automatic and obligatory. It is probably obvious at this point that neither extreme view can be true, and we should look at a combination of the two approaches. What kind of data can decide on the issue? For one, the Jabberwocky sentence in (1) from Lewis Carroll’s Through the Looking-Glass and What Alice Found There (1872) has a lot of morphosyntax which helps the reader get some interpretation, albeit vague. There is no pragmatic information and knowledge of the world helping the reader here, since all the content words are nonce words. Obviously, the syntax is crucial for the assignment of meaning in such examples. (2)

‘Twas brillig, and the slithy toves Did gyre and gimble in the wabe; All mimsy were the borogoves, And the mome raths outgrabe.

On the other hand, syntactic assignment may rely on knowledge of the world. Examples in (3) are from Townsend and Bever (2001: 85): (3)

a. The jockey raced the horse with a big smile. b. The jockey raced the horse with a big mane.

The adjunct PP in (3a) is attached high and modifies the whole event and in particular, the subject. The adjunct PP in (3b) is attached low to the ob-

50 Psycholinguistic models ject NP. Both attachments are grammatically permitted. However, the sentences are not semantically ambiguous between high and low attachment, because we know that jockeys do not have manes and horses do not have big smiles. Townsend and Bever (2001: 86) conclude that “comprehension involves a combination of knowledge about the likelihood of conceptual combinations with knowledge of possible syntactic structures. It is not reasonable to argue that comprehension is based entirely on either conceptual or syntactic knowledge alone, but it is possible to specify an architecture in which one or the other kind of knowledge has logical and temporal priority of application.” An example of theories that emphasize the priority and independence of rule-governed syntax would be Frazier’s Garden Path Theory (Frazier, 1987; Frazier and Clifton, 1996). It proposes that the parser uses only syntactic category information (N, V, P, A) to generate a candidate structure for a sentence. Each incoming word is accommodated into the structure with the least amount of change. Later on, plausibility information may reject a structure that has been proposed on purely syntactic grounds. According to these proposals, a rapid influence of non-syntactic knowledge on sentence parsing can be explained in terms of re-analysis. For example, in the temporary garden path sentence in (4), the processing difficulty at was is suggestive of an initial analysis in which the patient as object of believed was pursued, and then rapidly revised. (4)

The doctor believed the patient was lying.

On the other side of the divide would be non-symbolic models that use associative information at the earliest point of sentence comprehension (e.g., MacDonald, Pearlmutter and Seidenberg, 1994, Trueswell and Tanenhaus, 1994). Performance is largely determined by lexical knowledge. Even temporal syntactic ambiguities (garden paths) are resolved on the basis of local lexical knowledge sensitive to context and frequency information. These models posit a probabilistic constraint-satisfaction process in which syntax is only one of a number of constraints on interpretation. A consensus has emerged concerning the rapid nature of interaction between syntactic and semantic knowledge, but without resolving the fundamental disagreements about the degree to which syntactic processing precedes and determines other aspects of language processing. A model that combines features from both approaches is the Late Assignment of Syntax Theory (LAST) of Townsend and Bever (2001). It pro-

Modularity of syntax and semantics 51

poses that all syntactic processing proceeds along two lines in parallel: a “pseudosyntactic” analysis based on canonical templates and semantic roles, which is immediately linked to semantic representations; and a true syntactic analysis that takes place independently of the semantics. The pseudosyntactic analysis is a “quick and dirty” parse dependent on associations and frequencies in the input. It moves the wh-words back to their original positions and then applies canonical template strategies to provide an initial meaning analysis. The syntactic analysis computes a cyclical derivation up to the end of the clause and offers a completely specified syntactic representation. At the end of the two parallel processes, their outputs are compared and if there is a match, the computed meaning is stored. If there is no match (due to garden path, reanalysis, etc), the computation can start again at various points. The two lines of computation are independent up to the comparison point. The pseudosyntax is appropriately modeled within connectionist frameworks. Clahsen and Felser’s (2006) Shallow Structure Hypothesis and Ferreira’s Good Enough initial representations are reminiscent of the LAST model, although they may differ in the details. In conclusion, there is quite a lot of consensus among psycholinguists these days that the linguistic input to comprehension may initially be organized in part by principles outside of the syntax narrowly construed, like predicate argument structure, canonical templates, and frequency associations among concepts. These extra-grammatical computations employ an inductive associative activation mechanism. Nevertheless, complete syntactic representations are indispensable for full comprehension, either in comparison with the “shallow” representations, or in resolving ambiguities later on. Since in this book we are mostly concerned with linguistic representations, we shall let the parser fade from our attention at this point. In section 4, we return to the interaction of syntax and semantics in online comprehension. 3. Evidence for modularity of “syntax” and “semantics” As mentioned in the introductory chapter, there seems to be a growing consensus that language is represented as an integrated system of neurofunctional modules, with at least phonetics/phonology, morphosyntax and semantics being such modules (Jackendoff, 2002; Paradis, 2004; Samuels, 2000, a. o.). Why would we find such proposals credible? These modules may be represented in dedicated networks of neurons involving intercon-

52 Psycholinguistic models nected areas. I follow Paradis (2004) in accepting that neurofunctional modules are isolable (selectively susceptible to lesions and inhibition) and computationally largely autonomous (each module has access only to the output of another, while their internal processes do not interact (Paradis, 2004: 119). The initial evidence for modularity came from neurolinguistics. Broca’s and Wernicke’s aphasias, for example, were thought to reflect the loss of a specific type of linguistic information (Caramazza and Zurif, 1976). Broca’s aphasics were thought to manifest, among other things, an impairment in syntactic function. Wernicke’s aphasia was associated with loss of semantics, and in particular, lexical semantics. This influential account, however, has recently been found to be too simplistic. Closer inspection of the relationship between lesions and aphasic behavior suggests that this relationship is weak (Caplan, Hildebrandt, and Makris, 1996; Dronkers, 2000). Neuroimaging methods provide another technique of mapping linguistically modular processes to brain parts. For example, fMRI measures changes in blood flow, the blood oxygenation level dependent (BOLD) response to behavioral tasks like listening to sentences or stories. It is assumed that the BOLD response indirectly reflects activation of those parts of the brain which are involved in a certain cognitive process. We will concentrate here on the findings of a handful of recent studies which try to map syntactic and semantic/pragmatic processing by looking at both grammatical and anomalous sentences as processed by English native speakers (Friederici et al, 2003; Kang et al., 1999; Kuperberg et al, 2000; Kuperberg et al, 2003; Newman et al, 2001; Ni et al, 2000). The common research question of these studies may be summarized as follows: Does the language processing system recognize syntactic and semantic information as different?3 The assumption is that anomaly will engender a more active response at the linguistic level which is being violated and thus increased BOLD response will map this level onto brain structures. The most striking aspect of these collective results is the variability of the findings: each study comes to a different conclusion as to which brain regions are specifically involved in processing semantic/pragmatic versus syntactic information. Friederici et al. (2003) find that left inferior frontal and anterior temporal regions are involved in syntactic processing while semantic violations spur bilateral, mid-temporal BOLD activity. Kuperberg et al. (2000) conclude that the left and right temporal lobes are involved in semantic processing but do not identify any area that is specifically modulated by syntactic anomalies. Ni et al. (2000) and Kang et al. (1999) argue that the left inferior frontal lobe is activated by syntactic violations while

Modularity of syntax and semantics 53

posterior temporal lobes are lit up by semantic violations. Newman et al. (2001) find that syntax is processed in superior frontal regions while semantics is processed in left inferior frontal, medial temporal and right temporal regions. Finally, Kuperberg et al. (2003) conclude that overlapping networks (including inferior frontal and left temporal regions) are modulated in opposite directions by the two types of anomaly, thus suggesting that morphosyntactic and semantic/pragmatic information may be processed by the same neural system but in qualitatively and quantitatively different ways.4 Discrepancies among the findings of the six studies can be due to differences in the type of conceptual and syntactic violations in the test sentences and to limitations and differences in the experimental design (e.g., statistical procedures, separate scanning sessions, etc). What is important for the purposes of our investigation here, however, is the consistent agreement that syntax and semantics are processed differently (and may be processed in different brain regions). There is an inherent and at the time of this writing unresolved problem in using fMRI to locate specific cognitive processes in the brain, as reflected in the variance in outcomes. Human sentence comprehension is an integrative activity which proceeds very fast, over the course of milliseconds, while the haemodynamic response evolves over 10 to 15 seconds. For example, the processing of a single linguistic unit (say, a grammatical morpheme) involves a number of cognitive processes such as retrieval from the lexicon and the ensuing phonological, syntactic and semantic operations this morpheme activates and participates in. It is probably the case that specific types of linguistic processes are difficult to isolate using fMRI and even event-related fMRI (Kuperberg et al., 2003) because they are transient and very closely sequenced in time. Neuro-functional imaging is not sufficiently detailed at present to give us a picture differentiating processes down to groups of neurons. This is the reason why the syntax-semantics divide is more frequently addressed using event-related brain potentials (ERPs). Unlike fMRI, ERPs have a temporal resolution exceeding that of the processes underlying language comprehension. This technique is the better option when trying to isolate individual cognitive processes. ERPs are small changes in the spontaneous electrical activity of the brain, occurring in response to cognitive or sensory stimuli. They are measured noninvasively by applying electrodes to the scalp. ERP patterns, also called components, are characterized by the following parameters: polarity (positive and negative), topography (i.e., at which electrode site is an effect visi-

54 Psycholinguistic models ble), latency (time measured after the onset of the critical stimulus), and amplitude (the strength of an effect). The ERP methodology only provides relative measures, that is, an effect always results from a comparison of a base (control) condition with a minimally differing critical condition. Best results are obtained if the control and critical stimuli differ in one word only, and that word is in the same position in the sentence. Electrophysiological studies of the time course of language processing have contributed substantially to upholding the syntax-semantics functional modularity. One of the most frequently cited findings is the existence of “signature” ERP effects. The main effect of a semantic incongruity in a sentence is a negative wave with an onset at about 250ms after the critical word and a peak of 400ms, the so-called N400 (Kutas and Hilliard, 1980). The N400 was originally discovered to be sensitive to semantic integration of a word into a sentence: (5) a. He spread his warm bread with socks. (semantically incongruent) b. He spread his warm bread with butter. (semantically congruent) It was later found by Hagoort and Brown (1994) that expectancy (as in a cloze probability that the supplied word is the expected word) plays a role for the N400 effect. The amplitude of this effect is most sensitive to the semantic relations between individual words, or between word and sentential context, word and discourse context, or generally speaking, with the processing costs of integrating the meaning of a word into the semantic representation that is built up on the basis of the preceding language input (Brown and Hagoort, 1993, Osterhout and Holcomb 1992). We shall come back to a discussion of this effect later on. Syntactic processes are correlated with two other qualitatively different processes: an early left-anterior negativity, known as LAN, and a late centro-parietal positivity labeled P600. Within the early time window, a very early LAN, or ELAN, may correlate with word category errors (Friederici 1995, 2002; Friederici et al. 1996). LANs within a 300-500ms range correspond to number, gender, case and tense mismatches (Münte and Heinze, 1994; Münte et al, 1993, Friederici 2002). LAN effects have also been linked to tasks taxing verbal working memory (Kluender and Kutas, 1993). The second syntactic ERP effect, the P600 (a.k.a. a syntactic positive shift, or SPS) emerges if a syntactic requirement like agreement is violated, or in garden path sentences where going back and reconsidering the whole tree is required (Osterhout and Holcomb 1992; Osterhout et al. 1994, Hagoort et

Modularity of syntax and semantics 55

al., 1993; Kaan et al. 2000). An argument for the independence of this effect from potentially confounding semantic factors is that the P600 also occurs with semantic garbage like the examples in (6), where one sentence has an agreement violation but the other does not (Hagoort and Brown, 1994): (6)

a. The boiled watering can smokes the telephone in the cat. b. The boiled watering can smoke the telephone in the cat.

P600 effects have been reported for phrase structure violations (Hagoort et al., 1993, Neville et al., 1991; Osterhout and Holcomb 1992), subcategorization violations (Ainsworth-Darnell et al., 1998; Osterhout et al. 1997), subjacency (McKinnon and Osterhout, 1996, Neville et al., 1991), and the Empty Category Principle (McKinnon and Osterhout, 1996). The positive shifts occur not only when there is a syntactic violation, but also when sentences are relatively complex (Kaan et al. 2000) or when there is syntactic ambiguity (input compatible with two or more syntactic trees) (Van Berkum et al., 1999) In summary, the three ERP effects described above for native speakers vary in latency, polarity, and topographic distribution. The N400 is qualitatively dissociated from the LAN and the P600, thereby suggesting that syntactic and semantic processing of language are modular indeed. The ERP evidence is quite consistent in showing that separable semantic and syntactic processes exist, and are manifested by different electrical brain activity. To an extent, the fMRI, lesion and ERP techniques provide converging evidence to this conclusion. 4. Interaction between syntax and semantics in sentence processing Another avenue of investigating syntax-semantics modularity has been to look at the timing of syntax-semantics interactions, as well as which type of information influences the processing of the other. The gist of the findings of this large literature were reviewed in section 2, where syntax-based and meaning-based approaches were exemplified. Although there is relative consensus about the rapid interaction between the syntax and semantics in processing, the predominant view5 still is that syntactic processing precedes and drives semantic interpretation. To make these ideas more concrete, in this section we will look at two recent models of language comprehension:

56 Psycholinguistic models Friederici’s (2002) neurocognitive model of auditory sentence processing and Hagoort’s (2003) Unification Model. Friederici’s proposal is based on ERP findings and neurotopographical specifications of brain imaging. Comprehension of a sentence consists of three phases. In Phase 1, which comes 100 to 300 ms after onset of the signal, the initial syntactic structure is formed based on word category information. In this purely syntactic phase, an ELAN effect can be detected. During Phase 2, from 300 to 500 ms, lexical-semantic processes and morphosyntax are active, with the purpose of finding “who” does “what” in the sentence, that is, thematic role assignment. If violations occur at this stage, the N400 and the LAN effects obtain. In Phase 3, from 500 to 1000 ms, the different types of morphological, syntactic and thematic/semantic information are integrated, with processes of reanalysis and repair taking place, signaled by a P600 effect.6 Thus Friederici argues that both the autonomous view of syntactic processing and the interactive view of semantics influencing syntax are correct, but they describe different processing phases. Building syntactic structure is autonomous of the semantic information, but only at the earlier stages; in the late-time window these two types of information can interact. Thus, this model is essentially serial in nature. (See Figure 1 in Friederici, 2002 for details on functional processes and their underlying neural correlates). Hagoort (2003) is inspired by Jackendoff’s (2002) language architecture and Vosse and Kempen’s (2000) computational model of syntactic processing. Hagoort proposes a parallel, interactive model of a lexicalist nature. According to the Unification Model, each word form in the mental lexicon is associated with a three-tiered structural frame (following Jackendoff, 2002). Lexical items are retrieved sequentially, driven by the time course of the input. Each word’s structural frame (Jackendoff’s treelet) enters the unification workspace one after the other, combining incrementally. In this “unification workspace”, constituent structures spanning the whole utterance are formed by a unification operation. This operation consists of linking lexical items (e.g., the NPs, called Root nodes in the model) with empty positions in syntactic templates (e.g., ones provided by verbs based on their argument structure, here called Foot nodes), and checking agreement features (see example in (7) from Hagoort 2003: S25, his Figure 7).

Interaction between syntax and semantics 57

(7)

S

Foot nodeÆ Root node Æ det | DP

NP head | N | woman

subj | NP

head dobj | | V NP | sees

mod | PP

mod | PP

These unification links are formed dynamically, with their strength varying over time until a state of equilibrium is reached. Because of inherent (syntactic, lexical) ambiguity in natural language, alternative linking candidates may be available. Typically, only one linking will remain active. The suppression of the alternative candidates is achieved through a process of lateral inhibition between two or more alternative unification links. Unification takes place at the syntactic as well as at the phonological and the semantic levels. An example of semantic unification is the integration of word meaning into an unfolding discourse representation of the preceding context. If a word is ambiguous (e.g., bank), one of its meanings is chosen based on context constraints. Importantly, this model postulates that in language comprehension, syntactic, phonological, and semantic unification processes operate concurrently and interact to some extent. Thus Hagoort’s Unification Model is in line with previously proposed parallel processing models (Marslen-Wilson and Tyler, 1980; MacDonald et al., 1994), which argue that semantic information is immediately used when it becomes available. Some recent studies have provided evidence for a weak independence of the semantics from the syntax in processing sentences (Kolk, Chwilla, van Herten and Oor, 2003; Kuperberg, Sitnikova, Caplan and Holcomb, 2003; Kim and Osterhout, 2005). This independence gives the semantics the option to pursue an independent analysis of the input even if it contradicts the output of the syntax. If this conjecture is indeed true, then syntax and semantics must be processed in a parallel fashion and undergo a constant interaction. We shall look at an example from Kim and Osterhout (2005),

58 Psycholinguistic models which examined semantic influence on sentence comprehension in syntactically unambiguous sentences. The goal was to test the assumption that if there is no syntactic ambiguity, the syntax produces an output which serves as an input to the semantics, and thus “controls” the semantics. Kim and Osterhout recorded ERPs while participants read strings containing violations as in (8a) and well-formed sentences as (8b,c). (8) a. The hearty meal was devouring the kids (semantic attraction violation) b. The hearty meal was devoured by the kids (passive control) c. The hungry boy was devouring the cookies (active control) In (8a), the hearty meal is a highly plausible Theme for the verb devour but anomalous as the verb’s Agent. Thus the sentence constitutes a violation of the semantic attraction. The syntactic cues in this sentence directly contradict the semantic ones. The –ing inflection on the verb is consistent with the anomalous Agent interpretation but not with the Theme interpretation, unlike the –ed inflection in (8b). The logic was as follows: if syntactic processing controls semantic processing, then an Agent interpretation of the first NP should be pursued. The implausibility of this interpretation should elicit an N400 effect for the verb in (8a), compared to (8b,c). By contrast, if semantic processing operates with some independence from the control of syntax, such that the plausible Theme interpretation of the first NP is pursued, then the offending –ing inflection may be perceived as anomalous and elicit a P600 effect. Because the latter interpretation is in contradiction to the powerful semantic cue, the readers may experience it as a syntactic difficulty. Violation verbs elicited a strong P600 effect, but no increase in N400 amplitude (Kim and Osterhout, 2005). These results are compatible with the claim that the powerful semantic association between NP the hearty meal and the verb devour has caused a well-formed string to be perceived as syntactically anomalous. It seems that in this case, semantics wins over syntax in the sense that it can operate independently and perhaps even control syntactic analysis. Of course, an obvious objection to this interpretation of the results from the point of view of syntax-first models (Frazier and Clifton, 1996) is that the P600 effect may be due to a syntactic reanalysis which has begun so quickly that the anomalous Agent interpretation is cancelled before it elicits an N400. Kim and Osterhout anticipate this objection by conducting a second experiment with an added condition as in (8d).

Interaction between syntax and semantics 59

(8) d. The dusty tabletops were devouring the kids. (no attraction violation) The rationale is that if there is no semantic attraction between the subject NP and any of the verb’s potential theta roles, no commitment to any thematic association will ensue. Hence, the sentence will be perceived as semantically anomalous. This particular sentence type did elicit an N400 effect, as the authors expected. The striking difference between the ERP effects in the Attraction Violation Condition and the No Attraction Violation Condition, they argue, cannot be explained without some account of the interaction between the syntax and the meaning of the subject and verb in all the stimuli. The data appear to be consistent with “a system of parallel, independent syntactic and semantic processing. Semantic processing commits to highly attractive predicate-argument combinations, even when they contradict syntactic cues.” (p. 214). These findings are compatible with a syntax-last account along the lines of Townsend and Bever (2001) as well as with accounts like Ferreira, Bailey and Ferraro (2002), who argue for a “good-enough” semantic and syntactic representations as an initial processing stage in canonical sentences. One heuristic strategy treats all NVN sequences as Agent-Verb-Theme canonical templates. Another heuristic strategy combines the words in a sequence in a manner most consistent with pragmatic world knowledge. These heuristics may sometimes determine interpretation instead of syntactic analysis and may even have a predictive effect on the incoming linguistic string (Kamide et al. 2003). Furthermore, they constitute a major comprehension strategy when syntax is missing, as in agrammatic aphasics (Grodzinsky, 2000). In this section, we have seen that the interaction between syntax and semantics in sentence processing provides further evidence for their status as separate, potentially generative, linguistic processes or families of processes. Very often they co-operate in providing an interpretation for an incoming string, but when they send conflicting cues to the parser, one or the other may take the upper hand, depending on the strength of the cues. Syntax and semantics may be independent of each other, but only weakly so. 5. Neurophysiology and electrophysiology of L2 processing If, as we saw in the previous sections, neural processes for syntax and semantics are functionally specialized, it is then conceivable that they should

60 Psycholinguistic models adapt differently to maturation and experience. By now, it is wellestablished in the literature that delays in language experience do not affect language processes uniformly. We turn to the neurophysiology of L2 acquisition (for recent reviews, see Abutalebi, Cappa and Perani, 2001, 2005; Müller, 2005; Paradis, 2004, Ch. 6). As Paradis cautions, the results of the existing studies are often contradictory and they should be compared with extreme caution. Still, some limited consensus emerges. Most studies find that early bilinguals activate the same cortical areas when processing their native and their second language (Kim et al., 1997; Chee et al., 1999, 2000; Urbanik et al., 2001). When late bilinguals are involved, then proficiency level becomes the important variable: less proficient L2-ers are reported to have different patterns of activation (Kim et al., 1997; Perani et al., 1996; Dehaene et al., 1997; Urbanik et al., 2001) while more proficient bilinguals activate the same areas as in their L1 (Perani et al., 1998; Illes et al., 1999; Price et al. 1999; Pu et al., 2001; Tan et al., 2001, 2003). In a nutshell, achieved high proficiency can compensate for a late start. Three studies are of particular importance for us at this point, because they used a similar task, listening to stories in the L1 and the L2 while being scanned (Perani et al. 1996, 1998; Dehaene et al. 1997). Dehaene and colleagues scanned low proficiency French learners of English and found a lot of individual variation among the brain substrates engaged. Perani et al. (1998) controlled for the age and proficiency factors, studying two groups of highly proficient learners: one group started L2A later than the other (after age 10 as compared to after age 4). Among these proficient learners, age of acquisition (AoA) was not a significant factor predicting neurofunctional differences. The authors concluded that there is evidence of considerable plasticity in the network that mediates language comprehension. Wartenburger et al. (2003) used a two by two by two design (AoA, proficiency and L1-L2 as factors) in investigating semantic versus syntactic violations. One group of Italian-German bilinguals acquired the L2 since birth and achieved high proficiency.7 These were labeled the early acquisition high proficiency group. A second group acquired the L2 late but were equally proficient: the late acquisition high proficiency group. There was a third, late acquisition low proficiency group. The subjects had to perform grammatical and semantic judgments in their native language and in their second language while they were being scanned. The effect of individual differences is thus minimized since we are looking at cerebral representation of language in the same individual. Stimuli included 180 short sentences, 90 in German and 90 in Italian, half of which were anomalous and

Neuro- and electrophysiology 61

the rest well formed. Half of the anomalous sentences were syntactic violations of gender, number and case agreement as in (9a) and the other half were semantic violations as in (9b). (9) a. Der Hund laufen über die Wiese. the dog-sg run-pl over the meadow ‘The dog run over the meadow’ b. Das Reh erschießt den Jäger. the deer shoot-3sg the hunter ‘The deer shoots the hunter.’ The researchers found that the age of acquisition factor mainly influences the mental activation pattern elicited by the grammatical judgments. Subjects with an age of acquisition higher than 6 (mean of 19) showed greater activation as compared to early learners of high proficiency during grammatical, but not during semantic processing. While age is a factor only for grammatical processing, proficiency level seems to be an important factor for both grammatical and semantic processing. The electrophysiological studies on L2 learners also support this claim of higher neural plasticity for semantics as opposed to morphosyntax. Weber-Fox and Neville (1996) is the first study to point to differential sensitivity of syntactic processing to AoA effects. According to these researchers, syntactic processing is more dependent on AoA (affected by an AoA of 1-3 years of age) than semantic processing (affected by an AoA of more than 11 years). Hahne (2001) and Hahne and Friederici (2001) found that semantic integration is processed similarly by natives and L2 learners (although the N400 effect had a lower amplitude and was somewhat delayed), while phrase structure violations did not elicit the expected LAN and P600 in the learners. The same pattern emerged as in the neuroimaging studies. While processing semantics, L1 and L2 speakers do not differ qualitatively. While processing morphosyntax, however, both age and proficiency lead to significant differences. The development of automatic implicit processing seems possible, but is influenced by the complexity of the system to be acquired. A clever little study, Friederici, Steinhauer and Pfeifer (2002), explicitly argues against the Critical Period Hypothesis using acquisition of an artificial language. They showed that if it is a small grammatical system (in the sense of small number of lexical items and rules), L2 learners exhibit exactly the same event-related brain potential components that are related to early automatic processing and to late controlled syntactic

62 Psycholinguistic models processing in native speakers. Pointing in the same direction is another study, Weber-Fox and Neville 2001, who found age effects in processing closed-class words (inflectional morphology) only, but not open-class words (see more on that study in section 6). They argued that neural subsystems responsible for grammatical processing appear to be more sensitive to the age of L2 immersion. Two tables on the next page summarize the main findings of the current literature. Table 1 presents a summary of the few ERP studies that look at L2 processing of the semantics, while Table 2 summarizes the studies dealing with syntactic processing. If a study has compared the two types of processing, it appears in both tables. In most of the studies, the task of the participants is to read grammatical and ungrammatical sentences for comprehension. In summary, recent cognitive imaging evidence (Perani et al, 1998; Chee at al, 1999, 2001; Wartenburger et al, 2003) points to the conclusion that both AoA and proficiency are critical determinants of brain organization of language processing in adult L2 learners. However, AoA is a more important determinant for syntactic than for semantic processing. Using electrophysiological imaging, researchers have established that semantic processing in L2 learners is qualitatively the same, although very often delayed, as compared to native speakers. Of the syntactic ERP effects, the P600 is frequently present in learners, but ELAN and LAN effects are strikingly absent. These findings suggest that early automatic processing is difficult to acquire if the language is learned late, while more controlled syntactic parsing processes are comparatively easier to acquire.

Neuro- and electrophysiology 63

64 Psycholinguistic models

Neuro- and electrophysiology 65

66 Psycholinguistic models

Closed-class versus open-class LIs 67

6. Processing of closed-class versus open-class lexical items Another research question examined in the literature using ERPs to study language comprehension is whether closed-class versus open-class lexical items are processed in the same way. Although open-class words (notional lexical items conveying referential meaning) and closed-class words and morphemes (the Functional Lexicon) have in common their being threeway interface rules (Jackendoff, 2002), they differ essentially in the types of meanings they can encode. Current generative linguistic theory argues that not only overt syntactic properties but also properties computable at the syntax-semantics interface depend crucially on features encoded in the Functional Lexicon. In normal native speaker adults, the ERP response to open-class words is characterized by a negative component that peaks at 350 ms post word onset (N350) (Neville, Mills and Lawson, 1992). The distribution of this component is bilateral and largest over posterior areas. In contrast, closed-class words are characterized by a negative peak that occurs earlier, 280 ms post onset (N280) and is lateralized differently: over anterior temporal regions of the left hemisphere. These findings are of course compatible with the hypothesized different roles of the word classes in language processing: open-class words reflect lexical semantics and closed-class reflect grammatical functions. Studies of deaf individuals and of hearing children provide further evidence for the distinctness of the two neural subsystems mediating open- and closed-class word processing (Neville, 1994; Neville et al, 1992; Neville, Coffey, Holcomb and Tallal, 1993). Whereas ERP indices of semantic processing were virtually identical in deaf and hearing subjects, those linked to grammatical processes were markedly different in deaf and hearing subjects. Taken together, the results of the various studies by Neville and colleagues suggest that non-identical neural systems with different developmental vulnerabilities mediate these different aspects of language. More generally, these results provide further neurobiological support for the distinction between lexicosemantic and grammatical functions. Weber-Fox and Neville (2001) is a study that looks at closed-class and open-class item processing in bilinguals. Their participants include 53 Chinese-English bilingual individuals, divided in groups based on age of acquisition: they started acquiring English at 1-3, 4-6, 7-10, 11-13, and after 15 years of age. As in the other study involving similar age groups of Chinese-English bilinguals (Weber-Fox and Neville, 1996), age of acquisition was confounded with proficiency, as measured by various proficiency tests

68 Psycholinguistic models the researchers administered. The latencies and distributions of the N350 effect that signals processing of open-class lexical items were the same across all learner groups as well as the native speakers. However, the N280 effect for closed-class words peaked 35ms later than in native speakers in all groups who had stared acquiring English after 7 years of age. In contrast to the same authors’ earlier findings (Weber-Fox and Neville, 1996), all the groups displayed a similar left anterior distribution of the N280, suggesting that a left anterior neural network is active in comprehending a second language, even if learned after the first years of life. These findings are one more confirmation that age of acquisition differentially affects different linguistic structures: while semantic processing is largely undisturbed by later onset of acquisition, inflectional morphology is sensitive to it. We started this chapter by noting that a recent model (Clahsen and Felser, 2006) integrates processing and linguistic representation and compares children and adult language acquirers, arguing for a shallow linguistic representation in L2 learners based on parsing data. That is why I suggested that it is difficult these days to talk about (symbolic) linguistic representations without mentioning data from language processing. The relevant point to retain from this chapter is the division of labor between the lexicosemantic and semantic-pragmatic modules on one side, and the morphosyntactic module on the other. These two types of linguistic information are processed in different parts of the brain, presumably subsumed by different neural networks, and they give rise to separate signature ERP effects. The same localization and generally the same effects are observed in second language speakers as well. However, most of the data points to differential processing of syntax and semantics, as well as differential age of acquisition effects on those types of processing. Most ERP components are qualitatively similar in bilinguals as compared to native speakers, although their latencies are often delayed and less sharply distributed. In the next chapter, we turn to a critical examination of electrophysiological and imaging studies.

Notes 69

Notes

1. 2.

3. 4.

5. 6.

7.

The theoretical proposal itself will be discussed at greater detail in chapter 5. For example, I will not be interested in whether L2 learners obey native attachment preferences when the grammar allows them two attachment possibilities, in sentences like I spoke [to the servant [of the actress]] who was on the balcony. I believe that the processing of filler-gap dependencies and other non-ambiguous sentences are more revealing of the underlying linguistic competence (mediated by parsing laws), than attachment preferences. The even more interesting research question, namely, is pragmatic information different from both syntactic and semantic information, is unfortunately not sharply articulated. Furthermore, the areas used to comprehend syntax and semantics are not uniquely dedicated to these tasks. For example, parts of Broca’s area are activated in the processing of music (Maess et al. 2001), perception of the rhythm of motion (Schubotz and von Crammon, 2001) and imagery of motion (Binkofski et al., 2000). This view is predominant among psycholinguists who conceive of language as a symbolic system, and is irrelevant to those of connectionist persuasions. A question immediately arises in the mind of the syntactician, namely, where would agreement and case valuing and checking, as well as locality violations, be recognized? As far as I understand Friederici’s model, phase 2 would seem to be the appropriate phase for these basic syntactic computations. It is possible that long-distance dependencies (binding and whmovement) spill over into phase 3 as well. By most definitions, these participants would be classified as simultaneous bilinguals, not L2 learners.

Chapter 4 What are imaging and ERP studies of bilinguals really testing?

1. Linguistic tests In chapter 2, I presented the architecture of the language faculty, elaborating predominantly on the interface between syntax and semantics, the focus of this book. In chapter 3, I discussed the interaction between the parser and the grammar, with the purpose of establishing the modularity of syntax and phrasal semantics. I compared the processing of syntax and semantics and provided evidence for their interaction in online comprehension, as suggested by neuro- and electrophysiological methods of investigation. But before using the ERP and neuroimaging data in support of syntaxsemantics modularity, one should take a closer look at how valid the evidence is that these new methods of investigating linguistic performance bring to the table. In this chapter, I turn to critically examining the linguistic tasks speakers perform while their comprehension is being monitored and recorded. There are several ways of looking at linguistic competence through linguistic performance. Behavioral data are produced by carefully controlled experiments using tasks such as the grammaticality judgment task, the logical entailment judgment task, the elicited production task, the act-out task and the truth value judgment task, among many others. These are the types of data used by researchers of first and second language acquisition but also by generative syntacticians and semanticists.1 To draw on a metaphor, behavioral tasks look at the black box of linguistic competence through the medium of performance on those tasks. Other behavioral tasks are psycholinguistic, for example, the moving window technique (presenting words on a screen one at a time and measuring the reaction times until each word is accommodated into the structure and another one is called for). Other tasks involve priming, eye-tracking, or interference paradigms, again with the dependent variable being the reaction time. Such tasks rely more directly on measuring comprehension time at important junctures in the sentence, say, at a trace position (a gap). The main difference between behavioral linguistic and psycholinguistic tasks is

Linguistic tests 71

that behavioral data are static while psycholinguists are interested in how listeners and speakers comprehend/produce utterances in real time, their data are dynamic and time sensitive. A third type of data would be the neuro- and electrophysiological data that I discussed in chapter 3. In a sense, these are the data that aspire to look inside the black box of competence. It is an admirable yet quite challenging goal in cognitive science to be able to articulate a hypothesis that draws on, and can accommodate, all three types of data. In this chapter I will explore the possibility of combining insights from all three areas to support the Bottleneck Hypothesis: linguistics, psycholinguistics and neuroscience. I will discuss some more general problems with integrating neuroscientific findings into linguistic models, both in terms of property and transition theories. The discussion will lead to the conclusion that increasing the linguistic sophistication of tests used in neurolinguistic research might cause a closer integration of neurolingustic findings into the models of linguistic development. Without a doubt, dynamic psycholinguistic data on how linguistic information is integrated on-line has implications for the architecture of language in the mind. However, in chapters 6 and 7 I will focus mostly on behavioral data to support the Bottleneck Hypothesis for the simple reason that psycholinguistic and neurolinguistic studies do not really investigate the computation of complex sentence-level meaning in the mind/brain of the speaker. I will start with some more general questions and examine stimuli used in semantic and syntactic processing studies. I will propose some linguistic properties researchers should look at if they want to test the processing of phrasal semantics. 2. Can neuroscience inform theories of linguistic development, just yet? A recent debate has emerged in the cognitive science literature over the usefulness of fMRI and ERPs as a new tool for studying human cognition (two recent articles are Henson (2006) and Poldrack (2006), see references therein; see also Paradis (2004); Vaid and Hull (2002) for arguments from the point of view of neurolinguistics). In particular, Poldrack (2006) cautions that many studies use the following (faulty) logic, called reverse inference: “(i) In the present study, when task comparison A was presented, brain area Z was active.

72

What are we really testing? (ii) In other studies, when cognitive process X was putatively engaged, then brain area Z was active. (iii) Thus, the activity of area Z in the present study demonstrates engagement of cognitive process X by task comparison A. (Poldrack 2006: 59)”

The author points to the fact that cognitive processes (like processing semantics, or processing syntax) cannot be studied directly but only through the reactions/responses to cognitive experimental tasks, as illustrated in Figure 1. Engaging in experimental tasks, speakers perform a number of cognitive processes to finally invoke the mental representation of sentences (or text), in the form of structured phrase markers (syntactic trees). The cognitive processes are purportedly accessed through behavioral psycholinguistic tasks, while mental representations are accessed by judgments. As a fourth step, the cognitive processes lead to variations in blood flow or brain electrical activity. Neuroscience experimental tasks

Cognitive processes used to invoke mental representations ________________________________

Behavioral psycholinguistic tasks _____________________

Mental representations

Judgments

Variations in blood flow (fMRI) or brain electrical activity (ERP) Figure 1. Relationship between experimental manipulations, cognitive processes, and observed variables in linguistics

Although functional neuroimaging techniques are an exciting new tool of cognitive science, haemodynamic and electrophysiological changes in the brain are not direct windows into cognitive processes. Studies must first establish the mental representations that speakers reach when prompted with linguistic stimuli, and then look at the cognitive processes used to in-

Neuroscience and linguistic theory 73

voke these mental representations. In order to establish the correct and detailed ontology of the cognitive processes in linguistics (what task engages what cognitive process and by what mechanism), results from behavioral studies should necessarily precede and be complemented by neurolinguistic studies. As Poldrack (2006: 62) points out, the cognitive ontologies of the existing functional neuroimaging databases list the following cognitive processes under Language: orthography, phonology, semantics, speech, syntax. These are far too coarse distinctions, in the sense that they are not “a cognitive process” but a multitude of cognitive processes that function as a system. As things stand right now, the linguistic tasks used to study, say, comprehension (e.g., listening to stories) probably engage too many disparate cognitive processes to be taken as an adequate reflection of semantic processing effects in the brain. This cautionary attitude on the part of cognitive psychologists is echoed even more strongly on the linguistic side. Poeppel and Embick (2005) offer a pessimistic prognosis for the future of the combined study of language and brain. They point out two major obstacles for the cross-fertilization of these disciplines. The first is the Ontological Incommensurability Problem (similar to Poldrack’s cognitive ontology issue), namely, the units of linguistic computation and the units of neurological computation are incommensurable at present. The problem, as they see it, is not that these two ontologies are very different at this moment in time, but that they are being refined by researchers independently of each other, with no consideration of the results of the other side. The second obstacle is the Granularity Mismatch Problem, namely, that linguistic and neuroscientific studies operate with objects of different granularity. “In particular, linguistic computation involves a number of fine-grained distinctions and explicit computational operations. Neuroscientific approaches to language operate in terms of broader conceptual distinctions” (p. 105). If the ontologies and computational processes are spelled out clearly, the authors suggest, then explicit interdisciplinary linking hypotheses can be formulated. The suspicion that neuroscience is not ready to inform current linguistic theory and the ensuing side-lining of neuroscientific findings is quite common among linguists in the first decade of the 21st century. For my specific purposes in this book, I am interested in mental representations in the mind/brain of adult second language learners and how they come to be acquired. In particular, I have been focusing on the question whether there is a critical period for the acquisition of different syntactic and semantic computations, and whether the functional morphology presents a severe bottle-

74

What are we really testing?

neck for such acquisition. The main message from the studies reviewed in chapter 3 is that morphosyntax and lexical (but not phrasal) semantics are processed differently; they are subsumed by different neural circuits, and they give rise to ERP effects of different latitude and timing. But that is the limit to which neurophysiological data can be useful for our discussion. As noted above, “syntax” and “semantics” are not computational tasks but broad general domains, each of which consists of numerous computations and representations. If we want to probe deeper into detailed representations and how they are acquired, we will have to rely on behavioral data. Where linguistics can inform neuroscience, and consequently gain some insights for itself, is in supplying linguistically sophisticated research questions and test materials. In the next sections, I will look closely at test items and research questions in the study of semantic and syntactic processing. 3. Studies of semantic processing Paradis (2004) offers a wide-ranging and frequently scathing criticism of neuroimaging studies of bilingualism. We will only echo his concern regarding the stimuli used in imaging studies. As Paradis points out, the majority of studies to date have used single words as stimuli. Linguistic tasks include single word recognition, word repetition, word reading, cued word generation, synonym generation, judging whether a word refers to an animal, and deciding which of two verbs a noun is associated with (Paradis, 2004: 174). Some other tasks are obviously metalinguistic, for example, deciding whether two written words rhyme or not, deciding whether a word is concrete or abstract, or deciding which of two languages a word belongs to. Even if we disregard the metalinguistic tasks, the cognitive processes involved in single-word access to the mental lexicon are hardly indicative of natural language processing. In the best case scenario, lexical access may be considered as an integral part of language comprehension, but is far from the whole story. Studies utilizing such tasks can only claim that they are comparing the storage of lexical items in isolation in natives and bilinguals, but they cannot claim to generalize those results even to “lexical semantics,” let alone “sentential semantics.” This is because accessing a lexical item in isolation and out of the blue is not the same process as accessing it in the course of language comprehension, where context and syntactic structure contribute to certain expectations of which word should come

Studies of semantic processing 75

next. Words in isolation simply lack the discourse and sentential contextual information that is utilized when processing them in text or speech. We have a problem of another nature with the stimuli used in electrophysiological studies: only one “semantic” effect and too many cognitive processes that it may stand for. Let us take a closer look at what is considered a semantic violation in the ERP literature, at least in the studies investigating bilinguals. All of the studies that I reviewed briefly in chapter 3 bracket together, under the label “semantic violation,” sentences which violate semantic constraints on matching semantic features as well as sentences in which one critical word cannot be easily integrated in the context, or is unexpected. As an example of a selectional features mismatch, (possibly leading to a presupposition violation), take the classical He spread his warm bread with socks. The semantic features of bread (edible stuff) must match those of its modifier, which in the case of socks is violated. On the other hand, Hagoort and Brown (1994) show that subtle differences in the cloze probability of a word in a sentence, such as between mouth and pocket in the sentence Jenny put the sweet in her mouth/pocket after the lesson, can also modulate the N400 amplitude, although no presupposition is violated. Thus, the N400 effect is sensitive not only to semantic violations, but also to integration of words into a preceding discourse context and to cloze probability expectations. This situation is an apt illustration of the frequent (and disastrous) disregard for linguistic factors and distinctions in interpreting neuroimaging and electrophysiological findings. One experimental task may produce the same ERP effect with different stimuli, but until we know exactly what cognitive processes are triggered by this task and the stimuli, the ERP findings will be of limited utility. This observation is reinforced by a recent study of N400 effects. Hagoort et al. (2004) recorded magnetic fields while subjects read three types of sentences as in (1): a true sentence, a world knowledge violation and a “semantic violation”. (1)

a. b.

The Dutch trains are yellow and very crowded. The Dutch trains are white and very crowded.

TRUE

WORLD KNOWLEDGE VIOLATION

c.

The Dutch trains are sour and very crowded. SEMANTIC VIOLATION

The studies do not find any difference in the timeline and the way Dutch natives process the world knowledge violation (1b) and the semantic viola-

76

What are we really testing?

tion (1c). Both sentences evoke an N400 as compared to the true sentence. The authors conclude that the same brain area (left inferior prefrontal cortex) is involved in the integration of both lexical meaning and world knowledge and that “it does not take longer to discover that a sentence is untrue than to detect that it is semantically anomalous” (p. 440). But heeding the caution that the same first author advised ten years earlier, we cannot know whether those results are a reflection of (i) a semantic violation processing, (ii) a world knowledge violation processing, or simply (iii) an unexpected word effect. We encounter a variant of the problem identified by Poldrack (2006) as reverse inferencing: because the same ERP effect is evoked, it does not necessarily follow that the same cognitive process is involved. Combining such ERP findings with behavioral data from semantic and pragmatic judgments and semantics versus pragmatics processing, both in native and second language speakers, would help to tease the cognitive processes apart and ultimately increase the reliability of the ERP findings. 4. What exactly should we test when we are testing semantics? What would qualify as a linguistic stimulus that will allow us to study authentic semantic processing, apart from the selection violations mentioned above? For one, semantically ambiguous sentences like the classical quantifier ambiguities as in (2) induce semantic calculations over and above the strictly compositional sentential semantics, since they may involve Quantifier Raising (May, 1985, see chapter 2, section 2.3. for more details).2 (2)

Everyone loves someone. a. There is one person x that is loved by everyone. b. Everyone has tender feelings for someone or other, making pairs of “lover” and “loved”.

Such sentences, involving two quantifiers, or a wh-word and a quantifier, are studied using behavioral measures by Dekydtspotter and Outcalt (2005) and Marsden (2003, 2004), but they are not yet investigated in neurophysiological and electrophysiological studies. We postpone discussion of the behavioral results to chapter 7. Another option (already discussed in chapter 2, section 5) would be to test sentences that exhibit semantic coercion as in (3) (Jackendoff, 1996) and to compare them with sentences where the verb does not need coercion

Tests of semantics 77

as in (4).3 As we saw in the earlier discussion, the adverbial until dawn has to combine with an atelic predicate, providing a telos (endpoint) for it. In (3), however, the verb flash, an achievement, is inherently telic since it encodes an instantaneous event. In its meaning, there are no preparatory stages leading up to the change of state. The adverbial, then, coerces the verb meaning into a series of repetitive events, until dawn. The atelic verb sing in (4), on the other hand, does not employ the semantic operation of aspectual coercion but only simple composition. (3) (4)

The light flashed until dawn. The singer sang until dawn.

TELIC, COERCED INTO ATELIC TRULY ATELIC

Thus, by comparing behavioral and ERP reactions to sentences as in (3) and (4), one could isolate a purely semantic process (not a syntactic process) over and above semantic composition. Semantic structure is violated also when we combine an inherently telic predicate, e.g., the achievement recognize, with a for X time adverbial, which is felicitous with atelic predicates. We expect a similar sort of effect as in (3) above.4 The predicate recognize, however, resists aspectual coercion, since the lexical meaning of the verb tells us we cannot recognize the same individual many times over. Therefore a semantic incompatibility arises, marked with #.5 (5)

#John recognized the man in the picture for an hour.

The contrast between aspectually coercible versus aspectually uncoercible predicates presents another testable pair, since the cognitive process of aspectual coercion (a semantic process of re-interpretation that could be compared to re-analysis in the syntax) leads to a legitimate semantic structure in (3) but a violated semantic structure in the case of (5), depending on the lexical properties of the predicates (and probably the adverbials, see note 5) involved. This is another example of a purely semantic process in sentence comprehension that is not confounded with cloze probability. However, it is probably more profitable to test contrasts like those between (3) and (4) and between (3) and (5) by behavioral semantic judgments and behavioral psycholinguistic measures, since all we can expect from an ERP study will be an N400 effect, which therefore is insufficiently informative. Piñango, Zurif, and Jackendoff (1999), defining aspectual coercion as a combinatorial semantic operation requiring computation over and above

78

What are we really testing?

that provided by combining lexical items through expected syntactic processes, conducted an experiment to investigate whether or not parsing of a string such as (3) requiring coercion (in addition to syntactic composition) is more computationally costly than parsing its syntactically transparent counterpart as in (4). Their prediction of higher computational cost for coercion was borne out by the results of adult native speakers. The bilingualism inquiry is still awaiting experimental studies using this type of linguistic stimuli. 5. Stimuli in studies of syntax Let us turn now to an examination of the “syntactic” violations most commonly tested in the neurofunctional studies on bilinguals. I will repeat the most common type of stimulus sentence, using three from Wartenburger et al. (2003).6 Italian-German bilinguals’ brains were scanned while they were visually processing stimuli like the ones below. (6)

*Der Hund laufen über die Wiese the dog-sg run-pl over the meadow ‘The dog runs over the meadow.’

(7)

*Das Kalender hängt an der Wand the-masc calendar-neuter hangs on the wall ‘The calendar is hanging on the wall.’

(8)

*I gatti ama the cat-pl like-3p.sg ‘Cats like chasing mice.’

cacciare i topi hunt-inf the mice

All the syntactic violations in (6)-(8) involve incorrect agreement in number, gender, or case (p. 168). Recall that the language faculty architecture we reviewed in chapter 2 posits different status for lexical entries in the functional lexicon and for various other syntactic operations like movement of constituents. In fact, another way of viewing this is as the difference between inflectional morphology and syntax proper, a distinction well known to linguists. Now, lexical entries, be they referential or functional, are stored in declarative memory (a.k.a. explicit memory, see Paradis 1994, 2004, Ullman 2001, 2004) while other syntactic processes are subsumed by procedural memory (an implicit process). It is very likely,

Stimuli in the studies of syntax 79

then, that these two qualitatively different routes (the processing of functional morphology and of syntax) are differentially affected by AoA in second language acquisition. In a way, that will correspond to a distinction between the phonological features and the syntactic features encoded in the inflectional morphemes. In fact, a survey of syntactic processing in native speakers, Kaan and Swaab (2002), explicitly argues for syntactic processing recruiting multiple brain areas. “ … we propose that the different parts of the network are recruited for different aspects of syntactic processing. The middle and superior temporal lobes might be involved in lexical processing and activating the syntactic, semantic and phonological information associated with the incoming words; the anterior temporal lobe might be involved in combining the activated information or encoding the information for later use; and Broca’s area might be involved in storing non-integrated material when processing load increases.”(p. 355)

The bilingual imaging literature, however, does not make these finergrained distinctions between lexicon and syntax just yet. Until bilinguals are scanned while processing functional morphology violations (as in Wartenburger et al., 2003) as well as syntactic violations like phrase structure or word order violations, the claims of syntax and semantics being differentially affected by AoA should be suspended. There is a sharper distinction made between syntactic processes in the ERP literature. Recall that three effects have been proposed, although opinions differ on whether all three are equally robust and whether they appear together in the processing of one sentence by one and the same speaker (see Osterhout et al 2004, chapter 14, for a critical discussion, pp. 293-9). An ELAN (early left anterior negativity) is purportedly sensitive to local, firstpass phrase structure violations, based on word categories. A LAN (left anterior negativity) is sensitive to functional morphology violations as well as the hierarchical syntactic structures that involve not only Merge but Move as well. Finally, the P600 effect is supposedly indicative of syntactic integration, when the parser encounters a syntactic element that it cannot accommodate into the existing structure (garden paths), or when processing syntactically complex, scrambled sentences (e.g., object-first sentences in German or English) (Grodzinsky and Friederici, 2006). A very recent review of ERP studies on bilinguals, Mueller (2005) comes to an interesting conclusion. After summarizing the findings of several studies, the author notes that the two syntactic effects (LAN and P600) are differently affected by AoA. The LAN (or even ELAN) is affected by

80

What are we really testing?

AoA (Hahne, 2001; Hahne and Friederici, 2001; Weber-Fox and Neville, 1996). The P600, on the other hand, was strikingly similar to that of native speakers in the most proficient learners and generally dependent on proficiency (Weber-Fox and Neville, 1996; Hahne, 2001; Hahne et al., 2006; Sabourin 2003). However, the author does not relate the two ERP effects to functional morphology and the abstract syntactic processes triggered by that morphology, probably because the real picture is much more confusing than the neat, sensible distinction made in the Mueller article. If one looks even perfunctorily at Table 2 in chapter 3, where ERP studies comparing native speakers with bilinguals are summarized, one will notice that two studies present stimuli with inflectional morphology violations (Hahne, Mueller and Clahsen, 2006; Sabourin, 2003) while the rest present phrase structure and word category violations. All of the studies, however, obtain LANs and P600s (no ELANs). No study looks at very complex syntax, for example, garden paths that involve re-interpretation, or scrambled, noncanonical word order sentences. One conclusion might be that the three ERP effects are fairly well differentiated in response to different types of syntactic violations in native speakers, but not so well differentiated in second language users. In other words, a functional morphology violation not only presents a problem when building syntactic structure, but remains a problem when reanalyzing the syntactic structure. There are two inconsistencies about this conclusion. One is that the native speakers in the reported studies also showed ERP effects similar to the second language learners. The other is that we cannot be certain if the LAN and the P600 appeared sequentially, caused by the same violation in two differentiated cognitive processes. As mentioned above, Inoue and Osterhout (in preparation), (as reported in Osterhout et al, 2004, p. 296-7), studied case ending violations in Japanese as in (10) as opposed to correct sentences as in (9). (9) (10)

Taro-NOM Hanako-DAT textbook-ACC to-buy said ‘Taro told Hanako to buy a textbook.’ *Taro-NOM Hanako-ACC textbook-ACC to-buy said ‘Taro told ??? to buy a textbook /Hanako.’

When averaged over all subjects who took the test, the response to the syntactically anomalous second NP in (10) seemed to be biphasic, comprising both a LAN (300-500 ms) and a stronger P600. However, looking at individuals, the results changed dramatically. Those subjects who showed a large LAN tended to demonstrate a much weaker P600, and vice versa,

Stimuli in the studies of syntax 81

those who showed a strong P600 showed no anterior negativity. In addition, the anteriority of the effect disappeared when averages were calculated for the two groups of subjects separately; thus, what seems like a LAN may be an N400 elicited by the case ending anomaly for one group of these subjects. Such methodological concerns call into question the suggestion that a LAN effect directly manifests a fast, automatic syntactic analyzer based on functional morphology (Friederici, 2002). Consequently, another conclusion comes to the fore, namely, that it is not certain what exact cognitive processes an ERP is triggered by in a specific individual language user and at a specific test trial. That, of course, does not mean that we should disregard evidence for linguistic processes coming from ERP studies, but rather that we should look for convergence of evidence from ERP, imaging, lesion, and behavioral studies. More informed attention to testing materials and individual performance is needed, of course, before any solid conclusions are drawn at this time. To summarize the main message of this chapter, ERP and BOLD effects may be clearly related to test item manipulations, but the underlying cognitive processes that cause them are not yet known with certainty. The effects may be due to direct manifestations of syntactic and semantic processing, but they may also be due to other cognitive processes that they happen to be correlated with. The situation is particularly striking with the N400 effect, where a number of linguistic effects (e.g., unexpected word, semantic expectation violation, pragmatic knowledge violation) can produce an identical pattern of activity across the scalp. I used several linguistic constructions (quantifier scope interactions, aspectual coercion, complement coercion) to illustrate what may be considered genuine semantic processing. In the processing of syntax, the three ERP effects that are currently linked in the literature to separate parts of the linguistic message processing (canonical argument structure building = ELAN; fast, automatic syntactic analysis = LAN, syntactic integration and reanalysis = P600) should be treated with caution due to still unanswered and some unanswerable methodological questions. This situation is true not only of native language syntactic and semantic processing but even more so of second language processing where the variety in test materials is even narrower. The conclusion seems to be that behavioral data in second language acquisition is still the primary source of evidence to which we turn when we want to support or refute a theoretical proposal about mental representations of language and how they are acquired.

82

What are we really testing?

Notes 1.

As Marantz (2005) points out, generative linguistic theory can only benefit if all theoretical researchers derive and present the syntactic and semantic judgments in their work as behavioral data collected experimentally from a respectable number of native speakers, under conditions of controlled experimentation. Published data in linguistics articles have to mean that “[t]he linguist has made an implicit promise that (i) there is a relevant population of speakers for which the reported judgments hold, (ii) the example sentences provided are representative of a class of sentences as described by the linguist, and (iii) with speakers randomly sampled from the relevant populations and sentences radomly sampled from the relevant class, an experimenter would find more or less the same judgments that the linguist reports.(p. 435)”. Unfortunately, this implicit promise is not always honored. As Ferreira (2005) points out, today many psycholinguists are disenchanted with generative grammar, one of the reasons being that “generative theories appear to rest on a weak empirical foundation, due to the reliance on informally gathered grammaticality judgments” (p. 365). 2. Quantifiers have to be interpreted relative to a proposition in any case, so there has to be some A-bar movement even in sentences with only one quantifier. 3. Some authors would claim that this process involves aspectual coercion forced by context, in a way, a contribution of the discourse representation that starts after the compositional interpretation has been given (de Swart, 1998). In principle, this makes aspectual coercion a pragmatic process. The border between semantics and pragmatics is fuzzy, and very much an empirical question still (see Kadmon, 2001 for interesting discussion). 4. Compare with the effect for X time adverbials have on accomplishment predicates like read and write. (i) John read the paper for an hour (may not have finished reading it yet) (ii) John read the paper in an hour (*may not have finished reading it yet) The adverbial brings out the endpoint as in (ii), or indicates that the endpoint is not actually reached as in (i). In the case of achievement verbs that are instantaneous events, the endpoint (change of state) is always reached, so the reinterpretation that the adverbial forces has to do with repetition. 5. The facts are, of course, not as simple as it would seem at first glance. For example, the lexical meaning of the adverbial also contributes to the semantic incompatibility in example (5), because the adverbial for three years in the following example, supplied by Alice Davison, seems to allow coercion into a habitual activity. I am grateful to Alice for this discussion. (i) The old person recognized his relatives for three years, then was unable to.

Notes

83

6. Weber-Fox & Neville (1996) is a notable exception to this trend of testing purely morphosyntactic violations as faulty agreement. The researchers tested truly syntactic violations: phrase structure violations (e.g., *The scientist criticized Max’s of proof the theorem), specificity constraint violations (e.g., *What did the scientist criticize Max’s proof of?), and subjacency violations (e.g., *What was a proof of criticized by the scientist?). However, they completely disregard properties of the native language, therefore not discussing how Chinese, the L1 of their learners, works with respect to these violations. The reader is left wondering whether there was anything beyond L1 transfer that their subjects had to acquire.

Chapter 5 The Bottleneck Hypothesis

We are now ready to expand on the Bottleneck Hypothesis, my proposal about how second language learners acquire grammatical competence. The proposal combines ideas relatively well established in Minimalist syntactic theory, e.g., language variation is reflected in the lexicon, and more specifically, in the various features of the inflectional morphology, with ideas that are still debated in the field of generative L2 acquisition, e.g., L2 syntactic competence is under-reflected by learners’ production of functional morphology. It also integrates recent results from an explosion of studies investigating L2 knowledge of semantics. The logic of the Bottleneck Hypothesis relies on the following three arguments: 1) recent neurolinguistic, psycholinguistic, and behavioral findings suggest that functional morphology is processed differently from syntax proper and semantics; learners can expect to encounter enhanced difficulty in learning morphology, as compared to learning syntax and semantics (to be elaborated on in sections 3.3 and 3.4); 2) learners are quite accurate on syntactic reflexes of functional categories, while at the same time they fail to consistently produce the overt morphology related to the same functional categories (to be elaborated on in section 3.1, following White, 2003); 3) learners are quite accurate in the acquisition of semantic properties (extensive evidence provided in chapters 6 and 7). The Bottleneck model, then, is a theory about L2 development but it also relates to ultimate attainment. In light of these arguments, the strong version of the Critical Period Hypothesis (see chapter 1) seems untenable. There is not just one but multiple critical periods for the different linguistic modules, and there may be no critical period for the acquisition of meaning (Slabakova, 2006b). Even if performance hurdles are responsible for a flawed morphosyntax production, learners can still achieve native-like comprehension of linguistic meaning (both semantic and linguisticpragmatic). In this chapter, I will first dwell on some assumptions of what a theory of second language acquisition must explain and the two parts it requires: a property theory and a transition theory (section 1). I will then turn to a discussion of some recently proposed transition theories (Carroll’s Autonom-

What must a theory of SLA explain? 85

ous Induction Theory, Truscott and Sharwood Smith’s Acquisition by Processing Theory, Clahsen and Felser’s Shallow Structure Hypothesis, and Herschensohn’s Constructionist approach). I will argue that these recent proposals are either untestable and thus unsuitable as working hypotheses (the Autonomous Induction Theory, the Acquisition by Processing Theory), or they are testable but (some) acquisition findings disagree with their premises (The Shallow Structure Hypothesis, Constructionism). In section 3, I will use ready-made arguments borrowed from White (2003) for the syntax-before-morphology view, and I will argue that knowledge of phrasal semantics also depends on the successful acquisition of the Functional Lexicon, in the same way that knowledge of syntax does. I will then summarize recent neurolinguistic and, especially relevant, ERP findings, suggesting that L2 morphosyntax is processed differently than syntax and semantics, and shows different Critical Period effects. Next, I will borrow from Lardiere’s (2005, 2007, 2008) arguments to show exactly why the functional morphology is the tight spot in the acquisition process. Finally, I will bring into the mix a new transition theory proposal, the Variational Learning framework of Yang (2002, 2004, 2006). It is a theory about child language acquisition and language change, but it is easily extendable to the second language acquisition process and in fact is a concretization and improvement upon the tired and inaccurate metaphor of “adult learners having access to UG”. The advantage of this theory is that it also puts the morphology at the center of the learning task and makes testable predictions. The conclusion will be that Variational Learning is an exciting new framework that is fully compatible with the Bottleneck Hypothesis and should be given a thorough testing. 1. What must a theory of second language acquisition explain? It is widely accepted in the field of generative L2 acquisition that at least two theories are required to explain the process of acquiring a second language grammar. Following Gregg (1993) (who cites Cummins, 1983 as the original) I will accept these to be a property theory and a transition theory. The property theory’s job is to answer the question: How is language knowledge represented in the mind/brain of the (adult native) speaker? Gregg argues that the best theory to explain L2 acquisition is UG. Note that UG in its current incarnation, Minimalism, can account not only for the final-state grammar of native speakers but also for the different stages in

86 The Bottleneck Hypothesis the development of interlanguage grammars. In fact, the idea that (some subset of) UG-provided linguistic principles and parameters can be used to describe all stages in language development seems to enjoy relative consensus in the field. Interlanguages have been shown to be internally coherent and consistent grammatical systems. Researchers disagree on the extent to which native linguistic competence matches that of a second language speaker; however, many authors accept UG as a theory explaining linguistic competence.1 The transition theory must answer the question: What causal mechanisms bring about the acquisition of the sort of competence explained by the property theory? What triggers the change of mental state in the mind/brain of the speaker? For example, at what point does a learner of Japanese decide that the language she is acquiring has an SOV word order, as opposed to her native English SVO? What kind of input (and in what quantity) is necessary for resetting of a parameter value? Furthermore, as Gregg points out, SLA transition theory has a twofold explanatory burden: it needs to explain the positive changes where parameter values are being correctly reset, but it also needs to explain why in some cases such resetting does not happen, or is impossible. Gregg (2003: 844) makes a further distinction within the transition theory. Building on terms borrowed from Sterelny and Griffiths (1999), he argues that a general SLA transition theory should aim for a robust process explanation and not for an actual sequence explanation. A robust process explanation would zoom out of individual learners and individual parameters to highlight the ideal process of parameter resetting and the types of triggers that lead to it. An actual sequence explanation zooms in to provide step-by-step causal accounts of change of states. An insightful comparison with historical explanation helps to understand the difference: “Thus an explanation of the World War I that appeals to the political divisions of Europe is a robust process explanation, seeking to show that some WWI-like event was very probable. The detailed unraveling of diplomatic and military maneuverings is an actual sequence explanation, showing how we got our actual WWI.” (Sterelny and Griffiths, 1999: 84)

A viable robust process explanation is a necessary prerequisite for an actual sequence explanation, and it is not clear whether we have any viable process explanation of L2 acquisition just yet. I shall review several recent theoretical proposals for such a theory in the next section. Since I have assumed UG as the property theory, that is, I am looking to explain how the learner gets from one UG-constrained mental representa-

What must a theory of SLA explain? 87

tion to another UG-constrained mental representation, at least two other elements are crucial in this process: comprehensible input and a learning mechanism of some sort. By “comprehensible input” I mean language (speech or written language) addressed or available to the learner in a situation whose context matches the intended linguistic interpretation. For example, and simplified dramatically, the utterance The rabbit chases the lizard counts as comprehensible input if it is heard in a situation of a rabbit running after a lizard, and not vice versa. Matching an utterance with a specific syntax and interpretation to an actual truth-value is crucial in the acquisition of meaning. The learning mechanisms which create mental representations from that input, their very existence, and their exact form and function are the really interesting issues in generative L2 acquisition transition theory today, and I imagine they will come to the fore of L2 acquisition researchers’ attention, ultimately making the top of research priorities list. As a preliminary condition on L2 learning mechanisms, let us note that they should be spelled out in terms of processing the comprehensible input, and therefore, evidence for L2 learning mechanisms should ideally come from psycholinguistic empirical studies. In a very general (and not yet psycholinguistically operationalized) form, it is commonly assumed that a change of mental state happens when the current interlanguage grammar is faced with input it cannot process (White, 1987). Models (Full Transfer Full Access, Minimal Trees) and researchers (White; Epstein, Flynn and Martohardjono, among others) arguing for access to UG in adulthood, would have it that when processing stalls, learners can draw on UG resources to fill in the gap in competence. Thus, this learning mechanism roughly takes the form of hypothesis-testing, with the viable hypotheses innately provided by UG. As the reader can see, this is a coarse and super idealized mechanism, but it should suffice as a starting point. In the next section, I will review some recent proposals for more or less comprehensive transition theories that will set the stage for my own proposal. In the discussion, I will try to highlight to what extent we can consider these proposals comprehensive, and what kind of evidence they command.

88 The Bottleneck Hypothesis 2. Current transition theories 2.1. Autonomous Induction Theory

2

Carroll’s (2001) Autonomous Induction Theory is a theory of learning a second language, compatible with UG approaches (although it accepts universal principles but rejects parameters). Its main goal (among many) is to define triggers for acquisition mechanisms and clarify the linguistic and psycholinguistic constraints on how these mechanisms operate. This model is the most articulate attempt to provide substance and detail to the crude hypothesis-testing, failure-driven learning mechanism outlined above. The truly important contribution of this model is to highlight the inevitable necessity of uniting linguistic representations and linguistic processing in one theory. It is another question altogether whether we can do this just yet. In this sense, Carroll’s (2001) study assumes a property theory (the principles of UG and Jackendoff’s architecture of the language faculty) and proposes a transition theory, plus two different but related subtheories that she deems essential to the whole story: a processing theory and a learning theory. The processing theory should explain how the input enters the system from the speech signal (bottom up) and from the conceptual system (top down). It is not the best articulated part of her model. The learning theory specifies how new representations get created based on processing failures (which Carroll rather confusingly calls “errors”). A big part of the learning theory is i-learning, or induction-learning, which Carroll adapts from classical Induction Theory (Holland, Holyoak, Nisbett and Thagard, 1986). I-learning works on symbolic representations and can revise them “so that they are consistent with information currently represented in working memory” (p. 170). Analyzing a novel form involves competition between various alternative representations, possibly belonging to different levels of analysis, the best representation gaining the upper hand. I-learning can recategorize items and construct new representations, associations, or analogies. Change can occur only in currently active representations, and then it creates only minimally different representations from the “parent representations.” Thus, learning would cease; in other words, fossilization would ensue when the learning mechanism stops detecting parsing failures or mismatches between parsed meaning and contextual situation (the detectability requirement). What transfers from the first language is nothing but processing procedures and preferences for categorization and attachment, since parameters do not exist.

Current transition theories

89

The second part of the book is devoted to feedback/correction issues, trying to understand whether, for example, negotiation of meaning as defined in the Interaction Hypothesis (Long, 1980) or any other metalinguistic teaching (like providing negative evidence, but not only) can produce reliable i-learning. The short answer is No, because in order to i-learn, the learner has to compare two different mental representations of her own utterances at the exactly appropriate level of analysis (for example, a morphosyntactic failure in the parsing process should feed morphosyntactic ilearning). But as Carroll correctly points out, there is psycholinguistic evidence attesting to the fact that intermediate level representations are not kept in long-term memory, only final meaning representations are, so the learner would not be able to make such a comparison.3 The bigger question still remains whether or not explicit grammar teaching can change a learner’s “psychogrammar.” The empirical studies Carroll reviews in this part of the book seem to have mixed results, suggesting that correction and feedback and modified input (input highlighting some grammatical property to the students) cannot really produce grammar restructuring. In her own admission, the properties that can profitably be influenced by correction/feedback are only aspects of word order and lexical learning. Carroll has proposed an ambitious and highly complex model of the second language faculty, encompassing the entire range of modules from phonetics to pragmatics. If we accept her tentative conclusions about explicit grammar teaching and feedback/correction being incapable of conjuring up novel mental representations, then we should consider her processing failure-based i-learning procedures as the sole source of grammar restructuring. This might well be true; however, we have no way of verifying it and exploring its constraints. That is why it is not surprising that she formulates the bulk of her claims as possibilities: “X can cause restructuring,” “Y can cause learners to reanalyze stimuli,” “Z might arise on a strictly biological basis,” etc. (Selinker et al, 2004: fn. 5). What is even more worrying is, as Lardiere (2004: 466) formulates it, “Given the model’s complexity, there is little explicit guidance on what sort of specific testable predictions can be developed from it.” Thus, Carroll’s wish that “[empirical] research will hopefully follow in the future when the Autonomous Induction Theory is subjected to rigorous verification,” (p. 177) might be ill placed indeed. The major effort of this theorist and her followers should be aimed at operationalizing the i-learning steps that make up the i-learning process.

90 The Bottleneck Hypothesis 2.2. Acquisition by Processing Theory Truscott and Sharwood Smith have recently proposed another transition theory that is rooted in processing. Their model is even called Acquisition by Processing Theory (APT, Truscott and Sharwood Smith, 2004). Their theoretical assumptions are very similar to Carroll’s in the sense that they also assume Jackendoff’s architecture of the language faculty and modularity. Three independent but interactive modules, Phonological Structure (PS), Syntactic Structure (SS) and Conceptual Structure (CS), are connected through three interfaces. Each module has its own processor as well. Truscott and Sharwoord Smith accept that there is sufficient evidence for the existence of independent syntactic and semantic processing mechanisms (e.g., syntactic structure can be primed independently of semantic factors). But at the same time, semantic characteristics of the sentence can affect processing long before the whole sentence has received a syntactic representation. They therefore assume something like the incrementalinteractive theory of Crain and Steedman (1985): autonomous (over)generation of competing representations by the syntax and selection among them at the conceptual level. This is where another important feature of this model comes into play: they define this competition among alternative representations as levels of activation, borrowing a crucial facet from connectionist models. Processing is characterized as the interaction of the three processors, which modulates the activation level of the current representations. The resulting levels of activation determine which particular representation is retained in processing the message. How does learning occur? Truscott and Sharwood Smith reject Carroll’s i-learning mechanism as unnecessary and use a simpler alternative instead. In the competition between alternative representations, the one that is ultimately chosen will have “a small lasting increase in its resting level [of activation], the effect of which is that it becomes more readily available for future processing (because, again, processors use the currently most active items and current activation is determined in part by resting activation levels).” (p. 6) This is how acquisition happens through processing. I shall illustrate this general procedure with the authors’ detailed description of how learning the functional category Infl occurs. “Infl is innately present in that the syntactic processor by nature writes it on SS [syntactic structure] whenever possible.” (p. 9) This claim, of course, entails that the syntactic processor possesses innate knowledge of functional cate-

Current transition theories

91

gories, presumably from UG. Integrative syntactic processes build SS within the syntactic module, using structures which are in syntactic working memory. In the early stages of learning, the processor cannot determine Infl’s relevance, (since it has not yet encountered the core Infl element, the copula), so it is not included in the SS. Once be is detected as the quintessential embodiment of inflectional properties (it is a free morpheme and has minimal selectional restrictions), the processor is ready to analyze it. The authors take the parsing of the sentence The glass is empty as an example, assuming for the sake of the argument that all the other lexical items except the copula are already known to the hearer. “At the syntactic level, Art-N?-A is initially written, where ‘?’ represents the new item. The syntax module then produces any syntactically acceptable representations it can from this incomplete information. At least one of these is likely to treat the new element as I[nfl], because valid representations will result.” (p. 9) Thus in essence Truscott and Sharwood Smith’s is a structure-building model in the spirit of the Minimal Trees hypothesis (Vainikka and Young-Scholten, 1994; also accepted by Hawkins 2001). There is an initial stage of acquisition, at which the functional category Inflection Phrase is not present. Conceptual Structure will not reject is as Infl because its meaning is compatible with the rest of the words in the sentence and it can be integrated in the semantic composition.4 After the resting activation level of be as Infl increases incrementally with each encounter and each successful acceptance at the conceptual level, its meaning will be installed in Conceptual Structure, its phonetic form in Phonological Structure, and all three modular representations will be coindexed. Of course, since learning depends on activation levels, many potentially useful occurrences of the appropriate input may be necessary before the functional category is firmly acquired. In a sense, Truscott and Sharwood Smith’s is the simplest model possible, an application of Occam’s Razor by brute force. This of course is not a bad thing in itself, since the simplest model is preferable until more complexity is demonstrated to be necessary. However, this simplicity is only apparent since a lot of “power” is packed into the working of the processor, making it highly unconstrained. As illustrated above, Truscott and Sharwood Smith’s processor possesses innate knowledge, can create novel representations, can change features on already existing representations, can coindex items within the PS, SS, and CS modules and so on. The standard assumption in psycholinguistics is that the parser works with already acquired representations. It is no wonder, then, that Truscott and Sharwood Smith can dispose of the learning mechanism: they have simply repackaged

92 The Bottleneck Hypothesis its functions as those of a powerful parser, capable of extracting knowledge from the input and containing all the UG principles and parameter at its disposal. Another problem of the model is that it brings together features from two competing and, some would argue, incompatible models explaining language acquisition (innate grammar from UG, activation levels from connectionism) and puts them to work in consort. However, the all-ornothing categorical nature of UG principles and parameters does not mix well with the idea of gradedness pertaining to different activation levels. This particular criticism recurs in many of the commentaries on the article (Bilingualism: Language and Cognition, 7, 1, 21-41). If there is only one mechanism of learning altering levels of activation and it depends on frequency of the specific element in the input, why do we need UG to tell us which representations are legitimate and which are not? On the other hand, if the L1 and L2 learner has access to all UG options, and uses them to generate the competing representations, why would engaging one option depend on the frequency of the trigger? How do we develop probabilistic cue strengths for innately specified strengths of features? I am very sympathetic to approaches that want to eat their cake and have it too, but much more clarity on detailed specific mechanisms is necessary before the integration of UG principles and parameters with activation levels is justified (see, for example, Yang, 2002). A final consideration echoes my issue with the Carroll model: there is scant evidence that learning a grammar proceeds in exactly this way, and no testable predictions are offered by the authors. The only prediction that the Acquisition by Processing Theory makes is a processing prediction, not a learning prediction (p. 6).5 Considering both Carroll’s and Truscott and Sharwood Smith’s proposals, we can sadly conclude that the admirable goal of formulating a transition theory that is feasible, comprehensive, and above all testable, appears to be unattainable in the short term. What then? 2.3. Shallow Structure Hypothesis One way forward may be to work deductively from psycholinguistic data towards theory, exactly the opposite of what Carroll as well as Truscott and Sharwood Smith have opted for. This approach has recently been taken by Clahsen and Felser (2006), who propose the Shallow Structure Hypothesis for L2 acquisition (see also discussion in chapter 3). Because they go from

Current transition theories

93

data to explanation, their theory has empirical support and has the added advantage of making testable predictions. I will review the main claim of the proposal concerning L2 acquisition, and then I will suggest that their explanation cannot be the whole story of second language learning. The main idea is that in adult language processing, most of the time language users employ simpler sentence representations that do not utilize the whole spectrum of grammatical mechanisms like movement of NPs, leaving traces at the original and each intermediate position where they land. Instead, language users rely more on lexical, semantic and pragmatic information to get them a fast rough parse that saves time as well as psycholinguistic resources. Proposals documenting this “shallow processing”, based on less detailed syntactic representations include Fodor’s (1995) depth of processing hypothesis, Ferreira, Bailey and Ferraro’s (2002) “good-enough” representations for language comprehension, the late assignment of syntax theory (LAST) of Townsend and Bever (2001), and Sanford and Sturt’s (2002) underspecification account (cf. chapter 3). Evidence for these proposals comes from a variety of areas, but the strongest support is from studies showing that the meaning which people obtain for a sentence is often not a true result of compositional processes, therefore, semantic representations are usually incomplete. For example, Tunstall (1998) (a UMass PhD dissertation cited in Sanford and Sturt, 2002) takes up the case of underspecified representations for quantifier scope. The idea is that, if representations of quantifiers are underspecified, then “people should be able to integrate the meaning of a multiply quantified sentence without committing to any one scope ordering” (Sanford and Sturt, 2002: 383). Tunstall studied the reading times of two possible continuations to experimental sentences such as in (1): (1) (2)

Kelly showed every photo to a critic last month. a. The critic was from a major gallery. b. The critics were from a major gallery.

In the reading where there was only one critic to whom every photo was shown, continuation (2a) should be read more easily; in the reading where every photo was shown to a different critic, continuation (2b) should be easier to read. The reading times in the Tunstall experiment showed no preference for one or the other, suggesting to Sanford and Sturt that readers are using underspecified representations.6

94 The Bottleneck Hypothesis Another example comes from the interpretation of garden-path stimuli. Christianson, Hollingworth, Halliwell, and Ferreira (2001) had their subjects read sentences like (3) presented without commas. (3) (4)

While Anna dressed the baby played in the crib. a. Did the baby play in the crib? b. Did Anna dress the baby?

Subjects spent a long time at the disambiguating word ‘played’. It is generally assumed that if comprehenders restructure their initial analysis of (3) so as to make ‘the baby’ the subject of ‘played’ and not the object of ‘dressed’, they will end up with the correct meaning of sentence (3). Subjects in this experiment correctly responded with ‘yes’ to question (4a), indicating that ‘the baby’ had eventually been taken from object position to subject of the embedded clause. However, when asked (4b), the subjects also answered ‘yes’, suggesting that they had not readjusted the meaning of the sentence to correspond to the syntactic reanalysis. This result is interpreted in Ferreira et al. (2002) to indicate that the initial misrepresentation lingered and so the final meaning people obtained for this sentence was not coherent; it was in fact impossible. To account for such findings, it has been argued that the language processor sometimes computes representations for comprehension that are shallower (that is, less-detailed) than might be necessary, but they are “good enough”, or get the job done, most of the time. Townsend and Bever (2001) call these representations “pseudosyntax”, a “quick-and-dirty parse”, and argue that high-frequency templates with the canonical thematic pattern NVN = actor-action-patient play a major role in forming them as well as in the fast recognition of function words. Pseudosyntax includes the minimal syntactic operations that yield a superficially well-formed sentence. Major phrases are assigned conceptual relations using verb argument structure and control information. Only later is the complete syntactic analysis accomplished and finally, the two representations are checked against each other. Coming back to L2 acquisition, then, Clahsen and Felser (2006) proposes that contrary to what happens in native speakers, the shallow processing available to the human processing system is the only type of processing that L2 learners can engage in. However, this is not a claim about processing only; it is a claim about linguistic representations. “In nonnative (adult L2) language processing, some striking differences were observed

Current transition theories

95

between adult L2 learners and native speakers in the domain of sentence processing. By way of accounting for these differences, we proposed the shallow structure hypothesis according to which the sentential representations adult L2 learners compute for comprehension contain less syntactic detail than those of native speakers.” (p. 35). Let us look at the evidence for these “striking differences.” Most studies supporting Clahsen and Felser’s tentative (by their own admission) conclusions come from their own lab and test L2 filler-gap dependencies. Marinis, Roberts, Felser, and Clahsen (2005) carried out a selfpaced reading task on sentences containing long-distance wh-dependencies: (5)

a. The nurse who the doctor argued _______ that the rude patient had angered _____ is refusing to work late. (intermediate gap) b. The nurse who the doctor’s argument about the rude patient had angered _______ is refusing to work late. (no intermediate gap)

For sentences such as (5a), where the extracted wh-phrase comes from the complement clause, an intermediate landing site is postulated and, hence, an intermediate trace should be present. No such intermediate landing site and trace are assumed for examples like (5b), where the extraction is over (but not out of) a complex NP. The native speakers in the Marinis et al. (2005) experiment showed a significant interaction between extraction and phrase type on the crucial segment, indicating that the presence of the intermediate trace facilitated the wh-phrase integration into the sentence structure. Advanced learners of English with Greek, German, Chinese and Japanese as native languages did not show such an interaction, therefore indicating that they are not sensitive to the intermediate trace. Clahsen and Felser proceed to show how exactly the processing of the two stimuli sentences (5a) and (5b) can be done utilizing only lexical and pragmatic knowledge. As Indefrey (2006: 67) suggests in his commentary on the Clahsen and Felser proposal, the strong conclusions may seem premature indeed. He draws attention to another experiment (Roberts, Marinis, Felser and Clahsen, 2004) which shows that low memory-span native speakers also did not show sensitivity to traces in sentences like (6): (6)

John saw the peacock to which the small penguin gave the nice birthday present ______ in the garden last weekend.

96 The Bottleneck Hypothesis Indefrey suggests that “if we were to classify types of listeners, it would be high working-memory-span native speakers on one side and low-span native speakers with L2 speakers on the other. Obviously, what the latter groups have in common is not their language background but much more likely some limitation of their processing resources.” (p. 67) Another type of evidence that would address the question of whether or not L2 speakers utilize detailed and complete syntactic representations in their parsing comes from comprehension experiments. In particular, there is a wide range of studies on scope and long-distance dependencies that demonstrate successful L2 comprehension, and these are the particular province of this book, of course. I will illustrate this point with the tense-dependent interpretations from Dekydtspotter and Sprouse (2001). Consider the data in (7). (7)

Qui de célèbre fumait au bistro dans les années 60? Who of famous smoked in the bar in the 60’s? ‘Which famous person smoked in bars in the 60’s?’

A possible answer to this question may involve a present and a past celebrity. On the other hand, it is impossible to answer the discontinuous interrogative constituent as in (8) with a present celebrity. Only someone who was a celebrity in the past is the appropriate answer. (8)

Qui fumait de célèbre au bistro dans les années 60? Who smoked of famous in the bar in the 60’s? ‘Which famous person smoked in bars in the 60’s?’

A linguistic analysis fairly compatible with Clahsen and Felser’s own assumptions combines a parameterized movement for checking of a whfeature and a universal semantic-computational mechanism. When a whphrase (qui) moves to Spec, CP to check a (strong) wh-feature, it can optionally take its adjectival restrictions (de célèbre) along for the ride, resulting in the structures in (9) and (10) (Dekydtspotter and Sprouse’s (3) and (4)). (9) [CP Qui de célèbre [C [TP tqui de célèbre fumait [VP tqui de célèbre [V' tfumait ] au who of famous smoked at-the bistro]]]]? bar

Current transition theories

97

(10) [CP Qui [C [IP tqui [I' fumait [VP [t qui de célèbre ] [V' tfumait]au bistro]]]]? who smoked of famous at-the bar In order to come up with the meanings described above, speakers need to interpret the wh-phrase with or without its associate de célèbre at each of the intermediate sites. (More details of this analysis will come in chapter 7.) That is, correct interpretation crucially depends on positing and processing intermediate traces. Intermediate as well as advanced learners of French with English as their native language show statistically significant sensitivity to this contrast, correctly choosing speech time construals more often with continuous interrogatives than with discontinuous interrogatives. Even learners of limited exposure to French (three semesters in an American university) demonstrated use of intermediate traces in their L2 processing. This was of course an untimed experimental task and cannot be directly comparable to timed reaction time tasks. However, the point is that even if the experiment participants had all the time they needed to perform the task, they wouldn’t have been able to do it without processing traces. This result cannot be explained by the Shallow Structure Hypothesis, which claims that L2 speakers rely on argument structure assignment and pragmatics in processing. The verbs in the two test sentences are the same; therefore learners should have come up with a similar shallow analysis for them. In conclusion, I believe that the Shallow Structure Hypothesis of Clahsen and Felser (2006), although not a completely articulated transition theory, is a major step in the right direction, because it formulates a framework that can account for some psycholinguistic results and because it makes testable predictions for further psycholinguistic research. As it happens, there are much data in the L2 literature that it cannot account for. The most important issue that this model has to resolve, however, is the relationship between the parser and the grammar (see Duffield’s 2006 commentary). We would be wise to be suspicious of the common assumption that the product (processing patterns) necessarily reflects the underlying grammar in a straightforward manner. In other words, it might not be warranted to infer that differences in performance in L1/L2 processing imply different underlying processes or, conversely, that similarities in performance imply similar processes. In sum, positing an equal sign between processing and grammar is dangerous, although our investigations of the grammar are inevitably going to be mediated by processing.

98 The Bottleneck Hypothesis 2.4. Constructionism Herschensohn (2000) offers another transition theory, assuming Minimalism as its property theory. In this sense, the underlying assumptions and many conclusions of this model are in agreement with the Bottleneck Hypothesis that I am proposing in this chapter. That is why I will very briefly outline the major claims of this model, paying more attention to where our two models will diverge. Adapting elements of recent theories of language acquisition, Herschensohn proposes that morphology and the lexicon are crucial to the emergence of the L2 grammar. Her morpholexical approach to L2 acquisition, called Constructionism, accounts for variability in the final outcomes and in the stages of parameter development. Adopting the minimalist position that language variation is solely morpholexical, the constructionist hypothesis proposes that L2 learning is substantially a matter of vocabulary and morphology acquisition with a progressive fleshing out of [r interpretable] features to gain the correct level for each parameter. A central tenet of this proposal is the gradual shifting of parameters, construction-by-construction. For example, the verb movement parameter (Pollock, 1989; studied in L2 acquisition by White, 1990/91, 1991, 1992) is known to be exemplified by at least three constructions: parameterized movement of the finite verb over negation as in (11), movement over an adverb as in (12), and optional movement of finite main verbs from I to C in questions, (13). The following examples are from Herschenson (2000: 125), her (6)-(9). (11) (12) (13)

a. b. a. b. a. b.

Vous n[e] (*pas) embrassez (pas) Marie. You do (not) kiss (*not) Mary. Vous (*souvent) embrassez (souvent) Marie. You (often) kiss (*often) Mary. Aime-t-il Marie? *Loves he Mary?/ Does he love Mary ?

In classic parameter theory (Principles and Parameters), the prediction was that the cluster of superficially unrelated constructions will appear together in the grammar of the learner (the child as well as the adult L2 learner), purportedly activated by a common salient trigger and reflective of the same underlying property of the grammar, in this case, verb movement. Since studies in the L2 acquisition of this parameter have shown that clustering of constructions in acquisition is not the case (White, 1990/91,

Current transition theories

99

1991, 1992), Herschensohn proposes that parameter resetting happens progressively; the first movement acquired is the one over negation, while the one over adverbs comes into the grammar later.7 As an aside, note that evidence to the contrary, that is, a cross-sectional acquisition picture compatible with clustering of constructions, as present in some of my own work (Slabakova, 2001, 2002, 2006a), is not sufficient to support the general prediction of clustering. The fact that it happens in some cases does not mean that it happens always. Rather, such diverging results pose the theoretically interesting question of exactly which parametrically related constructions may cluster and which others do not (see section 3.5 for the beginning of an answer). The stages of L2 learning in Herschensohn’s proposal are: an Initial state, where L1 values persist; an Intermediate state where the L1 value is unset; then, L2 constructions are progressively gained, and subsequently [+interpretable] morphology is gradually acquired. The final, Expert state is characterized by L2 values for syntax and mastery of the functional lexicon (2000: 112, see also Herschensohn, 1998). What explains the initial errors and the gradual acquisition? Herschensohn assumes that functional categories are initially underspecified à la Eubank (1996) and have the option of projecting or not (Grimshaw, 1994). “The restricted morphology and lexicon of the initial stage also limit the possibility of realizing functional projections. … L2 learning relies heavily on mastery of specific lexical patterns as the key to syntactic parameters.” (p. 219) This approach is reminiscent of the Lexical Learning Hypothesis in L1 acquisition (Clahsen, Eisenbeiss and Vainikka, 1994; Clahsen, Penke and Parodi 1993/94), in which lexical learning of the functional morphology drives the acquisition of functional categories (see much more on this in the next section). I do not believe the evidence for this view is overwhelming. Specifically for L2 acquisition, lexical learning would entail that between the initial state, when learners operate with their native functional categories, and the intermediate state, when learners start gaining gradual L2 functional knowledge, there must be an obligatory step in which the learners obliterate all the L1 functional categories in anticipation of acquiring the non-native ones. So far, there is no empirical support for this implication. This is exactly where my own proposal diverges from Herschensohn’s. I will propose, following Lardiere (2005, 2007, 2008), that the functional morphology is indeed the tight spot in the L2 acquisition process (the Bottleneck Hypothesis), but that problems with morphology mapping are not caused by the lack of functional categories in learners’ grammar representations.

100 The Bottleneck Hypothesis 3. Functional morphology is the bottleneck, syntax and semantics flow smoothly In this section, I will bring together and elaborate on all the different strands of evidence we have seen in the preceding chapters, which bear on the main argument of this book, namely, that functional morphology is the tight spot in the flow of acquisition. The different arguments will be presented in their own short subsections. First, I will repeat arguments from White (2003) that knowledge of syntax emerges before full suppliance of the functional morphology. Next, I will argue that semantic properties also depend on features encoded in the Functional Lexicon. Thirdly, I will summarize recent findings from neuroscience documenting different neural pathways for syntax and semantics. Finally, I shall consider why the functional morphology is the bottleneck of L2 acquisition. 3.1. Syntax is easier than functional morphology White (2003, ch. 6) describes two views of the morphology-syntax connection, which she labels Morphology-before-syntax and Syntax-beforemorphology (p. 182-4). On the morphology-before-syntax view (Clahsen, Penke and Parodi, 1993/94; Radford, 1990), lexical acquisition of functional morphology actually drives the acquisition of functional categories, as we mentioned above. The Lexical Learning Hypothesis would fall squarely within this view. Until the morphology is in place, learners lack knowledge of functional categories. A closely related view is the Rich Agreement Hypothesis (Rohrbacher, 1994, 1999; Vikner, 1995, 1997), according to which acquisition of the morphological paradigm determines the acquisition of feature strength, strong features being associated with rich morphological paradigms. This latter hypothesis assumes that rich agreement morphology is causally related to overt syntactic movement (not an uncontroversial position, see Sprouse 1998). White (2003) collapses both these closely related proposals under the label morphology-before-syntax. The syntax-before-morphology view, on the other hand, assumes that abstract morphological features, those that have an effect on sentence syntax and semantics, should be treated as distinct from the surface morphological forms (Beard, 1995, Lardiere 2000). On the latter view, L2 learners who do not have perfect performance on the inflectional morphology can still have engaged the functional categories related to that morphology and

Morphology is the bottleneck 101

have the abstract syntactic features represented in their interlanguage grammar. Evidence comes from several studies of child and adult L2 production (Haznedar and Schwartz, 1997; Haznedar, 2001; Ionin and Wexler 2002; Lardiere, 1998a,b). White (2003: 189) summarizes the data of the three studies as follows, see Table 1 on next page. The common trait running through them is the contrast between morphological variability in learner production and demonstrated knowledge of the syntactic properties usually associated with these inflections. For example, marking of past tense and third person singular present tense inflection is taken as instantiation of the functional category Tense, head of TP. At the same time, this functional category is responsible for attracting an overt subject to its Spec by providing a strong EPP feature, checking the nominative case on the subject and (depending on the analysis) for supplying the weak feature that ensures the verb does not move in English. Thus, the first two columns in Table 1 present the percent suppliance of the functional morphology in obligatory contexts, while the last three columns present evidence for acquisition of the syntactic reflexes of TP. Haznedar (2001) is a study of the production of a Turkish-native child learning English, whose linguistic development was followed over eighteen months. Ionin and Wexler (2002) report on the English L2 production of twenty Russian children aged from 3;9 to 13;10. Lardiere’s (1998a,b) subject is Patty, whose native languages are Hokkien and Mandarin Chinese and who acquired English as an adult immigrant in the US. Patty’s English did not “improve” in the eight years between the two recording sessions, that is why she is classified as an end-state L2 speaker of English, or a fossilized learner. What is especially striking in the data presented in Table 1 is the clear dissociation between the incidence of verbal inflection (ranging between 46.5% and 4.5%) and the various syntactic phenomena related to it, like overt subjects, nominative case on the subject, and verb staying in VP (above 98% accuracy). But knowledge of all the properties reflected in Table 1 is purportedly knowledge related to the same underlying functional category, IP, and its features. In view of such data, it is hard to maintain that morphology drives the syntactic acquisition.

102 The Bottleneck Hypothesis

Morphology is the bottleneck 103

Prévost and White (2000a,b) explain the learning situation reflected in Table 1 with the Missing Surface Inflection Hypothesis (MSIH). It maintains that learners have syntactic knowledge of functional categories, such as nominative case assignment, lack of verb movement in English, overt subjects, etc, even in the absence of surface manifestation of inflection. The title of this hypothesis, missing surface inflection, emphasizes the fact that underlying morphosyntactic features are indeed activated and used correctly. It is another question altogether, and a very interesting one, to explain the variability in inflection suppliance across subjects. A recent hypothesis, the Prosodic Transfer Hypothesis (Goad, White and Steele, 2003; Goad and White, 2004, 2006) argues that prosodic effects stemming from the native language can go a long way towards an explanation, thus categorizing inflection omission as a rather superficial phenomenon. Lardiere (2005, 2007, 2008), on the other hand, takes the approach that the difficulties are due to form-function mapping problems (see an extended discussion of Lardiere’s view in section 5.3.4). To sum up the basic point of this section, suppliance of correct morphosyntax lags behind L2 learners’ knowledge of the syntax of the same functional categories. 3.2. Semantics also depends on the functional morphology Now, the next logical step to take is to examine the connection between functional morphology and semantics. Current generative linguistic theory argues that not only overt syntactic properties but also properties computable at the syntax-semantics interface depend crucially on features encoded in the functional lexicon. To reiterate a previously given example, and simplifying dramatically, overt versus covert wh-movement is explained by a combination of a universal requirement on interpretation and a parameterized property. In order for sentences containing wh-words to be interpreted as questions, these words need to take (high) scope position at the interface, that is, they need to be in the CP projection. This is a universal requirement. The visible versus invisible movement has been taken to depend parametrically on features encoded in the wh-words in a language. Thus, interpretive properties encoded at the syntax-semantics interface like whmovement do not seem to be qualitatively different from purely syntactic properties such as verb movement, which do not give rise to interpretive differences between languages. Put differently, in the familiar verb movement (think English versus French, see examples (11)-(13) in section 2.4)

104 The Bottleneck Hypothesis the verb does not move for interpretive reasons. In wh-movement, the whword does move for scope-taking reasons. However, both movements are triggered by properties encoded in the functional lexicon. Furthermore, there is a considerable overlap between syntactic and semantic effects in language, because it is very often the case that the semantics is read off the syntax. For example, the functional category TP responsible for a range of syntactic effects (nominative case marking on subject, verb movement, etc.) as well as temporal interpretations of sentences. In principle, then, we should expect a similar pattern of processing and similar behavioral patterns of acquisition for abstract, subtle syntactic and semantic properties. A view that can be dubbed semantics-before-morphology on analogy with syntax-before-morphology is not inconceivable. We shall evaluate such a view in chapter 6, section 3.6. 3.3. Processing of morphology in the L2: Neuroimaging and ERP findings In an indirect way, the data from the neuroimaging and the ERP studies of bilinguals, which I discussed in chapter 3 also provide support for the Bottleneck Hypothesis. Let us summarize the ERP findings first. Recall that there are two different syntax-related components. The first is a left anterior negativity (LAN) between 150-350 ms which reflects morphosyntactic processes related to word form and word category identification. Within the LAN time interval, studies mostly out of the Friederici lab have argued for an early LAN, an ELAN, correlating with rapidly-detectable word category errors while the LAN correlates with morphosyntactic errors.8 Agreement, inflection and word category violations do not trigger LAN or ELAN effects in post-puberty learners (Hahne, 2001; Hahne and Friederici, 2001; Proverbio, Cok, and Zani, 2002; Sabourin, 2003; Hahne, Müller and Clahsen, 2006). The second ERP effect is a centro-parietally distributed positivity at around 600ms, the P600, which correlates with processes of syntactic revision, such as reanalysis and repair, as in garden-path sentences, incorporation of morphological information into the syntax, or generally syntactic processing difficulty. Syntactic violations as captured by P600 are processed similarly by natives and learners of high proficiency, even if they were immersed in the second language after puberty (Weber-Fox and Neville, 1996; Hahne, 2001; Hahne and Friederici, 2001; Proverbio, Cok, and

Morphology is the bottleneck 105

Zani, 2002; Sabourin, 2003; Hahne et al. 2006). Studies frequently show that the P600 may be delayed and with lower amplitude. A further line of inquiry using ERPs is indicative of the same tendency, too. Neville, Mills, and Lawson (1992) and Weber-Fox and Neville (2001) compared the processing of closed-class (function) words versus open-class (content) words in L2 learners and found either a missing or a much delayed N280 effect for closed-class words (determiners, auxiliaries, prepositions) as opposed to an N400 similar to that of native speakers’.9 In sum, and with the amply recorded caveats, ERP findings suggest a syntaxsemantics dissociation in L2 processing similar to that in native processing, and crucially, lack of rapid morphosyntactic processing in the same time window as in the native speakers’ processing. Table 2 tabulates these conclusions. Table 2. ERP effects in L1 and L2 processing __________________________________________________________________ Linguistic stimuli ERP effect in L1 ERP effect in L2 __________________________________________________________________ Morphosyntax ELAN, LAN No effecta Syntactic integration/reanalysis P600 P600 delayed, lower amplitude Semantics N400 N400 slightly delayed a Hahne et al (2006)’s findings on the acquisition of German participial morphology is the sole exception.

Some support for these (very tentative) conclusions comes from hemodynamic studies of L2 processing as well. Indefrey (2006a) offers a metaanalysis of such studies and concludes that to date, and based on 14 experiments, “L2 syntactic processing seems to use the same cortical areas as L1 syntactic processing” (p. 293). However, in four studies, which found stronger activation for listening and reading L2 sentences (Hasegawa, Carpenter, and Just, 2002; Luke, Liu, Wai, Wan, and Tan 2002; Rüschemeyer, Fiebach, Kempe and Friederici, 2005; Wartenburger et al, 2003), the subjects saw syntactically correct and incorrect sentences and had to perform a grammaticality judgment or a verification task. The same areas used in native processing are more strongly recruited, then, when there is an extra processing load on the subjects, as in judgment tasks, or when more awareness of the syntax is required. The errors that triggered this stronger activation of the implicated areas were all morphosyntactic errors.

106 The Bottleneck Hypothesis Finally, we will report in this section on a study that explicitly sets out to test inflectional morphology through behavioral and ERP experiments. Hahne, Müller and Clahsen (2006) investigated how L2 learners process inflected words online. Participle formation and plural in German were tested; subjects were 18 highly proficient Russian natives. Proficiency was established with a self-report (learners gave themselves a mean of 5 out of 6 points for fluency in German) and they had lived in Germany for 4.5 years on average (range 0.5 to 12 years). Hahne et al. tested the same morphology in two behavioral production tasks, and the learners were highly accurate, over 95% correct for participles and 86% correct for plurals, indicating that they were familiar with the critical morphology tested. Note that 86% is considered fairly high accuracy for non-native speakers’ production of morphology. Participants were presented with sentences containing two kinds of violations: incorrect regulars (*gelachen instead of gelacht ‘laughed’, *Waggonen instead of Waggons ‘wagons’) and overregularizations (*gelauft for gelaufen ‘run’, *Vases for Vasen ‘vases’). The expectation was that a misapplication of regular inflection would be perceived as a lexical error and would trigger an N400 effect, while a misapplication of a regular rule of inflection would be perceived as a morphosyntactic violation and would trigger a LAN and/or a P600. Results in the participle experiment show both a significant anterior negativity between 250 and 600 ms as well as a small parietal positivity between 600 and 1000 ms for the regularizations, and a centrally distributed negativity between 450 and 600 ms for the irregularizations. Although the effects are more focal and consistent in the native speakers, these findings are consistent with native German processing results. On the other hand, in the plural experiment, regularizations elicited only a late positivity, while irregularizations again elicited a delayed N400. Table 3 adapted from Clahsen and Felser (2006: 11, their Table 3) compares the native and non-native contrasts. In general, Table 3 shows that in L2 morphological processing, violations of regular morphosyntactic rules produce the expected ERP effects (P600 and LAN), while violations of irregular morphology trigger an effect related to lexical processing (the N400). Furthermore, given that a LAN effect is indicative of early automatic processes of word-internal morphological decomposition, these results suggest that L2 learners employ these processes for participles but not for plurals. Why should this be? The authors explain this discrepancy with the fact that “[t]he German noun plural system is rather unusual in that it has a default rule (the plural –s) with an

Morphology is the bottleneck 107

extremely low frequency, which means that more than 95% of the German nouns form their plurals according to one of the various irregular patterns. The participle formation system has a more common frequency distribution (in that irregular verbs do not outnumber regular ones) and a relatively small number of irregular patterns. Moreover, the plural system is linguistically more diverse in that it comprises five different endings, whereas participle formation only involves the choice between –t and –n. These factors make it easier for L2 learners to acquire German participle formation in nativelike ways than to learn the noun plural system of German.” (p. 130) Table 3. ERP effects on morphological violations in L2 learners L2 learners Native Speakers __________________________________________________________________ Incorrect regulars Participles N400 No effecta Plurals N400 N400b Overregularizations Participles LAN, P600 LANa Plurals P600 LANb,c, P600c a Penke , Weyerts, Gross, Zander, Münte, and Clahsen (1997) b Weyerts, Penke, Dohrn, Clahsen, and Münte (1997) c Lück, Hahne, Friederici, and Clahsen (2001)

While I agree with the authors’ explanation, I would stress that the German plural system is not that unusual in terms of its linguistic diversity (more on this in the next section). What is more, we should note that the fairly high 86% accuracy on plurals that the same learners achieved on the behavioral tests did not translate into LAN effects (indicative of fast automatic processing), only the even higher 95% accuracy on participle formation did. We should also consider the fact that these Russian learners live in Germany and use German in their every day communication; in fact, they are very proficient speakers. My point is that although Hahne et al.’s results demonstrate successful L2 usage of the two processing routes posited by dual-mechanism models—lexical storage and morphological decomposition—it takes a very high degree of automaticity for the inflectional morphology to be processed in a nativelike manner, while a much lower threshold of automaticity can be posited for syntactic and semantic calculations. In other words, inflectional morphology is the bottleneck of online linguistic processing.

108 The Bottleneck Hypothesis 3.4. Why is morphology so difficult to learn and to process? In answering this question, I will adopt the definition of morphological competence provided by Lardiere (2005): “Morphological competence includes, most obviously, the knowledge of which forms ‘go with’ which features. But consider what additional kinds of knowledge are required: What are the conditioning factors and are these phonological, morphosyntactic, semantic or discourse-linked? Are certain forms optional or obligatory, and what constitutes an obligatory context? In which domains are various features expressed, in combination with what other features, and why is supposedly the same feature expressed in some domains in some languages but not in others?” (p. 179)

In elaborating on morphological competence, Lardiere (2005, 2007, 2008) considers concrete examples of morphological mapping difficulty and then follows the same linguistic properties in the fossilized grammar of Patty (see section 5.3.1). I will repeat here some of Lardiere’s examples because they clearly outline the mapping difficulty. Take the feature [rpast]. It is customary in the L2 acquisition literature (and not only there) to treat it as an interpretable feature, which is going to contribute to interpretation at LF (having been checked but remaining undeleted). Languages can select to represent or not represent this feature with an overt morpheme. The often-cited languages that differ in this respect are Mandarin Chinese, without past tense marking, and English, with past tense marking (e.g. Hawkins, 2000). The semantic import of the –ed past tense morpheme is taken to be marking an event or state that has terminated before the interval including the moment of speech. However, the mapping between meaning and form in English is considerably more varied than this simple, one-to-one situation. As Lardiere (2005) points out, it is a many-to-many mapping. First of all, the –ed morphology encodes perfective aspect in English (Smith, 1997), as the example in (14) demonstrates. (14)

We devoured the pizza on the terrace (and there is no more pizza left.)

The devouring-of-pizza event in this sentence is not only in the past, it is also complete, so we can grammatically utter the second conjoint in (14) as a continuation of the first. Secondly, the past morphology is used as a marker of uncertainty and hence politeness in requests:

Morphology is the bottleneck 109

(15)

I was wondering whether you would be free tomorrow.

Both past tense forms was –ing and would stand for a present state (“I am wondering as I speak”) and a future state (“are you going to be free tomorrow?”) Thirdly, past morphology is used in sequence of tenses to refer to a nonpast state or event, even leading to ambiguity. (16)

Jane said that Joyce was pregnant.

The utterance in (16) can mean either that Joyce was pregnant at the time when Jane and Joyce saw each other, but Joyce has given birth since; or it can mean that Joyce is still pregnant today, at present. In this utterance the past form is actually ambiguous between a present and a past temporal reference. Finally, historical present usages are very common in everyday speech and in literature. The following excerpt is from Marc Nesbitt’s short story “Gigantic”, published in The New Yorker on July 9th, 2001, p. 76. (17)

“I rake dead bats from the hay floor of the bat cage and throw them in a black plastic bag. … I pick up a sign that says, “Quiet! ______ sleeping!,” slip in the “Bats” panel, and place it up front, where all the kids can see it. An hour till we open, I go see our one elephant, Clarice.”

Note that all the events described by the underlined verbs denote a sequence of closed events in the past and not ongoing or incomplete events. This usage of the present simple (a.k.a. the historical present) is a pragmatic convention of English, and its meaning is calculated at the semanticsdiscourse interface. Sticking to English past tense morphology and barely scratching the surface, I have demonstrated, following Lardiere (2005, 2007, 2008), that –ed in English can encode [+past] but also [–past], politeness, irrealis mood in conditionals, and perfective aspect in events, among other grammatical and pragmatic meanings. On the other hand, the simple present tense can also encode a sequence of complete events in the past. Thus, a mapping of many-to-many exists between English past tense meanings and forms. A further comparison between English, Irish, and Somali points to another source of mismatch: the lexical categories on which a feature is

110 The Bottleneck Hypothesis expressed may differ across languages. The examples are from McCloskey (1979) and Lecarme (2003, 2004), as cited in Lardiere (2005: 180). Irish expresses the tense distinction on its complementizers in agreement with the tense in the embedded IP as in (18) while in Somali [past] is encoded on nouns within the DP, (19). (18)

Deir sé gurL thuig sé an scéal Irish says he that-Past understood he the story ‘He says that he understood the story.’ (19) a. árday-gii hore Somali student-Det.Masc.Past before ‘the former student’ b. (weligay) dúhur-kii baan wax cunaa (always) noon-Det.Masc.Past Fem.1S thing eat-Pres ‘I always eat at noon.’ c. Inán-tii hálkée bay joogta? girl-Det.Fem.Past place-Det.Masc.Q Fem.3S stay-Fem.Pres ‘Where is the girl?’ Let us take another type of example from the temporal morphology of two closely related Romance languages: Italian and Portuguese. The data are from Giorgi and Pianesi (1997: 50). The crucial difference concerns the Portuguese past tense morpheme. (20)

(21)

a. Ho mangiato alle quattro. ‘I have eaten at four.’ b. Mangiai alle quattro. ‘I ate at four.’ a. Comi as quatro. ‘I have eaten at four / I ate at four.’ b. Tenho comido as quatro. ‘I took the habit of eating at four’

Italian

Portuguese

Both Portuguese and Italian have what looks like a simple past form and a present perfect form. However, the Portuguese simple past tense form comi in (21a) has both a present perfect meaning as in the Italian (20a) and a past meaning as in the Italian (20b). The periphrastic form, tenho comido as in (21b), although it is made up of the equivalent form of the auxiliary

Morphology is the bottleneck 111

verb to have and a past participle as in the Italian (20a), has a completely different meaning: a present habit. Thus, in Portuguese one form has two meanings, each one of those expressed by a single morpheme in Italian, while the seemingly equivalent forms have different meanings altogether. This is just one of numerous examples of functional morphology-meaning divergence, and comparative grammars are full of similar mismatches. Learning the functional morphology of a second language, then, is complicated enormously by such form-function mismatches. This is not all, however. The formal expressions of the –ed past morpheme in English are phonologically determined: they surface as [-t], [-d] or [-ıd] depending on the surrounding sounds. Furthermore, if one considers the frequent irregular verbs with their (partly) phonologicallydetermined classes of past tense marking, it becomes abundantly clear that a one form-one meaning mapping in the functional morphology of human languages is very rare indeed. Considering at length the past tense marking in English, Lardiere (2007: 131) concludes: “…, unless the same features or properties are always clustered in exactly the same way cross-linguistically, such that they are uniformly realized by the same (past tense marking) morphological means in each language— which surely does not appear to be the case—it is doubtful we can speak of that amalgamated feature as being parameterized in the sense intended by Hawkins, such that some languages have it and some don’t.”

In sum, learning the functional morphology in a second language is complicated because:  a functional meaning may be represented on one lexical category in the native language and on another lexical category in the second language, or not at all overtly represented (cf. Irish, English and Somali expressions of Tense);  a functional meaning encoded in a piece of morphology (say, a simple past verbal ending) may be encoded in another piece of morphology in the second language (a present perfect form, in our Italian-Portuguese contrast);  a functional meaning, say ‘past’, can be expressed with dedicated morphology in a language, but under special discourse conditions and optionally, it can also be expressed with forms lacking this morphology (the historical present);  a functional form, even if it has constant meaning, may have varied expressions depending on different phonological and morphological environments.

112 The Bottleneck Hypothesis As the above form-function mismatches attest, the learning task involves much more than a simple pairing between meanings and pieces of inflectional morphology. I will return to a more general discussion of “difficulty” in acquisition in chapter 8. 4. Another transition theory proposal: Variational Learning In this section, I will present a recently proposed transition theory for native language acquisition and language change, a theory that fits well with the logic of the Bottleneck Hypothesis. This theory has not yet been tested in L2 acquisition, but the predictions it makes are clearly applicable to the issues at hand, namely, how second language acquisition proceeds and what the areas of difficulty for the second language learner are.10 Given the contemporary position assumed in this book that much of language variation and thus acquisition comes down to the acquisition of the lexicon and the functional morphology, it is fitting to develop a theory of experience-dependent language learning. Yang (2002, 2004, 2006) and Legate and Yang (in press) propose the Variational Learning theory (see also Roeper, 2000; Kroch, 2001; Crain and Pietroski, 2002, for similar approaches). The starting point is the observation that the so called transformational approach to language learning is not capable of explaining 1) children’s gradual development of specific linguistic properties, and 2) the lack of abrupt changes in children’s production. Under the transformational approach (Chomsky, 1981, Gibson and Wexler, 1994), the mental state of the learner undergoes direct changes, as one hypothesis about a parameter setting is replaced by another hypothesis. Learning is failure-driven in the sense that rules are added or changed when input cannot be analyzed with the old rules. The old rules get deleted or unused. If only one grammar (albeit not the final-state grammar) is responsible for children’s production at any one point in time, then why is children’s production so variable? Also, as the child moves from grammar to grammar, abrupt changes in production should be observed, contrary to fact. Yang (2002: 21) takes a wellknown test case from children’s production of null subjects, because there is a large body of crosslinguistic and developmental data on this property. Hyams (1986) suggests that null subjects in English child language result from mis-setting their language to an optional subject grammar such as Italian, where null subjects are allowed. However, Valian (1991) shows that the rates of null subject use is very different: 70% for Italian children and

Variational Learning 113

only 31% for English children. Alternatively, Hyams (1991) suggests that during the null subject stage (2;0-3;0), English children mistake their grammar for a Chinese-type grammar, where topics (already mentioned nominal phrases) can be dropped. Again, Wang, Lillo-Martin, Best and Levitt (1992) contend that the rates of production are not really similar: Chinese children in the same age group drop subjects at the rate of 55%. Furthermore, if English children did in fact have a stage in their grammar when they were operating with a Chinese parameter value, one would expect that object drop should also be robustly attested. However, compared to Chinese children with 20% object drop, English children only drop objects at the rate of 8%. Yang argues that these well known acquisitional data cannot be accommodated under a model which proposes that the child has one single grammar at a time. Under Variational Learning, on the other hand, the child’s language is modeled as a competition among grammars. Each grammar (or more specifically, each parameter value) is defined by the innate parameter space of UG. It is this probability distribution that varies adaptively in response to the linguistic input. Schematically, learning proceeds as follows (Legate and Yang, 2005: 5). “For an input sentence s, the child a. with probability Pi selects a grammar Gi, b. analyzes s with Gi c. – if successful, rewards Gi by increasing Pi – or otherwise punishes Gi by decreasing Pi” The Variational Learning theory makes a number of predictions that distinguish it from the transformational approach. First, the rise of the target grammar is gradual since the probability P of the target grammar gradually approaches 1. Second, the disappearance of the non-target grammar is also gradual, since before being eliminated, the non-target parameter value will be accessed and used probabilistically. Thus, children’s “errors” are actually potential alternative grammars made available by UG that the children are “unlearning”. This mechanism is reminiscent of infants’ losing sensitivity to acoustic contrasts that are unavailable in their native language but that they were sensitive to at birth. Werker and her colleagues have demonstrated that the decline in the ability to acoustically discriminate non-native contrasts occurs within the first year of life (Werker and Tees, 1984, among others). To make the parallel even more interesting, it has been established

114 The Bottleneck Hypothesis that perceptual sensitivity to some contrasts is lost before others, so that the gradual decline proceeds in a systematic order. Brown (1998, 2000) links this decline in ability to discriminate some non-native contrasts to the gradual establishment of the feature geometry in the child’s grammar: “The degradation of the perceptual capacities and the increase in the ability to distinguish sounds phonologically are the result of the same internal mechanism, namely, the construction of phonological representations” (Brown 2000: 16). Although Brown’s explanation is not probabilistic and competition based, clearly similar ideas are at play. The most important prediction of the Variational Learning model is the following: learning is not only gradual, but the speed with which a parameter value rises to dominance is correlated with how incompatible the competitor parameter value is with the input, and how much input is indeed relevant to the parametric choice. This prediction is eminently testable. Table 4 adapted from Yang (2004: 455, his Table 1) relates the input statistics directly to the longitudinal trends in syntax acquisition and summarizes several cases, in which the timing of the setting correlates with the frequency of the necessary evidence in child-directed speech. Table 4. Correlations between input and output in parameter setting Parameter

Target Requisite Appears in % of Time of language evidence input sentences acquisition ________________________________________________________________ Wh fronting English wh-questions 25 very early (Stromswold, 1995) verb raising French negation/adverb 7 1;8 11 placement (Pierce, 1992) no null subject English expletive subjects 1.2 3;0 (Valian, 1991;Wang et al.1992) verb second German/Dutch OVS sentences 1.2 3;0-3;2 (Clahsen, 1986) scope marking12 English long-distance 0.2 4;0 + wh-questions (Thornton and Crain, 1994)

Legate and Yang (2007) use the Variational Learning model to account for the well known Optional Infinitive (a.k.a. Root Infinitive) phenomenon in child language. This phenomenon is attested in many languages and widely discussed in the L1 acquisition literature. Children use infinitival

Variational Learning 115

forms in root contexts, as in (22) where the adult grammar does not allow them, examples from Hyams (2000). (22) a. b. c.

Zahne pussen teeth brush-INF Dormir tout nu sleep-INF all naked Niet neus snuiten not nose blow-INF

German French Dutch

It is imperative to note that the root infinitive phenomenon is a gradient phenomenon in the sense that it takes up to three years to disappear from children’s production. Its crosslinguistic distribution is also not categorical, in that some child languages observe the phenomenon and some don’t. The production of root infinitives ranges from short and lower in frequency in Italian and Spanish to prolonged and frequent in Dutch, German and English (Guasti, 2002). Legate and Yang relate these two characteristics, gradience in time and across learner languages, to morphological learning and the competition between parametric properties, which is correlated to how clear the evidence is of obligatorily finite verbs in root clauses. What does the child need to learn in order to eliminate [–Tense] verbs from root clauses? She needs to hear unambiguous evidence that the verbs in main clauses are tensed. English is a tricky language in this respect, and Legate and Yang argue that even forms that are not, strictly speaking, tense-marking, like the agreement marker –s for third person singular, actually depend on tense (because –s only appears in the present tense), so they should be counted as unambiguous evidence. The forms that are compatible with a [–Tense] grammar are likely to punish the target [+Tense] grammar, up to a point. That is why it is important to calculate the ratio of these two types of morphology in a given language. The authors chose to compare English, Spanish and French. Spanish is a language with relatively infrequent use of root infinitives, around 10% before 2;0 and down to 5% by 2;6 (Grinstead, 1994); English has a prolonged root infinitive stage (until after the age of 3;0, and sometimes with over 50% frequency, according to Phillips, 1995); while French falls in between with 15-30% for children between 1:8 and 2;6 (Rasetti, 2003). Legate and Yang predicted that “the morphological evidence for [+Tense] in languages with shorter root infinitive stages is far more abundant than the morphological evidence in languages with extended root infinitive use” (p. 8).

116 The Bottleneck Hypothesis Their experimental data consist of counts of unambiguously tensed verb forms in the child-directed speech, based on the CHILDES database. I combine their results in Table 5. Table 5. Frequency of tensed verb forms in child-directed speech (in percent, actual ratios in brackets) Spanish French English __________________________________________________________________ Verb forms 77.2% 69.9% 54.4% rewarding [+T] (677/877) (733/1049) (14793/2198) Verb forms punishing [+T]

22.8% (200/877)

30.1% (316/1049)

45.6% (12405/27198)

Difference

54.4%

39.8%

8.8%

As Table 5 reveals, the morphological evidence available to children directly correlates with how long the root infinitive stage lasts in Spanish, French, and English. Thus, in Legate and Yang’s account, the child’s grammar is a constantly-changing combination of potential adult-like grammars and the child is using the unambiguous evidence from the morphological tense paradigm in her language to get to the target [+Tense] parameter value, ensuring that only finite verbs are allowed in root clauses. Yang’s (2002, 2004, 2006) Variational Learning model is logically extendable to second language acquisition. First of all, there is much more variation in learner production to explain in second language learning compared to child language learning. The learner-directed speech is also quite variable, depending on a range of classroom and naturalistic settings. Finally, the unambiguous evidence for a certain parameter value in the input is also likely to vary dramatically. Other variables may have to be added to the equation of second language acquisition: How good at morphology learning are adult L2 learners as opposed to child L2 learners? How do we measure who is a good learner of functional morphology? What is the status of the native parameter setting, is it just one among many, or does it have some privileged status among values?13 Those are only three of the many relevant questions one might have. But I believe the main idea is promising. Instead of employing the overused but probably inaccurate metaphor of L2 learners “having access to UG”, we can make it more concrete by hypothesizing that the “discarded” or “unused” parameter settings at the

Variational Learning 117

end of native language acquisition are not banished out of the UG hypothesis space for good; rather, they can be brought into play again when the second language input cannot be analyzed successfully with the native setting. Of course, the new, and testable, idea is that the rise of the target value from probability 0 to 1, and even conceivably fossilizing at 0.8, is going to be correlated with the percentage of sentences in the input that unambiguously reward the target parameter value and punish all others. The Variational Learning framework is a theory supported by learner production data, assuming that learner variable performance is indicative of the competition between UG-sanctioned parameter values. There probably are additional performance hurdles in L2 acquisition on top of those existing in child language acquisition, like slower and less accurate mental lexicon access, slower processing, etc. But the proposal is about a general language learning mechanism. When the language learner uses her native parameter value as a possible analysis for the incoming message, it is comprehension or miscomprehension of the message that measures success or failure. In the minimalist view, it is the functional morphology that either drives the parameter values (as in the optional infinitive case discussed above) or, if the parameter values are reflected by word order (as in the whfronting parameter), the functional morphology encodes these values with the help of features. In either case, if we assume Minimalism and a learning procedure similar to Variational Learning, we have a more articulate metaphor for the Bottleneck Hypothesis. Learning a second language parameter value different from the native one is a competition between the native parameter value and the UG-allowed other value(s) as reflected in the functional morphology of the target language. The more unambiguous the evidence from functional morphology, the more often will the target parameter value be rewarded with increased probability of access, and the easier it will be to raise this parameter value probability closer to 1. The gap between L2 production and L2 comprehension may still exist, as argued for by the Missing Surface Inflection Hypothesis, but it will be a moving target. I expect that the Variational Learning procedure has the potential to explain variability in L2 acquisition better than Carroll’s Autonomous Induction Theory and Truscott and Sharwood Smith’s Acquisition by Processing Theory. It gains an enormous advantage over those two proposals by making testable predictions. In the end, it may still turn out to be unsupported in the L2 acquisition process but at least it will die fighting. Variational Learning still needs to be articulated more specifically in terms of concrete processing mechanisms to become what Gregg (2003) calls an

118 The Bottleneck Hypothesis actual sequence explanation, but it is already promising as a robust process explanation, and second language researchers owe it a proper testing. 5. Conclusions In this chapter, I reviewed processing models that (to a different extent) offer a transition theory for second language acquisition. While admiring the comprehensive breadth of the Autonomous Induction Theory and the innovative sweep of the Acquisition by Processing Theory, I argued that both models suffer from being untestable proposals of how things might look in the L2 acquisition process. On the other hand, I argued that Clahsen and Felser’s Shallow Structure Hypothesis, while eminently testable, seems to be unsupported by recent research on the L2 acquisition of semantics. I suggested that the Yang’s Variational Learning framework may be a profitable way to get out of the impasse, at this point. It remains to be seen, of course, how it will acquit itself when imminent research takes a good look at the data. I have proposed the Bottleneck Hypothesis, whose rationale is made up of three related arguments: 1) showing that the functional morphology is hard, and is processed differently than syntax proper and semantics (sections 3.3 and 3.4) 2) showing that syntax flows (supported by research summarized in White, 2003, especially chapter 6) 3) showing that semantics flows. I next examine the empirical evidence. The last point is what I turn to in the next two chapters. In chapter 6, I present learning situations characterized by “simple syntax and complex semantics” while in chapter 7 I review studies testing linguistic properties combining “complex syntax and simple semantics.”

Notes 119

Notes 1. Emergentist approaches, of course, reject UG altogether, claiming instead that, in Ellis’s words, “structural regularities of language emerge from learners’ lifetime analysis of the distributional characteristics of the language input and, thus, that the knowledge of a speaker/hearer cannot be understood as an innate grammar, but rather as a statistical ensemble of language experiences that changes slightly every time a new utterance is pronounced” (Ellis, 2003: 634). We will not get into this debate here (see, for example, articles in MacWhinney, 1999; O’Grady, 1999, 2003). 2. In my discussion of Carroll’s book, I will omit reference to many of her arguments and claims and will concentrate on her Autonomous Induction Theory. For example, I will not dwell on Carroll’s rejection of parameter theory, but see Lardiere (2004) and Selinker, Kim, and Bandi-Rao (2004) for extensive critiques. 3. As Lardiere (2004) points out, “Interactionists will be irritated here with Carroll’s failure to include in this section more recent work that might address some of her criticisms, including Long’s more updated (1996) version of the Interaction Hypothesis.” (p. 464) 4. If at this initial stage IP is not present, then presumably Infl as head of IP should not be projected. It is still not clear to me why the conceptual structure would accept is as Infl, and not reject it. Another question that arises is what happens in languages without an obligatory copula. Are there any consequences for learning Infl? 5. The theory predicts that the competing representations generated by the syntactic processor will be activated for a short period of time, the unacceptable parses will be removed immediately, and then the correct representation will receive increased activation. To test this claim, the authors propose a hypothetical experiment: priming subjects with a temporarily ambiguous sentence like The floor was uncovered wood. The idea is that during parsing, a brief passive representation will surface, which will then be rejected at the appearance of the final noun. This passive construction, the rejected competitor, should then NOT prime subsequent production of passive structures. As Harrington (2004) points in his commentary (p. 30), it is difficult to test experimentally what is not happening in parsing, and secondly, evidence contrary to this prediction already exists. 6. Another explanation that comes to mind could be that the readers are maintaining two representations in working memory, equally plausible without context. More details of the experimental procedure and results need to be known before the interpretation of underspecified representations can be accepted. 7. The question arises as to whether the proposed sequence of stages presupposes that the adverb is higher in the structure than Negation.

120 The Bottleneck Hypothesis 8.

9.

10. 11.

12.

13.

It should also be noted that a number of studies do not report LAN effects where they should be expected. Secondly, the distribution of the LAN effect is somewhat variable and not infrequently bilateral. See further discussion of this effect in Osterhout, McLaughlin, Kim, Greenwald and Inoue (2004: 294-7) However, as Osterhout et al (2004: 294) point out, many variables are confounded with the content word/function word distinction. For example, word length and word frequency are two variables that can hardly be distinguished from word class: function words are shorter (at least in English) and more frequent. See Osterhout, Allen and Mc. Laughlin (2002) for an experiment showing that the word length factor better predicts the distribution of the ERP functions. Charles Yang’s new book is tellingly titled “The Infinite Gift”, Scribner, 2006. Yang (2002: 103) correctly adds negation to the evidence of verb raising. The claim is that Vfinite Neg/adverb word orders are 7% of all sentences heard by French children, in the CHILDES database. It is not discussed why I-to-C movement in questions cannot serve as the relevant input for verb placement in French. In German, Hindi and other languages, long-distance wh-questions contain overt intermediate copies of wh-movement as in (i) (i) Wer glaubst du wer Rech hat? Who think you who right has ‘Who do you think is right?’ For children to know that English doesn’t use this option, long-distance whquestions must be heard in the input. For many children, the German option persists for quite some time, producing sentences like ‘Who do you think who is in the box?’ (Thornton and Crain, 1994). Assuming L1 transfer, the native value would be the logical starting point of the learner, accessing the others only if the native one fails.

Chapter 6 Evidence from behavioral studies: Simple Syntax–Complex Semantics

The next two chapters offer experimental support for the Bottleneck Hypothesis. More specifically, I will review evidence for the third argument in the rationale of the Bottleneck Hypothesis, namely, that acquisition of semantics is not an insurmountable problem for L2 learners. The learning situations and challenges are divided into two types. In this chapter, I will look at studies where the syntax presents none or relatively little difficulty to the learners, at least to advanced ones. Not surprisingly, native speakers in these experiments show the regular range of accuracy in experimental studies (around 80-90%). The learning challenges lie, however, at the syntax-semantics interface. At this interface, learners have to figure out which forms are mapped on which meanings in the target language, since there is no one-to-one correspondence. These learning situations are memorably dubbed: simple syntax–complex semantics. When we consider results at all levels of proficiency from beginner to near-native, it will become clear that knowledge of the properties under scrutiny emerges gradually but surely. In the next chapter, I will discuss studies where the semantic effects to be acquired involve quite complex syntax, in the sense that sentences involve less frequent constructions (double genitives, discontinuous constituents, scrambling, etc.). Very often, the native speakers in these experiments show far lower acceptance rates than the ones we are used to seeing in the L2 literature. However, at the syntax-semantics interface, these same properties do not present much difficulty, as there are no mismatches. That is why I label them properties combining complex syntax and simple semantics. If learners have constructed the right syntactic sentence representation, the presence or absence of semantic interpretation follows, because learners are using their universal semantic computation mechanisms. That is why learners sometimes have higher rates of acceptance than native speakers. In all the cases, learners demonstrate that a contrast exists in their grammar between the allowed and the disallowed interpretation.

122 Evidence from Simple SyntaxComplex Semantics 1. Some methodological considerations The studies that will be discussed attest to the explosion of interest in studying interlanguage knowledge at the syntax-semantics interface in the last decade. Without a doubt, the impetus for such enhanced recent interest lies in the Poverty of the Stimulus (POS) argument. Since the beginning of generative second language acquisition in the 1980s, the seminal research question of this framework has been: Do adult L2 learners have access to UG? Arguably the best test cases that can address this research question are POS situations, where the input underdetermines what learners have to acquire. Most of these POS learning situations are at the syntax-semantics interface. If it can be shown that a native interpretation does not exist in the target language (therefore, there is no positive evidence for it in the L2 input) but its absence is still acquired by adult speakers, then a clear case of UG access can be made. As the reader can verify, almost all studies discussed in this book present such POS situations to their learners. But how do we gain access to the linguistic interpretations learners attribute to the input strings? The tasks at our disposal for studying interpretive properties have also evolved from the staple Grammaticality Judgment tasks (GJT) assumed to be the main tool of the generative linguist. Most of the studies I discuss here use the Truth Value Judgment Task (TVJT) (Crain and McKee, 1985), especially versions of it adapted to the needs of adult L2 acquisition. The original task, presented to children sometimes as young as two or three years of age, involves two experimenters, one acting out a certain situation and the other manipulating a puppet that “says” what the act-out represented. The child is asked to decide whether the puppet described the event correctly, and to reward the puppet with a cookie, if he said it right, but to punish him by giving him a sock, if he was wrong. The child need not pronounce anything in this experimental set-up, she just observes the experimental situation and judges whether it can be described adequately with the test sentence pronounced by the puppet. Testing is individual. Since the written format of this task is quite different from the classic set up for children, I present it in a general form. A story is supplied, sometimes in the native language of the learners (Dekydtspotter and Sprouse pioneered the presentation of the story in the native language, see also Borgonovo et al, 2006; Gürel, 2006; and Slabakova, 2003) to establish clear and unambiguous context. A test sentence in the target language appears below the story. Learners are asked to judge whether the test sentence is

Some methodological considerations 123

appropriate, or goes with (describes) the story well. Participants answer with Yes or No, True or False (see an exception forced by the linguistic facts in Slabakova 2006b). Whenever a story supplies an interpretation that does not go with the test sentence, learners are expected to reject it. Some test sentences are ambiguous so a story supplies only one of their two available interpretations; in such a case, those sentences appear under another story as well, supporting their second interpretation. Typically, stories and test sentences are squared in a 2 x 2 design, giving a quadruple of story-test sentence combinations. Acceptability in the quadruple is often as illustrated below: (1) Test sentence 1 Test sentence 2

Meaning 1 ¥ ¥

Meaning 2 ¥ *

Here is a quadruple of story-test sentence combinations from Slabakova (2003). The experiment investigates whether speakers of English know that a bare infinitive such as eat must refer to a complete event (of eating a cake, for example), while the gerund eating only refers to the process and need not refer to a complete event (see more on this study in section 3.7). These four story-sentence combinations do not appear together in the test, they are well mixed up with other items. Each story-sentence pairing is judged on its own. (2)

Matt had an enormous appetite. He was one of those people who could eat a whole cake at one sitting. But these days he is much more careful what he eats. For example, yesterday he bought a chocolate and vanilla ice cream cake, but ate only half of it after dinner. I know, because I was there with him. I observed Matt eat a cake.

False

True

Matt had an enormous appetite. He was one of those people who could eat a whole cake at one sitting. But these days he is much more careful what he eats. For example, yesterday he bought a chocolate and vanilla ice cream cake, but ate only half of it after dinner. I know, because I was there with him. I observed Matt eating a cake.

True

False

124 Evidence from Simple SyntaxComplex Semantics Alicia is a thin person, but she has an astounding capacity for eating big quantities of food. Once when I was at her house, she took a whole ice cream cake out of the freezer and ate it all. I almost got sick, just watching her. I watched Alicia eat a cake.

False

True

Alicia is a thin person, but she has an astounding capacity for eating big quantities of food. Once when I was at her house, she took a whole ice cream cake out of the freezer and ate it all. I almost got sick, just watching her. I watched Alicia eating a cake.

False

True

The event of eating a cake comprises a process of eating as well as a culmination point when all the cake is gone and the event is complete. The verbal phrase with a bare infinitive eat a cake can refer to the incomplete process as well as the complete event;1 that is, the sentence I observed Matt eat a cake is ambiguous between a process and a complete event interpretation (Meaning 1 and Meaning 2 in the design illustrated in (1) above). On the other hand, the gerund eating cannot refer to a complete event but only to a process. Hence, I observed Matt eating a cake has only one interpretation and the speaker of the sentence cannot commit as to whether the culmination has come to pass. The first story in the quadruple in (2) presents an unfinished event (the cake was half-eaten); consequently, only the sentence with the gerund describes it correctly and the sentence with the bare infinitive should be rejected by a speaker who knows the two meanings. The second story in (2) represents a complete event, so both Test Sentence 1 with a bare infinitive and Test Sentence 2 with a gerund are true. Note that all the test sentences are grammatical with some interpretation in the target language, so learners are not invited to think about the form of the sentences but just to consider their meaning. Nevertheless, this task reveals much about the learners’ grammars, and more specifically, about the interpretations they map onto linguistic expressions. Its main advantage is that learners do not access metalinguistic knowledge that may be due to language instruction but rather engage their true linguistic competence. Another advantage of the TVJT is that it deals away with judgment preference, since the expected answer is categorically True or False, and never both. In this respect, it is interesting to note that White et al. (1997), investigating reflexive binding in French-English and Japanese-English interlan-

Some methodological considerations 125

guage, addressed the methodological question of which task, a TVJT or a picture selection task, better represented the interlanguage competence of the learners. I will only give examples from English here for illustration purposes, but we shall discuss this study in more detail in the next section. Take the example in (3). (3)

Maryi showed Susanj a portrait of herselfi/j

If we have to find out whether learners interpret herself to refer to Mary or to Susan, or possibly to either one, we can test their interpretation with a picture selection task, widely used in the L2 literature on binding. Participants would be offered a picture in which Mary is showing Susan a prominent portrait of Susan, and a sentence underneath it like the one in (3) without the indexes. Participants have to indicate whether what is going on in the picture matches the sentence. If the learners allow Susan, the object of the sentence, and the reflexive to co-refer, they will answer positively. The same sentence will appear under another picture (not side by side but at another location in the test), this time of Mary showing Susan a portrait of Mary, to check whether learners allow binding to subject.2 It has been noticed (see White et al, 1997: 148 for discussion) that these picture (selection) tests mostly reflect the linguistic preferences of the learners. In the case of (3), for example, learners prefer to interpret the reflexive as coreferring with the subject and not the object. This does not mean that the other interpretation is missing from their grammar, but it does mean that experimental results capturing this preference actually underestimate the learners’ competence. White et al (1997) used both a picture task and a TVJT and showed that the native speakers as well as the learners were significantly more consistent in accepting local objects as reflexive antecedents on the TVJT. For example, only 3 out of 14 native speakers (21%) consistently accepted object antecedents (allowed by the English grammar) in the picture task, as opposed to 17 out of 19 (89%) who did so on the TVJT. Since the two tasks are arguably tapping the same linguistic competence, it is clear that the TVJT better deals with licit but dispreferred interpretations of ambiguous sentences, disposing of preferences to a larger degree. Another written version of the TVJT is used in Borgonovo et al (2006). The researchers supplied the context story in the native language, and then followed each story with a minimal pair of test sentences, differing only in the verbal mood inflection (indicative or subjunctive). The test sentences

126 Evidence from Simple SyntaxComplex Semantics were “produced” by one of the characters in the story. The participants had to rate, using a scale from -2 to +2, whether or not each test sentence appropriately corresponded to the situation described in the story. The presentation of two target sentences is probably not optimal in this task. First, the choice is made easier if the two inflectional endings are presented side by side for easy comparison; secondly, the participants cannot focus on the appropriateness of one form in one context.3 In addition, rating scales using negative and positive integers have been criticized for including the zero (White, 2003). It is not clear whether the zero is treated as a value between -1 and +1 and thus a rating of intermediate appropriateness, or whether the learners use it to mean “I don’t know”. A third version of the TVJT was used by Gabriele (2005). The researcher investigated semantic entailments of progressive tenses and presented the context story aurally, supported by two pictures on a computer screen, capturing two stages of an event. Then the participants heard and saw one test sentence. They had to indicate whether the test sentence was appropriate in the context of the story, using a scale of 1 to 5. The presentation of the context story in this experiment is superior in that it draws attention to the two stages of the unfolding situation (process and completion). Even if scales do not include zero, however, their use is hard to justify in an interpretation task. If a learner has categorical judgments of whether, for example, is arriving entails completion or not, then there are no degrees to that knowledge. One completion cannot be half less appropriate than another. If the learners were asked to choose between True or False, their attention would be drawn to the truth conditions of the sentence under consideration, in a way that numerical rating does not do. I believe that scales unnecessarily complicate this version of the TVJT for the participant. A second type of task tapping interpretive judgments (pioneered in Slabakova, 1997a, 2001, but see also Montrul and Slabakova, 2002, Gabriele, Martohardjono and McClure, 2003; Gabriele and Martohardjono, 2005) is a sentence conjunction judgment task, a semantics based task, in which the participants are asked to decide whether the two clauses in a complex sentence go well together or not. For example, take the sentences in (4) and (5) (4) (5)

Allison worked in a bakery and made cakes. Allison worked in a bakery and made a cake.

The first clause presents context and the felicity of combination of the first and the second clause is being judged. The two clauses in (4) are a

Some methodological considerations 127

good fit because they represent two habitual activities, while the pairing in (5) is less compatible because a habitual and a one-time event are combined. This task could be used in learning situations where the TVJT is not appropriate, although the TVJT is superior to it because it establishes the context in a clearer way. A third type of interpretation task (used in Kanno, 1997; Gürel, 2006, Slabakova, 2004, 2005) presents the learners with a test sentence and spells out the three (or more) interpretations, as the example in (6) from Kanno (1997: 269) illustrates. In this case, the instructions made clear that it was allowed to choose both (a) and (b) as possible answers, if this seemed appropriate. (6)

Darei-ga [proi kuruma-o katta to] itta no? who-NOM car-ACC bought that said Q ‘Whoi said (hei) bought a car?’ (a) the same person as dare (b) another person

In my opinion, this task is also less effective than the TVJT, specifically because learners find it more difficult to externalize how they interpret coreference or DP specificity or complete event, whatever the case may be. In a way, this task expects them to think about the meaning of the test sentence and then choose from a couple of provided interpretations, while the TVJT allows them to focus on the story context and then judge the test sentence in a more natural way, abstracting away from its grammatical form. However, Gürel (2006) used this task in conjunction with the TVJT to find out whether her learners allow pronominal elements to be ambiguous, and her findings on the two tasks were similar, suggesting that her learners were able to overcome the problems mentioned above. In this and the next chapter, I do not include studies that test semantic properties using a GJT. It is hard to argue that a grammaticality judgment captures the interpretation a speaker attributes to a sentence. In a GJT, participants are asked to attend to the form of sentences and whether they sound OK to them. Even if considerations of learners applying metalinguistic knowledge are excluded, it is still not always clear what the reason(s) may be for marking a sentence as unacceptable (unless a correction task is included as well). It is instructive to relate Borgonovo et al’s (2006) experience in this respect. This study investigated usage of the subjunctive in Spanish relative clauses, which is dependent on the specificity of the DP

128 Evidence from Simple SyntaxComplex Semantics modified by the clause (see discussion of this study in section 3.6). The researchers used a GJT with sentences establishing context through various adverbs and other grammatical means, and supplying an appropriate or inappropriate inflectional endings (indicative or subjunctive). They also used a TVJT. In the GJT, the learners did not always judge the subjunctive as “grammatical” and the indicative as “ungrammatical”. The authors say: “In fact, the results for the TVJT allow us to see that L2 learners may have a better knowledge of mood selection than what the results of the GJT may have led us to believe. … This suggests that in some cases, syntactic information alone, i.e., without context, is not sufficient to establish the specificity status of the DP and to determine the appropriateness of the subjunctive.” (p. 22 of manuscript) In summary, versions of three tasks are predominantly used in the L2 literature to probe semantic interpretations. I have argued that the best one is the written TVJT with answers of True and False, followed by the sentence conjunction judgment task, and the multiple choice of explicitly spelled-out interpretations task. I have argued also that the GJT is not appropriate for addressing research questions related to interpretation. In the rest of the chapter, I turn to empirical investigations of acquisition of binding, aspect, mood, and articles. 2.

Interpretive dependencies of binding

2.1. Binding of reflexives There has been considerable research on how principles of the Binding Theory (Chomsky, 1981) are acquired by second language learners. Much of that research has addressed the question of whether L2 learners can reset parameters related to Principle A, the principle that determines the distribution of reflexives and reciprocals.4 We will concentrate here on the research which assumes proposals linking the morphological form of the reflexive with the possible position of its antecedent (Cole, Hermon, and Sung, 1990; Katada, 1991; Pica, 1987, Reinhardt and Reuland, 1993, a.o.) (L2 studies: Christie and Lantolf, 1998; Hamilton, 1998; Thomas, 1995; White, Hirakawa, and Kawasaki, 1996; Yuan, 1998). Morphologically complex reflexives include English herself while simplex ones are equivalent to self, e.g., Japanese zibun. Essentially, morphologically complex reflexives have to be bound locally (within their clause or NP) by a subject or a non-subject an-

Interpretive dependencies of binding 129

tecedent; morphologically simplex reflexives can be long-distance oriented, but only to subject. They cannot be bound by non-subjects. The following sentences illustrate these observations. In (7), the morphologically complex reflexive herself cannot refer to the non-local subject Mary, the only possible antecedent is the local subject Susan. In (8), both the subject Mary and the object Susan are possible antecedents for the English reflexive. Examples (7)-(10) are from White et al. (1997: 147, their examples 1 through 4). (7) (8)

Maryi thought that Susanj blamed herself*i/j Maryi showed Susanj a picture of herselfi/j

In Japanese, on the other hand, the judgments are reversed, see (9) and (10). Both the local and the long-distance subject can be antecedents for the monomorphemic anaphor zibun in (9), but it cannot refer to the object in the simple clause as in (10). (9) (10)

Mary-gai Susan-gaj zibun-oi/j semeta to omot-ta Mary-Nom Susan-Nom self-Acc blamed that think-past ‘Maryi thought that Susanj blamed herselfi/j.’ Mary-gai Susan-nij zibun-noi/*j shasin-o mise-ta Mary-Nom Susan-Dat self-Gen picture-Acc show-past ‘Maryi showed Susanj self’si/*j picture’

This L1-L2 learning situation is a perfect example of a mismatch at the syntax-semantics interface. Note that the domain of binding (local versus long-distance) and the orientation (only to subject versus to subject and non-subject) are associated in a one-way implication: if the anaphor is bound long-distance, then it is only subject-oriented, although the reverse implication does not hold, as Pica (1987: 489) pointed out. The exact details of Pica’s linguistic explanation need not concern us here; however, most recent analyses agree that these interpretive contrasts can be explained by LF movement relative to the morphological form of the reflexive.5 The important thing to notice is that the visible morphological form of the anaphor can serve as a clue to the possible binding domain, and, associated with that, its orientation. Since this form-meaning relationship is only a tendency and not without exceptions (e.g., French has a reflexive clitic se, a morphologically simplex anaphor, which is only locally bound by subject), re-classifications on the part of learners of anaphors as pronouns or

130 Evidence from Simple SyntaxComplex Semantics other elements are not ruled out. Therefore, White (2003: 45) provides the following succinct summary of the linguistic facts (see also Thomas, 1995): (11) a. Long-distance anaphors must be subject-oriented b. Anaphors which allow non-subject antecedents must be local White et al (1997) investigate the acquisition of English reflexives by Japanese and French learners. In addition to jibun, Japanese has morphologically complex anaphors like the English ones (kare-zisin ‘himself, kanojo-zisin ‘herself), although they are very rare. French also has the latter, as well as the clitic se, locally bound by subject. The common meaning of all reflexives is universal: they pick their reference from a c-commanding antecedent. What the learners have to acquire, then, is how the English reflexives himself, herself should be classified: as monomorphemic (X0) or as bimorphemic (XP). The relevant interpretations will follow from the universal tendency described above. For the French learners, there is a match between French reflexives and the English ones, at least in terms of binding domain, all French reflexives being locally bound. For the Japanese learners, there exists a mismatch between the most common reflexive zibun and the target English ones. In order to study learners’ interpretations, White et al (1997) used two different tasks, a TVJT and a picture selection task (see section 1). We will only report the individual results of the TVJT here, since the authors show that the results of the picture selection task under-represent the learners’ grammars (at least in monoclausal sentences). The TVJT offered the participants a written story, 2 to 5 sentences long, followed by a comment sentence. Stories like the one in (12) check whether learners allow binding to non-subjects. (12) Susan wanted a job in a hospital. A nurse interviewed Susan for the job. The nurse asked Susan about her experience, her education, and whether she got on well with people. The nurse asked Susan about herself.

True

False

The learners which allow binding to object will answer True, and that is the correct answer for English. The learners whose grammars disallow binding to object, will look at the other possible antecedent, the nurse, but will answer False since the story does not support this interpretation.

Interpretive dependencies of binding 131

Stories as in (13) check whether the learners allow long-distance binding (available only in the Japanese grammar). (13) Johnny and a little boy were playing with matches. Johnny lit a match and then dropped it on the little boy’s leg. The little boy went screaming to his father and told him what had happened. The little boy said that Johnny burned himself.

True

False

If a learner allows long-distance binding, then she will answer True to this type of story, contrary to the target English grammar. If a learner allows local binding only, her answer should be False, since the story does not support local binding. Two language groups were included to check for the influence of the native language: French and Japanese. French, like English, allows only local binding; Japanese allows long-distance binding. Group results indicate that both the French and the Japanese intermediate learners of English were significantly less accurate on the local binding to object than to subject. On stories where the context favored the long-distance interpretation as in (13), both learner groups were less accurate in rejecting long-distance binding than the native participants. Table 1 (adapted from White et al’s Figures 1 and 3) gives the mean accuracy on the important properties. However, the group results are hiding the re-classification of reflexives happening in the grammars of individual participants. Individual results were calculated with 3 or 4 correct responses out of possible 4 being considered as consistent. Table 1. Mean accuracy (out of 4) Japanese learners French learners Native speakers ( n=19) (n=22) (n=19) Allow subject antecedent in monoclausal sentences

3.3 (82.5%)

3.7 (92.5%)

3.4 (85%)

Allow object antecedent in monoclausal sentences

2.5 (62.5%)

3.0 (75%)

3.5 (87.5%)

Reject long-distance subject antecedents

2.3(57.5%)

2.7 (67.5%)

3.8 (95%)

132 Evidence from Simple SyntaxComplex Semantics Table 2, adapted from White et al., their Table 2, shows the individual number of participants per group who gave consistent answers in monoclausal sentences. Table 2. Number of participants giving consistent responses to monoclausal sentences Japanese learners (n=19)

French learners Native speakers (n=22) (n=19)

Number consistently 16 (84%) accepting subject antecedents

20 (91%)

18 (95%)

Number consistently 11 (58%) accepting object antecedents

16 (73%)

17 (89%)

Number consistently 4 (21%) rejecting subject antecedents

2 (9%)

2 (11%)

Although White et al. (1997) do not provide the individual results of the biclausal sentence interpretations, in footnote 9 they address the important question of whether the participants observe the form-meaning mapping tendency (bimorphemic reflexives are locally bound by subject and object; monomorphemic reflexives are long-distance and subject oriented). Only 4, or 21%, of the Japanese and 5, or 23%, of the French learners allow both long-distance antecedents and at the same time local object antecedents. Since these options are simultaneously allowed in natural languages like Icelandic (sig) and Serbo-Croatian (sebe), these learners are opting for a reclassification of their native reflexives, not mapping them onto the target English ones but still using UG-sanctioned other options. In summary, at least in monoclausal sentences (see Table 2), the majority of individual Japanese learners (at intermediate proficiency) have acquired that English reflexives can be bound by an object, an interpretation not supported by their native grammar. More generally speaking, the binding of reflexives has proven a very difficult area of acquisition at the syntax-semantics interface. The overall possible meanings are abundantly clear: reflexives pick out c-commanding antecedents. However, the clarity ends here. Antecedents can be local or non-local, they can be subjects or non-subjects. It is not the case that “any mapping goes”, but almost. The only option consistently excluded from

Interpretive dependencies of binding 133

UG is the following: non-local objects are never antecedents for reflexives. However, linguistic theory has not provided yet the definitive judgment explaining all binding facts.6 Both the Binding Theory (Chomsky, 1981) and Reflexivity Theory (Reinhardt and Reuland, 1993) leave certain swaths of facts unexplained. On this background, and armed with only a (nonexceptionless) universal tendency provided by UG, most learners in the majority of experimental studies testing interpretation of reflexives, including the White et al. (1997) study we discussed in more detail here, are successful at re-classifying the reflexives in the language they are learning. Not surprisingly, a certain amount of misclassification occurs. For example, in the Thomas (1995) study, 30% of her participants (six lower proficiency learners and one advanced learner) allowed the UG-disallowed binding to long-distance object. Other studies have also reported some learners as well as some native speakers allowing this possibility. While we can attribute the native speaker performance to “experimental noise”, Thomas (1995) and Hamilton (1998) offer misclassification explanations for the learner data. Thomas suggests that her learners have misanalyzed the reflexive as a pronoun, which can indeed be bound by a long-distance non-subject. Hamilton proposes that some learners treat reflexives as logophoric (e.g., There were five people in the room besides myself), which do not have to be bound at all. Whatever the explanations for these unsuccessful attempts at reclassification of reflexives may be, we ought to acknowledge the tremendous complexity of the learning task and the insufficient evidence learners have to work with. By insufficient evidence, I mean not only the general lack of consistent instruction and negative evidence,7 but more importantly, the general permissiveness of UG-sanctioned options, if we include misclassifications of reflexives as pronouns and logophors. This being the case, it is not surprising at all that learners have a very hard time acquiring and consistently interpreting reflexives in a second language, as the existing research studies have amply documented. Reflexive binding is an area at the syntax-semantics interface where too many available options are provided by UG, and learners have to pay particular attention to the comprehensible input in order to acquire the exact configuration of domain and orientation that their target language reflexives use. It could be an exercise in frustration, metaphorically speaking. Of course, such acquisition is not impossible in the long run, as the majority of learners in these studies do demonstrate successful reclassification of reflexives. The only way to get a misclassification corrected is to pay attention to the available interpreta-

134 Evidence from Simple SyntaxComplex Semantics tions in the input, and a critical mass of relevant examples may be needed for that to happen. 2.2. The Overt Pronoun Constraint The Overt Pronoun Constraint (Montalbetti, 1984) is a well-known example of subtle interpretive differences between languages, severely underdetermined by the input, which children and L2 learners still manage to acquire. In this section, the interpretive contrast will be briefly relayed, the results of two studies on L2 Spanish and L2 Japanese will be mentioned, but most of our attention will be directed to a recent study of L2 Turkish (Gürel, 2002, 2006). This choice of focus is determined by the even more severe exercise in frustration that English-Turkish syntax-semantics mapping represents, in comparison with Spanish and Japanese. The Overt Pronoun Constraint regulates the interpretation of null and overt subject pronouns. As such, it is relevant to [+null subject] languages only. In [ínull subject] languages like English, the OPC does not have visible effects, since there is no contrast of two possible elements in subject position. (15)

a. Billi thinks that hei/j is smart. b. Everyonei in this room thinks that hei/j is smart.

The overt pronouns in the embedded clauses’ subject positions are ambiguous: they can refer to the subjects of the main clauses (Bill, everyone) or they can refer to a third party in the discourse, say, John. There is a difference though: in (15a) the main subject is a referential NP while in (15b) it is a quantifier. Quantifiers in subject positions give rise to a bound variable interpretation.8 In [+null subject] languages like Spanish, Japanese, Korean, and Turkish, a contrast exists between the interpretive possibilities in sentences equivalent to (15). The following examples illustrate how the OPC works in Japanese (Montalbetti 1984: 85): i. Referential antecedent context: itte-iru (16) a. Tanaka-sani wa [kare-gai/j kaisya de itiban da to] Tanaka-Mr TOP he-NOM company in best is COMP say-PRES b. Tanaka-sani wa [proi/j kaisya de itiban da to] itte-iru Tanaka-Mr TOP pro company in best is COMP say-PRES ‘Mr Tanaka is saying that he is the best in the company.’

Interpretive dependencies of binding 135

ii. Quantified antecedent context: (17) a. Daremo-gai [kare-ga*i/j atama-ga ii to] omotte iru Everyone-NOM he-NOM be-smart COMP think-PRES b. Daremo-gai [proi/j atama-ga ii to ] omotte iru Everyone-NOM pro be-smart COMP think-PRES ‘Everyone thinks that he is smart.’ In these examples, a contrast in binding of overt embedded subject pronouns is demonstrated in the scope of referential versus quantified antecedents. When the antecedent is a referential NP, the overt pronoun (16a) and pro (16b) have the same referential properties. Both can refer to either the main clause subject or to a third party. However, when the antecedent is quantified, the overt pronoun (17a), unlike pro (17b), cannot be bound by the quantified antecedent; it can only refer to a third party. In other words, pro can take a quantified antecedent while overt pronouns cannot. Thus, if the language has pro as an option, (i.e., it is a [+null subject] language), then interpretive possibilities of overt pronouns are constrained by the OPC. Let us consider the learning task facing the English learner of Japanese. The first major challenge is the null subject, of course. In subject positions, where in English there are nominative marked subjects (in main and, particularly important, embedded clauses9), there is an alternation between overt and null morphemes in [+null subject] languages. This calls for a serious re-classification of the whole pronominal system. The overt and the null elements appear under different pragmatic conditions (into which we will not go here). But even if we leave pragmatic constraints aside, the simplest assumption learners can make is that null subjects have similar syntactic and interpretive properties as the overt pronominal subjects. However, in bound variable contexts as in the Japanese examples in (17) this similarity disappears. The overt pronoun in (17a) differs from pro in (17b) in not allowing a bound interpretation. Thus only pro remains ambiguous while kare has only one meaning. This interpretive contrast is not taught or discussed in language classrooms (Kanno, 1997; Perez-Léroux and Glass, 1999) and sentences along the lines of (16) contrasted with (17) are rare in the input. Even more importantly, the overt pronoun kare is missing an interpretation, that is, sentences like (17a) are never going to appear in the input, hence the frequency of the other three licit interpretations is irrelevant. When learners have to acquire a gap in a paradigm of available interpretations, positive evidence for the missing interpretation is

136 Evidence from Simple SyntaxComplex Semantics not available and negative evidence (correction, explicit teaching) is arguably missing, too. Furthermore, the native overt pronouns he/she can appear as a bound variable (that is, transfer from the L1 would be misleading). This learning situation presents a classic Poverty of the Stimulus problem: the input underdetermines the grammar. It has been argued that in such situations, UG provides the knowledge that language learners acquire. Two experimental studies investigating this POS in Japanese and Spanish interlanguage, respectively, are Kanno (1997) and Perez-Léroux and Glass (1999). In Perez-Léroux and Glass (1999), English learners of Spanish were tested by means of a task, which involved translating biclausal sentences from their native English into Spanish, following written contexts which supported either a bound variable or a discourse referent (third party) interpretation. Even at the elementary level, learners used significantly more null than overt subjects in the bound variable context, while the advanced learners used zero overt subjects. The findings suggest that even at low levels of proficiency, learners’ grammars have established the contrast between overt and null pronoun interpretation, and restrict the contrast only in the correct contexts. Kanno (1997) used a coreference judgment task, in which participants were given biclausal sentences with referential or quantified main clause subjects and were asked to choose, out of three possible answers, who is the possible referent for the embedded subject (literally, who is doing the action in the embedded clause). (We will discuss the same experimental design in the Gürel, 2006 study.) Kanno’s English-native learners of Japanese also demonstrated that they have established the interpretive contrast in their grammars. Although there were differences between the two studies with respect to the use of the null pronoun, the important similarity in their findings is that learners of all proficiency levels as well as native speakers use or accept the overt pronoun in bound variable contexts significantly less than they do the null subject. We will next turn to the relevant Turkish facts before acquisition of L2 Turkish is discussed. Turkish is a null subject language. It has two overt pronominals, o and kendisi10 whose behavior is illustrated below (examples from Gürel, 2006). The reflexive pronoun stem kendi means ‘self’ and a suffix is attached to it to indicate the person and number of the subject. This form is used to express reflexive relations as in (18a). However, with the third person singular (kendisi), it can be used as a pronoun, see (18b) (Özsoy 1987).

Interpretive dependencies of binding 137

(18)

a. Elifi kendi-nii be÷en-iyor Elif self-ACC like-PROG ‘Elif likes herself’ b. O / kendi-si / pro toplantı-ya git-ti S/he self-3SG pro meeting-DAT go-PAST ‘S/he went to a meeting’

As the example in (18b) illustrates, the subject position of a sentence can be occupied by the overt pronouns o and kendisi and the null pronoun pro. The following examples illustrate the binding possibilities of Turkish pronouns in embedded subject positions. i. Referential antecedent context: ol-du÷-u@-nu (19) a. Elifi >o-nun*i/j çok inatçı Elif s/he-GEN very stubborn be-NOMZ-3SgPOSS-ACC bil-iyor know-PROG b. Elifi >proi/j çok inatçı ol-du÷-u@-nu Elif pro very stubborn be-NOMZ-3SgPOSS-ACC bil-iyor know-PROG c. Elifi >kendi-si-nini/j çok inatçı ol-du÷-u@-nu Elif self-3SG-GEN very stubborn be-NOMZ-3SgPOSS-ACC bil-iyor know-PROG ‘Elif knows (that) s/he is very stubborn.’ ii. Quantified antecedent context: (20) a. Herkesi >o-nun*i/j dahi ol-du÷-u@-nu Everyone s/he-GEN geniusbe-NOMZ-3SgPOSS-ACC düúün-üyor

think-PROG

b. Herkesi > proi/j dahi ol-du÷-u@-nu Everyone progenius be-NOMZ-3SgPOSS-ACC düúün-üyor think-PROG

138 Evidence from Simple SyntaxComplex Semantics c. Herkesi >kendi-si-nini/j dahi ol-du÷-u@-nu Everyone self-3SG-GEN genius be-NOMZ-3SgPOSS-ACC düúün-üyor think-PROG ‘Everyone thinks (that) s/he is genius’ As in Japanese, the overt pronoun o in embedded clause position can never be coreferential with the main clause subject in Turkish. It only allows a disjoint reading, (19a) and (20a). In contrast to Japanese, the overt pronoun kendisi and pro are both unconstrained in their binding properties: they can refer to the main clause subject or refer to a third person in the discourse. For those reasons, Gürel (2002, 2006) argues that the OPC is not clearly exemplified in Turkish. “Given the similar binding properties of pro and kendisi, it appears that the overt counterpart of pro is not the pronoun o but the pronoun kendisi (Gürel 2006: 267).” Let us consider some more sentences with the pronoun o. (21)

(22)

söyle-di a. Elifi >o-nun*i/j gel-ece÷-i@-ni Elif s/he-GEN come-NOMZ-3SgPOSS-ACC say-PAST ‘Elif said (that) s/he would come’ (literally, Elif said her/his coming) b. Elifi said (that) >shei/j would come@ a. ElifiDP>o-nun*i/j koca-sı@-nı öp-tü Elif she-GEN husband-3SgPOSS-ACC kiss-PAST ‘Elif kissed her husband’ b. Elifi kissed DP>heri/j husband@

The overt pronoun o can only refer to somebody else’s coming but not Elif’s coming, in (21a) and only someone else’s husband but not Elif’s, in (22a). Gürel explains this contrast with the category of embedded clauses in Turkish versus English coupled with the Governing Category requirement of Principle B. In Turkish, embedded clauses are nominal categories (DPs but not CPs), and DPs do not constitute an obligatory binding domain for the pronoun o. The binding domain is the whole clause, the pronoun cannot be bound in it following principle B, hence the disjoint reference is the only possible reference. In English, on the other hand, tensed embedded clauses and DPs do constitute binding domains, the pronoun is successfully bound outside as per Principle B, or by a discourse referent. Table 3 summarizes all the relevant reference facts in Turkish and English.

Interpretive dependencies of binding 139 Table 3. Possible antecedents for embedded subject pronouns and anaphors __________________________________________________________________ Turkish English _____________________ ________________________ o kendisi pro she/he *her/himself† __________________________________________________________________ Referential antecedents no yes yes yes n/a Quantified antecedents

no

yes

yes

yes

n/a

Discourse antecedents

yes

yes

yes

yes

n/a



Note that English sentences with reflexives as embedded subjects of tensed clauses are ungrammatical: *Bill thinks that himself is smart.

Table 3 should help us in thinking about the learning task of an English speaker acquiring Turkish. If she has figured out that Turkish is a null subject language, then she should consider kendisi as pro’s overt counterpart (Gürel, 2002), and the simplest assumption would be that those two pronominals have the same binding properties. That would actually be true, for Turkish, and unlike Japanese and Spanish. So we predict that the coreference possibilities of both pro and kendisi should not constitute a major difficulty for learners, because there is no complex mismatch at the syntaxsemantics interface, apart from postulating pro.11 The serious challenge at the interface should come from the pronoun o, which has binding properties deviating from those of English pronouns. Due to binding domain differences between the two languages, Turkish pronouns in embedded subject positions cannot refer to the main clause subject but only to a discourse referent. This, of course, is not true for English, see (21b), (22b). The prediction, then, is that not kendisi but o would be the bottleneck for acquiring Turkish pronoun binding. Gürel (2002, 2006) tested 28 end-state L2 speakers of Turkish, aged between 30 and 70 with a mean of 46, who had been living in Turkey for at least 10 years at the time of testing. She administered four tests, of which only two will be reported on here. A TVJT included 36 items, related to 12 stories. Each story appeared with a test sentence containing pro, o, and kendisi. Half the test items had referential main clause subjects, the other half—quantified subjects. Following Dekydtspotter, Sprouse, and Anderson (1997), the stories were in English while the test sentences were in Turkish. Here is an example.

140 Evidence from Simple SyntaxComplex Semantics (23) Mary and Brian went to a restaurant. Mary ordered seafood and Brian ordered a pizza. The bill came to 50 dollars. Brian complained that the bill was high but Mary didn’t agree. pahalı Maryi o-nun*i/j restoran-ı Mary s/he-GEN restaurant-ACC expensive bul-du÷-u-nu söyle-di find-NOMZ3SgPOSS-ACC say-PAST ‘Mary said (that) s/he found the restaurant expensive’ If the learners choose the True answer, this suggests that they allow the target disjoint reading, which is also supported by the story. If they choose False, this will indicate that they accept the bound reading for the overt pronoun o, an option not allowed in the native Turkish grammar. However, the bound reading for the overt pronoun is possible in their native English, as the translation indicates. That is, the overt pronoun in the embedded subject position can be coreferential with the matrix subject Mary. Thus, any such response will indicate lack of reclassification for the pronominal properties, possibly due to lingering transfer effects. Table 4. Results of Truth Value Judgment Task (percentages) Referential antecedents Quantified antecedents _______________________ _________________________________ o kendisi pro o kendisi pro __________________________________________________________________ Controls (n=30) Bound

4

79

76

3

81

74

Disjoint

96

21

24

97

19

26

Bound

38

73

79

27

79

70

Disjoint

62

27

21

73

21

30

L2 speakers (n=28)

Table 4 shows that with a story like (19) containing a referential main clause subject, the native speakers accepted a bound reading (i.e., they answered False) only 4% of the time, while the rest of the time they opted

Interpretive dependencies of binding 141

for True, pointing to the disjoint reading. In comparison, the L2 learners accepted the bound reading 38% of the time, and the disjoint reading 62% of the time. This difference is statistically significant. We should notice two more things in Table 4. Both natives and learners prefer the bound readings of pro and kendisi, and they do not treat them differently. In other words, it seems that learners are aware of the fact that pro and kendisi have the same binding options. There is no difference in the treatment of pronominal elements in the context of referential or quantified subjects. But how can we know if natives and learners would accept both a bound and a disjoint reading of pro or kendisi? The TVJT cannot answer this question because it includes categorical answers True or False,12 but another task, the written interpretation task, does address it. That task was adopted from Kanno (1997). It included 48 items with 24 referential and 24 quantified antecedents, where each category had 12 overt (kendisi and o) and 12 null pronouns. Participants were given a Turkish sentence and asked to select possible antecedent(s) from among the three options given for the embedded subject pronoun in complex sentences like (24) below. In this example, participants were expected to circle the option (b) as the overt pronoun cannot be coreferential with the matrix subject. (24) Mehmeti [ o-nun*i/j sinema-ya gid-ece÷-i]-ni Mehmet s/he-GEN cinema-DAT go-NOMZ-3SGPOSS-ACC söyle-di say-PAST ‘Mehmet said (that) s/he would go to the movies’ Soru (question): Sizce bu cümleye göre kim sinemaya gidecek olabilir? ‘According to this sentence, who could be the person that would go to the movies?’ (a) Mehmet (b) Baúka bir kiúi ‘Some other person’ (c) Hem (a) hem (b) ‘Both (a) and (b)’ In Table 5 below, percentages indicate how many times the participants interpreted each pronoun with a particular interpretation (e.g., bound-only, disjoint-only or ambiguous). Both in the referential and in the quantified subject conditions, learners allowed significantly more disjoint readings for o (29% and 23% respectively) than the natives did (6% and 11%, respectively) (F(1,56) = 12, p < .001). The learners treat pro essentially similar to

142 Evidence from Simple SyntaxComplex Semantics the natives, while some differences surface for the treatment of kendisi. The learners prefer the bound reading for kendisi while the natives opt for the choice of “both readings possible” most of the time, but these are just preferences that do not point to a qualitatively different grammar. Table 5. Results of the Written Interpretation Task (in percentages) Referential antecedents Quantified antecedents ________________________ _____________________________ o kendisi pro o kendisi pro __________________________________________________________________ Controls (n=30) Bound

1

36

16

2

32

10

Disjoint

94

0

0

89

0

3

5

64

84

9

68

87

Bound

7

69

32

5

56

26

Disjoint

71

7

12

77

11

26

Both

22

24

56

18

33

48

Both L2 speakers ( n=28)

In summary, both Gürel’s TVJT and written interpretation task indicate that Turkish end-state learners, even after having lived in Turkey for extended periods, still retain traces of their native English grammar with respect to re-classifying the binding GC for the o pronoun. Note that their grammar is native-like, in the sense that they do exhibit a contrast between the two interpretations in Turkish, allowing only around 30% bound readings as opposed to about 70% correct disjoint readings. Thus, it seems that pronoun o is just a bottleneck, but not a complete bottle stopper, for these end-state learners. This, of course, is expected, since positive evidence for these properties should appear in the input. Questions remain as to the complete picture of Turkish binding facts, for example, do o and pro and kendisi in fact have different GCs? But even with parts of the binding picture as background, this study demonstrates that examining carefully the syntax-semantics interface mismatches between two languages leads to testable predictions, fully supported in this case.

Aspectual challenges 143

3.

Aspectual challenges

3.1. Preliminaries Since there is a lot of diversity in the definitions and assumptions in the literature on aspect, I will start with some preliminary definitions.13 This exposition will serve as background for the next few sections. The term “aspect” refers to the internal temporal structure of events as described by verbs and phrases (Comrie, 1976; Chung and Timberlake, 1985; Smith 1997). It is the property that makes it possible for a sentence to denote a complete or an incomplete event. The concept of aspect is defined on two different levels: grammatical and lexical. “Lexical aspect” or ‘Aktionsart’ (also termed “situation aspect” by Smith, 1997) refers to aspectual classes of verbs. All the verbal phrases in human languages can be viewed as reflecting either a state or an event. The most widely accepted cognitivelybased classification is that of Vendler (1967), who, following Aristotle, proposed a quadripartition of situational types into states, activities, accomplishments, and achievements. Vendler’s four classes can be presented as a combination of the following two underlying ontological features: [rProcess] and [rChange of State]. (25) [Process] [+Process]

[Change of State] State Activity

[+Change of State] Achievement Accomplishment

Philosophers have argued that such an ontological partition of eventualities reflects the basic situational types observable in the world (Kenny, 1963). The classification is non-linguistic, as it concerns situational categories, but linguistic criteria are used to distinguish the classes from one another. Some examples of English verbal phrases falling into the different classes are given in (26). (26) States: know the answer, be sick, love Mary, be tall Activities: swim laps, travel, burn, eat sushi Accomplishments: swim 10 laps, travel from X to Y, burn down, eat a plate of sushi Achievements: die, arrive, find a wallet, recognize

144 Evidence from Simple SyntaxComplex Semantics A state is defined as a stable condition of some entity for a period of time, where no change appears from Time 1 to Time 2. To take an example from above, the sentence John loves Mary is true if at any instant between those times, John loves Mary. An important test for states in English involves the progressive tense: states are usually ungrammatical or awkward with the progressive, as the degraded status of the sentences in (27a,b) indicates. (27) a. *John is loving Mary. b. *Lea was knowing Bistra’s new telephone number. Events, on the other hand, are dynamic situations where some change or changes obtain from Time 1 to Time 2. They can be counted and quantified over. The first type of event is an activity. (28) a. John ran in the park. b. John ran a mile. c. John found a wallet. The verb run in the sentence in (28a) denotes a homogeneous process going on in time with no inherent goal. A second type of event is an accomplishment, a situation that involves a process going on in time and an inherent culmination point, after which the event can no longer continue, as in (28b). We can imagine an accomplishment as a complex event containing an activity and a culmination of that activity. Finally, an achievement is similar to an accomplishment in that it also has an inherent endpoint, but in this class the process that leads to the culmination is instantaneous. Such a momentary event is represented by the sentence in (28c). Another useful term to describe lexical classes is “telicity”. An event that has an inherent endpoint, after which the same event cannot continue to unfold, is labeled a telic event. In the above classification, accomplishments and achievements are telic, while states and activities are atelic. On the other hand, “grammatical aspect” (also known as “viewpoint aspect” (Smith, 1997), or “sentential aspect”) is indicated by perfective and imperfective aspectual tense morphemes. Comrie (1976: 3) argues that they represent “different ways of viewing the internal temporal constituency of a situation.” The perfective viewpoint (Smith, 1997), represented by the past simple tense as in (29a), looks at the situation from outside and disregards the in-

Aspectual challenges 145

ternal structure of the situation. On the other hand, the imperfective viewpoint, as represented by the progressive tense (cf (29b)), looks at the situation from inside and is concerned with its internal structure without specifying beginning or end of the situation. Note that in terms of lexical aspect, both (28a) and (28b) represent a telic event. (29)

a. Laura built a house. b. Laura was building a house.

(perfective) (imperfective)

We shall see examples of mismatches between grammatical aspectual meanings in different languages in sections 3.5 (English and Japanese) and 3.6 (English and Spanish). Returning to the marking of lexical aspect, two major mechanisms of “composing” telicity have been proposed (Krifka, 1992, 1998; Verkuyl 1972, 1993). One mechanism is to combine a non-stative (dynamic) verb with an object which is marked as exhaustively countable or measurable (a quantized object, in Krifka’s terminology; a specific quantity object, in Verkuyl’s terminology). English uses this object-marking mechanism in (most) accomplishment and activity predicates. Quantized nominal arguments combined with dynamic verbs bring forward a telic interpretation as in (30); cumulative objects contribute to an atelic interpretation as in (31) (Verkuyl, 1972; Krifka, 1998). Aspectual particles signaling telicity as in (30b) are optional. (30) a. Claire ate an apple/the apple/three apples in 5 minutes. b. Claire ate (up) her breakfast in 5 minutes. (31) Claire ate apples/popcorn for an hour/*in an hour. However, this object-marking mechanism does not extend to all VPs: telicity is partly determined by the lexical semantics of the verb. The difference between the telicity values of drive a car (atelic) versus make a car (telic) is clearly due to differences between the two verbs. Verbs of creation (make, write) and verbs of consumption (eat, drink), among others, are unified by the following property: they have Incremental Theme objects (Dowty, 1991). These objects are affected by the event in a special way, and according to three recent theoretical accounts, they “measure out” the progress of the event (Tenny, 1994); their discrete parts map to parts of the event (Krifka, 1989); or they serve as an “event odometer” (Verkuyl, 1993).14

146 Evidence from Simple SyntaxComplex Semantics The second mechanism of marking compositional telicity is to utilize a specific prefix on the verbal form. In article-less Slavic languages (e.g., Russian, Polish, Czech), the object’s quantization does not play a large role in compositional telicity. These languages utilize the verb-marking mechanism of signaling (a)telicity, at least for activities and accomplishments with Incremental Theme objects. Examples in (32)-(33) from Russian illustrate that the VP telicity value (as a rule) depends on the presence or absence of perfective prefix, and not on the object’s quantization. The sentences in (32a) and (33a) have an atelic VP, although the object is nonquantized in (32a) and quantized in (33a). The VPs in (32b) and (33b) are both interpreted as telic, since they have a perfective verb, regardless of the non-quantized (32b) or quantized (33b) objects. (32) a. Maša jela tort (atelic) Masha IMP-eat-PAST cake ‘Masha was eating cake/Masha used to eat cake.’ b. Maša s-jela tort (telic) Masha PERF-eat-PAST cake ‘Masha ate the cake.’ (33) a. Maša jela kusoþek torta (atelic) Masha IMP-eat-PAST piece cake-GEN ‘Masha was eating a piece of cake/Masha used to eat a piece of cake.’ b. Maša s-jela kusoþek torta (telic) Masha PERF-eat-PAST piece cake-GEN ‘Masha ate the/a piece of cake.’ Following Smith (1997), I have argued (Slabakova, 1997a, 2000, 2001) that Russian and English exhibit two different values of a telicity marking parameter. In capturing the aspectual distinction in phrase structure, I adopt the syntactic decomposition of eventive verbs approach, following Larson (1988), Pustejovsky (1991), Hale and Keyser (1993), and Travis (1992). The trees in (34a,b) illustrate the proposed phrase markers for English and Russian respectively. Readers should keep in mind that these trees and the compositional telicity calculation they argue for are only valid for the default case of eventive verbs taking Incremental Theme objects (see Slabakova, 2001 for analyses of the other aspectual classes).

Aspectual challenges 147

(34) a.

vP ty v' tsubj ty v Asp CAUSE ty DPobj Asp' ty Asp VP [+telic] ty tobj V' | V

b.

vP ty tsubj v’ ty v PerfP CAUSE ty Perf AspP Prefix=[+telic] ty DPobj Asp’ ty Asp VP ty tobj V’ | V

Let us focus on the English tree in (34a) first. The VP shell structure (Larson, 1988) reflects the semantic fact that events may be viewed as having at least two subevents (Dowty, 1979): a causative subevent and a resultant state. The light vP denotes the causative subevent and the lower VP denotes the resultant state subevent of the eventive classes. This decomposition is reflected by postulating a null CAUSE morpheme in the head of the vP in a VP shell structure (Hale and Keyser, 1993; Pesetsky, 1995; Chomsky, 1995). Event participants (arguments) take part in the aspectual composition through case checking in AspP (accusative case) and TP (nominative case). AspP is an important functional category for telicity construal. The object moves to the spec of AspP to check accusative case and the verb moves to the head Asp (Borer, 1994; van Hout, 1996, Schmitt 1996; Travis, 1992). It is at this point, in a spec-head relationship with the verb, that the object DP imparts its temporal-bounding properties to the verb. Depending on a verbal feature (stative or dynamic) and on a nominal feature (specified or unspecified cardinality), the telicity of the whole VP is calculated (Verkuyl, 1993). Whenever the object is of specified cardinality, the interpretation is one of a telic event. Whenever the object is of unspecified cardinality, being a mass or bare plural noun, the interpretation is atelic. Thus the independently needed mechanism of accusative case checking is also used for aspectual feature checking at the syntax-semantics interface. In Russian (see 34b), the telic morpheme is overt; it is a derivational morpheme, usually a prefix, on the verb. It occupies the head of a function-

148 Evidence from Simple SyntaxComplex Semantics al projection Perfectivity Phrase (PerfP), a position higher than the one in English. If a prefix is in the Perf°, a position from which it c-commands the object, the interpretation is telic. If there is no prefix in the Perf°, then the interpretation is atelic. Consequently, the cardinality of the object in Russian does not matter for aspectual interpretation; it is only the presence or absence of prefix that signals a(telicity). An important feature of this analysis is that it represents perfective prefixes, which are after all derivational morphemes, as having grammatical (functional) meaning and thus occupying a functional category. I believe this analysis is justified, because the various polysemantic derivational morphemes, barring some exceptions, have a common function to perform, namely, to signal a telic VP. But why should telicity have a privileged status among other lexical meanings of perfective prefixes to the extent that we view it as a sort of “grammatical” meaning? The concept of completion of an event has been viewed as seminal and somehow primitive in philosophical and semantic work on aspect (see Kenny 1963, Vendler 1967, Verkuyl 1972, 1993). It is equated with the feature [+ change of state] in the quadripartition of situational types: i.e., it embodies one of the basic features distinguishing between situational types observable in the extralinguistic reality. Perhaps even more important in arguing for a privileged status of the telic meaning is the fact that languages of the world often mark it with grammatical means. In the English accomplishments with Incremental themes, the telic meaning is signaled by the presence of definite, indefinite, or quantified objects and the atelic meaning is signaled by mass or bare plural objects, i.e., by inflectional morphology. In Finnish the same meanings are marked by accusative and partitive case endings, respectively, that is, by inflectional morphology. Compared to telicity, none of the other lexical meanings reflected in perfective prefixes (e.g., manner, intensity, big number of affected participants, repetition, iterative or semelfactive nature of the event) is as cognitively primitive, nor is it commonly rendered by grammatical morphology in languages of the word. The analysis presented above is just one proposal among many. The experimental tasks in the studies we shall look at do not depend on the details of the syntactic trees and the exact mechanism for case and telicity calculation. What is important to note is the different character of the telicity marking morphemes: obligatory perfective prefix in Russian accomplishments; optional telic particle in English and obligatory quantization marking on the object. Any syntactic analysis will have to account for these linguistic facts.

Aspectual challenges 149

3.2. Telicity marking in Bulgarian-English interlanguage What exactly does a Slavic language speaker learning English have to acquire, if we assume that telicity is a universal semantic meaning? In the Slavic to English direction, the mismatch concerns the phrasal categories where telicity is marked. Although there is nothing really complicated about the meaning of telicity, Slavic speakers “know” that telicity is marked on the verb, and that is where they look for telicity marking at first. Beginning and even intermediate learners will consider the verb as crucial in determining the aspectual interpretation of the sentence and will not be aware of the fact that in English the cardinality of the object is crucial in determining telicity. One prediction is that beginning learners will treat all verbal phrases as atelic, since there are no visible prefixes on the English verbs, hence learners would not know how to make the aspectual distinction. This entails that they will perform more accurately in comprehending atelic verb phrases (eat cakes) than telic ones (eat a cake). However, it can also be expected that learners will demonstrate gradual but fairly quick acquisition of the English telicity marking mechanism, since the input abounds with positive evidence. Slabakova (1997, 2001) tested these predictions in the interlanguage of Bulgarian learners of English. A hundred and thirty adult learners of English and 32 native English controls participated. Sixteen British English speakers and 16 American English speakers were tested to control for dialect differences. Three different tasks tested telicity marking knowledge, but only the results of the combinatory felicity task will be reported on here. Test sentences contained two simple clauses and participants had to decide, on a scale from í3 to +3, whether they were a felicitous combination. The fist clause established context and the second clause presented telic or atelic events. (35) a. Antonia worked in a bakery and made a cake. b. Antonia worked in a bakery and made cakes. Both (35a) and (35b) are, strictly speaking, grammatical. However, in the context of the first clause expressing a habitual activity, a second clause also reflecting an open-ended activity sounds more natural. On the other hand, in (35a), the telic second clause is not contradictory to the first clause, but the combination is not as felicitous as (35b). Native speakers proved to

150 Evidence from Simple SyntaxComplex Semantics be sensitive to this subtle interpretive distinction, and so did learners. Table 6 summarizes the results. Table 6. Mean accuracy on combinatory felicity task (from í3 to +3) Telic second clause Atelic second clause

Contrast

AmE controls (n=16)

0.19

2.09

p < .0001

BrE controls (n=16)

0.81

2.41

p < .0001

Advanced learners (n=45)

0.41

2.00

p < .0001

High Intermediate(n=50)

0.48

1.75

p < .0001

Low Intermediate (n=35)

1.48

1.94

p = .03

One thing to notice in Table 6 is that native speakers did not reject combinations of a habitual or characteristic clause with a telic clause as in (35a) but they all rated them significantly lower than the combination with an atelic clause. The advanced and the high intermediate learners demonstrated the same sensitivity. Even the low intermediate learners as a group showed a contrast between these two types of complex sentences. However, individual results calculated by means of a t-test for each participant reveal that most of the intermediate learners were not fully consistent in judging these combinations of sentences.15 Notice that the other prediction, namely, that beginning learners will be more accurate on atelic than on telic sentences, is in fact supported (see Table 6). They rate habitual and atelic combinations with a mean of 1.94 (out of 3) while habitual and telic combinations with a mean of 1.48. There are no significant group differences among accuracy on atelic sentences while there is a significant effect of group on the telic sentences, due to the beginners. Thus, the significant contrasts in learners’ grammars indicate that the English telicity marking mechanism is not particularly complicated to learn, probably because there is abundant evidence in the input. Another reason may be that Bulgarian is one of two Slavic languages (the other being Macedonian) which have articles, and transfer in this respect may be aiding the Bulgarian learners of English to “notice” the function of the articles.

Aspectual challenges 151

3.3. Telicity marking in English-Russian interlanguage We shall next look at the other acquisition direction. What do English native speakers have to acquire in their acquisition of Russian? The learning task is considerably more complicated. Unlike the English telicity marking mechanism that is quite regular, the Russian telicity marking morphemes, the prefixes, are derivational morphemes. When English native speakers are acquiring Russian, they are faced with the following two tasks regarding lexical aspect: They have to acquire the fact that most prefixed verbs are telic; and they have to learn each individual verb with its subset of perfective prefixes. For example, in learning the lexical item for ‘write’, a learner has to know that the root pisat’ is imperfective and denotes an atelic event akin to be writing, and also know that its perfective equivalent na-pisat’ denotes a telic, complete event akin to write up. In other words, the learner has to acquire the fact that the interpretable feature of telicity still has to be checked in a functional category, say, PerfP, but unlike in English, the Russian telicity morpheme is a verbal prefix. I consider this process acquisition of a syntax-semantics mismatch, or L2 learning at the syntax-semantics interface. The acquisition of a new functional category, of course, happens to be the easy task.16 However, the prefix na- is not the only pure telicity marker. The verb est’ ‘eat’ takes a different prefix to signal telicity: s-jest’. A number of other prefixes also have this function: vy- as in vy-pit’ ‘drink up’; po- as in poželat’ ‘wish’; v- as in v-ljubit’sja ‘fall in love’ are just a few examples. Thus, a learner is not faced with a regular inflectional morpheme signaling telicity every time it is attached to an atelic root, but rather with a number of derivational morphemes lexically selected by individual verbs. The learning task is complicated by multiple prefixes possible with the same root, which signal telicity and have additional lexical meanings (see examples in (36c,d,e,f). (36)

a. b. c. d. e. f.

pisat’ ‘write, be writing’ na-pisat’ ‘write up’ pod-pisat’ ‘sign’ do-pisat’ ‘write to the end (something that was started before)’ pere-pisat’ ‘write out again’ po-pisat’ ‘write for a while’

152 Evidence from Simple SyntaxComplex Semantics The task is complicated even further by the polysemantic nature of many prefixes, cf. na- which can have a purely telic sense as in (37a) but which can also have a telic plus lexical sense as in (37b,c). (37)

a. b. c.

na-pisat’ ‘write up’ na-gotovit’ ‘cook something in big quantitites’ na-boltat’sja ‘chat with someone to one’s heart’s content’

In short, there is a lot of lexical learning to be done in acquiring Russian lexical aspect over and above acquiring the syntactic property that aspectual prefixes signal telicity and have to be checked in a functional category. This lexical learning task is the truly difficult part in the acquisition of Russian aspect. Slabakova (2005) attempts to tease those two types of learning apart, and to investigate only knowledge of the grammatical learning at the syntax-semantics interface. What pattern of acquisition does our theory predict in this case? If UG allows access to all the values of parameters, parameter resetting can occur in most cases. As in the other acquisition direction, there is abundant evidence in the input. Unlike in the acquisition of English, Russian aspect marking is taught and drilled widely in language classrooms. However, the lexical learning task is more formidable. What would Russian learners’ initial hypothesis look like? Assuming their native English value, learners are expected to pay attention to the form of the object at the start of acquisition. If a non-stative, dynamic verb combines with a singular countable object (e.g., pis’mo, ‘letter’) or an overtly determined/quantified object (e.g., etot fil’m ‘this movie’, dva svitera ‘two sweaters’), learners are predicted to treat the whole VP as telic, since quantized objects of this type bring forward a telic interpretation in English. If, on the other hand, the object is a mass or bare plural noun, learners are predicted to initially interpret the VP as atelic. For example, they are expected to interpret both sentences in (32a) and in (32b) as atelic past events, given that the object in both sentences is the mass noun tort ‘cake’. Along the same lines, learners are predicted to interpret the VPs in (33a,b) as telic, since they contain the quantized object kusoþek torta ‘a piece of cake’. However, once learners notice that Russian nominal arguments do not mark telicity, they will know that prefixed verbal forms denote complete events, and interpret (32) as atelic but (33) as telic. Sixty-six learners of Russian as well as 45 controls took an on-line test, posted on the internet. The main task of the study was an interpretation test.

Aspectual challenges 153

Participants read a sentence and chose which of the provided continuations was logically possible, or made sense. In order to choose a continuation, Russian speakers had to interpret the event as complete (telic) or incomplete (atelic). In order to test for L1 transfer, three conditions with 10 sentences per condition manipulated the form of the object: mass and bare plural objects, singular count objects, or objects modified by overt demonstrative pronoun or quantifier. Condition A: objects are mass and bare plural nouns (38) Maša vezla detej domoj… Masha drove children home A. no deti ješþo ne doma and the children are not at home yet B. i deti uže doma and the children are now at home C. oba A i B vozmožny  CORRECT both continuations above are possible (39) Maša pri-vezla detej domoj… Masha brought children home A. i deti uže doma  CORRECT and the children are now at home B. no deti ješþo ne doma and the children are not at home yet C. oba A i B vozmožny both continuations above are possible It is important to notice that since the imperfective aspect highlights the progress of the event but not its final endpoint, the correct answer for all imperfective sentences in the three conditions is “both continuations above are possible”. But the atelic answer “and did not finish doing the event” is more salient than the strictly speaking “correct” answer, so it was also accepted as correct. Thus, only one out of three answers was appropriate in perfective sentences versus two out of three in imperfective sentences. Table 7 gives the mean choices of interpretation in perfective sentences, where expected answers are shaded in grey.

154 Evidence from Simple SyntaxComplex Semantics

Table 7. Type of interpretation chosen with Perfective accomplishments with three different types of objects (in percentages) Interpretation

Controls (n=45)

Advanced (n=26)

Condition A: Mass/Bare plural objects telic 94 91.5 atelic 0.4 3 both 4.6 4.5 Condition B: Count objects telic 92 96.2 atelic 2.6 0 both 5.4 3.8 Condition C: Demonstrative objects telic 84.6 87.7 atelic 0.4 3.8 both 15 8.5

Hi Interm (n=20)

Lo Interm (n=20)

85 4 11

61 14 25

87 5 8

57 20 23

77 9 14

55 23 22

We will focus on the performance on perfective sentences, because telicity marking was what the learners had to acquire, and because the imperfective is the default (or elsewhere) aspect in Russian. Recall that the chance level in this category of sentences is 33.3%. If they were randomly picking answers, the low intermediate learners would be right one time out of three. However, they are accurate between 55% and 61% of the time. This accuracy is different from chance (t = .00026). In other words, when faced with a perfective verb, they are already likely to choose a complete (telic) paraphrase. Furthermore, the object cardinality does not help them in the acquisition process: even the lowest proficiency learners do not show an effect of object cardinality. The individual results in Table 8 confirm the group findings.

Aspectual challenges 155 Table 8. Individuals per participant group who are 80% accurate on perfective sentences Mass/bare plural object

Count object Demonstrative object

Controls (n=45)

41 (100%)

41 (100%)

41 (100%)

Advanced (n=26)

25 (96%)

25 (96%)

22 (85%)

Hi Interm (n=20)

16 (80%)

17 (85%)

13 (65%)

Lo Interm (n=20)

8 (40%)

7 (35%)

8 (40%)

These group and individual results show that syntax-semantics mismatch acquisition is not only possible but actually accomplished by the great majority of learners. Advanced learners’ performance on perfective sentences was statistically indistinguishable from that of the native speakers. Even more importantly, the low proficiency learners as a group have also successfully acquired the telicity marking mechanism in the L2. These findings were corroborated by the fact that the majority of individuals in the advanced and high intermediate groups, and even roughly 8 out of 20 low intermediate learners, were 80% accurate on both types of sentences. The effect of the different types of objects on telicity interpretation was only detected with the low intermediate learners. It was expected that learners would be aided by countable and especially demonstrative objects in interpreting perfective sentences, again based on their English grammar (e.g., eat this cake, eat an apple). These predictions were only partially confirmed by the error rates on perfective sentences. The type of object seemed to have a little facilitating effect in perfective sentence comprehension: they (incorrectly) interpreted non-quantized objects to point to an atelic interpretation only 14% of the time, as compared to 20% and 23% with quantized objects. The traces of L1 transfer disappear altogether at the higher proficiency levels. Why then is the perception so prevalent and persistent that Russian aspect is extremely difficult to learn? Where does the real difficulty lie, if it is true, as I have argued, that most of the participants in this experiment have already acquired the basic functional meaning of aspect? I suggest that the (perception of) difficulty comes from the formidable lexical learning task. In light of the theoretical approach assumed, and based on the learners’

156 Evidence from Simple SyntaxComplex Semantics high accuracy on the interpretation test, it must be the case that the perceived difficulty in acquiring Russian aspect lies in learning the lexical items signaling telicity, but crucially NOT in learning the grammatical mechanism for telicity marking. Further psycholinguistic research on the mental lexicon of L2 Russian learners is necessary to empirically confirm the first part of this conclusion. 3.4. Telicity and nominal interpretation Finally, we shall review the findings of a study investigating knowledge of perfective prefix effects on nominal interpretation. Recall from section 6.1 that Russian has no articles, and that in perfective sentences with Incremental Theme objects, bare noun phrases are interpreted as definite, while in imperfective sentences the same objects were interpreted as indefinite; see examples (32) and (33). I argued that this situation comprises a syntaxsemantics mismatch in the sense that the meaning “definite” is expressed by articles in English but by verbal prefixes in Russian (among other means). The prediction, then, will be that learners who know the telicity marking mechanism in Russian will also know the perfectivity effect on definiteness. A separate condition in the Russian L2 study (Slabakova, 2005) tested this effect, reported on in Slabakova (2004). The format of the test is choosing from three possible interpretations. Here is an example of a 17 test item. ȱȱ (40)

Anya stirala odeždu…, Anya washed (the) clothes A. odeždu voobšþe clothes in general B. vsyu odeždu kotoraya nuždalas’ v strike all the clothes that needed washing C. oba A. and B. vozmožny both continuations above are possible

(41)

Maša po-stirala odeždu…, Maša PERF-washed (the) clothes A. odeždu voobšþe clothes in general

 EXPECTED

 EXPECTED

Aspectual challenges 157

B. vsyu odeždu kotoraya nuždalas’ v strike all the clothes that needed washing C. oba A. and B. vozmožny both continuations above are possible

 EXPECTED

Table 9 gives the results on perfective sentences, because it was only in those sentences that the effect was expected to obtain. Recall that participants had to indicate which object interpretation (definite, indefinite, or both) they would choose in perfective sentences. Table 9. Object interpretation in perfective sentences (percentages) Indefinite object Definite object

Both object interpretations

Controls (n=45)

3.9

49.8

46.3

Advanced (n=26)

1.4

81

17.6

Hi Interm (n=20)

6

79

15

Lo Interm (n=20)

18.4

55

26.6

Table 9 shows some unexpected judgments of Russian native speakers. They chose the indefinite object interpretation in perfective sentences in only 3.9% of the times. However, they were more inclusive in their object interpretations, with roughly 46% of answers indicating that both object interpretations are possible in perfective sentences. They do not significantly demonstrate that they interpret bare plural or mass objects as denoting specific quantity or not, depending solely on the perfectivity of the verb. The advanced and the high intermediate speakers, on the other hand, showed expected high accuracy in their choice of object interpretation. They displayed superior sensitivity to the quantificational effect of prefixes. Even the Low Intermediate learners chose the definite object interpretation most often (55% of the time), thereby indicating that they have started acquiring the quantificational effect of prefixes. What if L2 learners’ superior knowledge is somehow transferred from their native language? This is not an unreasonable hypothesis. Smith (1991), for one, argues that the simple past tense in English exemplifies perfective aspect, while the progressive tense denotes imperfective aspect. Do English native speakers change their object interpretation based on the aspectual form of the verb? For example, do they consider bare plural and

158 Evidence from Simple SyntaxComplex Semantics mass objects to refer to non-specific quantity in progressive sentences and to specific quantity in past simple sentences? In order to rule this hypothesis out, a similar test was given to 25 monolingual English native speakers. It included five quadruples of sentences: past simple or progressive verbal forms were crossed with definite or indefinite objects, for a total of 20 test sentences as in (42). (42)

a. b. c. d.

Katherine wrote letters. George wrote the letters. Misha was writing letters. Sean was writing the letters.

Choice of answers in each case: A. I don't know how many letters B. I know that it was a specific number of letters C. both continuations above are possible Results of the object interpretation test with the monolingual native speakers indicate that sentential aspect does not change object interpretation. English monolinguals choose the specific quantity interpretation for the definite object in past simple sentences with 83.2% accuracy, and the non-specific quantity interpretation in a similar sentence when the object is indefinite with 85.1% accuracy. Similarly, in progressive sentences, they choose a specific quantity interpretation for definite objects 88.2% of the time, and interpret an indefinite object as a non-quantized 80% of the time. This is hardly surprising, of course, since English does have articles to signal specificity and definiteness. As expected, these English-speaking participants judged object interpretation based only on the presence of the definite article. Thus, the Russian learners’ interpretations of Russian objects cannot come from the native language. But how can the divergence between native and learner performance be explained?ȱ At least superficially, it looks like the L2 learners are demonstrating superior ability to manipulate the meaning paraphrases in this particular test as compared to the native speakers. Is it the case that the learners are more aware than native speakers of the quantificational effect of perfective prefixes over bare plural and mass objects in Russian? Since learners demonstrate knowledge of the L2 property but do not transfer it directly from their native language (as the monolingual English test shows), could it be that the object’s specific quantity reading is more salient to the

Aspectual challenges 159

L2 learners than to the native speakers, since object quantization is precisely what signals telicity in English? A possible explanation of these results questions the salience of the perfective prefix effect on object interpretation in Russian. Russian is a language, in which discourse and context play a larger role than in English. More concretely, word order in Russian interacts with the specific and definite interpretations of the arguments.18 As has been noticed in the literature on Russian, the relatively free word order, or scrambling, gives rise to different discourse information structures. The preverbal position is normally related to topic, or old information, and the postverbal position is related to focus, or new information (see Yokoyama, 1986; Holloway King, 1993; Bailyn, 1995 for more discussion). If a bare subject NP is placed preverbally as in (43a), it is most often interpreted as definite, while if it is postverbal as in (43b), it is interpreted as indefinite, possibly specific. Finally, (44) shows that indefinite nonspecific objects also appear postverbally (The examples are from Ionin, 2003: 111-112.) (43)

(44)

a.

Koška v-bežala v komnatu cat-NOM PERF-run-PAST into room-ACC ‘The cat ran into the room.’ b. V komnatu v-bežala koška into room-ACC PERF-run-PAST cat-NOM ‘A cat ran into the room.’ Lena proþla (kakuju-to) knigu. Ja ne znaju kakuju. Lena PERF-read-PAST (some) book-ACC I not know which ‘Lena read some book. I don’t know which.’

Now, all ten sentences testing the prefixes effect on objects in this experimental study had SVO word order (see examples (36) and (37)). On the one hand, the mass and bare plural objects were in the scope of a perfective prefix, which would purportedly give rise to a definite interpretation. On the other hand, the objects were in postverbal position, which would normally lead to an indefinite specific as well as non-specific interpretations, depending on the context. Slabakova (2004) suggests that it is this clash of two sources of semantic information that makes Russian native speakers accept both quantized and non-quantized object interpretations in perfective sentences. In other words, word order, information structure, and the semantic effect of prefixes interact in such a way as to make the latter not as

160 Evidence from Simple SyntaxComplex Semantics salient in Russian as it could be. If this (admittedly post-hoc) explanation of the native speaker results is on the right track, the linguistic competence of Russian native speakers would be correctly captured by the results of this test. With their more inclusive choice of object interpretation, they are indicating that, if the right pragmatic context exists, the prefix effect may be overruled. This explanation is certainly compatible with recent models of aspectual meaning calculation where pragmatic (context-related) aspectual information can take scope and coerce grammatically-encoded meanings (see de Swart, 1998). The English native learners of Russian, then, may not be aware of the coercing effects that possible pragmatic contexts and scrambled word order can have over the grammatically encoded prefix effect. This situation, if it is true, will certainly be compatible with a line of recent research which documents delays in L2 acquisition of pragmatic topic-focus marking as opposed to related syntactic properties (Fruit, 2006; Lozano, 2004). Of course, this is a conjecture that has to be experimentally supported by further research. The main point of Slabakova (2005) and Slabakova (2004) studies, though, is that a major syntax-semantics mismatch between English and Russian, pertaining to where telicity is marked in the sentence and what effect it has on object interpretation, can be acquired. While perfective prefixes and what they entail in sentence interpretation is definitely taught in language classrooms, their effect on object interpretation is not taught at all. Most learners of advanced and even high intermediate proficiency have acquired all the different meanings in the target language. Taking the results of both studies together, then, suggests that explicit teaching of meaning is not really necessary for some semantic effects to be acquired. 3.5. Acquisition of grammatical aspect (English and Japanese L2) Recall from section 3.1 that grammatical aspect is usually reflected in inflectional morphemes like past simple and progressive tenses in English, preterite and imperfect tenses in Spanish, Passé Composé and Imparfait in French. The grammatical meanings of completed, ongoing, or habitual event interact with the telicity or atelicity encoded in the VP. Since grammatical aspect is encoded higher in the structure, it can “undo” the lower aspectual values in predictable ways. A telic event under a progressive operator will still remain telic, that is, having a potential endpoint, but the attention of the speaker will be focused on the lack of completion entailment,

Aspectual challenges 161

as in (46) repeated here from section 1. The perfective viewpoint expressed by the English past tense as in (45) adds an entailment of completion to telic VPs. (45) (46)

Laura built a house. [//////////] I F Laura was building a house. [... //// ...] I F

Perfective viewpoint Imperfective viewpoint

This is how Smith (1991, 1997) visualizes the fact that the initial (I) and final (F) moments of the event of building a house are included in the event described by the perfective sentence (45). The imperfective, on the other hand, looks at the situation from inside and is concerned with the internal structure without specifying the beginning or end of the situation.19 Dowty’s (1979) Imperfective Paradox observes important entailment relations between sentences as in (45) and (46): while the perfective sentence proposition entails the imperfective sentence proposition, the opposite is not true. That is, if you have built a house, then there was necessarily a previous interval during which you were building a house. But if you were building a house at some interval in time, something (say, a shortage of funds) may have prevented you from completing the house. We shall now look at some syntax-semantics mismatches in grammatical aspect meanings in languages of the world. Gabriele (2005), a bidirectional study, investigates differences in the truth-values of the progressive operator in English and Japanese and how they are acquired. Montrul and Slabakova (2002) examine interpretations of grammatical aspect in Spanish-English interlanguage. Both experimental studies include comparisons between knowledge of semantic entailments and knowledge of inflectional morphology. Thus, both studies are especially relevant to the question posed in this book, namely, is inflectional morphology a bottleneck for the acquisition of meaning?20 On the surface, it looks like the Japanese inflectional morpheme te-iru and the English progressive form be V-ing are a perfect match. Semanticists (e.g., Landman, 1992; de Swart, 1998) have analyzed the progressive tense as a semantic operator PROG that interacts with the lexical aspect of the VP, to which it attaches and brings forward an ongoing interpretation. To start with, both te-iru and be V-ing cannot normally be used with states (al-

162 Evidence from Simple SyntaxComplex Semantics though both morphemes have some additional meanings as “temporary state”, e.g. Mary is being lazy today, see Slabakova, 2003). With most accomplishment verbs (48a) and all activities (48b), te-iru has an ongoing interpretation, as does be V-ing (47a,b). The mismatch is manifested only with achievement verbs: while be V-ing is unnatural with some achievements (49a), with some others it highlights the process immediately preceding the change of state (49b). Crucially, te-iru cannot have an ongoing interpretation but only a completion, that is, perfective interpretation (50). The Japanese examples are from Gabriele (2005). (47) a. b. (48) a.

Samantha is making a cake. Samantha is running. Ken-wa isu-o tukut-te i-ru Ken-TOP chair-ACC make-ASP-NONPAST ‘Ken is making a chair.’ b. Ken-ga utat-te i-ru Ken-NOM sing-ASP-NONPAST ‘Ken is singing.’ (49) a. *I am losing my wallet. b. The plane is arriving at the airport. (50) Hikǀki-ga knjkǀ-ni tui-te i-ru plane-NOM airport at arrive-ASP-NONPAST ‘The plane has arrived at the airport. # ‘The plane is arriving at the airport.’ This intriguing semantic contrast between English and Japanese has been noticed and widely discussed. Two basic proposals have been put forward in the formal semantics literature to explain it. Ogihara (1998, 1999) adopts Landman’s (1992) analysis of the PROG operator and postulates that English and Japanese have different lexical semantic representations for achievement verbs. On the other hand, McClure (1995) attributes the difference to the semantics of the PROG operator, but not to the lexical semantics of achievements. There is some evidence from Japanese dialects that with a different aspectual morpheme, achievements can indeed be interpreted as a process leading to the change of state, just as in English. This fact suggests that it is not Japanese and English achievement verbs that are different; hence the difference has to be sought elsewhere. Gabriele (2005) adopts McClure’s solution.

Aspectual challenges 163

In analyzing the different truth conditions of the PROG operator in the two languages, McClure (1995) makes reference to an interval of evaluation, the point at which the truth value of the proposition is computed. The semantics of English be V-ing necessitates that the event being evaluated has started, and it may not have been completed; that is, there are some segments of the event that will follow the interval of evaluation. The semantics for Japanese te-iru, however, has one added feature: it requires at least one whole segment entailed by the predicate to be manifested completely. Segments of an event are defined as “any continuous and proper subset of the situations which define a particular predicate” (Gabriele, 2005:51). States have no segments; achievements are made up of one segment where s’ is the final segment (in a way, they begin and are over at the same time); activities are an open-ended sequence of segments, while accomplishments are a combination of a finite series of segments plus a final segment. Consider how the two different PROG morphemes interact with lexical classes. In McClure’s analysis, the truth conditions of be V-ing can be satisfied with any aspectual verb class except states. The progressive with activities and accomplishments is true as soon as one of their process segments is manifested. In achievements, which consist of one single event (the change of state), the interval of evaluation cannot come after that instantaneous change of state (as per the definition of PROG). Therefore, achievements make reference to the point just before the change of state.21 Japanese te-iru, on the other hand, imposes the condition that at least one complete segment (an event) has been manifested before the interval of evaluation. This can easily be satisfied by activities and accomplishments.22 Since the only event of achievements includes its final segment, te-iru’s truth conditions are only satisfied with achievements only when the change of state has already occurred. Note that on this account, the difference between the English and Japanese progressive morphemes is due to a mismatch at the syntax-semantics interface: both morphemes are composed of universal features, or impose truth-conditional restrictions on interpretation, but the restrictions they impose are not the same. It may be argued that the Japanese morpheme is in some respect more complex, because it imposes an additional restriction over and above the English morpheme’s restrictions, and because it leads to two different interpretations (perfective as well as ongoing) with different aspectual classes of verbs. If McClure’s semantic account adopted by Gabriele is on the right track, the learners’ task is to discover the truth conditional restrictions of both

164 Evidence from Simple SyntaxComplex Semantics morphemes. The different interpretations of achievements in the progressive will follow. Gabriele (2005) presents two studies of this syntaxsemantics mismatch, one of Japanese as a second language and one of English as a second language. She had somewhat different research questions and employed complex tests with many conditions, but we shall only concentrate here on the results relevant to the acquisition of the present progressive be V-ing and present te-iru (the past form is te-ita). In order to test sentence interpretation, she used a modified Truth Value Judgment Task with pictures. In the two studies, the stories were presented aurally in the target language, not in the native language. Learners listened to recorded stories and saw two pictures. They were then presented with a sentence both visually and aurally, and asked to judge on a scale from 1 (worst) to 5 (best) whether the sentence made sense as a description of the story. For each verb, a complete context and an incomplete context were provided. Here are some examples: (51) a. Complete event context for paint a portrait, an accomplishment VP Picture 1: Ken is an artist. At 12:00 he begins to paint a portrait of his family. Picture 2: At 8:00 he gives the portrait to his mother for her birthday. Ken is painting a portrait of his family. (Expected answer 1 in Japanese and English) b. Incomplete event context for paint a portrait, an accomplishment VP Picture 1: Ken is an artist. At 12:00 he begins to paint a portrait of his family. Picture 2: At 12:30 he paints his mother and father. Ken is painting a portrait of his family. (Expected answer 5 in Japanese and English) (52) a. Complete event context for arrive, an achievement VP Picture 1: This is the plane to Tokyo. At 4:00 the plane is near the airport Picture 2: At 5:00 the passengers are at the airport.

Aspectual challenges 165

The plane is arriving at the airport. (Expected answer 1 in English, 5 in Japanese) b. Incomplete event context for arrive, an achievement VP Picture 1: This is the plane to Tokyo. At 4:00 the plane is near the airport Picture 2: There is a lot of wind. At 4:30 the plane is still in the air. The plane is arriving at the airport. (Expected answer 5 in English, 1 in Japanese) In study 1, Gabriele tested 101 native speakers of Japanese learning English in Japan and 9 near-native Japanese individuals living in New York City, as well as 23 native Japanese as controls. In study 2, she tested 33 native English speakers learning Japanese in the US, as well as 31 native speaker controls. Proficiency tests were used to divide the participants into groups. One thing to note is that the Japanese L2 low group is of lower proficiency than the English L2 lowest group. Turning to results, we shall look at the Japanese Æ English direction first. Learners’ accuracy on accomplishment verbs was very high, as expected. With achievement verbs, learners had to add the ongoing interpretation to their grammar, as well as delearn (i.e., learn the unavailability of, or preempt) the complete event interpretation. Note that these two interpretations are not necessarily in contrast, and it is not immediately obvious in the adopted analysis that there is anything in the grammar to prevent the learners from entertaining both interpretations. Table 10 is adapted from Gabriele (2005). Table 10. Mean scores on achievements in the English present progressive (between 1 and 5) Low (n=46)

Interm (n=39)

High (n=16)

Near-native (n=9)

Native (n=23)

is arriving refers to incomplete event

3.08

2.89

4

4.25

4.1

is arriving refers to complete event

3.64

3.45

2.56

3.25

1.5

166 Evidence from Simple SyntaxComplex Semantics As the mean scores in Table 10 indicate, it does not seem to be difficult to add the incomplete interpretation to the grammar. The low and intermediate learners do not reject this interpretation strongly, while the high and the near-natives rate it as high as the natives. But learners encounter more difficulty with rejecting the complete event interpretation (which comes into their grammar from the L1). Individual results show that half of the low and intermediate learners strongly believe The plane is arriving actually means The plane has arrived, and some high learners believe so, too. We shall discuss some possible reasons after we present the results of the English Æ Japanese direction. Table 11. Mean scores on achievements with te-iru (between 1 and 5) Low (n=16)

High (n=17)

Native (n=31)

tui-te i-ru ‘is arriving’ refers to incomplete event

3.28

2.04

1.35

tui-te i-ru ‘is arriving’ refers to complete event

4.39

3.93

4.84

As Table 11 shows, the resultative (complete event) interpretation is in place even for the low proficiency group. Results of individual responses point to the fact that many of the low proficiency learners have not completely ruled out a progressive interpretation for achievements + te-iru, but the high proficiency learners have largely overcome the problem, that is, they know that the incomplete event interpretation is not available. In sum and comparing both learning directions, adding a new interpretation is already accomplished by the higher proficiency learners of English and by all learners of Japanese. On the other hand, delearning an interpretation seems to be difficult for all learners of English but only for the low proficiency learners of Japanese. Recall here that the L2 Japanese low group is of lower proficiency than their L2 English counterparts; that is, proficiency alone cannot explain the acquisition facts. These finding were not as expected. There is a pattern of differential difficulty here that begs the question.23 As Gabriele correctly points out, semantic learning is different from syntactic or morphological learning. In order to delearn an interpretation of a perfectly grammatical sentence, the following things need to happen. The sentence must be pronounced (and comprehended) in a situation clearly

Aspectual challenges 167

relevant to the proposition. The extralinguistic situation observed must also strongly refute the hypothetical interpretation of the learner with respect to the sentence. To use the examples from Gabriele’s stories, take a Japanese learner of English who is at the airport and is observing an incomplete situation, that is, the plane flying in was near the airport but strong winds at the last moment precluded it from landing. The learner hears the sentence The plane is arriving at the airport and tries out her Japanese perfective interpretation of “The plane has arrived at the airport.” Well, the unfolding situation would clearly disabuse her of this interpretive hypothesis, because the final event has obviously not transpired. As in Dowty’s Imperfective Paradox, the ongoing interpretation does not entail completion. On the other hand, consider an English learner of Japanese witnessing a complete situation, that is, the airplane was flying in, came near the airport and finally landed. The learner hears Hikǀki-ga knjkǀ-ni tui-te i-ru ‘The plane has arrived at the airport’ but is employing her English-native interpretation “The plane is arriving at the airport”. The context will not squarely contradict her hypothesis because, as per the Imperfective Paradox, a completed event does entail a previous process leading up to that change of state. The learner may suppose that her interlocutor has not obeyed Grice’s Maxim of being maximally relevant and instead has decided to highlight the process stage of the event. Thus, in the Japanese Æ English direction, we have a semantic contradiction between context and interpretation. In the English Æ Japanese direction, we have flouting of pragmatic maxims to explain the context-interpretation discrepancy. I believe this situation can explain why learners of Japanese have a harder time delearning one interpretation than the learners of English.24 Following Montrul and Slabakova (2002), Gabriele (2005) compared knowledge of interpretation and knowledge of the inflectional morphology as tested by a GJT. She included 8 present progressive accomplishments and as many achievements, half of them grammatical and half ungrammatical. GJT stimuli of study 1 are exemplified in (53). The test also contained past simple and past progressive morphology. The study 2 stimuli as in (54) were similarly designed, but due to the morphological differences between English and Japanese, ungrammatical sentences did not involve bare stems, which are purportedly very easy to spot. (53) a. My mother is drinking a glass of wine right now. (grammatical) b. *Today John is draw a portrait of my family. (missing inflection) c. *Last week Sam is running a marathon in Boston (wrong tense).

168 Evidence from Simple SyntaxComplex Semantics (54) a. Sengetsu, chefu-wa sushi nit suite no hon o kakimashita (gr.) last month chef-TOP sushi about book-ACC write-PAST ‘Last month the chef wrote a book about sushi.’ b. Ashita, Yumiko-wa aoi kuruma-o araimashita (wrong tense) tomorrow, Yumiko-TOP blue car-ACC wash-PAST ‘Tomorrow, Yumiko washed a blue car.’ c. Senshnj, Takashi-wa resotoran de pizza-o tabemasu (wrong tense) last week, Takashi-TOP restaurant at pizza-ACC eat-PRES ‘Last week, Takashi eats pizza at the restaurant.’ Results in the Japanese Æ English learning direction (see Tables 12 and 13) show that participants were quite accurate in rejecting sentences like (53b), therefore suggesting that they know what the morphological form looks like. At the same time, the low and intermediate proficiency groups were struggling with sentences like in (53a), that is, accepting the form in the appropriate context. In the other direction, te-iru morphology was judged quite accurately by the low and high proficiency learners, in fact, better than they judged the past simple and the past progressive morphology. Table 12. Mean scores (between 1 and 5) on achievements and accomplishments in the present progressive, GJT Ungrammatical sentences ____________________ Acc Ach

Grammatical sentences ____________________ Acc Ach

Low (n=46)

1.78

1.9

3.39

3.44

Interm (n=39)

1.49

1.61

3.51

3.45

High (n=16)

1.38

1.41

3.86

3.81

Near-native (n=9)

1.17

1.08

4.69

4.22

Native English (n=23)

1.35

1.35

4.84

4.57

Aspectual challenges 169 Table 13. Mean scores (between 1 and 5) on achievements and accomplishments with te-iru, GJT Ungrammatical sentences Grammatical sentences ___________________ _____________________ Acc Ach Acc Ach Low (n=14)

2.27

1.77

3.86

3.68

High (n=14)

2.05

1.88

3.77

4.09

Native Japanese (n=31)

1.60

1.40

4.36

4.40

To summarize our discussion of the Gabriele (2005) bidirectional investigation, our necessarily sketchy presentation of the results should have revealed that acquisition of new interpretations of the seemingly equivalent PROG operator is possible even for intermediate learners. At the same time, it was observed that delearning a native interpretation was easier in one learning direction than in the other. I suggested that this was due to the contradictory evidence available to the learners, in one direction, versus only pragmatically inappropriate evidence available in the other direction. Learners’ knowledge of morphology was found to be superior in one direction (English Æ Japanese), precisely in the same direction where learners’ interpretive knowledge was superior. I shall evaluate the morphologymeaning connection after I discuss the Montrul and Slabakova (2002) study in the next section. 3.6. Acquisition of grammatical aspect (Spanish L2) In a series of studies, Montrul and Slabakova investigated acquisition of interpretive properties related to the aspectual functional projection AspP, in English-Spanish interlanguage (Montrul and Slabakova, 2002, 2003; Slabakova and Montrul, 2002, 2003). Montrul and Slabakova (2002) was a study specifically designed to probe the connection between acquisition of inflectional morphology and interpretations related to the aspectual tenses preterite and imperfect. To anticipate the presentation of the results, they show that there is a strong connection between the acquisition of the inflectional morphology and the semantic interpretation of these aspectual tenses.

170 Evidence from Simple SyntaxComplex Semantics As in Gabriele (2005), their focus is Spanish and English aspectual tenses, which encode different meanings. While the English past progressive tense signifies an ongoing event in the past, Spanish imperfect can have both an ongoing and a habitual interpretation. The English simple past tense, on the other hand, has a one-time finished event interpretation and a habitual interpretation, depending on the lexical class of the predicate. Activities as in (56a) in the past simple denote a habitual event, while telic predicates as in (56b) have a one-time complete event interpretation. The Spanish preterite has only the latter interpretation. The examples below illustrate this: (55) a. Guillermo robaba en la calle. (habitual) Guillermo rob-IMP in the street ‘Guillermo habitually robbed (people) in the street.’ b. Guillermo robó en la calle. (one-time event) Guillermo rob-PRET in the street ‘Guillermo robbed (someone) in the street.’ (56) a. Felix robbed (people) in the street. (habitual) b. Felix robbed a person in the street. (one-time event) In the diagrams in Figure 1, circles enclose meanings that are encoded by the same piece of inflectional morphology. In restructuring her grammar, the learner has to acquire the fact that it is the imperfect morphology that encodes habituality in Spanish, and not the perfective preterite morphology. Another acquisition task is noticing that the imperfective ending is ambiguous between two interpretations, habitual and ongoing, while the preterite ending only encodes the perfective meaning of a one-time finished event. In this sense, the habitual meaning is now paired with another imperfective meaning (the ongoing one) and crucially does not depend on the lexical class of the predicate. This English-Spanish learning situation presents a mismatch between syntactic structure and conceptual structure. The pieces of inflectional morphology come from the functional lexicon. The functional projections (e.g., AspP) where features are checked are part of sentence syntax. The aspectual meanings (ongoing event, habitual event, one-time finished event) reside in conceptual structure (Jackendoff, 2002). But different languages have different form-to-meaning mappings, which are calculated at the syntax-semantics interface.

Aspectual challenges 171 Aspectual tense meanings in English

Perfective

Imperfective

Habitual

Ongoing

Aspectual tense meanings in Spanish

Perfective

Imperfective

Habitual

Ongoing

Figure 1. Aspectual tense meanings in English and Spanish

Montrul and Slabakova (2002) tested 71 adult learners of Spanish. Based on a proficiency task, they were divided into advanced and intermediate learners. Based on a test of inflectional morphology of aspectual tenses, the intermediate learners were further divided into a Yesmorphology group and a No-morphology group. The cut-off point for successful acquisition of the morphology (thus, grounds for a learner to be classified in the Yes-morphology group) was 80% achieved accuracy. The main test instrument was a sentence conjunction judgment task, which specifically tested the semantic implications of the preterite and imperfect tenses. In this task, subjects were presented with a list of sentences consist-

172 Evidence from Simple SyntaxComplex Semantics ing of two coordinated clauses. Some of the combinations were logical while others were contradictory. Subjects had to judge on a scale ranging from –2 (contradiction) to 2 (no contradiction) whether the two clauses made sense together. Following is an example with an accomplishment verb: (57) a. Joaquín corría (imperf) la carrera de fórmula 1 pero no participó. ‘Joaquín was going to participate in the Formula One race but he didn’t take part in it.’ -2 -1 0 1 2 b. Pedro corrió (pret) la maratón de Barcelona pero no participó. ‘Pedro ran the Barcelona marathon but he didn’t take part in it.’ -2 -1 0 1 2 Table 14. Mean rates for acceptable/unacceptable combination of clauses with the different lexical classes (between -2 and 2) Groups

Accomplishments

States

Achievements

Native speakers

1.34 / -.98*

1.56 / -1.5*

1.39 / -1.69*

Advanced

1.23 / -1.1*

.92 / - .9*

.25 / -1.79*

Interm. yes-morphology .42 / - .2*

.53 / - .32*

.03 / - .86*

Interm. no-morphology

.12 / - .25

- .57 / .75 **

.24 / - .24*

Note: * The contrast between these two means is significant by t-test. ** The no-morphology group displayed the opposite pattern: they rejected sentences with imperfect and accepted those with preterite.

Table 14 presents the mean rates of acceptable (as in 57a) and unacceptable (as in 57b) combinations of clauses, with the three lexical classes of predicates. If learners distinguish between sentences as in (57a) and (57b), then their mean rating of these clauses should be significantly different (marked with a star in the table). Group results show that advanced and intermediate learners who scored above 80% accuracy with the morphology test appear to have acquired the semantic implications associated with preterite and imperfect tenses in Spanish. By contrast, those intermediate learners who do not have knowledge of the preterite/imperfect morpho-

Aspectual challenges 173

phonology are not yet sensitive to the semantic contrast between these tenses, especially with achievement and state predicates. In addition, individual results were calculated with scalar responses converted into absolute values in the following way. If illogical sentences had been assigned a score on the scale between –1 and –2 and a logical sentence a score between 1 and 2, those responses were considered correct and were awarded 1 point. Negative responses for logical sentences and positive responses for illogical sentences were considered incorrect and received a 0. All the native speakers had at least 5 out of 7 correct responses per sentence type. Subjects who had at least 5 imperfect and 5 preterite sentences correct with each class of verbs were deemed to have acquired the contrast. The researchers applied this criterion to the two intermediate learner groups to establish whether there was a correlation between knowledge of morphology and knowledge of semantics with the three types of predicate tested (accomplishments, achievements, and states). 25 The frequency distributions are displayed in Table 6.15 (Note that in all cases, there was a small group of subjects for whom the presumed acquisition of the semantic contrast was unclear because they had only 4 sentences correct with imperfect and 5 with preterite, or vice versa.) Table 15. Distribution of learners according to acquisition of the morphology and semantics of aspectual tenses in Spanish (Yes = have acquired, No = have not acquired) ACCOMPLISHMENTS Yes semantics No semantics

Yes morphology No morphology unclear (n=5) 21 2 21 22 p < 0.0023

ACHIEVEMENTS Yes semantics No semantics

Yes morphology 20 21

No morphology unclear (n = 4) 1 25 p < 0.0023

Yes semantics No semantics

Yes morphology 21 21

No morphology unclear (n = 5) 2 22 p < 0.0001

STATES

174 Evidence from Simple SyntaxComplex Semantics The authors interpret these findings as suggesting that knowledge of morphology necessarily precedes knowledge of semantics in this aspectual domain, and that the acquisition of the semantic contrast may be a gradual development which eventually reaches complete native-like knowledge in advanced proficiency learners. They explained the higher accuracy on the morphology with the context of acquisition (aspectual morphology endings are explicitly taught and drilled in language classrooms) and with the nature of the morphology test (recognition of the correct form in context, one out of only two choices). At first glance, the findings of both Gabriele (2005) and Montrul and Slabakova (2002) seem to flatly contradict the semantics-beforemorphology view that I offered for examination at the end of section 3.2 in chapter 5. However, it would be premature to abandon this view completely. There are a few considerations to keep in mind. First of all, the syntaxbefore-morphology view (including the Missing Surface Inflection Hypothesis) is supported with robust evidence from production while the Gabriele (2005) and the Montrul and Slabakova (2002) studies I reviewed here look at recognition of inflectional morphology in comprehension, a much easier task. It is important to keep in mind that “knowledge of morphology” may mean at least three different things in different L2 studies: passive recognition of morphemes in comprehension, successful productive suppliance of the morphology in appropriate context, and knowledge of the morphemes’ semantic entailments. The level of difficulty of achieving these three types of knowledge is obviously different, and a matter of direct empirical comparison. Both studies discussed here (Gabriele, 2005; Montrul and Slabakova 2002) offer a comparison between morpheme recognition and semantic entailments. Comparisons between production and semantic entailments have been, to my knowledge, only indirect (see discussion of Bardovi-Harlig 1992 below, which compares production and use in appropriate context, but not production and knowledge of entailments, as well as Slabakova 2003 in the next section). A comparison between recognition and production has not, to my knowledge, been attempted (a search of the LLBA database for such a study on 11 July, 2007 came out empty). Secondly, even if a stronger version of semantics-before-morphology is not supported, a weaker version of similar theoretical import is actually The Bottleneck Hypothesis. This hypothesis postulates that it is the functional morphology indeed that is the “tight spot” in the acquisition process flow. It is processed by declarative memory, has to be learned by rote, and its forms (or phonological features) present difficulty for L2 learners not only at be-

Aspectual challenges 175

ginning stages of acquisition but at later stages, too (Lardiere, 2005). It is a stumbling block in linguistic production but is also crucial in comprehension. After the figurative bottleneck, application of universal semantic principles continues to flow freely, and target interpretations are achieved. One criticism that can be addressed to both the Gabriele and Montrul and Slabakova studies is that their participants comprise instructed learners. Several studies on the acquisition of tense/aspect morphology have indicated that while instructed and uninstructed learners go through the same developmental stages, instructed learners outperform uninstructed learners with the use of morphology at later stages (Bardovi-Harlig, 1992, 1995). This is because tense/aspect morphology is a prominent topic in any instructional intervention, and classroom learners usually receive extensive instruction and intensive drilling of verbal endings. In her study on the development of tense/aspect morphology in English, Bardovi-Harlig’s (1992) findings suggest that the development of form precedes appropriate use. Learners provide morphological markers, but sometimes in incorrect contexts. That is, fully grammatical forms emerge and are used by the learners before they carry target-like meaning. Of course, English is a morphologically impoverished language with few, often polysemous, morphemes. It remains to be seen how this comparison will fare in languages with richer inflectional morphology. Another, and related, shortcoming of these two studies is that they are looking at learners’ recognizing the form and basic meanings of grammatical aspect morphemes (the aspectual tenses). Not only the forms but their aspectual meanings are widely taught and drilled in language classrooms. Thus, it is difficult to rule out instruction effects in general. The study we discuss in the next section was designed to control for a possible instruction effect. 3.7. Uninstructed aspect-related semantic properties The linguistic properties whose acquisition Slabakova (2003) investigates again have to do with grammatical aspect. English differs from German, Romance, and Slavic with respect to the semantics of the present tense. It is well known that the English simple tense cannot denote ongoing events. (58) a. *She eats an apple right now. b. She is eating an apple right now.

(#ongoing event) (ongoing event)

176 Evidence from Simple SyntaxComplex Semantics c. She eats an apple (every day).

(habitual event)

With stative predicates, however, the ongoing reading of the English present is possible. (59) a. Mike is lazy. b. Mike is being lazy today.

(characteristic state) (temporary state)

Furthermore, the English bare infinitive denotes not only the processual part of an event but includes the completion of that event. English accomplishment and achievement predicates in the infinitive (without any aspectual morphology) have only complete events in their denotations. (60) a. I saw Mary cross the street. b. I saw Mary crossing the street.

(completion entailed) (no completion entailed)

In trying to explain the relationship between the facts illustrated in (58a) and (60a), many researchers have noticed that English verbal morphology is impoverished (Bennett and Partee, 1972; Landman, 1992; Roberts, 1993; Zucchi, 1999). The experimental study adopts Giorgi and Pianesi’s (1997) proposal. English verbs, they argue, are “naked” forms that can express several verbal values, such as the bare infinitive, the first and second person singular, and the first, second and third person plural. Many English words are even categorially ambiguous in that they can either identify an “object” or an “action,” such as cry, play, drive, and many others. Giorgi and Pianesi (1997) propose that verbs are categorially disambiguated in English by being marked in the lexicon with the aspectual feature [+perf], standing for ‘perfective.’ English eventive verbs acquire categorial features by being associated with the aspectual marker [+perf]. In other words, English (eventive) verbs are inherently perfective and include both the process part of the event and its endpoint. Thus, children acquiring English can distinguish verbal forms from nominals, whose feature specification bundle will exclude the feature [+perf]. This feature has to be checked in a functional category, say AspP, in the sentential structure. We shall not go into the rest of the analysis here of how the other grammatical aspectual meanings obtain (but see the original study for details). In Romance, Slavic, and other Germanic languages, on the other hand, all verbal forms have to be inflected for person, number, and tense. Thus, nouns and verbs cannot have the same forms, unlike English, in which ze-

Aspectual challenges 177

ro-derivation abounds. The Bulgarian verb, for example, is associated with typical verbal features as [+V, person, number] and it is recognizable and learnable as a verb because of these features. Nominal inflections are distinguishable from verbal ones. Bulgarian verbs are therefore not associated with a [+perf] feature. Unlike English, Bulgarian has no present progressive tense and the present simple tense is ambiguous between a habitual and an ongoing event or state. This is true of eventive verbs as in (61) and of stative verbs as in (62) below. (61) a. Maria sega jade jabΩlka. Maria now eat-PRES apple ‘Mary is eating an apple right now.’ b. Maria jade jabΩlka vseki den. Maria eat-PRES apple every day ‘Mary eats an apple every day.’ (62) a. Maria e mΩrzeliva. Maria is-PRES lazy ‘Mary is lazy.’ b. Maria v momenta e mΩrzeliva. Maria at this moment is-PRES lazy ‘Mary is being lazy.’

(simultaneous event) (habitual activity) (characteristic state) (temporary state)

Bulgarian verbs do not need to be marked [+perf] in the lexicon. Consequently, Bulgarian equivalents to bare infinitives do not entail completion of the event. (63) Ivan vidja Maria da presiþa ulicata. Ivan saw Maria to cross street-DET ‘John saw Mary crossing the street.’

(no completion entailed)

Thus, Bulgarian and English exhibit a contrast in the present viewpoint aspect. It follows that the Bulgarian functional category AspP does not have to check the feature [+perf] because the verbal root does not carry this feature from the lexicon. In the acquisition of English by Bulgarian native speakers, then, the learning task is to notice the trigger of this property: the fact that English inflectional morphology is highly impoverished, lacking many personnumber-tense verb endings. The property itself, if Giorgi and Pianesi are

178 Evidence from Simple SyntaxComplex Semantics correct, is the [+perf] feature that is attached to English eventive verbs in the lexicon. Knowledge of this property will entail knowledge of four different interpretive facts: 1) bare verb forms denote a completed event; 2) present tense has only habitual interpretation; 3) the progressive affix is needed for ongoing interpretation of eventive verbs; 4) states in the progressive denote temporary states. This is a semantics-syntax mismatch that relates a minimal difference between languages—the presence or absence of a feature in the lexicon—to various and superficially not connected interpretive properties. All of the properties are not attested in the native language of the learners. Perhaps even more important, of the four semantic properties enumerated above, the second, third, and fourth are introduced, discussed, and drilled in language classrooms. The first one, however, is not explicitly taught. A hundred and twelve Bulgarian learners of English took part in the experiment, as well as 24 native speaker controls. The learners were typical classroom instructed learners. All participants took a TVJT with a story in their native language and a test sentence in English. Here is the example of a test quadruple (repeated from section 1): (64) A quadruple testing completed interpretation of English bare forms (the construction is known as “perceptual reports”) Matt had an enormous appetite. He was one of those people who could eat a whole cake at one sitting. But these days he is much more careful what he eats. For example, yesterday he bought a chocolate and vanilla ice cream cake, but ate only half of it after dinner. I know, because I was there with him. I observed Matt eat a cake.

True

False

Matt had an enormous appetite. He was one of those people who could eat a whole cake at one sitting. But these days he is much more careful what he eats. For example, yesterday he bought a chocolate and vanilla ice cream cake, but ate only half of it after dinner. I know, because I was there with him. I observed Matt eating a cake.

True

False

Alicia is a thin person, but she has an astounding capacity for eating big quantities of food. Once when I was at her house, she took a

Aspectual challenges 179

whole ice cream cake out of the freezer and ate it all. I almost got sick, just watching her. I watched Alicia eat a cake.

True

False

Alicia is a thin person, but she has an astounding capacity for eating big quantities of food. Once when I was at her house, she took a whole ice cream cake out of the freezer and ate it all. I almost got sick, just watching her. I watched Alicia eating a cake.

True

False

Results on the acquisition of all four semantic properties pattern the same way. We focus on the instructed properties first. The less proficient learners are quite accurate in mapping the present simple tense to habitual context (roughly around 80 %), while they are slightly less accurate at recognizing the progressive form semantics (around 65%). This contrast may be due to the fact that the generic, or habitual, meaning, can be expressed by the present tense form in their L1, even though it has to be supported by adverbials and/or context. Thus, if beginning learners are directly mapping simple present tense forms in the L1 and the L2, their semantic acquisition process will be facilitated by the fact that the habitual meaning is available in both cases. The progressive meaning, on the other hand, is associated with a different piece of morphology in the L2, making the process of formfunction mapping more problematic. The advanced learners are highly accurate on all three properties taught in classrooms. Thus initial L1 transfer and subsequent morphological acquisition are clearly attested in the data.

180 Evidence from Simple SyntaxComplex Semantics

95

100 87

90

83

79

75

80

65

70

86

82 75

73 76

75

68

64

68

58

Low Int

60

Hi Int

50

Advanced

40

Controls

30 20 10 0 Bare verb (F)

-ing (T)

Incomplete event

Bare verb (T)

-ing (T)

Complete event

Figure 2. Mean accuracy on bare verb versus –ing form on perceptual reports (in per cent)

Figure 2 presents accuracy on the untaught property: in the perceptual report construction, the bare verb has a completed interpretation. As the figure shows, advanced learners are even more accurate than native speakers in their knowledge that an English bare verb denotes a complete event, and consequently is incompatible with an incomplete event story (see first group of columns). Even more importantly, all learner groups are quite accurate in attributing a complete interpretation to the bare verb, a property that cannot transfer from the L1, as example (63) indicates. Note also that both native speakers and advanced learners prefer to combine complete event stories with a bare verb form, although the –ing form is not ungrammatical. In other words, both groups focus on completion in the context of a telic event. Individual accuracy was calculated using three out of four (75%) correct answers as the cut-off point. Table 16 gives those results.

Aspectual challenges 181 Table 16.

Number of individual participants who were accurate on bare verb versus -ing form crossed with complete and incomplete interpretation (percentage in brackets)

Group

bare verb incomplete (F)

-ing

bare verb

-ing

incomplete (T)

complete (T) complete (T)

Low (n=32)

14 (44%)

20 (62%)

23 (72%)

19 (60%)

High (n=41)

34 (83%)

25 (61%)

22 (53%)

32 (78%)

Adv (n=39)

36 (92%)

32 (82%)

31 (80%)

25 (64%)

Control (n=24) 20 (80%)

20 (80%)

24 (100%)

21 (84%)

Individual accuracy shows that more than half of individual learners (ranging from 53% to 100%) have acquired successfully every aspect of the taught properties. Importantly, 44% to 72% of individuals were successful on the different mappings of the untaught property (see Table 16). After establishing that it is possible to acquire semantic properties in the second language that are not manifested in the native language, let us now turn to the impact of the instruction variable. Slabakova (2003) reports that extensive scrutiny of the instruction materials and discussions with the instructors ascertained that the present simple and progressive tense meanings are explicitly taught and drilled from the beginning of classroom instruction. On the other hand, the closed denotation of bare verb forms is not taught, and the Bulgarian teachers are not consciously aware of it. Is it the case that instruction is a significant variable and learners were more accurate on the taught than on untaught properties? The short answer is “no.” Analysis of variance was performed on the data for each group, with condition as the sole factor. The Low Group performs equally accurately on all conditions (F(2, 93) = 1.71, p = .185), and so does the Intermediate Group (F(2, 120) = 2.67, p = .07). The Advanced Group shows a marginally significant difference for condition (F(2, 114) = 3.11, p = .05), but it is due to the only lower accuracy score (68%) on the – ing verb form combined with a telic story. As mentioned above, this combination is not ungrammatical, it is simply dispreferred. Whenever the story involves an event with a salient endpoint, it is possible to highlight the endpoint or the process leading to that endpoint. Advanced Bulgarian learners exhibit a preference for highlighting the endpoint. This preference

182 Evidence from Simple SyntaxComplex Semantics of the advanced learners is shared by the controls, although less markedly so. In general, there seems to be no effect of instruction in Bulgarian learners’ acquisition of the semantic properties of English present tenses. The theoretical implication of this finding is that all semantic effects of learning the trigger (English verbs are morphologically impoverished) and the related property ([+perf] feature attached to verbs in the lexicon) appear to be engaged at the same time. 4. Acquisition of article interpretation It is well known that speakers of a language without articles (e.g., Japanese, Chinese, Korean, Russian) have a hard time using articles correctly in English. A number of experimental studies (Huebner, 1983; Leung, 2001; Master, 1987; Murphy, 1997; Robertson, 2000; Thomas, 1989) have identified two types of errors these learners make: they either omit articles altogether or use them in inappropriate contexts. While these studies identified the error pattern in article use and linked it not to the definiteness but to the specificity of the nominal phrase in the target language, they offered no principled explanation as to why L2 learners’ choices should be affected by specificity. A series of recent studies, reported on in Ionin (2003) and Ionin, Ko, and Wexler (2004), provide a principled, parameter-based explanation of learner article interpretation and use. Before looking at the learning task and the syntax-semantics mismatch, I shall give informal definitions of definiteness and specificity, following Ionin, Ko, and Wexler (2004). Definiteness and specificity are semantic features, both are part of the conceptual arsenal of language. They are also discourse-related since they rely on the knowledge of the speaker and the hearer in a communication situation. By using a definite nominal phrase, a speaker refers to a uniquely identified individual in the mind of the speaker and the hearer. Definiteness is morphologically encoded by the; indefiniteness–by a(n) for singular count nouns and by zero article for plural nouns. Uniqueness can be established either by prior mention, or by shared knowledge. Take (65) for example. (65) Julie saw a car in the driveway. The car was red and shiny. Since at the first mention of the car, the object does not satisfy the presupposition of uniqueness in the mind of the speaker and the hearer, it is

Article interpretation 183

indefinite and marked with a. At the second mention, the object is already established in the discourse as known to the speaker and the hearer, hence it is definite and marked with the. An object need not be mentioned in the discourse immediately preceding the definite noun phrase, if it is known both to the speaker and the hearer. The example in (66) is an out-of theblue statement between a husband and wife who have ordered a computer repair person to visit their home: (66) The computer guy is coming at 10. Can you stay at home to meet him? Specificity, on the other hand, reflects the property of uniqueness again, an object or person being activated in the mind, but only of the speaker, not of the hearer. A specific nominal phrase has to be noteworthy in some discourse-related way. Specificity is not marked morphologically in standard English, that is, both indefinite and definite nominal phrases can be specific or not, as the examples in (67) illustrate. Examples (67c,d) are from Ionin, Ko and Wexler (2004), their (9a,b). (67) a. Jill wants to marry a Canadian. She is going to present him to her family at Christmas. (indefinite specific) b. Jill wants to marry a Canadian, but she has not met one yet. (indefinite non-specific) c. I want to speak to the winner of today’s race—she is my best friend! (definite specific) d. I want to speak to the winner of today’s race—whoever that is; I’m writing a story about this race for the newspaper. (definite non-specific) Colloquial English does have a marker of specificity, the demonstrative this, but its usage is not obligatory, see (68). (68) I saw a/this kitty in the pet shop, and I want to buy her for my daughter. When speakers of a language without articles have to acquire articles in English, their learning task is to map the semantic feature definiteness onto its morphological expression the, and the lack of it to a(n) or zero article. Note that in article-less languages definiteness is established through context and other linguistic means such as demonstrative pronouns but does not have dedicated morphology. Thus, it is not the semantic feature defi-

184 Evidence from Simple SyntaxComplex Semantics niteness that is new to the learners, just its morphological expression is. The two semantic features, specificity and definiteness, are somewhat close in meaning, having to do with establishing uniqueness of an object in the discourse. It is not inconceivable, then, for learners to use both features in semantically bootstrapping themselves into the morphology. However, Ionin (2003) and Ionin, Ko and Wexler (2004) go one step further and propose the Article Choice Parameter as in (69), a principled explanation of how languages choose which feature their articles reflect. (69) A language that has two articles distinguishes them as follows: The definiteness setting: Articles are distinguished on the basis of definiteness (exemplified in English) The specificity setting: Articles are distinguished on the basis of specificity (exemplified in Samoan) Cross-linguistic evidence for this semantic parameter comes mainly from Samoan, a language that purportedly uses the article le with specific nominal phrases and the article se with non-specific ones, regardless of definiteness.26 Based on the Article Choice Parameter, Ionin, Ko and Wexler (2004) predict that learners of English from article-less languages will fluctuate between the different settings of the UG-supplied semantic parameter until the input leads them to the target value (definiteness). Their native language will not aid them in this choice since it distinguishes both features, but none of them morphologically. Table 17 summarizes these predictions. Table 17. Predictions for article choice in L2 English Definite NP (Target the)

Indefinite NP (Target a)

Specific NP

Correct use of the

Overuse of the

Non-specific NP

Overuse of a

Correct use of a

Ionin, Ko and Wexler tested beginning, intermediate, and advanced learners of English with Russian or Korean as their native languages. They employed a forced-choice elicitation task and a production task. The forced-choice elicitation task included short dialogs for context, the test sentences were part of the dialogs as in (70).

Article interpretation 185

(70) a. Definite non-specific context: Conversation between a police officer and a reporter Reporter: Several days ago, Mr James Peterson, a famous politician, was murdered! Are you investigating his murder? Police officer: Yes, we are trying to find (a, the, ___) murderer of Mr. Peterson, but we still don’t know who he is. b. Indefinite specific context: Phone conversation Jeweler: Hello, this is Robertson’s Jewelry. What can I do for you, ma’am? Are you looking for some new jewelry? Client: Not quite. I heard that you also buy back people’s old jewelry. Jeweler: That is correct. Client: In that case, I would like to sell you (a, the, ___) beautiful silver necklace. It is very valuable, it has been in my family for 100 years! I will report on the results of the elicitation task from the intermediate and advanced Russian learners only. The Korean learners were much more accurate than the Russian learners and did not exhibit the expected fluctuation pattern as clearly as the Russians, since they were performing at ceiling. Table 18, adapted from Ionin, Ko and Wexler’s Table 12, indicates the Russian group results. The percentages in the cells do not add to 100 because article omissions are excluded. As the group results in Table 18 show, the predictions of the Article Choice Parameter are supported indeed. Learners’ behavior is not random, their choice of article is significantly affected by the specificity of the nominal phrase. In the process of acquiring the morphological expressions of definiteness, they sometimes map non-specific definite NPs onto a, and specific indefinite NPs onto the. In other words, they fluctuate between the definiteness and the specificity value of the Article Choice parameter. Table 18. Percentage use of articles in different contexts by Russian L2 learners of English Definite NP (Target the)

Indefinite NP (Target a)

Specific NP

79% the, 8% a

36% the, 54% a

Non-specific NP

57% the, 33% a

7% the, 84% a

186 Evidence from Simple SyntaxComplex Semantics However, individual results reveal a more complex picture. Of the 25 intermediate and advanced Russian-native learners, only 5 had acquired the definiteness, target-like English pattern. Two participants demonstrated the opposite (specificity) setting of the parameter and 8 fluctuated between definiteness and specificity. However, there were 3 learners that only heeded specificity with either indefinite or definite NPs but not both. Finally, 7 learners showed a truly miscellaneous pattern, unpredicted by the Article Choice parameter. In addition, Ionin, Ko and Wexler’s production results revealed that learners overused the with specific indefinites, but did not overuse a with non-specific definites. Furthermore, recent results from Tryzna (2007), in which a similar task with Polish native speakers learning English was used, suggest that the Article Choice Parameter presents an overly neat picture of article semantics (see note 26). Half of Tryzna’s intermediate proficiency learners demonstrated true optionality of article usage not dependent on specificity. However, half of her advanced learners had already acquired the target-like pattern. On the other hand, Tryzna’s Chinese learners were either target-like (65%) or fluctuated between definiteness and specificity (35%). In conclusion, the Article Choice Parameter and the related fluctuation prediction is a principled explanation of learners’ ostensibly erratic, but actually predictable article choice. It is an eminently testable hypothesis. Moreover, as more English L2 groups are being tested from different languages without articles, some interesting variation is uncovered. For example, the presence of classifiers in Chinese (which are only used with indefinite NPs) may be aiding the learners in acquiring English articles, while the presence of demonstrative pronouns (which are specificity and deixis markers) is not aiding Polish and Russian learners in a similar way. However, the main conclusion remains that, even though the learners cannot transfer any morphology knowledge from their native languages, the universal semantics helps them to bootstrap themselves into language-specific morphological expressions of definiteness. 5. Acquisition of subjunctive mood In the final section of this chapter, we shall discuss the case of acquiring properties of functional morphemes which do not exist in the native language of the learners. In principle, linguistic theory postulates that languages may not instantiate, or utilize, all grammatical (functional) meanings available from UG. For instance, Bulgarian, a Slavic language, forma-

Subjunctive mood 187

lizes the distinction between indicative and re-narrated mood. Whenever the event or state in the proposition has not been witnessed by the speaker and she wishes to draw attention to this fact, a special verbal ending appears. This would be equivalent to the difference between She arrived (and I saw her) and They say she arrived in English, see example (71). (71) a. Tja pristigna she arrive-INDICATIVE-PAST ‘She arrived (and I saw her).’ b. Tja pristigna-la she arrive-RENARRATED-PAST ‘She arrived (so I am told).’ We can still characterize this learning situation as a syntax-semantics mismatch, because Bulgarian expresses the meaning ‘hearsay’ with a functional morpheme while English expresses it with lexical means. In such cases, if we assume that grammatical meanings are checked in functional projections, a whole new functional category has to be acquired by the L2 learner, including its morphosyntax, phonology, and semantic import. Mood morphemes are particularly difficult for language learners because in many contexts (but not all!) they mark optional interpretation. It is often the case, as in (71), that two grammatical strings describe the same event (her arrival), but they differ in additional meanings related to the discourse context, and the relevant evidence for the L2 learner may be very difficult to observe. As in the case of all semantic acquisition, learners have to hear the relevant sentences and at the same time observe situations demonstrating to them the additional meanings (e.g., whether or not the speaker witnessed the event). It is well known that mood distinctions are acquired late by children, they are attrited first in speakers losing features of their native language, and may create insurmountable difficulty for second language learners. Precisely for these reasons, studies looking at mood distinctions in second language acquisition are especially valuable. Borgonovo et al (2006) is a pilot study investigating the acquisition of optional subjunctive use in relative clauses depending on the specificity of the head nominal phrase. As the example in (72) illustrates, the grammaticality is not at stake here, rather, it is two different interpretations that arise depending on whether the speaker has or doesn’t have, a specific general in mind, along the lines of who versus whoever (all examples from Borgonovo et al., 2006).

188 Evidence from Simple SyntaxComplex Semantics (72) El general que ha / haya dado la orden tiene que ser the general who has-IND/has-SUBJ given the order has that be juzgado judged ‘The general that has given the order must be judged.’ The Indicative morpheme usage correlates with having a specific person in mind, the subjunctive – with not having a specific person in mind. More generally speaking, the meaning of the subjunctive is something like “hypothetical verbal event” or “irrealis.” How do these meanings obtain in the structure of the sentence? The authors assume a scopal account of specificity: “A DP is specific if the existential operator that quantifies over it has scope over other operators such as modals or negation, and it is nonspecific if it scopes under those very same operators. It follows from this characterization that the crucial ingredient for licensing a specific or nonspecific interpretation of a DP is context, taken to mean: presence of an operator.” (Borgonovo et al, 2006: 355) The operators of the context can be negation (73a), interrogation (73b), modals (73c), future tense (73d), imperatives (73e), and strong intensional predicates (73f). Of course, under those same contexts, indicative can be embedded as well (see examples in (74)). (73) a. No veo un coche que me convenga NEG see-1sg a car that me suit-SUBJ-3sg ‘I don’t see the car that suits me.’ b. ¿Ves un coche que me convenga? see-2sg the car that you suit-SUBJ-3sg ‘Do you see a car that suits you?’ c. Aquí puedes encontrar un coche que te convenga here can-2sg find a car that you suit-3sgSUBJ ‘Here you may find a car that suits you.’ d. Compraré un coche que me convenga buy-FUT-1sg a car that me suit-3sgSUBJ ‘I will buy a car that suits me.’ e. Compra un coche que te convenga! buy a car that you suit-3sgSUBJ ‘Buy a car that suits you.’ d. Te sugiero que compres un coche que te convenga you suggest-1sg that buy-SUBJ-2sg a car that you suit-3sgSUBJ ‘I suggest that you buy a car that suits you.’

Subjunctive mood 189

These are the types of contexts Borgonovo et al. test in this experimental study. The TVJT focused on contexts where both the subjunctive and indicative were allowed by the grammar, in terms of sentence structure. More specifically, they devised stories that made the specificity of the DP abundantly clear, and then checked whether participants allowed or rejected the subjunctive and indicative in the relative clauses.27 The test contained 48 stories each followed by 2 sentences in Spanish produced by one of the participants in the story. The learners had to decide whether each of the sentences corresponded to the context of the story, and to rate them from -2 to +2. The two test sentences under each story were identical up to the subjunctive or indicative verb form. They all contained DPs with indefinite determiners. The context and the type of predicate under which the targeted mood was embedded were expected to give the participants enough clues as to the correct choice of mood. The grammatical contexts (operators) included in the TVJT were negation, future, interrogatives, imperatives, modals, and strong intensional predicates. Here is an example of a future story and test sentences. (74) a. Subjunctive expected Mr. López wants to buy a ring for his girlfriend, who hates diamonds but loves emeralds. He knows no emerald is perfect but he wants the best. He says to his friends: # Pase lo que pase, encontraré un anillo que tiene una esmeralda casi perfecta. ¥ Pase lo que pase, encontraré un anillo que tenga una esmeralda casi perfecta. ‘Whatever happens, I’ll find a ring that has an almost perfect emerald.’ b. Indicative expected Sara and Pedro have a leak in their bathroom faucet but they don’t have money to pay the plumber. Pedro goes to the library, where they have only one book that explains how to change that particular type of faucet. He says to Sara: ¥ Sacaré de la biblioteca un libro que explica cómo arreglar el problema. # Sacaré de la biblioteca un libro que explique cómo arreglar el problema. ‘I’ll check out a book from the library that explains how to fix the problem.’

190 Evidence from Simple SyntaxComplex Semantics Table 19. Mean responses on contexts with ‘appropriate’ subjunctive (choices between -2 and +2) Context

Mood Status

Strong intensional predicate Subj Indic

Natives (n=17)

Adv (n=8)

Interm (n=8)

¥ #

1.804**** -.627

1.75** -.5

1.33 .458

Future

Subj Indic

¥ #

1.765**** -.686

1.125 .458

.708 .833

Negation

Subj Indic

¥ #

1.735**** -.363

.458 .917

.604 1.229

Interrogative Subj Indic

¥ #

1.784** .137

1.625* -.292

1.458 .667

Imperative

¥ #

1.814**** -.859

1.333* .05

1.146 .575

1.312 .125

1.188 .5

Modal

Subj Indic

Subj ¥ 1. 656**** Indic # -.469 **** p