222 42 2MB
German Pages 207 [220] Year 2008
Linguistische Arbeiten
520
Herausgegeben von Klaus von Heusinger, Gereon Mller, Ingo Plag, Beatrice Primus, Elisabeth Stark und Richard Wiese
Fortgeschrittene Lernervariet&ten Korpuslinguistik und Zweitspracherwerbsforschung
Herausgegeben von Maik Walter und Patrick Grommes
Max Niemeyer Verlag Tbingen 2008
n
Bibliografische Information der Deutschen Nationalbibliothek Die Deutsche Nationalbibliothek verzeichnet diese Publikation in der Deutschen Nationalbibliografie; detaillierte bibliografische Daten sind im Internet ber http://dnb.ddb.de abrufbar. ISBN 978-3-484-30520-5
ISSN 0344-6727
( Max Niemeyer Verlag, Tbingen 2008 Ein Imprint der Walter de Gruyter GmbH & Co. KG http://www.niemeyer.de Das Werk einschließlich aller seiner Teile ist urheberrechtlich geschtzt. Jede Verwertung außerhalb der engen Grenzen des Urheberrechtsgesetzes ist ohne Zustimmung des Verlages unzul ENGLISH
ENGLISH > GERMAN
INTERSECT
GERMAN SOURCE TEXTS
ENGLISH TARGET TEXTS
ENGLISH SOURCE TEXTS
GERMAN TARGET TEXTS
NO. OF TOKENS
519 540
585 693
205 359
200 379
FICTION
06 texts
05 texts
NON-FICTION
21 texts
10 texts
With this setup of corpora for CA and CIA it is possible to draw a comprehensive picture of modality in L2 German, but in order to be able to extract meaningful information about different uses of modal expressions, these expressions need to be annotated.
149
Modality as Indicator of L2 Proficiency?
5. Annotation of modal expressions As indicated before, the two types of modal expressions dealt with in this article are modal verbs on the one hand and modal adverbials and related adjectival/noun/prepositional phrases and verbal constructions on the other.8 As the aim of the annotation process is not only to identify e.g. different uses of modal verbs, but also to be able to compare uses of different types of modal expressions, e.g. modal verbs and modal adverbials, the annotation scheme needs to allow for the integration of these different types into one taxonomy. The tagset that has been devised for this investigation first of all makes a broad distinction between epistemic and non-epistemic modal expressions, which are then categorised further 9 according to certain criteria. In the epistemic section, modal expressions are sub-divided according to degree of certainty expressed by the speaker, as can be seen in (abridged and condensed) table 3. Evidentials are annotated separately (cf. table 4). Table 3: Taxonomy for epistemic modal expressions
CATEGORY
MODAL VERB
MODAL ADVERBIALS AND RELATED ADJECTIVAL/NOUN/PREPOSITIONAL PHRASES AND VERBAL CONSTRUCTIONS (EXAMPLES)
compelling conclusion
müssen
sicher, mit Sicherheit etc.
more tentative conclusion
müsste, sollte
höchstwahrscheinlich, ziemlich sicher, meiner Meinung nach, überzeugt sein, denken, glauben etc.
confident assumption
werden
voraussichtlich
feasible assumption
dürfte
normalerweise, anscheinend, wohl, vermutlich, es ist anzunehmen, vermuten
assumption of possibility
können, mögen
tentative assumption of possibility
könnte
möglicherweise, eventuell, vielleicht, unter Umständen etc.
Table 4: Taxonomy for evidentials
CATEGORY
MODAL VERB
claim by grammatical subject
wollen
claim by others (hearsay)
sollen
MODAL ADVERBIALS AND RELATED ADJECTIVAL/NOUN/PREPOSITIONAL PHRASES AND VERBAL CONSTRUCTIONS (EXAMPLES)
angeblich, wie verlautet, gerüchteweise
–––––––—–– 8 9
For ease of reference, I will henceforth refer to this whole group as modal adverbials. The taxonomies are mainly based on the grammatical descriptions by Brinkmann (1971) and Zifonun et al. (1997), additional modal adverbials have been extracted from Dornseiff (1965) and Wehrle (1967).
150
Ursula Maden-Weinberger
Other epistemic modal adverbials in additional categories were annotated, such as purely assertive (e.g. tatsächlich, wirklich), evaluative assertive (e.g. glücklicherweise, günstigerweise, immerhin, zumindest, bedauerlicherweise), assertive on evidence (e.g. es ist bewiesen, ohne Zweifel, ohne Frage, auf alle Fälle, bekanntlich, offensichtlich, logischerweise) and negative concessive (scheinbar, kaum, schwerlich, eher nicht). In total, a catalogue of approximately 220 items was annotated. For non-epistemic modality, a similar taxonomy was drawn up. Here, however, the categorisation was along the distinction of obligation – permission for deontic modality in addition to other non-epistemic categories such as volition, ability etc. Table 5: Taxonomy for non-epistemic modal expressions
CATEGORY
MODAL VERB
MODAL ADVERBIALS AND RELATED ADJECTIVAL/NOUN/PREPOSITIONAL PHRASES AND VERBAL CONSTRUCTIONS (EXAMPLES)
necessity (unspecified source)
müssen
es ist notwendig, es besteht die Notwendigkeit, gezwungen sein, es ist erforderlich etc.
obligation (necessity through external, authoritative source)
sollen
befohlen, verlangt, gefordert etc.
recommendation, suggestion
sollte
es wäre angebracht, richtig, gut etc.
volition, intention
wollen
beabsichtigen, planen, streben nach, (neg.) sich weigern, keine Lust haben etc.
attenuated volition/wish
möchte
sich wünschen, Lust haben
liking/affection
mögen
gern haben, lieben, (neg.) hassen, nicht leiden können, schlimm finden
circumstantial possibility
können
möglich sein, die Möglichkeit haben
ability
können
vermögen, fähig sein, in der Lage sein, Erfahrung/Kenntnis haben
permission
dürfen, können
erlaubt sein, Erlaubnis haben, (neg.) verboten, vorschriftswidrig, gegen Gesetz/Regeln
Since modal verbs can be ambiguous or carry both epistemic and non-epistemic meaning (see Coates 1983 for detailed explanations), a category “undecided” was included in the tagset. By far the largest portion of cases where a decision between epistemic and nonepistemic meaning could not be reached occurred with the modal verb können, other modal verbs were hardly affected at all. Modal verbs classified as “undecided” were included for overall counts of modal verb use (figure 2) but excluded for the further analyses of epistemic and non-epistemic modal verb use. Overall, almost 4800 modal verbs were annotated in CLEG, 1400 in KEDS. This annotation was carried out manually, which also allowed for the annotation of errors in both form and use of modal verbs. Modal adverbials totalled approximately 1900 in CLEG and
151
Modality as Indicator of L2 Proficiency?
850 in KEDS and were annotated semi-automatically with CoAn.10 Both annotation processes were carried out solely by myself as part of my ongoing PhD research. Although annotations were partly cross-checked at later dates and guidelines for consistent annotation of problematic cases were drawn up, the fact that there is only one annotator obviously increase the possibility of annotation errors due to tiredness etc. However, it at least avoids the problem of inter-annotator inconsistencies.
6. Results The following diagram gives an overview of normalised frequencies for modal verbs and modal adverbials in each of the three learner groups as compared to the native speakers:
300 250 200 Modal Verbs
150
Modal Adverbials
100 50 0 Year A
Year B
Year C
KEDS
Figure 2: Normalised frequencies of modal verbs and modal adverbials (per 10 000 words)
From calculations of log-likelihood values testing the statistical significance of the difference in frequencies between the respective learner groups (Year A, B and C) and native speakers (KEDS), it can be seen that the overuse of modal verbs is statistically significant at pG
müssen
–
must
78.4
77.4
80.8
werden
–
will
68.7
62.4
73.4
können
–
can
60.0
50.8
82.8
könnte
–
could
51.5
45.5
58.3
mögen
–
may
26.1
00.0
32.4
sollte
–
should
20.4
45.7
00.0
This indicates that the learners rely heavily on those epistemic modal verbs where they know that transfer from their L1 will be positive. This strategy is used by learners in all proficiency groups, if anything it seems to be more prominent for Year C learners. For sollte and mögen, the two remaining modal verbs where the English cognates can be used for the same meaning, but is usually not the preferred option, a look at the corresponding modal adverbials reveals an interesting insight. In both categories – tentative conclusion (sollte) and assumption of a possibility (mögen) the modal verbs are underused, sollte is in fact not used at all by learners in Year A, yet their corresponding modal adverbials are overused by the learners (figure 5) – despite the general underuse of modal adverbials overall that was noted earlier:
160
Ursula Maden-Weinberger
40,0 35,0 30,0 25,0
SOLLTE
20,0
MÖGEN
15,0 10,0 5,0 0,0 Year A
Year B
Year C
KEDS
Figure 5: Frequency of modal adverbials corresponding to modal verbs sollte and mögen12 (normalised per 10000 words)
When we look closer at the types of modal adverbials and related expressions in these two categories, it becomes apparent that the overuse of these items is concentrated on only a small number of different types of modal adverbials. The following table presents frequency information for the most common types of modal adverbials for the respective categories: Table 9: Modal adverbials: frequencies per 10 000 words and percentages within the respective categories YEAR A
YEAR B
YEAR C
KEDS
tentative conclusion (SOLLTE)
36.80 (00.0%) 28.18 (00.0%) 13.78 (00.0%) 09.98 (00.0%)
meiner Meinung nach/ ich bin der Meinung
14.7 (39.9%)
06.2 (22.1%)
05.3 (38.7%)
04.0 (40.4%)
ich glaube
11.3 (30.7%)
08.4 (29.9%)
03.1 (22.6%)
01.2 (12.1%)
ich denke
07.6 (20.7%)
09.3 (33.1%)
01.9 (13.9%)
01.8 (18.2%)
accumulative percentage
00.0 (91.3%)
00.0 (85.1%)
00.0 (75.2%)
00.0 (70.7%)
assumption of possibility (MÖGEN)
09.18 (00.0%) 14.98 (00.0%) 18.48 (00.0%) 09.78 (00.0%)
vielleicht
07.6 (83.5%)) 12.8 (85.9%)) 15.3 (83.2%)) 05.9 (60.8%))
–––––––—–– 12
N.B. these are modal adverbials and related expressions that belong to the same modality categories as the modal verbs (cf. Table 3). The modal verbs are used as category labels in this diagram for ease of reading and comparison.
Modality as Indicator of L2 Proficiency?
161
The percentage figures show that in both categories learners rely heavily on a small number of types. In Year A 91.3% of instances of modal adverbials for tentative conclusions are made up of just three lexical expressions. This ties in with Dittmar & Ahrenholz’s (1995: 206) conclusion for natural second language acquisition that, what they call propositional attitude verbs (such as glauben, denken etc.) and adverbs like vielleicht dominate expressions of epistemic meanings in earlier stages of acquisition while epistemic readings of modal verbs appear later and less frequently. Hasselgren (1994) coined the term “lexical teddy bears” for items that learners are very fond of using (and hence overuse), because they feel that they have mastered them as “safe” options to avoid errors. So learners usually revert to these “teddy bears” whenever they feel that they do not have any alternative ways of expressing the same meaning or when they are unsure whether other options are correct. It has to be noted that the modal adverbial “teddy bears” in this case are the same items that native speakers prefer to use, but the native speakers make use of a wider range of modal adverbials besides the highest frequency items (and use the corresponding modal verbs more often). For the category mögen, for example, native speakers used 7 different types as compared to 3 (Year A), 5 (Year B) and 6 (Year C) types. We can see, however, that the learners show an increase in variation of modal adverbials with increasing proficiency. This trend is also reflected in the overall number of different types of epistemic modal adverbials, which increases from 29 (Year A) to 45 (Year B) to 50 (Year C) as compared to 65 in KEDS. This indicates that two factors have to be taken into account when considering modal adverbial use as an indicator of proficiency – frequency of use and range of different types.
7. Conclusion The study that has been presented in this article is part of an ongoing project. The findings can therefore only serve as a first selection of “snapshots” of modality in learner German in an instructed setting that requires further and deeper analysis in order to make up a comprehensive album. They have shown, however, that the learners at the starting point of the investigation (i.e. in their first year of university studies) have, in principle, mastered the use of modal verbs in non-epistemic contexts. These include those modal verbs that are similar in form and meaning to English modal verbs, but also the ones that are not. There are more complex issues to be investigated, though, as there is still considerable confusion over verb forms (e.g. past tense vs. subjunctive), signs of L1 transfer that shows especially in typical errors (e.g. negation of müssen as nicht müssen instead of nicht dürfen in analogy to English must – must not) and a general overuse of non-epistemic modal verbs that could not be explained yet. It is suggested that a more pragmatically oriented exploration, which looks at the context of modal verb use and the way learners present and structure text involving modal verbs, might shed some light on this issue. It is expected that it is also on these higher-order discourse levels that differences between the learners in lower and higher proficiency groups will arise, which in turn might be established as features of advancedness with regard to modality.
162
Ursula Maden-Weinberger
As for epistemic modality, the data shows that learners in the corpus are relying heavily on modal expressions that they feel “safe” and comfortable with. In the case of modal verbs these are those where transfer from the L1 is positive (modal verbs that are similar in form and meaning in German and English). Those epistemic meanings of modal verbs that have no corresponding modal verb in English (e.g. evidential sollen and wollen) are the last to be acquired – or available for production – as we see their first emergence only in the higher proficiency groups. On the one hand this can be interpreted as corroborating evidence for L1 transfer influences, on the other hand it ties in with Kasper & Rose's (2002: 176) observation (based on Kärkäinnen 1992) in natural adult second language acquisition that “implicit, syntactically integrated, nonroutinized expressions of epistemic modality are more difficult to acquire than explicit, extra-clausal (parenthetical) and routinized expressions”. In the case of modal adverbials and related lexical expressions, the development starts out with a small number of overused items which is gradually expanded into a more varied set of modal adverbial expressions with increasing proficiency. In this respect, proficiency seems to be related more strongly to general vocabulary development and less influenced by L1 transfer. On the whole, we can conclude from these initial findings that certain aspects of modality such as the emergence of particular modal verb meanings can be seen as markers of proficiency, but it has to be acknowledged that the use of modal expressions is probably more influenced and therefore tied in with a more general advancedness in writing ability, that is to say the more expertise the learners gain in how to write well structured and well argued essays in the foreign language the more complex and elaborate their use of modal expressions becomes.
Modality as Indicator of L2 Proficiency?
163
References
Aijmer, Karin (2002). Modality in advanced Swedish learners’ written interlanguage. In Computer Learner Corpora, Second Language Acquisition and Foreign Language Teaching, Sylviane Granger, Joseph Hung, Stephanie Petch-Tyson (eds.), 55-76. Amsterdam & Philadelphia: Benjamins. Altenberg, Bengt (1998). Adverbial connectors in English and Swedish. In Out of Corpora, Hilde Hasselgård & Signe Oksefjell (eds.), 249-268. Amsterdam: Rodopi. Biber, Douglas, Stig Johansson, Geoffrey Leech, Susan Conrad & Edward Finegan (1999). Longman Grammar of Spoken and Written English. London: Longman. Brinkmann, Hennig (1971). Die Deutsche Sprache – Gestalt und Leistung. Düsseldorf: Schwann, 2., neubearbeitete und erweiterte Auflage. Brünner, Gisela & Angelika Redder (1983). Studien zur Verwendung der Modalverben. Tübingen: Narr. Bybee, Joan (1985). Morphology: A Study of the Relation between Meaning and Form. Series: Typological Studies in Language 9. Amsterdam & Philadelphia: Benjamins. Bybee, Joan & Suzanne Fleischman (1995). Modality in grammar and discourse: an introductory essay. In Modality in grammar and discourse, Joan Bybee & Suzanne Fleischman (eds.), 1-14. Amsterdam & Philadelphia: Benjamins. Bybee, Joan, Revere Perkins & William Pagliuca (1994). The Evolution of Grammar: Tense, Aspect and Modality in the Languages of the World. Chicago: Chicago University Press. Calbert, Joseph (1975). Towards the Semantics of Modality. In Aspekte der Modalität, Joseph Calbert & Heinz Vater (eds.), 1-70. Tübingen: Narr. Chellas, Brian (1980). Modal Logic: An Introduction. Cambridge: Cambridge University Press. Coates, Jennifer (1983). The semantics of the modal auxiliaries. London & Canberra: Croom Helm. Diewald, Gabriele (1999). Die Modalverben im Deutschen. Grammatikalisierung und Polyfunktionalität. Tübingen: Niemeyer. Dittmar, Norbert (1993). Proto-Semantics and Emergent Grammars. In Modality in Language Acquisition/ Modalité et Acquisition des Langues, Norbert Dittmar & Astrid Reich (eds.), 213-233. Berlin u.a.: Walter de Gruyter. Dittmar, Norbert, Astrid Reich, Magdalena Schumacher, Romuald Skiba & Heiner Terborg (1990). Die Erlernung modaler Konzepte des Deutschen durch erwachsene polnische Migranten. Info DaF 17: 125-172. Dittmar, Norbert & Bernt Ahrenholz (1995). The acquisition of modal expressions and related grammatical means by an Italian learner of German in the course of 3 years of longitudinal observation. In From pragmatics to syntax. Modality in second language acquisition, Anna Giacamole Ramat & Grazia Crocco Galèas (eds.), 197-232. Tübingen: Narr. Dornseiff, Franz (1965). Der deutsche Wortschatz nach Sachgruppen. Berlin u.a.: Walter de Gruyter, 6. Auflage. Giacalone Ramat, Anna (1992). Grammaticalization processes in the area of temporal and modal relations. Studies in Second Language Acquisition 14: 297–322. Giacamole Ramat, Anna & Grazia Crocco Galèas (eds.) (1995). From pragmatics to syntax. Modality in second language acquisition. Tübingen: Narr. Gilquin, Gaëtanelle (2001). The Integrated Contrastive Model. Spicing up your data. Languages in Contrast 3: 95-123.
164
Ursula Maden-Weinberger
Granger, Sylviane (1996). From CA to CIA and Back: An Integrated Approach to Computerized Bilingual and Learner Corpora. In Languages in Contrast. Text-based cross-linguistic studies, Karin Aijmer, Bengt Altenberg & Mats Johansson (eds.), 37-51. Lund: Lund University Press. Hasselgren, Angela (1994). Lexical Teddy Bears and Advanced Learners: a study into the ways Norwegian students cope with vocabulary. International Journal of Applied Linguistics 4: 237260. Heine, Bernd (1995). Agent-oriented vs. Epistemic Modality. Some observations on German Modals. In Modality in grammar and discourse, Joan Bybee & Suzanne Fleischman (eds.), 17-54. Amsterdam & Philadelphia: Benjamins. Huddleston, Rodney & Geoffrey K. Pullum (2002). Cambridge grammar of the English language. Cambridge: Cambridge University Press. Kärkäinnen, Elise (1992). Modality as a strategy in interaction: Epistemic modality in the language of native and non-native speakers of English. In Pragmatics and language learning, Lawrence Bouton & Yamuna Kachru (eds.), 197-216. Urbana, Illinois: Division of English as an International Language, University of Illinois, Urbana Champaign. Kasper, Gabriele & Kenneth Rose (2002). Pragmatic Development in a Second Language. Oxford & Malden, MA: Blackwell. Lyons, John (1977). Semantics. Cambridge: Cambridge University Press. Nehls, Dietrich (1986). Semantik und Syntax des englischen Verbs. Teil II: Die Modalverben. Eine kontrastive Analyse der Modalverben im Englischen und Deutschen. Heidelberg: Groos. Palmer, Frank (1986). Mood and Modality. Cambridge: Cambridge University Press. Salkie, Raphael (1995). INTERSECT: a parallel corpus project at Brighton University. Computers & Texts 9: 4-5. Stephany, Ursula (1989). Modality in First Language Acquisition: The State of the Art. In Modality in language acquisition, Norbert Dittmar & Astrid Reich (eds.), 133-144. Berlin u.a.: Walter de Gruyter. Tono, Yukio (2001). The Role of Learner Corpora in SLA Research and Foreign Language Teaching: A Multiple Comparison Approach. Unpublished PhD thesis, Lancaster University. von Wright, Georg Henrik (1951). An essay in modal logic. Amsterdam: North Holland. Wehrle, Hans (1967). Deutscher Wortschatz: Ein Wegweiser zum treffenden Ausdruck. 13. Auflage. Stuttgart: Klett. Weinberger, Ursula (2005). Error Analysis with computer learner corpora: a corpus-based study of errors in the written German of British university students. In Linguistics, Language Learning and Language Teaching, David Allerton & Cornelia Tschichold & Judith Wieser (eds.), 119-130. Basel: Schwabe. Wichmann, Anne & Jane Nielsen (2000). Rights and obligations in legal contracts: corpus evidence. In Working with German Corpora, Bill Dodd (ed.), 245-266. Birmingham: Birmingham University Press. Zifonun, Gisela, Ludger Hoffman & Bruno Strecker (1997). Grammatik der deutschen Sprache. Berlin u.a.: Walter de Gruyter.
Marcus Callies & Konrad Szczesniak
Argument Realisation, Information Status and Syntactic Weight – A Learner-Corpus Study of the Dative Alternation1
1. Introduction and Motivation Lately, second language acquisition (SLA) research has seen an increasing interest in advanced stages of acquisition and questions of near-native competence, but there are still relatively few studies of advanced learners compared to those at the early and intermediate stages of the learning process. It has been controversial to what extent and under what circumstances adult speakers of a foreign/second language (L2) can achieve native-like proficiency. Moreover, the monolingual native speaker (NS) as an unquestioned basis for comparison of native-like competence has recently come under attack (e.g. Birdsong 2005). While in many (European) countries the ultimate goal of foreign language teaching at the advanced level has traditionally been for the students to achieve a near-native command of the L2, it is often left unspecified what native-like proficiency exactly means (de Haan 1997: 55). Despite the growing interest in what has also been called the advanced learner variety (ALV), the field is still struggling with both a clarification and definition of the concepts “advanced learner” and “nativelikeness”, as well as an in-depth description of the ALV, especially with regard to learners’ acquisition of optional and highly L2 specific phenomena in all linguistic subsystems. Advanced learners have typically mastered the L2 rules of morphosyntax, and their written production is mainly free from serious grammatical errors. However, their writing often sounds unidiomatic and shows subtle differences to texts produced by NSs. The exact reasons for this “non-nativeness” or “foreign-soundingness” are difficult to pin down and are frequently explained by using vague cover terms such as “unidiomaticity” or “style”. Recently, learner corpus research has yielded empirical evidence that texts produced by (advanced) learners and NSs differ in terms of frequencies of certain words, phrases or syntactic structures. In a recent overview of the field, Granger (2004: 135) defines advanced interlanguage as “the result of a very complex interplay of factors: developmental, teaching-induced and transfer-related, some shared by several learner populations, others more specific.” According to her, typical features of the ALV are, for instance, overuse of high frequency vocabulary, overindulgence of a limited number of prefabs and a much higher degree of personal involvement, as well as stylistic deficiencies, often characterised by an overly spoken style or a somewhat puzzling mixture of formal and informal markers. –––––––—–– 1
We would like to thank David Smith and an anonymous reviewer for helpful comments on an earlier version.
166
Marcus Callies & Konrad Szczesniak
In addition, there is evidence that another factor that distinguishes advanced learners from NSs is the way they use linguistic structures to organise information in discourse (Carroll et al. 2000). Information structure management turns out to be problematic even for advanced L2 learners as they experience problems with information distribution and the end-weight principle when using certain syntactic patterns (e.g. unusually heavy focus constituents in it-clefts or non-extraposed, thus heavy clausal subjects, see Callies 2006). To carry the description of the ALV forward, and to find out what it is that remains problematic in the advanced stages of acquisition, we believe that it is fruitful to investigate optional as well as L2-specific phenomena. Optionality is abundant in many patterns of syntactic variation which are sensitive to principles of information structure, such as information status and syntactic weight, whereas verb argument alternations are often highly language-specific. The present study is a contribution to the description of the ALV and provides a corpusbased examination of the dative alternation (DA) in the written production of NSs and advanced German and Polish learners of English as a foreign language (EFL). The DA has been chosen as the topic of investigation because it is a verb alternation that is subject to semantic constraints, and in which principles of information structure influence the occurrence of either structural variant (see Section 2.1.2. below). In a recent review of issues relating to difficulties encountered in learning L2 grammar, DeKeyser (2005) explicitly points to problems of optionality in form-meaning mappings and argues that “the acquisition problem is compounded even further when optionality and discourse-motivated preferences for one of the options interact with arbitrary or semantically obscure subcategorization restrictions, such as, [..] the restrictions on dative alternation” (DeKeyser 2005: 10).
We examine the frequencies of use of the two alternative postverbal constituent orderings (the prepositional and the double object construction, see Section 2. below) in the writing of two groups of advanced learners of English with different native languages (L1s) to find out which factors influence the occurrence of either structural variant. In particular, we are interested to see whether concepts such as information status and syntactic weight play a role, and hope to provide answers for the question how far such principles are operative in the ALV. While the acquisition of argument structure alternations has received considerable attention within approaches to SLA based on Universal Grammar (UG), discourse-functional syntactic aspects of these phenomena have largely been neglected.
2. The Dative Alternation (DA) The DA has been a much discussed topic in the literature, and many alternative names have been used to describe its dynamics, with one syntactic variant being called ditransitive, dative or double object construction. To avoid confusion, the present paper will consistently use the terminology as illustrated in the examples (1) and (2). In the DA, a verb appears in two related structures in which it takes two complements. In our examples, the phrases a lot of power and wide reaching powers, respectively, are the theme, a role associated with
A Learner-Corpus Study of the Dative Alternation
167
entities which change location or state. The President is the recipient in (1) and, preceded by a preposition, the goal in (2). The two variants differ in terms of their syntactic structures, hence the two distinct names given to them. Variant (1) will be referred to as the double object construction (DOC), and (2) will be called the prepositional construction (PC). (1)
Although he was unable to change the constitution into a fully presidential one he did manage to include two clauses which gave [the President] indirect object / recipient [a lot of power]. direct object / theme (DOC)
(2)
The ambiguities such as in Article 34 and 37 over the way legislation is past [sic!] also gave [wide reaching powers] direct object / theme [to the President]. indirect object / goal (PC)
2.1. Factors that Influence the DA Dative verbs and the choice of either structural variant are subject to a number of morphological, semantic and discourse constraints. These have been described extensively in the literature, and it is impossible to reproduce the entire body of findings here. However, it is necessary to briefly review the main factors relevant to our study. 2.1.1. Constraints on Dative Verbs Levin (1993) distinguishes between alternating dative verbs which allow both structural variants, and two groups of non-alternating dative verbs: One group allows the prepositional variant only, mostly verbs of Latin origin, while the second group allows the double-object variant only. According to Pinker (1989: 45f.), the DA is subject to an interaction of morphological and semantic constraints. The morphological constraint says that − with few exceptions, e.g. promise, offer or allow − dative verbs tend to have native, i.e. Germanic stems, while Latinate stress-final stems are disallowed; consider the following examples: (3)
a. Konrad gave / donated / presented a book to the library. b. Konrad gave / *donated / *presented the library a book.
(4)
a. Marcus told / reported / explained the incident to the librarian. b. Marcus told / *reported / *explained the librarian the incident.
The semantic constraint Pinker mentions has been termed the “projected possessor” effect (Green 1974, Oehrle 1976), and refers to the fact that dative verbs “must be capable of denoting prospective possession of the referent of the second object by the referent of the first object” (Pinker 1989: 48). This includes non-literal, metaphorical possession as in tell someone a story or ask someone a question.
168
Marcus Callies & Konrad Szczesniak
The possession interpretation, or more generally, that of an affected recipient, is behind the meaning difference between the sentences in the following much-quoted example. The implication that the students actually learned French is stronger in (5a): (5)
a. Beth taught the students French. b. Beth taught French to the students.
This structure, therefore, carries the requirement that the first object be an animate entity capable of projected possession; the inanimacy of the recipients in (6a) and (6b) makes these sentences sound noticeably odd. (6)
a. ?Jim sent New York the package. b. *The sailor threw the pier the rope.
The PC is reserved for causation of transfer to spatial goals. Whereas the DOC has a semantic structure “X causes Z to have Y”, the meaning associated with the PC is “X causes Y to go to Z” (Pinker 1989: 82). Gropen et al. (1989) and Pinker (1989) classify the above constraint as belonging under broad range (lexical) rules (BRRs), as the constraint on animacy is believed to be universal. In addition to BRRs, Pinker assumes language-specific narrow range rules (NRRs). For the English DA, the NRR permits some verbs (those describing ballistic motion) in the alternation, while disallowing other potentially alternating verbs (Pinker 1989: 102-111): (7)
a. John tossed the ball to Mary. (ballistic motion) b. John tossed Mary the ball.
(8)
a. John pushed the ball to Mary. (continuous motion) b. *John pushed Mary the ball.
Wasow & Arnold (2003: 130) discuss another semantic constraint on the DA, namely what they call semantic connectedness, i.e. collocational or idiomatic links between the verb and post-verbal elements as in take into account or bring to an end. This phenomenon is particularly important for fixed expressions and idioms with dative verbs, which are usually believed to be restricted to either the double object (give someone advice / a headache / the creeps) or the prepositional variant (bring to life / to an end, send someone to the devil), respectively (Harley 2002, Rappaport Hovav & Levin 2005: 21ff.). Such phrases are subject to their own restrictions on the alternation. They are possible and attested in alternate variants, but rare. Briefly, their availability for the alternation depends on which of the two constituents is fixed. Expressions with fixed themes appear to be more flexible than those with fixed goals: (9)
Such records, he said, might illuminate general areas in which Miers gave advice to the president but stop short of making her divulge the advice itself. [www.washingtonpost.com/wp-dyn/content/article/2005/10/24/ AR2005102401744.html]
A Learner-Corpus Study of the Dative Alternation
169
(10) The format of the book, which strings together emails peppered with slang and emoticons over 364 pages, could well give a headache to readers accustomed to literature as flowing prose. [http://www.orcon.net.nz/home/entertainment/books/65860] By contrast, it is much harder, if not even impossible, to alternate expressions with fixed goals, consider the examples in (11) below. (11) a. *The doctor brought back life the frail soldier. (= brought the frail soldier back to life) b. *The journalist brought light a number of embarrassing facts. (= brought embarrassing facts to light) c. *The allied forces brought an end the war. (= brought the war to an end) d. *The young monk took heart the wise man's advice. (= took the advice to heart) Recently, new proposals have been put forward for the description and explanation of the DA and its (semantic) constraints, in particular since large amounts of corpus data have become more easily available for linguistic research.2 Studies on lexical restrictions of the DA were mainly based on the intuition of linguists.3 In view of new evidence from the World Wide Web, Bresnan & Nikitina (2003: 5-11) and Bresnan et al. (2005) present probabilistic and gradient accounts of the DA, also considering information structural aspects. They argue that “central evidential paradigms that have been used to support semantic explanations for the choice of dative constructions are not well founded empirically. Some widely repeated reports of intuitive contrasts in grammaticality appear to rest instead on judgments of pragmatic probabilities” (Bresnan & Nikitina 2003: 2).
They claim that our own linguistic intuitions often agree with those cited in the literature in that we perceive constructed examples to be (in)correct, and they present a significant number of examples from the WWW that seem to violate semantic constraints, but are obviously used by NSs and also appear grammatically possible. This is especially interesting with respect to fixed expressions and idioms with dative verbs mentioned above. Their fixedness can sometimes be overridden by discourse-motivated factors such as weight or ambiguity avoidance (see Section 2.1.2. below). Bresnan & Nikitina (2003) and Bresnan et al. (2005) argue that violations of established semantic restrictions of the DA are not ungrammatical, but improbable, and propose a modelling of these constraints in terms of stochastic Optimality Theory, in which constraints may be violated with certain probabilities. Thus, the above semantic considerations should be understood as fluid, violable constraints, which, if flouted for the sake of other considerations, do not render a form completely unacceptable. –––––––—–– 2
3
Thanks to corpus data, it is possible to address phenomena which have so far eluded explanation. For example, Haspelmath (2004) offers a frequency-based account of a constraint precluding some pronominal object combinations in the DOC. See also the recent programmatic article by Wasow and Arnold (2005).
170
Marcus Callies & Konrad Szczesniak
Although the flexibility of fixed idiomatic expressions represents a very interesting topic, we will not pursue its mechanism further here.4 The availability of these expressions for the alternation probably depends on other constraints (for details, see Harley 2002 and Rappaport Hovav & Levin 2005), and the view adopted here may turn out to be oversimplified. (In theory, some expressions with fixed goals may also alternate, if only marginally.) However, for the purpose of the present study, we adopt the division into expressions with fixed themes and expressions with fixed goals as the main criterion for the availability of fixed expressions for the alternation. Thus, only expressions with fixed themes are considered in the present study. In sum, semantic approaches explaining which verbs do or do not allow the DA, and under what circumstances this may change, cannot sufficiently explain why a particular variant is chosen in a certain discourse context with alternating dative verbs which allow both structural variants. Corpus-based and experimental studies of grammatical variation patterns in English have shown that the choice of possible variants is largely determined by principles of information structure, to which we now turn. 2.1.2. Discourse Constraints The DA is interesting not only in terms of argument realisation, but also from the point of view of information structure (Collins 1995, Wasow 1997, Biber et al. 1999, Arnold et al. 2000). Wasow & Arnold (2003) discuss various determinants that influence post-verbal constituent ordering. One of the central factors is syntactic weight, or heaviness: In many languages, less complex constituents precede more complex ones, which are in turn placed towards the end of a sentence, also known as the end-weight principle. This is the reason why (12b) sounds more natural than (12a) below. The long theme a beautiful ripe juicy Macintosh apple which I brought from Greece makes the DOC in (12b) sound more acceptable, because long constituents are expressed most felicitously towards the end (underlined below): (12) a. ? I gave a beautiful ripe juicy Macintosh apple which I brought from Greece to my teacher. b. I gave my teacher a beautiful ripe juicy Macintosh apple which I brought from Greece. The DA has long been known for its sensitivity to the information status of verbal complements (given vs. new information), with the new or noteworthy information occurring in a more prominent position (see Collins 1995). In the PC, the theme typically contains given information, while the goal (to X) represents new information. In the DOC, however, the recipient usually represents given information while the theme constitutes new information. Assuming neutral intonation, this distribution explains why A is the more –––––––—–– 4
As was pointed out by an anonymous reviewer, the fixed-goal expressions differ from the fixedtheme expressions in that fixed goals are non-referential. This seems to be an accurate observation for the expressions considered in the present study. Fixed-goals are non-compositional in the sense of Nunberg et al. (1993).
171
A Learner-Corpus Study of the Dative Alternation
natural response to the question in example (13), whereas A' is the more acceptable answer to the question in (14): (13) Q: Who did you give an apple to? A: I gave an apple to my teacher. A': ?I gave my teacher an apple.
(given before new) (new before given)
(14) Q: What did you give to your teacher? A: ?I gave an apple to my teacher. (new before given) A': I gave my teacher an apple. (given before new) It needs to be emphasised that there is an interaction and correlation between constituents’ syntactic complexity and their information status. Elements that have already been introduced to the discourse can be referred to by short deictic elements, typically anaphoric pronouns, whereas heavy constituents are more likely to contain new rather than old information. Another factor mentioned by Wasow & Arnold (2003) is ambiguity avoidance (see also Arnold et al. 2004). If the direct object contains an optional prepositional phrase (PP), it is more felicitous to choose the double-object variant instead of the prepositional variant, where two PPs are in conflict and may cause an attachment ambiguity and hence, processing difficulties. In the following example, the PP to his ex-wife modifies the letter, and is not a complement of send, but the latter, erroneous interpretation is possible if the prepositional variant in (15a) is used. This ambiguity disappears in (15b), where his ex-wife is unlikely to be interpreted as the recipient. (15) a. He sent the letter to his ex-wife to the lawyer. b. He sent the lawyer the letter to his ex-wife.
(attachment ambiguity) (disambiguated)
2.2. The DA in SLA Research The acquisition of argument structure alternations has received considerable attention within UG approaches to SLA, but fundamental concepts of information structure that these phenomena entail, e.g. information status and syntactic weight, have largely been neglected, and are in general not well-examined topics in the sparse research on advanced learners. Most existing studies investigated what Pinker has called “Baker's paradox”: Given the non-availability of negative evidence, how do children (and L2 learners, for that matter) know that not all verbs that appear in the prepositional variant also dativize, i.e. allow the DOC (Pinker 1989: 8)? Thus, research has focused on the acquisition of constraints on lexical rules and issues of overgeneralization/overextension of the double object variant, as well as transfer and access to Universal Grammar5, investigating only few dative verbs and a small number of informants (see Juffs 2000 for review). The findings of these studies suggest that the initial hypothesis for knowledge of L2 syntactic frames is the L1, leading to –––––––—–– 5
See Montrul (2000) for a discussion of several models of access to UG in relation to the L2 acquisition of lexical-semantic knowledge and derivational morphology.
172
Marcus Callies & Konrad Szczesniak
an overgeneralization of broad range rules. Narrow range rules that are not part of the L1 appear to be difficult to acquire, but some advanced learners do acquire them (Juffs 2000: 202). There are very few studies that have examined discourse aspects of the DA in learner language. Tanaka (1987) used - among other things - acceptability judgements to examine the performance of three proficiency groups of Japanese college students in relation to three constraints on the DA: Semantic (properties of theme and goal), discourse (given vs. new information), and perceptual (a language processing view of syntactic complexity). The findings showed that proficiency level had no significant effect on the results. None of the three groups (high, intermediate and low) noticed the discourse constraint, i.e. they accepted the following test items equally well: (16) a. John gave the book to a boy. b. ?John gave a boy the book.
(given before new) (new before given)
The author argued that discourse constraints were “in general subtle and tend to be unnoticed by second-language learners” (Tanaka 1987: 82). Instead, subjects were more aware of the influence of syntactic complexity and thus preferred a light-before-heavy ordering of postverbal constituents. These findings suggest that the Japanese learners were more sensitive to syntactic weight than information status as a factor that governs the DA. Chang (2004) investigated Chinese EFL learners’ production of the prepositional and double object dative with regard to their awareness of information structure. This should, the author supposed, be reflected in their use of the respective given-new information sequencing in the two variants. To check whether context affected learners’ production, the experimental design contained three written question-response tasks that varied as to the type of question posed.6 The study included three different groups of high- to midintermediate learners to examine whether proficiency level and amount of input had any effect. Based on previous work with Chinese learners, Chang hypothesised that his subjects would predominantly echo the question pattern in all but the third task, where they were expected to use either variant at the chance level. The author predicted that the underlying given-new information sequence, which was sometimes in contrast to the stimulus, would be difficult for Chinese learners to notice, let alone acquire without explicit instruction. Chang thus expected a small number of given-new answers in the first two, but a higher proportion of those answers in the third task, regardless of the learners' proficiency level. The results mainly confirmed the author’s initial hypotheses. In particular, in response to simple wh-questions, the large majority of learners from all three groups simply echoed the question pattern and refrained from re-arranging the word order to follow the given-new sequence. While the proportion of given-new answers hardly increased in the other two tasks, all learner groups showed a strong preference for the prepositional variant (theme first) across all three tasks. Chang concluded that Chinese intermediate learners are not –––––––—–– 6
The experiment contained three types of questions that the subjects had to answer: 1) simple whquestions like What did Nancy teach to Danny?, 2) multiple wh-questions such as Who gave what to Mary?, and 3) contextual questions, e.g. Why was Mary happy? which provided no pattern to echo but were supposed to be answered with a dative construction using cues provided in the task.
A Learner-Corpus Study of the Dative Alternation
173
aware of the given-new information flow, and that discourse status simply did not play a major role in the subjects’ decisions. That they preferred the prepositional to the double object variant could be explained 1) by the fact that this pattern was taught earlier and thus the learners simply preferred a structure they had acquired and were familiar with, and 2) by markedness considerations with the prepositional variant as the unmarked counterpart being easier for the learners to access and acquire. Marefat (2005) also aimed to determine whether Persian EFL learners at different proficiency levels were aware of the way discourse factors affect the choice of either structural variant of the DA. The study used elicited production in the form of questionanswer sequences, and a recognition task in which subjects were required to indicate the ‘naturalness’ of each response. Another aim was to see if there was any congruity between the two tasks in the choice of one over the other variant. The elicited production data from both English NSs and Persian learners seemed to indicate that neither group’s responses were affected by the information sequencing in the respective variant (given before new information vs. new before given information, see (17) below), with the elementary learners consistently producing prepositional datives. However, Marefat found that these results were influenced by strong echoing effects, i.e. NSs and all learner groups mostly repeated the structure provided in the stimulus: (17) Q: What did you give to Mary? a. I gave a book to Mary. (new before given, echoed) b. I gave Mary a book. (given before new, non-echoed) Moreover, the learners “seem to follow the line given to them throughout their instructional experience. An examination of the instructional material these learners receive at school shows that the exercises they are exposed to most often require them to follow the pattern in the sentence and just make certain modifications” (2005: 77f.).
Naturally, these priming effects might lead the researcher to think that the learners had no awareness as to which dative construction was appropriate in a certain discourse context. However, the results of task 2 showed that they were in fact conscious of the role of information status. Advanced and high-intermediate learners’ ratings were similar to those of NSs because they consistently rated the given-before-new variants higher than the reversed alternatives. By contrast, neither the low-intermediate nor the elementary subjects showed any preference for this type of ordering. In fact, the elementary subjects consistently rated the prepositional variant higher than the other groups did. Marefat explained this in terms of L1 interference since Persian lacks the DOC, and it appears that at the initial stages of learning, Persian learners’ perception and production is constrained by L1 grammar, although the canonical sequence of information is the same in both languages (given before new). The results are interpreted as a “developmental process in the acquisition of dative alternation. For the elementary learners, there is a de¿nite (and statistically significant) L1 grammar effect. Low-intermediate learners show influence from their instructional experience. But high-intermediate and advanced students exhibit performances similar to those of the native speakers” (2005: 81).
174
Marcus Callies & Konrad Szczesniak
In sum, the few studies that were carried out have yielded mixed results which, nevertheless, suggest that advanced learners are likely to be more sensitive than elementary ones to syntactic weight and information status as factors that influence the DA. However, these studies suffer from several methodological shortcomings. In particular, these are the detrimental echoing effects caused by the experimental setting which strongly suggest not to use a question-answer elicitation format for that type of research. Previous studies have only used either elicited production or acceptability judgements, but did not consider longer stretches of written learner production embedded in context, and are thus unsuitable for an investigation of discourse-functional aspects such as information status and syntactic weight. 2.3. The DA in English, Polish and German The DA in English is governed by a set of criteria different from those responsible for the alternation in Polish and German. What sets English apart from German and Polish is its historical development that resulted in the almost complete erosion of overt case. This led to a rather fixed SVO word order which had direct consequences for information structure in English. To compensate for the loss of word order freedom, English has developed a great number of alternations which can be exploited for information-structural purposes, as we have seen above. Thus, the absence of an overt dative or accusative case to mark complements of double object verbs led to the emergence of the DA. Rappaport Hovav and Levin (2005) argue that “since English has relatively fixed word order, the two argument realization options defining the dative alternation allow English to express a causation of possession event, while also satisfying other constraints which might place particular demands on word order. If this analysis is correct, the dative alternation should not be necessary in a language which has relatively free word order and, thus, can maintain the same mode of argument realization, while allowing for a reordering of arguments” (Rappaport Hovav & Levin 2005: 32).
They conclude that German does not show the DA with core dative verbs such as give. These observations also hold for Polish, where the DA is available only for verbs like English throw or send, that is for verbs other than core dative verbs. As will be shown in more detail below, in both German and Polish the use of the DA is less widespread than in English, and our interest is in how that affects German and Polish learners’ use of the DA in English. If the DA is not normally used for information-structural purposes in German and Polish, are learners aware of that function in English? In what follows, we will describe the main differences between the rules responsible for the use of the two variants in English, German and Polish. The constraints that German and Polish impose on the DA will be divided into those that fall within the purview of broad range rules (BRR) and narrow range rules (NRR). BRRs like the requirement for an animate recipient are common to the three languages in question, and we would expect that EFL learners are aware of this in L2 production. More interesting are the effects of NRR, because as will become clear, German and Polish share NRRs that set these two languages apart from English.
175
A Learner-Corpus Study of the Dative Alternation
The English dative is regulated by a set of features that are absent in German and Polish. For example, the morphological restriction prohibiting the participation of many Latinate verbs in the DOC is irrelevant for German and Polish, hence both examples below are correct: (18) Dyktuj mi ten tekst. (Polish) Dictate me that text. ‘Dictate that text to me’ (19) Diktiere
mir diesen Dictate me that ‘Dictate that text to me’
Text. (German) text.
An important NRR that is valid in German and Polish is the requirement for a spatial meaning if a verb is to appear in the PC. The default structure seems to be the DOC, which accepts all verbs under consideration. The PC is more restrictive in German and Polish. The functioning of the PC is similar to that in English, in the sense that it does not imply causation of possession. However, as Table 1 illustrates, German and Polish differ from English in that a verb can only participate in the PC if its meaning entails physical distance. Table 1: Spatial meaning constraint on verbs in the PC in Polish and German DATIVE
Metaphorical transfer No physical distance
Physical transfer Physical distance
PREPOSITIONAL
Wytáumacz mi me Explain
problem. problem
*Wytáumacz problem do mnie. problem to me Explain
Erklär Explain
das Problem. the problem
*Erklär das Explain the
mir me
Problem zu mir. problem to me
‘Explain the problem to me.’
‘Explain the problem to me.’
Rzuü mi Throw me
Rzuü Throw
piákĊ. ball
Wirf mir den Ball Throw me the ball ‘Throw me the ball.’
zu. to
piákĊ ball
Wirf den Ball Throw the ball
do mnie. to me zu mir. to me
‘Throw the ball to me.’
Table 2 shows the consequences of the above constraints for the availability of the PC with fifteen frequent dative verbs in the three languages: Fewer verbs can be used in the PC in German and Polish than in English (see also Sprouse 1995: 339, note 1, and Sabel 2002: 231). However, the contrastive analysis is complicated because some German verbs do alternate if they are used with a particle, e.g. weiterreichen (‘to hand/pass on to’) or einreichen (‘to hand in’). If the particle is not used, however, and the verb is used in its basic form, then only the PC is possible:
176
Marcus Callies & Konrad Szczesniak
(20) a. Er He
reichte passed
seinem his
Nachbarn neighbour
sein his
Wörterbuch. dictionary
b. Er reichte sein Wörterbuch an seinen Nachbarn weiter. He passed his dictionary to his neighbour on ‘He passed his dictionary on to his neighbour.’ c. *Er reichte He passed
sein his
Wörterbuch an seinen Nachbarn. dictionary to his neighbour.
We will restrict the present investigation to verbs used in their basic form. We assume that if transfer7 plays a role in L2 acquisition, this should be reflected by the different statistical frequencies with which the verbs in question are used in their syntactic frames in L2 as compared with the frequencies demonstrated by NSs. Specifically, it would seem natural to expect at least some avoidance of verbs like tell or teach in the PC, as they are not allowed there in German and Polish. Table 2: Availability of ‘dative’ verbs for the PC in the three languages investigated ENGLISH bring carry send take write sell pass hand pay read give show teach tell offer
GERMAN + + + + + + + + + + + + + + +
bringen tragen senden nehmen schreiben verkaufen geben, reichen geben, reichen (be)zahlen8 lesen geben zeigen lehren erzählen anbieten
POLISH + + + + + + (+) (+) (+) – – – – – –
przynosiü nieĞü sáaü braü pisaü sprzedawaü podawaü przekazywaü páaciü czytaü dawaü pokazywaü uczyü opowiadaü oferowaü
+ + + + + + + – – – – – – – –
–––––––—–– 7
8
We use the term language transfer in the sense of cross-linguistic influence, defined by Odlin as “the influence resulting from similarities and differences between the target language and any other language that has been previously (and perhaps imperfectly) acquired” (Odlin 1989: 27), thereby incorporating positive transfer as well as interference, avoidance and overproduction. An anonymous reviewer observed that the alternation in German depends on which equivalent of pay is selected. While bezahlen cannot appear in the PC, it is possible to use zahlen with the preposition an, as in etwas an das Finanzamt zahlen (‘to pay something to the tax office’).
A Learner-Corpus Study of the Dative Alternation
177
3. Data and Methodology The fact that there is comparatively little research on very advanced L2 learners could be one reason why there is no set of well-defined criteria to classify such learners as “nearnative”. This has obvious implications for the selection and recruitment of participants for research purposes. Near-native subjects have occasionally been selected on the basis of recommendation or “word of mouth” recruiting without further testing (e.g. Coppieters 1987, Leube 2000). Thomas (1994) reviews the assessment of foreign language learners’ proficiency in SLA research and the techniques by which proficiency levels are established. The four major conventions for the assessment of proficiency are: 1) impressionistic judgement, 2) use of institutional status as a proxy for proficiency level, 3) use of researchinternal or in-house measures of proficiency, and 4) standardised test scores. According to Thomas’ survey, the most frequent technique employed in SLA studies is learners’ institutional status, defined as “their positions in some hierarchically-organized social structure, for example, as students in first-year versus third-year classes” (Thomas 1994: 317). For the present purposes, the aim was to establish learners’ global proficiency in their L2 English. For practical reasons, external criteria such as institutional status had to be applied. The basis of our examination is the written production of advanced German and Polish EFL learners taken from the International Corpus of Learner English (ICLE, Granger et al. 2002) and comparable NS writing from the Louvain Corpus of Native English Essays (LOCNESS). ICLE consists of mostly argumentative essays produced by learners with different L1s and was compiled on the basis of rather strict design criteria. All of the informants who contributed essays to the corpus were undergraduate university students of English in their twenties, had learned English in an EFL context involving classroom instruction, and were usually in their third or fourth year of studies. Thus, their English proficiency level ranges from higher intermediate to advanced (Granger et al. 2002: 14). However, establishing proficiency on institutional status alone is problematic, since beginning university students of English cannot automatically be considered to be a homogeneous group of learners (see Callies 2006 for discussion). Thus, learners’ institutional status was supplemented by another criterion, namely the amount of L2 exposure, defined for the present purposes as the amount of time learners spent in an English-speaking country. This was restricted to a maximum of 12 months, assuming that an extended study-abroad period in the target culture positively affects L2 proficiency. Four subcorpora were compiled for the present study (see Table 3). The two learner corpora were sampled from ICLE and compared to two corpora of similar writing sampled from LOCNESS, which also consists of mostly argumentative essays by British and American students (Granger et al. 2002: 41). The major advantage of these control corpora is that they are comparable to the learner corpus as to both text type (argumentative essays) and participant characteristics (university students).
178
Marcus Callies & Konrad Szczesniak
Table 3: Corpora used in the present study TYPE
NO. OF ESSAYS
NO. OF WORDS
Polish ICLE (PICLE)
NNS
350
224 052
German ICLE (GICLE)
NNS
395
208 308
US LOCNESS
NAME OF CORPUS
NNS
176
149 574
BRIT LOCNESS
NNS
165
078 327
total LOCNESS
NNS
342
227 901
The corpus analysis focuses on fifteen highly frequent dative verbs (listed in Table 2 earlier) which were identified in the pertinent literature (e.g. Levin 1993, Wasow 1997, Biber et al. 1999). These were checked for frequency through comparison with a large reference corpus of English, the British National Corpus (BNC). All instances of the PC and DOC with these verbs were extracted from the corpora, checked manually and submitted to a qualitative textlinguistic analysis. Expressions with fixed constituents presented a practical problem for the present study. Idioms like give birth to or bring to light are quite frequent in the learner essays, and if factored in, would heavily affect the statistics. Given that the likelihood of their occurrence in the alternate variant is often only theoretical, a dilemma arises as to whether they should be considered at all. Their inclusion would automatically skew the statistics in favour of one variant, which would then ironically contradict the aim of the study. What we intend to examine is learners’ use of alternations, but fixed expressions do the opposite: They discourage or even block the use of syntactic variants. This dilemma was resolved by including fixed-theme expressions while excluding fixed-goal expressions. Although rare, fixed-theme expressions like give a break are possible in alternations, especially so when discourse factors become involved. Alternations of fixed-goal expressions, on the other hand, are not attested, even under conducive discourse conditions: (21) a. I sent the two quack doctors and other insane hypocrites to the devil. b. *I sent the devil the two quack doctors and other insane hypocrites. Table 4 shows fixed-goal expressions (these were not considered) and fixed-theme expressions (included) found in the corpora.
179
A Learner-Corpus Study of the Dative Alternation Table 4: Fixed expressions found in the corpora EXPRESSIONS EXCLUDED
EXPRESSIONS INCLUDED
bring to an end bring to light / life / reality bring to power bring to a halt, stop, standstill carry to an extreme take into account / consideration take for a walk take to heart
give birth to give rise to give way to show consideration for pay attention to
4. Results and Discussion The frequencies of use of the two variant structures with the fifteen dative verbs examined are shown in Table 5. Note that we focused only on the uses of a verb with two objects, and excluded instances of use with one object, such as give a reason. Thus, for example, the verb give was used 39 times with the PC by the German learners, compared to 68 times by the Polish learners, and 73 times by the NSs. Generally, those dative verbs which were used sparingly with two objects, and are thus under-represented in the corpora (e.g. read, write or sell), were not considered, and are thus not shown in Table 5. At this point, they contribute little insight as to which is the dominant variant. However, what the low figures do suggest is that these verbs are under-represented in all three corpora. Although some dative verbs like sell have the potential of taking two objects, this option is rarely used, and they usually appear with a single object. Table 5: Frequency counts of dative verbs used in the alternating variants by learners and NSs VERB
bring give offer pay send show teach tell
L1 GERMAN
L1 POLISH
NSS
PC
DOC
PC
DOC
PC
DOC
02 39 04 12 21 09 03 01
004 115 017 002 000 043 011 145
19 68 11 19 09 05 02 01
009 132 010 005 000 023 048 049
31 73 07 06 14 03 02 00
003 152 009 007 001 013 032 055
180
Marcus Callies & Konrad Szczesniak
What is most conspicuous is that if one variant dominates over its alternate, it does so across the three corpora: The preference for either variant with a particular verb demonstrated by the NSs is echoed in the figures from the learner corpora. For example, in all three corpora, the verb tell is used overwhelmingly in the DOC. Table 6 groups verbs according to what structures they select most often. This congruence is disturbed only in the case of bring and offer, which show inverted proportions in GICLE, as compared to the Polish and NS corpora. This is probably due to the under-representation of these verbs in the samples used. It is our hypothesis that the distribution would be similar in a larger sample containing more instances of these verbs. Table 6: Dominant variants in the production of learners and NSs VERB
L1 GERMAN
L1 POLISH
NSS
PC
DOC
PC
DOC
PC
DOC
give show teach tell
39 09 03 01
115 043 011 145
68 05 02
73 03 02 00
152 013 032 055
DOC-prone verbs
01
132 023 048 049
pay send
12 21
002 000
19 09
005 000
06 14
007 001
PC-prone verbs
offer bring
04
017
11
010
07
009
02
004
19
009
31
003
indeterminate
In addition to the fifteen alternating dative verbs, we also examined several non-alternating verbs (explain, demonstrate, describe, recommend, donate, report, present, return). There were very few outright grammatical errors, just as expected for advanced learners, and there occurred only a small number of lexical constraint violations as shown in the following examples: (22) a. *Certainly, pictures or sequences may and frequently do show us parts we could qualify as truth, in the sense that they present us things or ideas as they really are (their essence or core). (GICLE) b. *There was nobody who could explain him all these phenomena which used to scare him or get interested in. (PICLE) Although these examples represent instances of overextension of the double object dative to Latinate verbs that do not allow the alternation in English, the DOC seems to have been chosen for a specific reason: Both structures involve a short anaphoric indirect object (us, him) and a complex direct object that is put at the end of the sentence (given in bold in the above examples). This ties in with our examination of the influence of syntactic weight. As discussed earlier, more complex (and often more noteworthy) constituents are usually placed at the end of the sentence, and this has obvious consequences for the choice of either syntactic variant. Heavy indirect objects should favour the use of the PC, while short indirect objects
181
A Learner-Corpus Study of the Dative Alternation
are expected to ‘trigger’ the DOC. Table 7 shows that in the DOCs extracted from the learner corpora, the indirect object is rarely over one word long. It is typically an anaphoric pronoun or simple NP. By contrast, the direct object is from 2.5 to almost 7 words long. By contrast, in sentences where the weight distribution between both objects is relatively balanced, the PC is the preferred option. It is chosen when the direct object is shorter than the indirect object, but here the length difference between the two constituents is not as striking. We take these observations as evidence that the end-weight principle influences the learners’ choice of either variant. Table 7: Mean word-length of constituents in PC and DOC for four selected dative verbs L1 GERMAN VERB
bring give offer pay
PC
L1 POLISH DOC
PC
DOC
DO
IO
IO
DO
DO
IO
IO
DO
1.5 2.1 1.8
3.5 4.1 4.0
1.0 1.3 1.0
4.0 5.0 5.3
1.8 2.3 3.5
2.6 3.2 2.3
1.0 1.6 1.2
4.8 5.3 6.8
1.6
4.4
1.0
2.5
1.9
3.2
1.0
3.2
Let us now turn to the qualitative findings. In the following, we present selected examples from the learner corpora that illustrate how information status and syntactic weight are in operation in the ALV. The corpus data also provide some evidence that the avoidance of potential attachment ambiguities seems to play only a minor role. There are mixed results as to the factor of semantic connectedness. Consequences of these findings will be discussed in Section 5. In the examples in (23) and (24) we clearly see that information status plays a role in learners’ use of either variant. In fact, we see an interaction of information status and syntactic weight. In the DOCs in (23), them is a light indirect object that has been introduced before, hence constitutes given information9, while the structurally more complex direct objects represent new information (given in bold). (23) a. Being married makes life a lot easier. I told Conny of our problems finding a place to live. Housing owners treat you like foreigners or social outcasts unless you show them the document saying that you are married. (GICLE) b. People's using drugs is the result of psychological or emotional problems of some kind. Relying on drugs, in their opinion, brings them relief, pleasure or a means of escaping from these problems. (PICLE) The reverse order can be observed in the PCs in (24), where it is the indirect objects that constitute heavy constituents, containing new information.
–––––––—–– 9
Underlined constituents are co-referential.
182
Marcus Callies & Konrad Szczesniak
(24) a. There is absolutely no point in separating garbage. This is just a waste of energy and time and on top it cost [sic!] a lot of money. We could sent [sic!] all our garbage to countries of the third world especially to the very poor countries, those having almost nothing that they would have at least something of the industrial countries. (GICLE) b. There is a possibility that if homosexuals obtain the rights they fight for, other groups may demand the equal treatment. This may mean giving rights to people whose sexual preferences are now regarded as deviations such as for example pedophilia. (PICLE) There are several examples that evidence the competition between information status and syntactic weight as factors that influence the occurrence of either variant. (25a) features a ‘new’ direct object, indicated by the use of the indefinite article, which is placed after the more complex indirect object. It seems that information status overrides syntactic weight, and thus can be assumed to have a greater influence on the learners’ choice here. By contrast, in (25b) syntactic weight wins out over information status. Although the direct object some rubbish represents new information, the complex indirect object which takes up previous information is placed at the end. (25) a. So, together with a friend of mine I went to see a former schoolmate (Alex). [...] He programmed the computer to play a simple rhythm, showed me three chords to play on the key-board and gave my friend who cannot read music nor play any instrument a quick instruction on the base guitar. (GICLE) b. I still had to buy. The streets were awfully crowded with people who were obviously in the same situation as I and it was hardly possible to set one foot into one of these shops which were specialised in selling some rubbish to those millions of poor souls who didn't have any idea of what to buy for the oh so beloved members of the family and friends. (GICLE) That information status does indeed play a role in constituent ordering in the ALV can also be inferred from several examples of the PC. In (26a) below, the indirect object to Poland precedes the direct object for reasons of information structure. The direct object their money appears in end-focus position, which emphasises the consequences of Western businessmen coming into Poland. A similar strategy can be observed in (26b), where the indirect object to many people has been moved to the beginning in order to reserve the endfocus slot for the key information the possibility to see the impossible. (26) a. The new government initiated several reforms which allowed western businessmen to invest their capital in Polish industry and agriculture. However, these businessmen not only brought to Poland their money, but they also affected the nation socially and practically. (PICLE) b. It is proved that television has helped to popularize some games and hobbies. To many people it gave the possibility to see the impossible. (PICLE) There were not many examples of attachment ambiguities with dative verbs in the corpora, but the few examples we did find suggest that the learners either did not recognize an ambiguous attachment relation, or that this did not influence their choice of. It seems that
A Learner-Corpus Study of the Dative Alternation
183
syntactic complexity played a much more prominent role. In (27) the weight of direct and indirect object is rather balanced, and thus the ordering could easily be reversed to resolve the inherent ambiguity, cf. (27b). (27) a. The E.C. membership is a ticket which gives wider access to culture and new technology to a country which aspires to be one of the accepted countries. (PICLE) b. The E.C. membership is a ticket which gives (to) a country which aspires to be one of the accepted countries wider access to culture and new technology. (disambiguated) (28) and (29) are examples which are not attachment ambiguities with two PPs, but rather cases that resemble what has been called horror aequi. In these examples, the PC is used with one or more infinitival complements, which results in an awkward style and may cause processing problems. The learners appear to be insensitive to such problems which could easily have been avoided by using the DOC instead. (28) a. If we decide that each child must be examined separately there is no other way of doing it as to give the right to decide to parents. (PICLE) b. […] to give parents the right to decide. (improved) (29) a. Meanwhile it would seem fair to offer to who seem only capable of evil an opportunity to undergo a rehabilitation. (PICLE) b. […] to offer those who seem only capable of evil an opportunity to undergo rehabilitation. (improved) The corpus data contained relatively little evidence for the influence of semantic connectedness and its interplay with other factors on the DA. There is only one striking example in which a student is neither aware of semantic connectedness nor weight, compare the improved version in (30b). (30) a. While every noise is dying again in the stadium, he shoots his burning arrow right up in the sky an like a starlet (falling) that can not be caught by any human being it falls (like) in slow motion under the view of 2 billions (of) breathless watching TV-spectators directly in the "melting pot" and gives the Olympic fire that has been dead for the last four years new life. Welcome to the Olympics. (GICLE) b. ... gives new life to the Olympic fire that has been dead for the last four years. (improved)
184
Marcus Callies & Konrad Szczesniak
5. Conclusion and Outlook This paper has examined the two alternative constituent orderings of the DA in advanced learner writing to find out whether major principles of information structure influence learners’ use of either variant. The findings suggest that the fundamental lexical constraints on DA verbs appear to be unproblematic for advanced learners10, and that information status and syntactic weight play a major role in their use of either the PC or the DOC. In order to gain a deeper understanding of the extent to which factors such as weight, ambiguity avoidance and semantic connectedness play a role in advanced learners’ choice between structural variants – for the influence of the latter two, there was relatively little evidence in the corpora – it may be rewarding to examine other weight-sensitive phenomena such as Heavy NP Shift (HNPS) or verb-particle placement. These are especially interesting with respect to German EFL learners because a characteristic feature of German is its abundance of discontinuous verbal constituents such as verb-particle constructions with other, often rather complex and heavy, clause elements intervening. Several examples from the corpora suggest that HNPS and verb-particle placement are interesting topics because the learners appear to be uncertain about the conditions for the choice of the two possible variants. (31a) is an example from PICLE where a learner did not use HNPS to move the heavy NP into end-focus position, while in (31b) another student had sufficient confidence to do so. (31) a. Poland is not an exception here. On the contrary, our politicians have been doing their best to bring the date when our country joins the EU nearer. (PICLE) b. News provides you with information on current affairs in the world, numerous educational programmes bring closer the world in all its aspects with its strange phenomena, wonders but also problems. (PICLE) (32) is an instance of failed particle movement taken from GICLE which results in an ungrammatical sentence. (32) Another possibility is to build groups of four to five people, who use one car so that you use the full capacity of the car. Just to say, cars ought to be banned, is too easy, we cannot give up them. (GICLE) Similarly, advanced learners seem to struggle with the interplay of weight and semantic connectedness in idiomatic phrases such as keep in mind or take into account. A quick query in the German learner corpus for these two items yielded mixed results as shown in (33) and (34). In (33a+b) the fixed expression is left intact with a heavy complement positioned at the end, but in (33c) the light element their opinion should ideally be positioned in between the two parts of the idiom resulting in take their opinion into account. In (34a-c), either variant would seem felicitous. –––––––—–– 10
This is especially true for high frequency dative verbs studied here, possibly due to sufficient input, see DeKeyser (2005: 10f.).
A Learner-Corpus Study of the Dative Alternation
185
(33) a. The Queen’s modest political force in contrast is hardly worth speaking of. After all, who would want to grudge the Royal Family’s income after having taken into account the disadvantages with respect to privacy connected with it! (GICLE) b. If this is the case he will either try to be present at the building site as often as possible or he will leave all the work to these people. He has to keep in mind the amount of money he can spend. (GICLE) c. As in many areas, there was not much change in this after 1951. The role of the trade unions had become part of the consensus, every party had to take into account their opinion, if it wanted or not. (GICLE) (34) a. Russell states a tendency and a consequence, and proposes a solution. I will now elaborate on this tendency and take more consequences and solutions into account. (GICLE) b. He shows the child the way into real life. These are facts that you can read in every psychology-book. So the girls and boys grow up with these respective rôles and will keep it involuntarily in mind for the rest of their lives. (GICLE) c. It's almost the same with older people: They often walk around without noticing anybody or anything, without looking left and right, without keeping the rules of traffic in mind. (GICLE) An investigation of such relatively specific and infrequent structures may push learner corpus research to its limits. It is well-known that because of their limited size, learner corpora cannot be used for all types of linguistic investigation. Also, there is the “acrossthe-board” problem, i.e. that despite strict design criteria, individual learner differences are often lost, and reported averages may be skewed. Additionally, there are more serious problems like underdetermination and avoidance of certain phenomena. While learner corpora are well-suited for the analysis of high-frequency lexical items, they are often unsuitable for the study of infrequent, highly L2 specific or optional (syntactic) phenomena11. Moreover, a corpus of spoken learner English comparable to ICLE is in preparation, but not yet publicly available (see e.g. Brand & Kämmerer 2006). Therefore, a research design for the future investigation of the influence of several factors in learners’ production of weight-sensitive variation patterns such as the DA, HNPS and verb-particle movement should make use of triangulated corpus and experimental data as corroborating evidence. More controlled data collection techniques such as elicited production and judgement tasks are required, because experimental data are often needed as corroborating evidence to supplement corpus data (see Wasow & Arnold 2005).
–––––––—–– 11
For this very reason, the authors discarded a projected corpus study on advanced EFL learners’ use of a more specific alternation, the middle construction.
186
Marcus Callies & Konrad Szczesniak
References
Arnold, Jennifer E., Thomas Wasow, Anthony Losongco & Ryan Ginstrom (2000). Heaviness vs. newness: The effects of structural complexity and discourse status on constituent ordering. Language 76: 28-55. Arnold, Jennifer E., Thomas Wasow, Ash Asudeh & Peter Alrenga (2004). Avoiding attachment ambiguities: The role of constituent ordering. Journal of Memory and Language 51: 55-70. Biber, Douglas, Stig Johansson, Geoffrey Leech, Susan Conrad & Edward Finegan (eds.) (1999). Longman Grammar of Spoken and Written English. Harlow: Longman. Birdsong, David (2005). Nativelikeness and non-nativelikeness in L2A research. IRAL 43: 319-328. Brand, Christiane & Susanne Kämmerer (2006). The Louvain International Database of Spoken English Interlanguage (LINDSEI) – Compiling the German component. In Corpus Technology and Language Pedagogy: New Resources, New Tools, New Methods, Sabine Braun, Kurt Kohn & Joybrato Mukherjee (eds.), 127-140. Frankfurt/Main: Peter Lang. Bresnan, Joan & Tatjana Nikitina (2003). On the Gradience of the Dative Alternation. Ms. Stanford University. Bresnan, Joan, Anna Cueni, Tatiana Nikitina & Harald Baayen (2005). Predicting the Dative Alternation. To appear in Royal Netherlands Academy of Science Workshop on Foundations of Interpretation proceedings. Callies, Marcus (2006). Information Highlighting and the Use of Focusing Devices in Advanced German Learner English. A Study in the Syntax-Pragmatics Interface in Second Language Acquisition. PhD dissertation, Philipps-Universität Marburg. Carroll, Mary, Jorge Murcia-Serra, Marzena Watorek & Alessandra Bendiscioli (2000). The relevance of information organization to second language acquisition studies. The descriptive discourse of advanced adult learners of German. Studies in Second Language Acquisition 22: 441466. Chang, Lan Sin (2004). Discourse effects on EFL learners’ production of dative constructions. Journal of National Kaohsiung University of Applied Sciences 33: 145-170. Collins, Peter (1995). The indirect object construction in English: An informational approach. Linguistics 33: 35–49. Coppieters, René (1987). Competence differences between native and near-native speakers. Language 63: 545-557. DeKeyser, Robert M. (2005). What makes learning second language grammar difficult? A review of issues. Language Learning 55: 1-25. Granger, Sylviane (2004). Computer learner corpus research: Current status and future prospects. In Applied Corpus Linguistics: A Multidimensional Perspective, Ulla Connor & Thomas A.Upton (eds.), 123-145. Amsterdam & Atlanta: Rodopi. Granger, Sylviane, Estelle Dagneaux & Fanny Meunier (2002). The International Corpus of Learner English. Handbook and CD-ROM. Louvain-la-Neuve: Presses Universitaires de Louvain. Green, Georgia (1974). Semantics and Syntactic Regularity. Bloomington: Indiana University Press. Gropen, Jess, Steven Pinker, Michelle Hollander, Richard Goldberg & Ronald Wilson (1989). The learnability and acquisition of the dative alternation in English. Language 65: 203-257. de Haan, Pieter (1997). How 'native-like' are advanced learners of English? In Explorations in Corpus Linguistics, Antoinette Renouf (ed.), 55-65. Amsterdam: Rodopi. Harley, Heidi (2002). Possession and the double object construction. Linguistic Variation Yearbook 2, 29–68.
A Learner-Corpus Study of the Dative Alternation
187
Haspelmath, Martin (2004). Explaining the ditransitive person-role constraint: A usage-based approach. Constructions 2/2004 (http://www.constructions-online.de/articles/35). Juffs, Alan (2000). An overview of the second language acquisition of links between verb semantics and morpho-syntax. In Second Language Acquisition and Linguistic Theory, John Archibald (ed.), 187-227. Malden/MA: Blackwell. Leube, Karen (2000). Information Structure and Word Order in the Advanced Learner Variety. Hamburg: bod.Libri. Levin, Beth (1993). English Verb Classes and Alternations: A Preliminary Investigation. Chicago: The University of Chicago Press. Marefat, Hamideh (2005). The impact of information structure as a discourse factor on the acquisition of dative alternation by L2 learners. Studia Linguistica 59(1): 66-82. Montrul, Silvina (2000). Transitivity alternations in L2 acquisition. Toward a modular view of transfer. Studies in Second Language Acquisition 22: 229-273. Nunberg, Geoffrey, Ivan Sag, & Thomas Wasow (1994). Idioms. Language 70: 491-538. Odlin, Terence (1989). Language Transfer. Cross-Linguistic Influence in Language Learning. Cambridge: CUP. Oerhle, Richard (1976). The Grammatical Status of the English Dative Alternation. PhD dissertation, MIT. Pinker, Steven (1989). Learnability and Cognition. The Acquisition of Argument Structure. Cambridge/MA: MIT Press. Rappaport Hovav, Malka & Beth Levin (2005). All Dative Verbs Are Not Created Equal. Ms. Stanford University. Sabel, Joachim (2002). Die Doppelobjekt-Konstruktion im Deutschen. Linguistische Berichte 190, 229-244. Sprouse, Rex (1995). The double object construction in the Germanic languages: Some synchronic and diachronic notes. In Insights in Germanic Linguistics. Volume : Methodology in Transition, Irmengard Rauch & Gerald F. Carr (eds.), 325–342. Berlin: Mouton de Gruyter. Tanaka, Shigenori (1987). The selective use of specific exemplars in second-language performance: The case of the dative alternation. Language Learning 37: 63-88. Thomas, Margaret (1994). Assessment of L2 proficiency in second language acquisition research. Language Learning 44: 307-336. Wasow, Thomas (1997). Remarks on grammatical weight. Language Variation and Change 9: 81105. Wasow, Thomas & Jennifer Arnold (2003). Post-verbal constituent ordering in English. In Determinants of Grammatical Variation in English, Günther Rohdenburg & Britta Mondorf (eds.), 119-154. Berlin: Mouton de Gruyter. Wasow, Thomas & Jennifer Arnold (2005). Intuitions in linguistic argumentation. Lingua 115, 14811498.
188
Marcus Callies & Konrad Szczesniak
Ulrike Gut
1. Introduction The phonology of advanced language learners has been investigated in various ways. In longitudinal studies, the development of one particular phonological feature in the speech of language learners is analysed repeatedly over a certain period of time in order to collect evidence for distinct stages in phonological acquisition. Hansen (2001) and Abrahamsson (2003), for example, carried out longitudinal studies on the acquisition of coda consonants, the final consonants in a syllable, by Mandarin Chinese learners of English and Swedish, respectively. They found a U-shaped developmental sequence with an initial stage in which beginners produce relatively few mistakes followed by a phase with an increased error rate and a subsequent final phase with a decrease of the overall error frequency. The type of errors produced in consonant codas also follows a developmental path: for single consonant codas errors change from initial deletion of the consonant to paragoge, where a vowel is added after the final consonant, and then target-like production. For coda consonant clusters, realisations change from an initial dominant simplification pattern of paragoge or epenthesis (the insertion of an additional vowel between the two consonants) to substitution (change of feature place or manner) in later stages. In general, learners first produce a particular sound as a single coda before it occurs in consonant clusters. In the area of intonation, Grosser (1997) described the sequence of acquisition of utterance-final pitch movements produced by Austrian learners of English. On the whole, falling pitch movements are produced before rising ones and complex (e.g. rising-falling) utterance-final pitch movements follow simple (falling or rising) pitch movements. Other empirical studies concerned with the phonological structure of advanced L2 speech use cross-sectional data, i.e. speech produced by different speaker groups at one point in time. This type of study typically employs phonetic measurements of various aspects of the language learners’ speech such as particular vowels and consonants (e.g. Bohn & Flege 1992, Flege & Munro 1994), syllable structure (e.g. Lleó & Vogel 2004), speech rhythm (e.g. Wenk 1985) or word stress (e.g. Mochizuki-Sudo & Kiritani 1991). By means of a group comparison of these phonetic features in the productions of language learners at different competence levels it is demonstrated in which ways advanced L2 phonology differs from less advanced L2 phonology. In addition, the stage of phonological ‘advancement’ of a language learner is judged according to the objectively measurable distance of the phonetic properties under investigation from native speakers’ phonetic properties. For example, it was shown that more advanced learners of English reduce vowels more in unstressed syllables than less advanced learners (Wenk 1985) and that they produce the voice onset time of the plosive /t/ in a more native-like manner than less advanced learners (Flege & Munro 1994).
190
Ulrike Gut
The attainment of an advanced phonology by language learners is often discussed in connection with the speaker’s age at the beginning of language learning. The central question of the constraints of age of first learning on advanced phonological acquisition is answered in radically opposite ways by two groups of researchers. On the one hand, it is proposed that for naturalistic L2 learners, that is non-native speakers without any formal education in their L2, a critical period exists after which the complete attainment of L2 phonology is impossible. Only during this critical period, it is claimed, can a second language be acquired with a native-like phonology and full fluency, given that the sociological, cultural, psychological and affective circumstances are positive (Patkowski 1994). This Critical Period Hypothesis (CPH) is based on anecdotal and observational evidence discussed in Scovel (1988), Singleton (1989, 2005) and Long (1990, 2005). Several empirical studies showed that the later one learns a language the less native-like the pronunciation is likely to be (Asher & Garcia 1969, Oyama 1976, Suter 1976, Tahta et al. 1981, Thompson 1991, Flege & Fletcher 1992, Flege et al. 1999, Piske et al. 2001). For example, it was reported that the age at arrival of immigrants to the U.S. shows the highest degree of correlation with accent ratings by native speakers (e.g. Oyama 1976 for Italian immigrants; Thompson 1991 for Russian immigrants; Flege & Fletcher 1992 for Spanish immigrants and Flege et al. 1999 for Korean immigrants). On the other hand, a series of publications contradict the assumption that advanced phonological acquisition is constrained by the age of learning. These studies report on learners who successfully acquire the phonology of a foreign language even as adults. Schneiderman & Desmarias (1988), Ioup et al. (1994), Ioup (1995), Bongaerts et al. (1997) and Moyer (1999), for example, have shown that phonological attainment which is nativelike to the extent that native speakers cannot detect a trace of foreign accent in the speech of non-native speakers is possible, at least for a small number of learners. In these publications the focus lies on other factors than age for an explanation of success in phonological acquisition. Purcell & Suter (1980), for example, demonstrated that aptitude for oral mimicry, length of residence, living with a native speaker and degree of motivation for pronunciation accuracy predict speakers’ pronunciation better than age of acquisition. Piske et al. (2001) report that the foreign accent of 90 Italian immigrants to Canada varied with the amount of continued use of Italian. Those who spoke their L1 frequently showed a stronger foreign accent in English than those who did not. Purcell & Suter (1980), Bongaerts et al. (1997) and Moyer (1999, 2004) stress the role of motivation in ultimate phonological attainment. Their subjects, who were partly indistinguishable from native speaker controls, all had a very high professional or personal interest in achieving a nativelike pronunciation. All of the current approaches to the investigation of advanced learners’ phonology have a number of methodological drawbacks. Generalisations of results are difficult since research on the phonetic properties of advanced language learners’ phonology tends to be based on a relatively small empirical base with a handful of participants, especially in longitudinal approaches. The type of language learners are very heterogeneous across the different types of studies ranging from immigrant groups to learners who acquired the language in a formal setting, who did not move to the country where it is spoken and who use it as language teachers in their home country. In terms of data collection and analysis longitudinal and cross-sectional studies favour experimental data, which is elicited in controlled ways and analysed acoustically. Age-related research into advanced
Phonology of advanced learners of German
191
phonological acquisition, in contrast, usually determines the quality of a language learner’s phonology based on global judgments of foreign accent by native speakers. Typically, speakers are asked to read or repeat individual sentences and these productions are judged by native speakers according to their accentedness. Yet, it has been criticised that artificial tasks such as the reading of particular sentences and words might not faithfully reflect language abilities and that the data collected should consist of longer units of spontaneous speech (Hyltenstam & Abrahamsson 2003). Accent ratings, furthermore, have so far not been correlated with acoustic measurements of features of L2 phonology. Furthermore, studies typically focus on one isolated aspect of non-native speech only and do not relate the findings to other phonological features so that nothing is known yet about the relationship and inter-dependencies between different phonological structures and domains in advanced learners’ phonology. Finally, in cross-sectional studies, language learners are often assigned to groups of different competence based on criteria other than phonological ones – for example grammatical proficiency tests – so that in some studies speakers classified as beginners show more native-like phonology than speakers classified as intermediate learners (e.g. Lleó & Vogel 2004). It is the objective of the present study to demonstrate that a corpus-based methodology can complement the current research methods in second language acquisition and possibly compensate some of their weaknesses. It has been suggested variously (Biber et al. 1998, Botley et al. 1996, Kettemann & Marko 2002, Granger et al. 2002, Sinclair 2004, Granger 2004) that the representative sample of natural speech contained in language corpora enables linguists to study patterns of actual language use of a scope not achieved in smallscale experimental studies, which might lead to the discovery of previously unexpected linguistic phenomena. The advantages of corpora in research on second language acquisition lie in the fact that they provide a broad sample of learner speech and genuine patterns of learner language use, which allows the study of non-native speech and its variation in an integrated way and on a hitherto impossible scope (cf. Gut 2006a). This paper presents an acoustic study of advanced learners’ phonology based on the LeaP corpus, a phonetically annotated corpus of learner English and learner German. Two main questions will be addressed. The first one concerns the general properties of advanced learners’ German phonology: Which phonological and phonetic properties characterise the phonology of an advanced learner of German? In order to answer this question first a corpus-based description is given of some general acoustic features that contribute to the difference between native and non-native German phonology. Then, a qualitative corpus analysis shows in which ways the phonology of advanced learners differs from that of less advanced learners. The second question is concerned with the influence of age and other factors on advanced learners’ phonology. With the help of the LeaP corpus it is investigated which non-linguistic factors are correlated with an advanced phonology. In the next section, the LeaP corpus is presented. The corpus analyses described in section 3 demonstrate systematic differences between native and non-native German phonology in terms of consonant cluster reduction, vowel reduction and intonation. In section 4, evidence for distinct structural differences between advanced learners’ German phonology and less advanced learners’ phonology is presented. Section 5 reports on the investigation of non-linguistic factors constraining the acquisition of an advanced phonology. It further demonstrates the relationship between accent ratings and phonological properties of L2 speech. These findings are summarized and discussed in section 6.
192
Ulrike Gut
2. The LeaP corpus The LeaP corpus was collected between May 2001 and July 2003 during the LeaP (Learning Prosody in a Foreign Language) project, which was funded by the Ministry of Education, Research and Science of North-Rhine Westphalia, Germany. The project had two main research aims: the first goal was to provide a detailed description of non-native phonology and a comparison with native speakers’ phonology. The second line of research was concerned with the question of whether and how the phonology of a foreign language can be learned. The project investigated non-linguistic factors such as speaker variables (e.g. native language, age at the beginning of language learning, motivation, musicality) and the type of teaching method that might enhance the outcome and speed of the acquisition process. The LeaP corpus consists of 359 recordings of non-native and native speech in both German and English comprising 73.941 words and a total amount of recording time of more than 12 hours (Milde & Gut 2002, Pitsch et al. 2003). During the collection of the corpus data it was aimed to record a representative range of non-native speakers in terms of age, sex, native language/s, level of competence, length of exposure to the target language, age at first exposure to the target language and non-linguistic factors such as motivation to learn the language, musicality and so forth. The corpus contains four different speaking styles: free speech in an interview situation (length between 10 and 30 minutes), reading of a passage (length of about two minutes), retellings of a story (length between two and 10 minutes) and the reading of nonsense word lists (30 to 32 words). It is divided into two subcorpora: target language German and target language English. The English sub-corpus consists of 176 fully annotated recordings and 45 word lists. The German sub-corpus consists of 183 fully annotated recordings and 57 recorded word lists. In the interviews, a large number of meta data was collected, including meta data about the recording (date, place, interviewer and language of the interview) , about the non-native speaker (age, sex, native language/s, second language/s, age at first contact with target language, type of contact [formal vs. natural], duration and type of stays abroad, duration and type of formal lessons in prosody [if at all], prosodic knowledge), and about the language learner’s motivation and attitudes (reasons for acquiring the language, motivation to integrate in the target country, attributed importance to competence in pronunciation compared to other aspects of language, interest, experience and ability in music and in acting). Table 1 presents the speaker variables for the native and the non-native speakers of the German sub-corpus. The age of the 55 non-native speakers of German at the time of recording ranges from 18 to 54 years with a mean age of 28.9 years. 35 of them are female and 20 are male. Altogether, they have 24 different native languages. The average age at first contact with German is 16.7 years, ranging from three years to 33 years of age. The seven native speakers of Standard German in the corpus are aged between 20 and 59 years.
193
Phonology of advanced learners of German Table 1:
Number, mean age, sex and mean age at first contact with German of the speakers in the German sub-corpus
NUMBER
AGE RANGE
SEX
MEAN AGE AT FIRST CONTACT WITH GERMAN
NUMBER OF NATIVE LANGUAGES
learners of German
55
18-54
35 female, 20 male
16.7
24
native speakers
07
20-59
3 female, 4 male
n.a.
n.a.
A multi-level annotation was carried out for all reading passages, retellings and two-minute extracts of each interview. During the annotation process text-to-tone alignment of each annotated element was added. This alignment links the transcriptions with the audio recording by setting time-stamps at the beginning and end of each word, syllable, phoneme etc. The manual annotation comprised six tiers; two further tiers were added automatically: 1. 2. 3. 4. 5. 6. 7. 8.
On the phrase tier, speech and non-speech events are annotated. The learner’s speech is divided into quasi-intonational phrases. On the words tier, the beginning and end of each word is annotated and an orthographic transcription is provided. On the syllables tier, the beginning and end of each syllable is marked and the syllable is transcribed with SAMPA (Wells et al. 1992). On the segments tier, all vocalic and consonantal intervals plus the intervening pauses are annotated. On the tone tier, pitch accents and boundary tones are annotated using a modified ToBI system (Silverman et al. 1992). On the pitch tier, the initial high pitch, the final low pitch and intervening high peaks and low valleys are annotated. On the POS tier, part-of-speech coding was annotated automatically. On the lemma tier, lemmata were annotated automatically.
For a recording of about one minute length, on average, 1000 events were annotated. Figure 1 illustrates the annotation of the phrase “ein grosses Stück Käse liegen sahen” (“saw a big lump of cheese lying…”) with the oszillogram and the five manually annotated tiers words, phrase, syllables, segments and tone (from top to bottom).
194
Figure 1: Annotation of the phrase “ein grosses Stück Käse liegen sahen”.
Ulrike Gut
Phonology of advanced learners of German
195
3. Acoustic features of native and non-native German phonology The initial quantitative search of the LeaP corpus focussed on systematic differences between native and non-native German phonology. In a first line of investigation reduction processes in both consonant clusters and vowels were investigated. Vowel reduction and vowel deletion occur regularly in particular phonetic contexts in German and are speculated to contribute to the specific speech rhythm of the language (e.g. Ramus et al. 1999). In German, vowel reduction occurs regularly in unstressed syllables as for example in the production of the schwa /↔/ as the second vowel in the German word diesem [diz↔m]. In phonetic terms, reduced vowels are typically shorter than non-reduced full vowels and have a different quality, i.e. are more central (e.g. Delattre 1981, Gut 2006b). Vowel deletion often occurs in fast speech, especially in post-tonic unstressed syllables such as the second syllable in the German word laufen, which is usually realized as [fn]. Likewise, word-final consonant clusters, i.e. sequences of two or more consonants, are regularly reduced in connected speech. This means that for example in the word riefst (called 2nd ps. sg.) one consonant of the cluster is deleted so that it is realized as [ri:fs] (e.g. Kohler 1995). In the LeaP corpus, the following measurements of reduction processes were taken for all three speaking styles (readings, retellings and free speech): – mean length reduced syllable: mean length of syllables containing a reduced vowel (in ms) – vowel reduction ratio: mean durational ratio of all syllable pairs in which a syllable with a full vowel is followed by a syllable with a reduced or a deleted vowel – final cluster retention: retention rate (i.e. no deletion) of all word-final consonant clusters – medial cluster retention: retention rate of all word-medial consonant clusters A total of 40 274 syllables produced by the non-native speakers of German and 3 261 syllables produced by the native speakers of German were analysed in terms of vowel reduction. In addition, a total of 4 045 words with underlying word-final clusters were analysed in the speech of the non-native speakers of German. The native German speakers produced a total of 232 of such words. Table 2 illustrates that there are significant differences in vowel and consonant cluster reduction between non-native German and native German. The mean length of syllables with reduced vowels is shorter in native German than in non-native German (t=8.07, df=139, p