180 31 2MB
English Pages 221 Year 2006
English with a Latin Beat
Studies in Bilingualism (SiBil)
Editors Kees de Bot University of Groningen
Thom Huebner San José State University
Editorial Board Michael Clyne, University of Melbourne Kathryn Davis, University of Hawaii at Manoa Joshua Fishman, Yeshiva University François Grosjean, Université de Neuchâtel Wolfgang Klein, Max Planck Institut für Psycholinguistik Georges Lüdi, University of Basel Christina Bratt Paulston, University of Pittsburgh Suzanne Romaine, Merton College, Oxford Merrill Swain, Ontario Institute for Studies in Education Richard Tucker, Carnegie Mellon University
Volume 31 English with a Latin Beat: Studies in Portuguese/Spanish – English Interphonology Edited by Barbara O. Baptista and Michael Alan Watkins
English with a Latin Beat Studies in Portuguese/Spanish – English Interphonology
Edited by
Barbara O. Baptista Universidade Federal de Santa Catarina, Brazil
Michael Alan Watkins Universidade Federal do Paraná, Brazil
John Benjamins Publishing Company Amsterdam/Philadelphia
8
TM
The paper used in this publication meets the minimum requirements of American National Standard for Information Sciences – Permanence of Paper for Printed Library Materials, ansi z39.48-1984.
Library of Congress Cataloging-in-Publication Data English with a Latin beat : studies in Portuguese/Spanish – English interphonology / edited by Barbara O. Baptista and Michael Alan Watkins. p. cm. (Studies in Bilingualism, issn 0928–1533 ; v. 31) Includes bibliographical references and indexes. 1. English language--Pronunciation by foreign speakers. 2. English language--Spoken English. 3. English language--Study and teaching-Spanish speakers. 4. English language--Study and teaching--Portuguese speakers. PE1137.E567 2006 421/.52--dc22 isbn 90 272 4142 2 (Hb; alk. paper)
2006043045
© 2006 – John Benjamins B.V. No part of this book may be reproduced in any form, by print, photoprint, microfilm, or any other means, without written permission from the publisher. John Benjamins Publishing Co. · P.O. Box 36224 · 1020 me Amsterdam · The Netherlands John Benjamins North America · P.O. Box 27519 · Philadelphia pa 19118-0519 · usa
Table of contents
Introduction Barbara O. Baptista and Michael Alan Watkins
1
Part I. Segmental-level studies: Vowels Adult phonetic learning of a second language vowel system Barbara O. Baptista The phonological and phonetic development of new vowel contrasts in Spanish learners of English Paola Escudero Age and native language influence on the perception of English vowels Francisco Gallardo del Puerto, Ma Luisa García Lecumberri and Jasone Cenoz
19
41 57
Part II. Syllable-level studies: Codas and onset clusters The influence of voicing and sonority relationships on the production of English final consonants Barbara O. Baptista and Jair L. A. da Silva Filho Perception and production of vowel paragoge by Brazilian EFL students Rosana Denise Koerich The sonority cycle and the acquisition of complex onsets Robert S. Carlisle
73 91 105
The influence of voicing on the production of initial /s/-clusters by Brazilian learners Jeanne Teixeira Rebello and Barbara O. Baptista
139
Production of English initial /s/-clusters by speakers of Brazilian Portuguese and Argentine Spanish Andréia Schurt Rauber
155
Table of contents
Part III. Prosodic-level studies: Stress and rhythm Variability in the use of weak forms of prepositions Michael Alan Watkins
171
Perception of double stress by Spanish learners of English Ma Luisa García Lecumberri
185
The production of compound stress by Brazilian learners of English L. Armando Silveiro and Michael Alan Watkins
199
Author index
211
Subject index
213
Introduction Barbara O. Baptista and Michael Alan Watkins Universidade Federal de Santa Catarina, Brazil / Universidade Federal do Paraná, Brazil
The scope of this collection Over the last twenty-five years interest in the patterns and causes of error in second language pronunciation has grown rapidly, with the result that interphonology has now become firmly established as an important research field in its own right. A number of collections of articles published over the last two decades have been devoted entirely to the topic: James and Leather (1986, 1996) and Ioup and Weinberger (1987) were instrumental in bringing the findings of second language phonology research to the attention of a wider audience and sparking off further research around the world, while the New Sounds conferences brought together interphonology specialists from every continent. The collections of papers presented at these conferences (Leather & James 1990, 1992, 1997; James & Leather 2002) constitute yet another valuable source of information for researchers. In addition, there have been publications devoted to research carried out by single authors (Archibald 1998; Major 2001), and a particularly relevant collection of papers (Strange 1995) on the role of perception in L2 phonology. Although not all the research presented in these collections concerns English as the target language, the great majority does, which is not surprising in view of the seemingly unstoppable rise of English as a lingua franca around the globe. While there is this predominant theme of the analysis of difficulties in producing the sounds of English by non-native speakers, there has as yet been no corresponding unity in the published literature regarding the L1 of the participants. For example, the volumes mentioned above typically juxtapose studies of L1 speakers of such languages as Chinese, Japanese, Korean, Vietnamese, Arabic, and many different European languages, some of whom are acquiring or have acquired English in a naturalistic setting, while others are learning it as a foreign language in an instructional setting.
Barbara O. Baptista and Michael Alan Watkins
This linguistic and situational heterogeneity among the studies has undoubtedly been valuable in some ways, highlighting the different solutions that speakers of different L1s find for the same problems, such as coda consonants, and has contributed to the development of various models of second language phonological acquisition theory, such as those of Major, Eckman and Flege (discussed below). However, people wishing to focus in depth on the specific characteristics of speakers of a particular language or group of closely-related languages have had to scour the literature to find relevant research, and it could be useful to have access to collections of relevant studies carried out with speakers of just those L1s. The grouping of Spanish and Portuguese together as L1s can be justified on two counts: geographical and linguistic. Geographically, the Iberian Peninsula and Latin America both form clearly defined regions of the globe in which the great majority of inhabitants have one or other of these languages as their L1, and where increasingly large numbers learn English in a classroom setting. Portuguese and Spanish-speakers also constitute fast-growing minorities in the USA. Socially, culturally and historically, these two linguistic communities could be considered to have more in common with each other than with other speech communities. Linguistically also, Portuguese and Spanish have much in common – both developed from Latin over a similar period, and both incorporated a number of Arabic words during the long Moorish occupation of the peninsula. However, while the two languages are superficially similar (in terms of their lexis, morphology, syntax, and stress patterns), some aspects of their phonology are more different than is often realized. EFL materials tend to be thought of by publishers as “for the Iberian market”, or “the Latin American market”, treating Spanish and Portuguese as if they were more or less the same, and in fact, in South America at least, speakers of one language often feel that they do not need to make any effort to learn the other, just changing a few superficial features to produce an intermediate form popularly known as portuñol or portunhol. With the growth of tourism and trade relations among the southern-most countries of South America, however, it is becoming increasingly plain to both Brazilians and Spanish-speakers that there are considerable differences between the two languages, and that just speaking a pidgin variety of each other’s language is no longer good enough. While only one of the papers (Rauber) actually compares speakers of these two languages, there is a fairly balanced representation of both Spanish and Portuguese speakers in the collection, with four papers dealing exclusively with the former, a fifth (as mentioned above) comparing the two, and six involving speakers of Brazilian Portuguese (BP). Moreover, almost all the studies were carried out with participants who had learned or were currently learning English as a foreign language in a classroom environment. The fact that European Portuguese (EP) does not feature in this volume is largely accidental, as it was simply the case that no relevant research was avail-
Introduction
able to us, but the omission could perhaps also be justified on other grounds: in some important respects BP phonology is closer to that of all varieties of Spanish than is EP, which differs most noticeably from BP and Spanish in having a more stress-timed rhythm, resulting in certain segmental differences as well. In fact, all the studies in this book address difficulties which are liable to affect, to a greater or lesser degree, native speakers (NSs) of both Spanish, no matter what the variety, and of BP: the perception and production of vowel distinctions which neither language possesses, strategies for dealing with English initial consonant clusters and final consonants in general, which both languages disallow or limit to some extent, and issues of stress placement and vowel reduction.
Theoretical models of L2 phonological acquisition In addition to focusing on the same two L1s, the papers are also linked by the fact that they are largely based (though not always explicitly) on two somewhat related theoretical assumptions: the role of L1 influence, as formulated in Flege’s (1995) speech learning model (SLM), and universal markedness, as applied to Eckman’s markedness differential hypothesis (MDH, 1977) and his more recent interlanguage structural conformity hypothesis (SCH, 1991). The SLM, while it has up to now been applied by Flege and colleagues to segmental-level acquisition, is extended by several contributors in this volume to other domains. While the MDH and the SCH are not explicitly referred to in the vowel studies, as these are mostly at the phonetic level, the concept of markedness is implicit in Escudero’s study in the preference of the Spanish speakers for the length feature over spectral quality in L2 vowel perception, even though length is not a relevant feature of vowels in their L1. The most relevant SLM hypothesis for the studies in this volume is that although adults maintain the phonetic abilities of childhood, they often neglect to establish phonetic categories for L2 segments because of the process of equivalence classification, which causes them to perceive similar L2 sounds through the “grid” of their L1 categories, thus ignoring subtle phonetic differences between them. Rather than markedness, the key concept in Eckman’s hypotheses, a key concept in the SLM is similarity. Because the process of equivalence classification depends on the degree of similarity of the L2 sound in question to a sound of the L1, the likelihood of an adult learner establishing an L2 phonetic category is greater for those sounds without an obvious corresponding sound in the L1. This is claimed by Flege to be the main cause of foreign accent, since production of L2 sounds is supposed to ultimately depend on the phonetic category representation in longterm memory. The age effect in the learning of L2 pronunciation, rather than a result of the biological loss of a childhood ability, is then seen as due to the fact that in younger learners the L1 categories are less firmly established than in the
Barbara O. Baptista and Michael Alan Watkins
adult, enabling them to perceive subtle distinctions ignored by the older learners. Most of the key studies supporting the SLM are studies on voice onset time (VOT) of L2 stops, especially voiceless stops (e.g., Flege 1987, 1991; Flege & Eefting 1987, 1988; Flege, Munro, & MacKay 1995; Flege & Schmidt 1995), and studies on vowels (e.g., Flege, Munro, & Fox 1994; Gottfried 1983; Munro, Flege, & MacKay 1996), with increasing importance being given to perception studies, under the assumption that the phonetic category representation is formed according to the way the target sounds are perceived. Eckman’s (1977) markedness differential hypothesis (MDH) adapted the basic model of the contrastive analysis hypothesis by introducing the factor of markedness. The claim was now that only structures that were not just different from the L1 but also more marked would be difficult for learners to acquire and that the degree of difficulty would depend on degree of markedness. Eckman supported the MDH by citing directionality of difficulty. For example, German speakers have difficulty suppressing their L1 final-consonant devoicing process in English, but English speakers do not have difficulty learning the devoicing process in German because final voiced obstruents are more marked. That is to say, all languages with final voiced consonants also have voiceless ones, but not vice versa, and while it is difficult to acquire a more unusual feature than your L1 has, learning to suppress a familiar one is not problematic. The hypothesis was supported by many early interphonology studies on syllable structure (Anderson 1987; Broselow 1983, 1984; Karimi 1987; Sato 1984; Tarone 1980; Weinberger 1987). Still, Eckman (1991) later preferred his interlanguage structural conformity hypothesis (SCH), which no longer referred to the L1, but simply said that generalizations valid for primary languages were also valid for interlanguages. The generalizations Eckman refers to are mostly implicational relationships, which say that if A exists in a language, then B must also exist. So, to use the same phonological example used for the MDH, just as primary languages will not be found that have final voiced obstruents without final voiceless obstruents, interlanguages that have only the more marked segments without the less marked ones will not be found either. Eckman justified the replacement of the previous model, pointing out that the newer one was more falsifiable and that the MDH could not account for difficulties L2 learners are found to have with structures which are quite similar to those of the L1. Flege’s SLM and Eckman’s MDH are related in that they each partially explain L2 speech difficulties by reference to the L1, although the explanations are quite different. They complement each other in the field of interphonology because the SLM functions more on the phonetic level while the MDH, as well as the SCH, deal with the phonological level. The SLM explains the learning (or lack of learning) of L2 speech as a gradual phonetic approximation of the target language sounds or phones, which is frequently never completed for some phones because equivalence classification does not allow the relevant L2 phonetic categories (mental
Introduction
representations) to be constructed in long-term memory. Rather, the L2 phones are perceived by the learner as “positional allophones” of similar phones in their L1, and as such, they are perceived (and therefore produced) without the differential L2 phonetic detail. Studies on vowels and the VOT of stops are particularly relevant because both can be represented on a continuum, so that learning can be described as a gradual approximation of the L2 positional allophones to the target phones. The MDH and especially the SCH both deal with the learning of L2 sounds as an all-or-nothing affair; that is, the target segment or sequence is either produced correctly or not in a particular occurrence (MDH) and either exists or does not exist in the IL in SCH studies. In the SCH Eckman specifically uses a cutoff point of 80% correct production to say that a particular item has been acquired. These two hypotheses lend themselves in particular to the investigation of L2 syllable structure. Two other recent influential theories in the area of interphonology, those of Major and Best, were less central to the studies in this collection, but important to mention here because of their appearance in some of the papers. Major’s (2001) ontogeny phylogeny model (OPM), an extension of the earlier ontogeny model (OM, Major 1986), says basically that transfer processes will be more prevalent in the early stages of learning but will be gradually replaced by developmental (universal) processes, which will increase and then decrease as they are replaced by L2 processes. As the developmental processes essentially refer to those governed by markedness, this model has certain similarities to Eckman’s two hypotheses. Although it refers more generally to different variants produced in place of the target variant, without specifying the size of the unit (e.g., segment or sequence) being acquired (or not), the variants themselves are distinct rather than belonging to a continuum as in Flege. As its name implies, the perceptual assimilation model (PAM) (Best 1995; Best, McRoberts, & Goodel 2001) deals specifically with the perception of L2 contrasts, which are expected to be discriminated with more or less facility depending on which of the following ways the sounds of each pair have been assimilated to L1 categories: (a) each L2 sound is assimilated to a different L1 category; (b) both to the same L1 category, but one a better exemplar than the other; (c) both to the same category; (d) both assimilated as unknown speech sounds (these would be called new phones by Flege); (e) one assimilated to an L1 category and the other as an unknown; (f) both heard as non-speech sounds.
Overview of the papers The papers in this collection are divided into three groups according to the phonological domain they deal with: (a) the segment, with three papers on vowels; (b)
Barbara O. Baptista and Michael Alan Watkins
the syllable, with five papers on vowel insertion – prothesis (added to the beginning of the word) before initial /s/-clusters and paragoge (added to the end of the word) after consonants; and (c) prosodic levels, with three papers dealing with stress and rhythm in words, compounds and phrases.
Segmental-level studies The segmental studies in this collection all deal with English vowels, as produced by speakers of BP (Baptista) and of Spanish (Escudero; Gallardo del Puerto, García Lecumberri, & Cenoz). Portuguese has a wider set of monophthongal vowels than Spanish – essentially the same five vowels /i, e, a, o, u/, plus two additional ones /7, f/. Thus, speakers of both Spanish and Portuguese can be expected to have difficulties with the English vowels that are not part of their L1 inventory: /I/, /æ/, /%/, /~/, /6/ for both groups of speakers, plus /7/ and /f/ for the Spanish speakers, and /6/ for those of both groups learning the British vowels. The three papers in this section all have quite different objectives from one another: Baptista investigates the production of the front vowels plus /%/ and /"/, Escudero measures perception of the /i/-/I/ contrast, and Gallardo del Puerto et al. look at the perception of the entire monophthongal vowel system of Southern Standard British (SSB). Baptista is a longitudinal study which examined the L2 front vowels as a system, with the objective of finding out whether the acquisition of each new IL vowel depends on the accuracy of the neighboring vowels, which may differ only in subtle ways from their L1 counterparts. The participants were all Brazilians who had recently arrived in the United States. Although the number of learners was very small, the study provides evidence for the claim that the acquisition of each new vowel distinction depends on adjustments in the neighboring vowels; that is, the vowels are not learned in isolation, but as part of an IL vowel system. To fit these results into the SLM, Baptista suggests the inclusion in the model of a phonetic supercategory: a mental representation of the vowel system itself, to which each of the vowel categories is connected. The learners in Escudero’s study were Spanish speakers learning Standard Scottish English (SSE), in which, unlike General American (GA) or SSB, vowels are distinguished almost exclusively by spectral quality, not by length or diphthongal trajectories. Escudero classified the results of the Spanish-speaking participants into four categories, which she suggests might constitute a development sequence: (a) those who were not able to discriminate the English vowels /i/-/I/; (b) those who were able to discriminate the pair but did so using duration almost exclusively, contrary to the NSs; (c) those who discriminated the pair using both features – duration and spectral quality; and (d) those who discriminated the vowels using basically spectral quality, as the NSs did. Escudero concludes that it is possible for
Introduction
L2 learners to learn to perceive target language (TL) contrasts, but that they do not necessarily do so by using the same features as those used by native speakers. Gallardo del Puerto et al. tested the perception of three groups of Spanishspeaking children, in three different age ranges, of 11 SSB vowels, including the long central vowel /8˜/. Four of the vowels were classified as identical to L1 vowels, six as similar and one – the long central vowel – as new. Contrary to expectations, they did not find a lower starting age, or age of onset of learning (AOL), to be an advantage. However, they did find the expected influence of degree of similarity of the vowels. In accordance with the predictions based on the SLM, the vowels classified as identical (/i˜, u˜, e, f˜/) were those most poorly identified, while the ones classified as similar (/I, ~, æ, f˜, 6, %/) or new (/8˜/) were identified with much greater frequency. The classification might be questioned, as it was based mainly on transcription conventions, but there is ample precedence for this criterion in the literature on the SLM. Interestingly, there was also an interaction between age and similarity: the tendencies regarding which vowels were more easily identified varied as a function of AOL. Most notably, the group with the highest AOL was best able to identify the vowels considered to be identical, a result suggested by the authors to be due to their more developed metalinguistic knowledge. All three vowel studies imply the importance of the L1 (in addition to age, in Gallardo del Puerto et al.) for determining which L2 vowels are difficult, but the strategies for acquiring these vowels range from making adjustments in neighboring vowels (Baptista), to use of a universally easy-to-perceive feature – length (Escudero), to using metalinguistic knowledge (Gallardo del Puerto et al.).
Syllable-level studies One of the main differences in syllable structure between Spanish and BP is that there are more coda possibilities in the former. Nasals, laterals and dental fricatives are all pronounced in word-final position in Spanish (e.g., pan, sal, libertad), whereas in BP word-final nasals are omitted as their nasality spreads to the preceding vowel (e.g., fim [f~i]), the lateral is vocalized (e.g., sal [sa~]), turning the sequence into a diphthong, and obstruents never occur word-finally. In addition, Spanish permits all stops in word-medial coda position (e.g., apto, advérbio, magnífico), whereas when they appear in the written form in BP they are regularly resyllabified in speech as onsets by the addition of an epenthetic [i] (e.g., advogado [adivoÁgadu], ritmo [Áritimu]). The only coda consonants fully realized phonetically as consonants in BP are /r/ and /s/. In this section there are two studies on the production of English word-final codas by Brazilians: Baptista and Silva Filho examine production, while Koerich investigates both perception and production. As theirs was the first study on the production of English word-final single consonants by BP speakers, Baptista and Silva Filho examined a number of
Barbara O. Baptista and Michael Alan Watkins
variables found in previous studies to be important for speakers of other L1s with other syllable structures. Paragoge was almost the sole strategy used by the Brazilians to deal with coda consonants, except for nasals, which were frequently pronounced according to their L1 process for coda nasals described above. In general the results of this study were consistent in demonstrating that the likelihood of paragoge was determined by markedness in terms of both voicing and sonority of the target consonants, as well as of the context segments in the following word. Thus, while the L1 was the factor determining the strategy used (word-internal vowel epenthesis is a productive L1 process in BP, as mentioned above, and paragoge is frequently used to adapt foreign borrowings such as the English club, which becomes clube), markedness and context determined when and where it was used. Although there are no papers in this volume dealing with English codas produced by Spanish speakers, they are known to use devoicing or omission, rather than paragoge, to simplify the syllable. Research is needed to see whether they would be influenced by the same variables as the BP participants in Baptista and Silva. Koerich embarked on a study of the perception and production of English codas by beginner-level Brazilian learners, with the idea that the difficulties in the pronunciation of the final consonants might be more than just articulatory. In addition to a sentence-reading task, she used a discrimination task to test whether the participants were able to perceive the presence or absence of the paragogic vowel. A significant correlation was obtained between the discrimination and production results, indicating that the same students who had difficulty producing the final consonants without inserting a vowel frequently did not perceive whether the vowel was there or not in the discrimination task. This is particularly interesting because, while Flege developed the SLM for individual segments, the results of this study suggest that there might also be phonetic categories for sequences of segments (e.g., syllable or coda), and that speakers of a language with almost no codas might simply not have a category for coda and, thus, hear every coda consonant as an onset followed by the L1 default vowel. Another way in which both Spanish and Portuguese syllables are more constrained than those of English is with regard to what clusters are permitted in initial position (neither language permits final clusters). English has a number of words beginning with /s/-clusters, and among these even /s/+ obstruent clusters, which appear to violate the sonority sequencing principle (Clements 1990). Neither Spanish nor Portuguese permits /s/ to occur as part of an onset cluster, and Spanish and Portuguese cognates of English words like school and Spanish always begin with [e] or [i] before the cluster: escuela/escola and español/espanhol, spelled with an initial letter “e”. There are three studies in this section on the production of English /s/-clusters – one involving Spanish speakers, another involving Brazilians, and a third comparing Spanish speakers and Brazilians. In all cases the preferred
Introduction
strategy for dealing with the clusters is prothesis – the addition of a vowel to the beginning of the word. Carlisle’s paper reports on two related studies investigating the production of English initial /s/-clusters by Spanish speakers. The results of the first study, longitudinal with three data collections, confirm predictions that the longer clusters would be more difficult than the shorter ones and that a consonant in the preceding context would be more difficult than a vowel. Putting the three times of the study together, he was also able to confirm that no learner ever acquired a longer cluster before its corresponding shorter component cluster. The second study was cross-sectional and confirmed findings of his previous studies that differences in sonority made the /st/ more difficult than the /sn/, which was in turn more difficult than the /sl/. It also corroborates the findings of the first study of the paper regarding the comparative difficulty of vowels and consonants in the preceding environment. Possibly the most interesting contribution of this paper is his analysis, based on Clements’ (1990) sonority cycle, of the phonological processes applied by Brazilian learners to /s/+obstruent versus /s/+sonorant clusters, accounting for the totally different results obtained by Rebello and Baptista and by Rauber, compared to Carlisle’s own results with Spanish speakers. Rebello and Baptista investigated the effects of markedness on Brazilians’ production of English /s/-clusters, but obtained results quite different from those of Carlisle. Surprisingly, they found the size of the cluster did not have a significant effect on the Brazilian learners and suggested that instruction could interfere with the effect of the universal CV syllable. In regard to sonority within the cluster, the results were the opposite of Carlisle’s – it was the /s/+sonorant clusters that caused the higher rate of epenthesis, which was attributed to the transfer of L1 voicing assimilation of sibilants, resulting in a voiced cluster, more marked than the voiceless /s/+obstruent clusters. As to the preceding environment, the only significant influence was found to be the difference between voiced and voiceless obstruents, the former causing more paragoge than the latter. These results suggest that the transfer of a quite productive L1 voicing assimilation process is capable of mediating the influence of universal markedness, making markedness in relation to voicing more important than the markedness in relation to sonority sequencing. Finally, Rauber, seeing the discrepancies between the results of Carlisle’s studies and Rebello and Baptista’s, carried out a study using the same instrument for both Spanish and BP-speaking participants. She found the effect of cluster length to be very weak for both groups, supporting Rebello and Baptista’s suggestion concerning the effect of classroom learning. The effect of sonority within the cluster was significant for the Spanish speakers, corroborating Carlisle, but insignificant for the BP speakers, there being hardly any difference between /s/+obstruents and /s/+sonorants. Applying Major’s OPM, Rauber concluded that the less proficient learners in Rebello and Baptista were still applying more transfer processes, re-
Barbara O. Baptista and Michael Alan Watkins
sulting in greater importance being given to voicing than to sonority, whereas for her more advanced learners, the competing influences of transfer and universal sonority sequencing canceled each other out, making the two types of cluster of approximately equal difficulty. In sum, the results of the three studies on initial /s/-clusters illustrate very clearly the importance of the L1, both directly and in the interaction with markedness.
Prosodic-level studies: Stress and rhythm It is in the higher domains, above syllable level, that the differences between BP and Spanish become least noticeable. While the rhythm of EP is characterized by what Crosswhite (2004) calls “extreme reduction” and even syncope, BP never has more than partial vowel reduction, preserving a three-way contrast even in unstressed word-final syllables, and Spanish has no vowel reduction whatsoever – only consonant lenition and deletion – which makes any secondary stresses generally less prominent. Watkins investigated variability in the use of reduced forms of four English prepositions by advanced Brazilian speakers. BP has a clear preference for binary feet, giving a fairly regular alternation of strong and weak syllables, and this tends to be carried across into English, so that a phrase such as going to the beach, normally realized as two feet by native speakers [go.ing.to.the][beach], would typically be given an extra stress by Brazilians [go.ing.][to. the][beach]. Of the three prepositions left in the final run of the VARBRUL analysis, for and of were found to resist reduction much more than to, probably because of their extra syllable weight. Perception was not investigated, and although Gómez Lacabex, García Lecumberri and Cooke (2005) found no significant correlation between perception and production of vowel reduction in a study of untrained Spanish learners of English, this might not be the case with BP speakers. Whereas vowel reduction is not something that L1 Spanish speakers would normally be familiar with, BP does have some degree of vowel reduction, and a study comparing NS’s and Brazilians’ perception of reduced vowels might help to explain why Brazilians find it so hard to produce fully reduced vowels with native-like consistency. The next paper in this section, by García Lecumberri, compares NSs of English with Spanish university-level learners to see to what extent both groups were able to perceive secondary and primary stress correctly in end-stressed words and compounds, in citation form and with stress shift. The fact that the NSs did not do significantly better than the Spanish group on most of the tasks is attributed to the latter’s greater metalinguistic awareness of stress, since their L1 makes extensive use of accents to mark primary stressed syllables. However, the fact that secondary stress in Spanish is very weak and not marked orthographically may explain the tendency of the Spanish participants to hear only one stress per word.
Introduction
Portuguese-speakers also (assuming they are literate) must have some metalinguistic awareness of primary stress, since the use of accents to mark stress in the written form is similar to (though not the same as) their use in Spanish. In the same way, Portuguese-speakers are likely to lack this metalinguistic awareness when it comes to secondary stress, since, although it is far more salient than in Spanish, it is not fixed on certain syllables as it is in English. As Collischonn (1994) points out, there are alternative ways of stressing a Portuguese word with an odd number of pretonic syllables, such as responsabilidade: you can either stress the first syllable and then the fourth, so that you create a ternary foot (Àres.pon.sa.Àbi.li.Áda.de), or opt for binary feet but leave the first syllable unfooted (res.Àpon.sa.Àbi.li.Áda.de). In the final paper in this collection, Silveiro and Watkins describe an experiment with Brazilian learners of English to measure their accuracy of production of end-stressed compared with front-stressed constructions. As expected, there was a very clear tendency to use the phrasal stress pattern incorrectly for compounds, suggesting either that the learners simply do not notice the contrast between the two patterns when they hear it (or at least its syntactic significance), or that the influence of L1 metrical patterns is too strong. Since incorrect stress placement on compounds was so prevalent in the results of this study and can affect comprehensibility quite seriously, Silveiro and Watkins emphasize the importance of drawing learners’ attention explicitly to the different meanings signaled by the phrasal and compound stress patterns in English, since neither Spanish nor Portuguese uses stress to make syntactic distinctions in this way.
Some general implications Running through all the papers in this collection has been a recurrent theme that seems to have important implications for teachers: L2 users do not automatically hear what is physically there in the speech signal. Unless we have our attention drawn to certain features of the L2 which are different from our L1, and to our own interlanguage productions of these L2 features, we are unlikely to notice them, and therefore will not produce them correctly. Noticing appears to be crucial for correct production, at all levels, from segments to intonation patterns. The tendency when speaking an L2 is to make the minimum effort necessary to be understood. There is a strong temptation to use short cuts, reducing retrieval time for lexical items, though often at the expense of phonological accuracy. If one can understand and be understood in the target language without being aware of certain distinctions, which is especially possible when the distinctions are not reflected in the written form, as is the case with full versus reduced vowels and stress-placement in English, then a foreign accent is likely to persist.
Barbara O. Baptista and Michael Alan Watkins
Although fully native-like pronunciation is not a realistic target, most learners in our experience regard good pronunciation as a high priority, and want to know how to achieve it. Even for those whose motivation is wholly instrumental, it is still fundamental to be easily understood. In Munro and Derwing (1995), a distinction is drawn between intelligibility (the extent to which an utterance is actually understood), comprehensibility (listeners’ perception of difficulty in understanding particular utterances), and accentedness (how strong the talker’s foreign accent is perceived to be). These dimensions are considered to be related, but partially independent: utterances may be highly intelligible and comprehensible, yet rated as heavily accented. The difference between comprehensibility and intelligibility involves processing difficulty: two utterances may both be understood, but one may require special top-down processing to resolve doubts about an initially unintelligible word, thus causing the listener to assign a low comprehensibility score. The authors found a relationship between comprehensibility and listener response times, but not between accentedness and response times, and concluded that an accent, even a strong one, is by no means an inevitable barrier to communication. In fact, Graddol (2006) reports an increasing preference for non-native teachers of English in Asia, one reason being that the accents of native speakers are hard to understand. He also notes that research shows many NSs of English to be poor at communicating with L2 users, and that discussions in English where no NS is present tend to progress more smoothly. He concludes that one of the more anachronistic ideas about the teaching of English is that learners should adopt a native speaker accent. But as English becomes more widely used as a global language, it will become expected that learners will signal their nationality, and other aspects of their identity, through English. Lack of a native-speaker accent will not be seen, therefore, as a sign of poor competence. (Graddol 2006: 117)
Although this is a trend that cannot be ignored, it is not yet the situation in which most L1 speakers of Spanish and Portuguese find themselves. Many need to interact with native speakers in their professional life. A substantial number have opted to live in countries where English is the predominant language, and where they must compete directly with NSs in educational and professional contexts. Comprehensibility is crucial, but speaking more slowly in order to achieve this may not be a realistic strategy in these situations. Various studies have shown that a foreign accent can put one at a disadvantage in an English-speaking country by arousing prejudice (Brennan & Brennan 1981; Ryan, Carranza, & Moffie 1977) or causing a feeling of irritation in the listener due to extra processing demands (Munro & Derwing 1995), and there are certain types of error, such as the insertion of a vowel after a word-final obstruent (typical of Brazilian-accented English), which some listeners may find particularly irritating as they can distort the meaning of a
Introduction
whole utterance. However fast the trend towards global multilingualism may be – a trend, as Graddol (2006) points out, from which such traditional NS preserves as the USA, Britain and Australia are by no means immune – it clearly continues to be worthwhile for speakers of Spanish and Portuguese to work on eliminating those features of their pronunciation that prevent them from being easily understood, whether they are speaking to native speakers or other L2 users of English. The papers in this collection may have both a practical and a more theoretical application: practical in that they point out clearly to those involved in pronunciation instruction where some of the areas of greatest difficulty for Spanish and Portuguese-speaking learners of English lie, and theoretical in that they shed light on how human beings learn, or fail to learn, aspects of a complex skill, in this case a foreign language. More specifically, the findings reported here may have implications for models of second or foreign language learning or acquisition – possible adaptations of existing models or the development of future ones. There were specific suggestions in Baptista for the extension of Flege’s SLM; the two papers on vowel perception (Escudero; Gallardo del Puerto et al.) have unstated implications for the SLM and Best’s PAM; all of the papers on the syllable have implications for the role of transfer and universals in L2 production, either supporting or challenging the claims of Eckman’s models (the differential role of the L1 in the MDH compared to the SCH); and the papers on prosody all have unstated implications for Schmidt’s noticing hypothesis (1990, 1993). All of the theories of second language acquisition can be considered part of the developing science of cognition, of which the learning and processing of language – both L1 and L2 – are an integral part.
Acknowledgment A special thanks goes to Andréia Schurt Rauber, who spent long days and nights with us formatting, checking, revising, and incorporating last-minute changes.
References Anderson, J. I. (1987). The markedness differential hypothesis and syllable structure difficulty. In G. Ioup & S. H. Weinberger (Eds.), Interlanguage Phonology: The acquisition of a second language sound system (pp. 279–291). Cambridge, MA: Newbury House. Archibald, J. (1998). Second Language Phonology. Amsterdam: John Benjamins. Best, C. T. (1995). A direct realist view of cross-language speech perception. In W. Strange (Ed.), Speech Perception and Linguistic Experience: Issues in cross-language research (pp. 171–204). Timonium, MD: York Press.
Barbara O. Baptista and Michael Alan Watkins
Best, C. T., McRoberts, G. W., & Goodel, E. (2001). Discrimination of non-native consonant contrasts varying in perceptual assimilation to the listener’s native phonological system. Journal of the Acoustical Society of America, 101, 775–794. Broselow, E. (1983). Non-obvious transfer: On predicting epenthesis errors. In S. Gass & L. Selinker (Eds.), Language Transfer in Language Learning (pp. 269–280). Rowley, MA: Newbury House. Broselow, E. (1984). An investigation of transfer in second language acquisition. International Review of Applied Linguistics, 22, 253–269. Brennan, E. & Brennan, J. (1981). Accent scaling and language attitudes: Reactions to Mexican American English speech. Language and Speech, 24, 207–221. Clements, G. N. (1990). The role of the sonority cycle in core syllabification. In J. Kingston & M. Beckman (Eds.), Papers in Laboratory Phonology I (pp. 283–333). Cambridge: CUP. Collischonn, G. (1994). Acento secundário em português. Letras de Hoje, 29, 43–53. Crosswhite, K. M. (2004). Vowel reduction. In B. Hayes, R. Kirchner, & D. Steriade (Eds.), Phonetically Based Phonology (pp. 191–231). Cambridge: CUP. Eckman, F. R. (1977). Markedness and the contrastive analysis hypothesis. Language Learning, 27, 315–330. Eckman, F. R. (1991). The structural conformity hypothesis and the acquisition of consonant clusters in the interlanguage of ESL learners. Studies in Second Language Acquisition, 13, 23–41. Flege, J. E. (1987). The production of “new” and “similar” phones in a foreign language: Evidence for the effect of equivalence classification. Journal of Phonetics, 15, 47–65. Flege, J. E. (1991). Age of learning affects the authenticity of voice onset time (VOT) in stop consonants produced in a second language. Journal of the Acoustical Society of America, 89, 395–411. Flege, J. E. (1995). Second language speech learning: Theory, findings, and problems. In W. Strange (Ed.), Speech Perception and Linguistic Experience: Issues in cross-language research (pp. 233–277). Timonium, MD: York Press. Flege, J. E. & Eefting, W. (1987). The production and perception of English stops by Spanish speakers of English. Journal of Phonetics, 15, 67–83. Flege, J. E. & Eefting, W. (1988). Imitation of a VOT continuum by native speakers of English and Spanish: Evidence for phonetic category formation. Journal of the Acoustical Society of America, 83, 729–740. Flege, J. E., Munro, M., & Fox, R. A. (1994). Auditory and categorial effects on cross-language vowel perception. Journal of the Acoustical Society of America, 95, 3623–3641. Flege, J. E., Munro, M., & MacKay, I. (1995). Effects of age of second-language learning on the production of English consonants. Speech Communication, 16, 1–26. Flege, J. E. & Schmidt, A. (1995). Native speakers of Spanish show rate-dependent processing of English stop consonants. Phonetica, 52, 90–111. Gómez Lacabex, E. G., García Lecumberri, M. L., & Cooke, M. (2005). English vowel reduction by untrained Spanish learners: Perception and production. Paper presented at the Phonetics Teaching and Learning Conference (PTLC), London, 27–30 July, 2005. Gottfried, T. (1983). Effects of consonant context on the perception of French vowels. Journal of Phonetics, 12, 91–114. Graddol, D. (2006). English Next: Why Global English May Mean the End of ‘English as a Foreign Language’. Retrieved May 27, 2006, from http://www.britishcouncil.org/files/documents/ learning-research-english-next.pdf
Introduction
Ioup, G. & Weinberger, S. H. (Eds.). (1987). Interlanguage Phonology: The acquisition of a second language sound system. Cambridge, MA: Newbury House. James, A. & Leather, J. (Eds.). (1986). Sound Patterns in Second Language Acquisition. Dordrecht: Foris. James, A. & Leather, J. (Eds.). (1996). Second-Language Speech: Structure and process. Berlin: Mouton de Gruyter. James, A. & Leather, J. (Eds.). (2002). New Sounds 2000: Proceedings of the Fourth International Symposium on the Acquisition of Second-Language Speech. Klagenfurt: University of Klagenfurt. Karimi, S. (1987). Farsi speakers and the initial consonant cluster in English. In G. Ioup & S. H. Weinberger (Eds.), Interlanguage Phonology: The acquisition of a second language sound system (pp. 305–318). Cambridge, MA: Newbury House. Leather, J. & James, A. (Eds.). (1990). New Sounds 90: Proceedings of the 1990 Amsterdam Symposium on the Acquisition of Second-Language Speech. Amsterdam: University of Amsterdam. Leather, J. & James, A. (Eds.). (1992). New Sounds 92: Proceedings of the 1992 Amsterdam Symposium on the Acquisition of Second-Language Speech. Amsterdam: University of Amsterdam. Leather, J. & James, A. (Eds.). (1997). New Sounds 97: Proceedings of the Third International Symposium on the Acquisition of Second-Language Speech. Klagenfurt: University of Klagenfurt. Major, R. C. (1986). The ontogeny model: Evidence from L2 acquisition of Spanish r. Language Learning, 36, 453–503. Major, R. C. (2001). Foreign Accent: The ontogeny and phylogeny of second language phonology. Mahwah, NJ: Lawrence Erlbaum Associates Munro, M. & Derwing, T. (1995). Processing time, accent and comprehensibility in the perception of native and foreign-accented speech. Language and Speech, 38, 289–306. Munro, M., Flege, J. E., & MacKay, I. (1996). Effects of age of second-language learning on the production of English vowels. Applied Psycholinguistics, 17, 313–334. Ryan, E., Carranza, M., & Moffie, R. (1977). Reactions toward varying degrees of accentedness in the speech of Spanish-English bilinguals. Language and Speech, 20, 267–273. Sato, C. (1984). Phonological processes in second language acquisition: Another look at interlanguage syllable structure. Language Learning, 34, 43–57. Schmidt, R. (1990). The role of consciousness in second language learning. Applied Linguistics, 11, 129–158. Schmidt, R. (1993). Awareness and second language acquisition. Annual Review of Applied Linguistics, 13, 206–226. Strange, W. (Ed.). (1995). Speech Perception and Linguistic Experience: Issues in cross-language research. Timonium, MD: York Press. Tarone, E. (1980). Some influences on the syllable structure of interlanguage phonology. International Review of Applied Linguistics, 18, 139–152. Weinberger, S. H. (1987). The influence of linguistic context on syllable simplification. In G. Ioup & S. H. Weinberger (Eds.), Interlanguage Phonology: The acquisition of a second language sound system (pp. 401–417). Cambridge, MA: Newbury House.
Segmental-level studies: Vowels
Adult phonetic learning of a second language vowel system* Barbara O. Baptista Universidade Federal de Santa Catarina, Brazil
A longitudinal study of the acquisition of English vowels by Brazilian Portuguese speakers provides evidence for the claim of Flege’s (1995) speech learning model (SLM) that phonetic learning is possible in adulthood, and further that the results of this learning are apparent even in the early stages and in the phonetic approximation of “similar” vowels. In addition to the expected link between certain L1 and interlanguage (IL) vowels, the study provides evidence for a link among the IL vowels themselves, which appear to undergo changes in relation to one another. The results suggest that any description of the phonetic representation of newly-formed IL categories should include their location within the phonetic space relative to the location of other L2 categories.
.
Introduction
Flege’s speech learning model (SLM) (Flege 1995) claims that the adult maintains the phonetic learning mechanisms and processes that were used for first language (L1) acquisition, including the ability to form new phonetic categories, and that these abilities can be applied to the learning of the sounds of a second language at any age. The problem is that the process of equivalence classification, so necessary in L1 acquisition to allow the child to identify a variety of acoustic realizations as belonging to the same phonetic category, causes similar sounds of the second language (L2) to be identified with those of the L1. This identification can impede the formation of phonetic categories for similar sounds of the L2. Although the SLM makes reference to the L1 sound system, the L2 phonetic categories referred to in the four postulates and seven hypotheses are apparently only those of individual L2 phones. Several of the studies cited as examples of * This paper is a revised version of Baptista (2002).
Barbara O. Baptista
L2 consonant acquisition, however, make it clear that phonetic categories for consonant phones are actually sub-categories of larger categories which would correspond to classes of consonants. Flege (1987), for example, in a study of English/French and French/English bilinguals, investigated the influence of the L2 voice onset time (VOT) values on the production of L1 word-initial /p, t, k/. Similarly, Flege (1991) refers to the compromise values of the VOT of English /p, t, k/ produced by native Spanish-speaking early learners of English. Both of these are clear references to the class of initial voiceless stops. Flege, Munro and MacKay (1995) examined the effect of age of learning of native Italian speakers on the VOT of their word-initial English /p, t, k/ and on the production and identification of word-final English /p, t, k/ and /b, d, g/ in terms of duration of preceding vowel, stop closure, and closure voicing. This is also a reference to three classes of consonants, whose individual L1 or L2 phonetic categories can be assumed to be stored in long-term memory (LTM) representations within three larger categories. L2 vowels, however, in the studies cited in Flege (1995), are generally studied as individual phones, the phonetic categories of which appear to be formed in isolation from other L2 vowels. Flege (1995: 242) cites Liljencrants and Lindblom (1972) and Lindblom (1986) in pointing out the importance of vowel dispersion for maintaining sufficient auditory contrast. However, whereas Lindblom was referring to contrast within a language, the SLM relates dispersion to the “deflection” (presumably in the sense of turning away) of an L2 vowel from an L1 vowel (Flege 1995: 239), and not to the positioning of L2 vowels in relation to one another. Reference is made to neighboring L2 vowels in the SLM and in the studies cited by Flege (1995) only to point out that a distinction is or is not being made between them or that one is produced more accurately than another. If vowel dispersion is important for maintaining sufficient perceptual contrast, it makes sense that it would be more important to maintain this perceptual contrast within a language (L1 or L2) than between one language and another, since in most communicative situations a listener listens to one language at a time. This seems logical especially considering the evidence provided by Ladefoged and Broadbent (1957), using synthetic stimuli, and by Ladefoged (1989), using natural speech, that listeners perceive each vowel in relation to the speaker’s total acoustic vowel space, which they calibrate from the formant frequency patterns in the rest of the ongoing speech. Of course, listeners can perceive vowels in relation to other vowels only if speakers produce them accurately relative to one another. Thus, it can be argued that in order to learn to produce reasonably accurate L2 vowels, the learner must not only form new L2 phonetic categories, but these phonetic categories need to be linked in some fashion in the LTM, so that the representation for each vowel can include its position relative to the other vowels of the L2 system. If this is so, L2 research which treats vowels in isolation may be neglecting some very important information.
Adult phonetic learning
The treatment of vowels in isolation is the case regarding not only Flege’s SLM, but in L2 vowel studies in general. Even recent L2 vowel studies, whether production (e.g., Piske, Flege, MacKay, & Meador 2002), or perception (e.g., Escudero this volume; Morrison 2002; Flege & MacKay 2004) examine individual vowels or pairs of vowels, rarely making any further link among vowels in the system. Piske et al., as well as Flege and MacKay, investigated all and most, respectively, of the vowels in the English vowel system, but the former examined native-speaker goodness ratings of individual English vowels as produced by Italian L1 speakers, and the latter compared the perceptual distinction between individual vowel pairs, with no attempt to relate one vowel or pair of vowels to another. Even in studies where a relationship between L2 vowels is suggested, the relationship is rarely made clear. Major (1987), in a study of English /7/ and /æ/ production by Brazilian Portuguese native speakers, found that the better the global foreign accent ratings of his speakers, the better also were the acceptability judgments of their /æ/ productions, but the worse were the acceptability judgments of their /7/ productions. After considering two other possible explanations, Major apparently prefers to explain these unexpected results by reference to an interaction between the two vowels, where the improvement of one would cause the deterioration of the other. A more likely explanation (one of the two considered, but discarded) is that there was no distinction made by any of the learners, but the /7-æ/ productions of the learners with a better accent were lower because of a greater influence of the target /æ/. Bohn (1992) offers an alternate interpretation of the findings of Major (1987), Bohn and Flege (1992), and Flege (1987), all as the result of deflection. However, rather than deflection away from an L1 category as suggested by Flege, Bohn talks about deflection of a similar vowel toward a new L2 vowel for which a phonetic category has not yet been established. It appears that what Bohn is referring to is actually an attraction (the act of drawing near) rather than a deflection, and amounts to simply a lack of distinction between two target vowels, where the resulting single vowel is influenced by the input from both. The best way to obtain a more satisfactory picture of intralingual influences among the L2 vowels is to investigate larger portions of the learners’ L2 vowel systems. This has rarely been done in interphonology research up until now. However, Morais (1995), in a study of the acquisition of French vowels by Brazilian students, states that when they learn to produce the French vowel /y/ higher, more fronted and labialized, they also produce a more fronted /i/ and an /u/ farther back, thus adjusting the perceptual distance between the three high vowels. This paper is a reanalysis of the data of Baptista (1992, 2000) in terms of Flege’s SLM (1995), with the purpose of investigating to what extent L2 phonetic vowel categories can be considered to be linked to one another in LTM. The analysis in the original study was based heavily on information-processing theory, with only brief
Barbara O. Baptista
mention of Flege’s phonetic application of equivalence classification. The development of Flege’s theories into a full model has since allowed for a more explanatory analysis of the data, as well as suggestions for an expansion of the model.
. Method . Participants Participants were 11 recently arrived (one week to six months) Brazilian-Portuguese speakers (5 men and 6 women), from four different states of Brazil, who were residing temporarily in Los Angeles, California. All participants had completed high school and some had done a few semesters of university study. They had had varying amounts of English instruction in Brazil, but none, at the time of the first recording, was able to utter complete sentences in English without considerable hesitation, frequent pauses and backtracking. Most claimed to socialize mainly with other Brazilians and used English only at work – minimally, as their jobs generally required little communication – and only 4 were taking English classes. Table 1 lists the participants identified by initials, followed by the number of monthly recordings made by each, their age, sex, length of residence in the US at the time of the first recording, their Brazilian state of origin (CE = Ceará, RJ = Rio de Janeiro, RS = Rio Grande do Sul, SP = São Paulo), their initial communicative competence level, and whether they were using their English socially, at work, or in English classes. Table 1. Background characteristics of participants Partic.
No. Rec.
Age
Sex
Time in US
Br. State
Comp
Eng. use
FR MR NR NT SN CL CR DN MN TR VN
6 6 6 6 8 8 6 4 6 6 8
31 34 28 35 26 21 18 25 25 30 41
M M M M M F F F F F F
2 mo. 3 mo. 2 wk. 4 mo. 5 mo. 1 mo. 1 wk. 1 mo. 6 mo. 2 mo. 3 mo.
RS RJ RS RJ RS SP RS SP RS RS CE
2 3 1 2 1 3 3 3 2 0 3
work work work class work/class work/class work/soc work work class work/soc
Adult phonetic learning
. Procedure Participants were recorded individually in their homes once a month, over a period ranging from four to eight months. Each session began with a warm-up involving the reading and retelling of a story in English, which used many of the test words. The participants’ communicative competence level was evaluated subjectively according to the level at which they were able to retell the story in the first session: (0) not at all, (1) with extreme difficulty, (2) at a very basic level, and (3) at a low-intermediate level. After this, they each read 42 monosyllabic English words containing the seven vowels /i/, /I/, /eI/, /7/, /æ/, /"/, and /%/ (six for each), plus thirteen distractors, all contextualized in the same carrier sentence Say X now (Appendix A), in a different randomized order for each participant. In the first session, the English warm-up was preceded by a recording of 24 two and threesyllable Portuguese words containing the vowels /i/, /e/, /ei/, /7/, /a/, and /f/ plus three distractors, contextualized in a similar manner in the carrier sentence Fala X de novo (‘Say X again,’ Appendix B), in one randomized order for all. In both the English and Portuguese sentence reading, participants were asked to maintain a constant statement intonation, and occasionally asked to repeat when they stumbled or fell into a list-reading intonation. For both the English and Portuguese words, the first and second formants of each target vowel were extracted through analysis of spectrograms and power spectra, taking a 25-ms steady-state or near steady-state portion near the center or before any obvious diphthongal movement (especially of /eI/, to avoid measuring the off-glide) of the second formant. Ellipses of two standard deviations on each axis were plotted for each vowel for each month, using an inverted mel scale,1 in order to approximate a typical articulatory vowel plot. As six tokens per vowel per month is too few to guarantee that standard deviations are representative, wherever the ellipse was very large or the data were obviously skewed, the vowel tokens are also shown.
. Results . Construction of the early IL vowel system This study provided only production data regarding the English vowels of the eleven participants, but in the following analysis these production data are as. A “psychologically realistic scale of pitch” developed from experiments carried out in the 1930s by Stevens and Volkmann “to investigate the relationship between frequency and the psychological quantity of pitch” (Hayward 2000).
Barbara O. Baptista
sumed to reflect to a certain degree their mental representations (at least those used to construct realization plans), justified by the seventh hypothesis of the SLM (Flege 1995: 239): “The production of a sound eventually corresponds to the properties represented in its phonetic category representation.” To avoid confusion between the L2 assumed to be the target and the L2 as perceived and produced by the participants, the terms used in this analysis will be native language (NL), target language (TL) and interlanguage (IL). Based on Figures 1 and 2, where the vowels are plotted for males and females respectively according to average formant frequencies, a comparison can be made of the Californian English vowels assumed to be the TL vowels, from Ohnishi (1991), the eleven participants’ Portuguese NL vowels, and their English IL vowels. The following three observations show that, even in the early stages of acquisition, these learners were not simply using their NL vowels for production in the TL. While there is obvious influence of the NL, there is also evidence for an awareness (at some level) on the part of the learners that the TL vowel system is different. First, Figure 1 shows that the male TL /i/ and /eI/ are somewhat more fronted than the corresponding NL vowels, while the TL /7/ is quite similar to the NL vowel. In Figure 2, the female TL /eI/ and /7/ can be seen to be forward of the NL front vowel line, while /i/ is not noticeably different. In both the male and female plots, however, whereas the individual IL vowels /i-I/ (not distinguished), /eI/, and /7-æ/ (also not distinguished) maintain the approximate relative positions of the NL vowels /i/, /e/-/ei/ (the first element of the diphthong /ei/ is not noticeably different from the nucleus of /e/), and /7/, the entire IL front vowel line is more fronted than the NL vowel line. Thus, while the learners may well be using the NL vowel categories as a point of departure for what they perceive to be similar vowels, there is evidence, very early on, for what appears to be the result of a systemic phonetic approximation of the learners’ perception of the TL front vowel line. It could be said that they are using a different articulatory setting 2 for the IL vowels, which would not be possible without some kind of holistic mental representation, that is, a superordinate phonetic category (however inaccurate) in the LTM for this new vowel system. Second, in both Figure 1 and Figure 2 it can be seen that the IL system (from what can be observed from the selection of vowels included in this study) is quadrilateral like that of the TL, rather than triangular like the NL system. The IL /"/ in Figures 1 and 2 was clearly not modeled on the NL /a/ as might be expected. . Defined by Thornbury (1993: 127) as “features of accent that result from the characteristic disposition and use of the articulatory organs by speakers of a particular language, and which affect the production of all the individual sounds common to that language”, and by Gick, Wilson, Koch and Cook (2004: 220) as “underlying or default articulator positions” linked to “speech rest position”.
Adult phonetic learning
Figure 1. Plot of male means: Portuguese NL, English IL, CA English TL (Ohnishi 1991)
Figure 2. Plot of female means: Portuguese NL, English IL, CA English TL (Ohnishi 1991)
Whereas the male /"/ appears to have been modeled on the NL /f/, it is not clear whether the female /"/ was modeled on a NL vowel or perceived as a new vowel. At any rate, both male and female learners have apparently perceived that there can be no low central vowel in a quadrilateral IL system. It is as though they have modeled the lower vowel line of their quadrilateral IL system on what would be the /7/-/f/ line of the NL system without the /a/. The third observation concerns the placement of the IL /%/ within the IL system. Both figures demonstrate that this vowel was apparently perceived to be a new vowel, as it was not produced close to any NL vowel. In fact, in terms of absolute
Barbara O. Baptista
position, it is not close to any TL vowel either. However, if we look at the relative position of this vowel within the IL vowel space, it is almost identical to that of the TL /%/ within the TL vowel space – central and at a distance above the IL lower vowel line similar to that of the TL /%/ above the TL lower vowel line. Since the IL lower vowel line is higher than that of the TL, so is the position of the IL /%/. From these three observations about the early IL vowel system, we might make the following modifications to the SLM (Flege 1995: 239): (a) Postulate 4 states: “Bilinguals strive to maintain contrast between L1 and L2 phonetic categories [my emphasis], which exist in a common phonological space.” This postulate might be rewritten as follows: Learners, as well as bilinguals, strive to maintain contrast between L1 and L2 phonetic categories and between the L1 and L2 phonetic systems, which they may perceive from early on as occupying overlapping but different portions of a common phonetic/phonological space. (b) Hypothesis 1 states: “Sounds in the L1 and L2 are related perceptually to one another at a position-sensitive allophonic level, rather than at a more abstract phonemic level.” To this hypothesis it might be added that sounds (in particular, vowels) in the L1 and L2 are related perceptually at a systemic level and that the vowels of the L2 are related perceptually also to one another. . Evolution of the IL vowel system The comparison of the first-month IL vowel productions of the eleven participants led to the suggestion that the IL vowel system is constructed already in approximation to the learners’ perception of the TL system, and that the vowels of the TL, and thus also those of the IL, are related perceptually, not only to vowels of the NL, but also to one another. An analysis of the evolution of the vowels of the individual participants demonstrates further the relationship among the vowels of the IL system. As pointed out in the previous section, there was no distinction between the vowel pairs /i/-/I/ or /7/-/æ/ in any of the eleven participants’ IL systems in the firstmonth recordings. Unfortunately, their success in the acquisition of the new vowel distinctions was limited; however, the following revealing tendencies were noted. Nine of the eleven participants failed to acquire the /i/-/I/ distinction during their four to eight months of participation. In the vowel systems of these nine, the emergence of an /I/ appeared to have been literally blocked by the proximity of the inappropriately high IL /eI/ (modeled after the Portuguese /e/) to the IL /I/, as schematized in Figure 3. If the IL vowels are located in the acoustic vowel space relative to one another, as suggested above, the emergence of the new vowel would be impossible in these nine systems. An emerging IL /I/ must maintain sufficient perceptual distance from IL /i/ (Lindblom 1986), which, taking into account a certain variance of each vowel, would put it at approximately the same height as
Adult phonetic learning
Figure 3. Acquisition of IL /I/ (schematized plot)
Figure 4. Plot of MR’s English IL vowels /i/, /I/, /eI/ in month 3
the first element of the still NL-influenced /eI/. Since the diphthong /eI/ should rise toward the height of the /I/, it obviously cannot start already at that position. The plot of MR’s third-month high front vowels (Figure 4) typifies this scenario. Consistent with this analysis, the only two participants to successfully acquire the /I/ during the study – CL and CR – lowered their IL /eI/ at approximately the same time as they gradually lowered and separated /I/ from /i/, as schematized in Figure 5. Figures 6, 7, 8, and 9 show the evolution of CL’s high front vowels in months 4, 6, 7, and 8. Month 4 typifies the schema of Figure 5 – with /eI/ too high to separate /i/ and /I/; in month 6 we see a lowering of /eI/; in month 7 /I/ also
Barbara O. Baptista
Figure 5. Acquisition of IL /I/ (schematized plot)
Figure 6. Plot of CL’s English IL vowels /i/, /I/, /eI/ in month 4
lowers, allowed by the lowering of /eI/; and in month 8 four of the six /eI/ tokens move back to a reasonably native-like position, although with too much variation. Figures 10, 11, and 12 show a similar evolution of CR’s vowels in months 1, 2, and 3, but in a slightly different order. In month 1 CR already produced two of the six /I/ tokens lower not only than /i/, but than /eI/ as well; in month 2 we have only two /I/ tokens overlapping with three /i/ tokens, and three /I/ tokens produced appropriately farther back than /eI/; the month 3 plot shows /eI/ being “forced” down by the new position of /I/ so that appropriate relative positions can be obtained. Thus, the intersystemic influence of the NL /e/-/ei/ was gradually replaced by an intrasystemic need to leave sufficient acoustic space for the new IL vowel. Put in
Adult phonetic learning
Figure 7. Plot of CL’s English IL vowels /i/, /I/, /eI/ in month 63
Figure 8. Plot of CL’s English IL vowels /i/, /I/, /eI/ in month 7
terms of the SLM, the similar vowel /eI/ had to undergo adjustments relative to the IL system (or at least the front vowel line) – the superordinate phonetic category – in order for the learners to form a separate phonetic category for /I/, which was . In Figures 7 to 23 where it was considered important to show the distribution of the individual tokens, they were plotted as the appropriate IPA symbols, sometimes overlapping.
Barbara O. Baptista
Figure 9. Plot of CL’s English IL vowels /i/, /I/, /eI/ in month 8
Figure 10. Plot of CR’s English IL vowels /i/, /I/, /eI/ in month 1
perceived by all learners as similar in the beginning, but later on as new by these two participants. Although changes in the /7/-/æ/ pair were in general less noteworthy, they also demonstrate the importance of relative position within the IL system. It was noted above that the first-month IL vowel systems were quadrilateral like the TL systems. They were, however, much smaller than the TL systems. In particular, they did not extend as far into the lower vowel space, the three vowels /"/, /%/, and /f/ all being positioned rather high, as in the schematized plot of Figure 13. The independent
Adult phonetic learning
Figure 11. Plot of CR’s English IL vowels /i/, /I/, /eI/ in month 2
Figure 12. Plot of CR’s English IL vowels /i/, /I/, /eI/ in month 3
acquisition of /æ/ would change the overall shape of the vowel system, lowering the front portion in relation to the back, as in Figure 14. This did not occur for any of the participants. One participant whose IL /eI/ was especially high even in month 4 – NR (Figure 15) – in month 5 tried inappropriately raising his IL /7/ to distinguish it from /æ/ while lowering /%/ and /"/ (Figure 16), but returned to the month 5 distribution in the following month. Another participant – CL – in month 4 was still producing her three lower vowels too high and making no distinction between /7/ and /æ/ (Figure 17). In
Barbara O. Baptista
Figure 13. Insufficient lower vowel space for IL /æ/ (schematized plot)
Figure 14. Not found: Lowering of /æ/ without /"/ and /%/ (schematized plot)
months 7 and 8 (Figures 18 and 19), however, she extended more or less symmetrically the entire lower portion of the vowel system, including /7/, by increasing the variance of /7/, /æ/ and /%/ and increasing then decreasing the variance of the /"/, as though in preparation for a yet-to-occur /7/-/æ/ distinction. Since month 8 was her last month of participation in the research, it cannot be said for certain whether later she separated these two vowels, but she certainly improved the size and shape of the system as a whole. Finally, SN’s lower vowels underwent an evolution similar to that of CL (Figures 20 to 23), but by the end of the study he had managed an almost complete separation of /æ/ from /7/. In the first month his vowel system was even more reduced than that of CL, and he produced no valid /7/ tokens (unaccustomed to the English orthographic system, he systematically produced /7/ as /i/, e.g., bet as beet).
Adult phonetic learning
Figure 15. Plot of NR’s IL vowels /7/, /æ/, /%/, /"/ in month 4
Figure 16. Plot of NR’s IL vowels /7/, /æ/, /%/, /"/ in month 5
By month 3 he was becoming familiar with English spelling (producing valid /7/ tokens) and began increasing the variance of /%/ and /"/. In month 4 there was a parallel lowering of /7/, /æ/, /%/, and /"/, and by month 8 three of his /æ/ tokens were produced much lower than any of the tokens for /7/. Consistent with the interpretation of the adjustments made by CL, the parallel lowering of all lower vowels seemed to be necessary in order for SN to begin to separate /æ/ from /7/. My interpretation of the evolution of the /7/ and /æ/ is similar to, if somewhat more complicated than that of the /i/-/I/ pair. The intersystemic influence of the NL was responsible not only for the original positioning of IL /7-æ/, but
Barbara O. Baptista
Figure 17. Plot of CL’s IL vowels /7/, /æ/, /%/, /"/ in month 4
Figure 18. Plot of CL’s IL vowels /7/, /æ/, /%/, /"/ in month 7
also for the initial reluctance or inability to enter the anterior and posterior lower vowel space (Portuguese has only a central vowel this low). Some intrasystemic influence was already present in the first-month recordings to give the IL system its TL-like quadrilateral shape. However, the intrasystemic influence (strength of connections within the IL vowel schema) had to become stronger in relation to the intersystemic influence (strength of connections between the two schemata) to overcome the avoidance of the unfamiliar vowel space. Whereas both /I/ and /æ/ were probably perceived as similar vowels by all participants in the early stages of acquisition, two participants apparently managed,
Adult phonetic learning
Figure 19. Plot of CL’s IL vowels /7/, /æ/, /%/, /"/ in month 8
Figure 20. Plot of SN’s IL vowels /7/, /æ/, /%/, /"/ in month 1
within the period of the study, to perceive /I/ as a new vowel and create a new phonetic category for it (and thus produce it differently), whereas only one participant managed to do the same with /æ/. In order to make these distinctions, it was necessary in all three cases to make other adjustments in the relevant region of the acoustic vowel space. The intervening similar vowel /eI/ had to be lowered to make room for the /I/, and the lower IL acoustic vowel space had to be re-dimensioned to allow for the /æ/. All of these adjustments could only be made within an IL vowel system in which vowels are positioned relative to one another.
Barbara O. Baptista
Figure 21. Plot of SN’s IL vowels /7/, /æ/, /%/, /"/ in month 3
Figure 22. Plot of SN’s IL vowels /7/, /æ/, /%/, /"/ in month 4
. Discussion It is tempting, based on the above interpretation of the data, to say that the more radical repositioning required for the acquisition of /æ/ should make this vowel more difficult to acquire. This might be supported by a claim that /7/ and /æ/ are more similar than /i/ and /I/ – they are somewhat closer for both males and females, according to Ohnishi’s data (Figures 1 and 2), and they are usually closer in duration as well. Unfortunately, the data (one successful participant, compared
Adult phonetic learning
Figure 23. Plot of SN’s IL vowels /7/, /æ/, /%/, /"/ in month 8
to two) are insufficient to support either of these claims. It can only be said that the two learning tasks are similar, but differ in important respects. The only TL vowel that was apparently perceived as new by the participants from the beginning – the central /%/ – should, according to the SLM, eventually be produced more accurately than the others. Since none of the participants reached an advanced level during the period of the study, this prediction cannot be confirmed or disconfirmed. However, what the data do show is that the only three participants who began to lower this vowel in approximation of the target position were the same ones who simultaneously lowered their productions of the neighboring vowels /æ/ and /"/. The tentative conclusion to be drawn here is that whereas the degree of similarity of the target vowel may be important in determining the likelihood of a phonetic category being constructed, the accuracy of the phonetic category may depend on relationships among neighboring IL vowels and the learner’s perception of the limits of the TL vowel space. In fact, with the exception of the vowels that are perceived as new from the beginning (only the /%/ in this study), it may not be such a simple matter to say whether (or when) a separate phonetic category has been formed for an L2 vowel. The gradual separation of one IL vowel from another can be thought of as depending on a gradual increase in the weight of connections within the IL vowel schema and a gradual decrease in the weight of the connections between the NL and IL vowel schemata. This is because the accurate perception and production of a vowel, whether it be a NL or TL vowel, depends on an accurate representation of the entire acoustic vowel space for that language, or at least of that portion of the vowel space in which the vowel in question is located.
Barbara O. Baptista
. Conclusions The following sequence for L2 vowel acquisition can be proposed, based on the data from this study: (a) an IL vowel schema is constructed, very early on, based on the individual vowels of the NL system and the learner’s holistic perception of the overall differences in acoustic vowel space occupied by the two languages; (b) IL phonetic vowel categories are formed for any TL vowel for which no obvious link can be made with a single NL vowel (as occurred with /%/); (c) the IL system is gradually adjusted – with phonetic approximation toward the target vowels – as the links between its vowels become stronger than the cross-system links, which allows for the formation of additional vowel categories, as IL vowels begin to adjust in relation to one another. The first two parts of the above proposal – regarding the construction of the initial vowel schema – are based on the initial recordings of only eleven participants. The third part – concerning the evolution of this vowel schema – is based on only four, as the other seven showed very little progress during the period of study. This is a very limited quantity of data on which to base a proposal. However, I believe the proposal is coherent and intuitively sound enough to warrant further research in the direction of examining IL vowel systems as integrated systems, that is, possible inter-relationships among neighboring IL vowels. Furthermore, since this study investigated only the production of IL vowels, the discussion so far might be referring only to representations valid for use in motor programs, as it is still unknown if the same representations are used for production and perception or whether there are separate representations for each. Most methods used so far for the investigation of the perception of L2 vowels (e.g., two-alternative – or more – forced-choice labeling tasks, same-as or odd-item-out discrimination tasks, perceptual weighting tasks with synthesized stimuli, etc.) do not lend themselves to the investigation of the relationship among the IL vowels. Bradlow (1996), however, carried out a cross-language investigation on the perception of vowels of English and Spanish by native speakers of each, and of each language by native speakers of the other, and the effect of the presence or absence of a neighboring vowel on the perceptual boundaries of corresponding vowels in the two languages. The goodness-rating method used in Bradlow’s study could be used in the investigation of the perception of L2 vowels as a system, and the resulting perceptual boundaries could indicate relationships among the assumed perceptual L2 phonetic categories.
Adult phonetic learning
Acknowledgment The original research on which this paper is based was funded by a grant from CAPES (Coordenação de Aperfeiçoamento de Pessoal de Nível Superior), of the Brazilian Ministry of Education.
References Baptista, B. O. (1992). The Acquisition of English Vowels by Eleven Brazilian Portuguese Speakers: An acoustic analysis. PhD Dissertation, University of California, Los Angeles. Baptista, B. O. (2000). The Acquisition of English Vowels by Brazilian Portuguese Speakers [Advanced Research in English Series (ARES)]. Florianópolis, Brazil: Pós-Graduação em Inglês, Universidade Federal de Santa Catarina. Baptista, B. O. (2002). Adult phonetic learning of a second language vowel system. In A. James & J. Leather (Eds.), New Sounds 2000: Proceedings of the Fourth International Symposium on the Acquisition of Second Language Speech (pp. 32–41). Klagenfurt: University of Klagenfurt. Bohn, O.-S. (1992). Influence of new vowels on the production of similar vowels. In J. Leather & A. James (Eds.), New Sounds 92: Proceedings of the 1992 Amsterdam Symposium on the Acquisition of Second-language Speech (pp. 29–46). Amsterdam: University of Amsterdam. Bohn, O.-S. & Flege, J. E. (1992). The production of new and similar vowels by adult German learners of English. Studies in Second Language Acquisition, 14, 131–158. Bradlow, A. R. (1996). A perceptual comparison of the /i/-/e/ and /u/-/o/ contrasts in English and in Spanish: Universal and language-specific aspects. Phonetica, 53, 55–85. Escudero, P. (2006). The phonological and phonetic development of new vowel contrasts in Spanish learners of English. In B. O. Baptista & M. A. Watkins (Eds.), English with a Latin Beat: Studies in Portuguese/Spanish – English Interphonology. Amsterdam: John Benjamins. Flege, J. E. (1987). The production of “new” and “similar” phones in a foreign language: Evidence for the effect of equivalence classification. Journal of Phonetics, 15, 47–65. Flege, J. E. (1991). Age of learning affects the authenticity of voice-onset time (VOT) in stop consonants produced in a 2nd language. Journal of the Acoustical Society of America, 89, 395–411. Flege, J. E. (1995). Second language speech learning theory, findings and problems. In W. Strange (Ed.), Speech Perception and Linguistic Experience: Issues in cross-language research (pp. 233– 277). Timonium, MD: York Press. Flege, J. E. & MacKay, I. R. A. (2004). Perceiving vowels in a second language. Studies in Second Language Acquisition, 26, 1–34. Flege, J. E., Munro, M. J., & MacKay, I. R. A. (1995). Factors affecting strength of perceived foreign accent in a second language. Journal of the Acoustical Society of America, 97, 3125– 3133. Gick, B., Wilson, I., Koch, K., & Cook, C. (2004). Language-specific articulatory settings: Evidence from inter-utterance rest position. Phonetica, 61, 220–233. Hayward, K. (2000). Experimental Phonetics. London: Longman. Ladefoged, P. (1989). A note on ‘Information conveyed by vowels’. Journal of the Acoustical Society of America, 85, 2223–2224.
Barbara O. Baptista
Ladefoged, P. & Broadbent, D. (1957). Information conveyed by vowels. Journal of the Acoustical Society of America, 29, 98–104. Liljencrants, J. & Lindblom, B. (1972). Numerical simulation of vowel quality systems: The role of perceptual contrast. Language, 48, 839–862. Lindblom, B. (1986). Phonetic universals in vowel systems. In J. J. Ohala & J. Jaeger (Eds.), Experimental Phonology (pp. 13–44). Orlando, FL: Academic Press. Major, R. C. (1987). Phonological similarity, markedness, and rate of L2 acquisition. Studies in Second Language Acquisition, 9, 63–82. Morais, C. A. (1995). Labialização das Vogais Orais do Sistema Vocálico Francês por Alunos Brasileiros: Caso Particular /y/, Estudo Acústico. MA thesis, Universidade Federal de Santa Catarina, Florianópolis, Brazil. Morrison, G. S. (2002). Perception of English /i/ and /I/ by Japanese and Spanish listeners: Longitudinal results. In G. S. Morrison & L. Zsoldos (Eds.), Proceedings of the North West Linguistics Conference 2002 (pp. 29–48). Burnaby, BC, Canada: Simon Fraser University, Linguistics Graduate Student Association. Ohnishi, M. (1991). A Spectrographic investigation of the vowels of Californian English (Southwest General American). Paper presented at the Convention of the Phonetic Society of Japan, Osaka, Japan. Piske, T., Flege, J. E., MacKay, I. R. A., & Meador, D. (2002). The production of English vowels by fluent early and late Italian-English bilinguals. Phonetica, 59, 49–71. Thornbury, S. (1993). Having a good jaw: Voice-setting phonology. ELT Journal, 47, 126–131.
Appendix A: English Corpus Forty-two test words plus thirteen distractor words, all contextualized in the carrier sentence: Say _____ now. Test Words meet neat beet feet seat heat
Distractor Words
mitt knit bit fit sit hit
mate bait fate gate Kate hate
met net bet pet set get
mat gnat bat pat sat cat
not pot shot cot got hot
nut but shut cut gut hut
fish head home nose plan
please reel smile straw
thing ten they wore
Appendix B: Portuguese Corpus Thirty test words plus three distractor words, all contextualized in the carrier sentence: Fala ______ de novo. Test Words cita Rita fita brita
Distractors preta caneta chupeta mêta
respeita aceita deita enfeita
reta meta neta seta
nata mata rata pata
bota vota cota rota
libra cinta pausa
The phonological and phonetic development of new vowel contrasts in Spanish learners of English Paola Escudero University of Amsterdam, The Netherlands
This paper reports the findings of an experimental study that investigated the perception of Standard Scottish English (SSE) /i/-/I/ by Spanish speakers. Many of the learners of this study adequately identified and discriminated the new contrast, but did so by using vowel duration, an auditory dimension that serves as a secondary cue in native SSE vowel perception. In addition, the individual L2 results patterned in a way that suggested a stage-like development in the phonological and phonetic acquisition of the new contrast. These results demonstrate that L2 learners can learn to perceive new L2 vowels phonologically in a native-like fashion and that, in same cases, they can also learn to adjust to the fine phonetic differences of such new vowels.
.
Introduction
Part of the phonology of a language consists of sound distinctions that speakers perceive and produce. These sound distinctions are signaled by a number of subproperties that integrate to constitute phonological contrasts, and native listeners have a particular perceptual weighting of such sub-properties or acoustic cues, some being primary and others secondary (Nittrouer 2000; Scobbie 1998). For instance, native listeners of Standard Scottish English (SSE) perceive the /i/-/I/ vowel contrast by means of both spectral and durational cues, but spectral information is primary (Escudero 2001). A number of studies have been devoted to the perception of non-native sounds (see Strange 1995) and how it may differ from the perception of native sounds. The two most influential models of non-native perception explain the distinction between the two on the basis of previous linguistic experience. The speech learning model or SLM (Flege 1995) assumes that L2 speech is parsed through the already existing L1 phonetic categories, which are represented in long-term memory. The model predicts that new sounds in the target language (TL) can potentially be
Paola Escudero
learned through detection of the new sub-segmental (i.e., phonetic) properties that signal the new sounds, which allows the learner to create new phonetic categories. However, there is no guarantee that learners will be successful in doing so. The perceptual assimilation model or PAM (Best 1995) assumes that new sounds will be assimilated to the categories that already exist in the speaker’s L1, proposes the degrees and types of assimilation of TL sounds to the L1 perceptual categories (e.g., they may be assimilated to a single L1 category or to two categories), and then predicts the effects of such assimilation patterns on the ability to discriminate contrastive TL sounds. This model claims that since non-native speakers have already tuned their linguistic perceptual device to particular features, they would have difficulties in detecting TL features that are not found in their L1. However, the model does not address the question of L2 development itself. From the theories mentioned above, it can be inferred that native-like L2 perception is possible provided L2 speakers detect the phonetic information or the features in the TL sounds and construct new categories based on them. The models do not mention that L2 speakers should also detect the relative importance of the phonetic information present in the TL sound categories, nor do they explain what the developmental process for the acquisition of new sounds or contrasts is. It is proposed here that it is necessary to consider not only the development of L2 categories but also the developmental acquisition of the appropriate mapping of the speech signal onto those categories, which occurs via the acquisition of an appropriate perceptual weighting of the phonetic information. The present paper reports the findings of an experimental study which investigated adult L2 perception of new vowel contrasts, as well as the way in which such perception may develop. The study tested the perception of the SSE tense-lax /i/-/I/ vowel contrast by L1 and Spanish L2 subjects. Studies have shown that the relevant acoustic information for the perception of tense-lax vowel contrasts such as /i/-/I/ falls into two basic types, namely quality and quantity differences. In terms of quality, both the steady-state formant frequency of the vowel and the transition of those formants from the vowel onset to its offset can be considered. Quantity cues are the vowel’s intrinsic duration as well as its duration depending on the linguistic context in which it is produced (Nearey 1989). In SSE /i/ is produced as a monophthong, and therefore formant transition will not play a major role in signaling the distinction between SSE /i/ and /I/. Thus, for this L2 study, the focus was on the relative role of spectral steady-state cues and intrinsic vowel duration cues. The listeners were presented with two vowel continua with stimuli that varied in only one cue, while the other cue was kept constant at an ambiguous value. Preliminary findings show that SSE speakers use both cues (spectrum and duration) for the /i/-/I/ contrast (Escudero 2000), and that they use spectral cues as the primary information when identifying /i/ and /I/.
The phonological and phonetic development
A number of other studies have also investigated the perception of the /i/-/I/ contrast by Spanish learners of English. For instance, Fox, Flege and Munro (1995) concluded that Spanish speakers of English do not have access to durational cues when perceiving vowel sounds, and suggest that this is the reason why they have difficulty in perceiving and producing English tense/lax vowel distinctions such as /i/ and /I/. Bohn (1995) assumes that assimilation into a single category is the starting point for Spanish speakers’ perception of /i/-/I/ and that discrimination comes later. Contrary to Fox et al. (1995), Bohn’s findings suggest that Spanish speakers rely on durational cues more than native speakers to discriminate the sounds, although his stimuli had fewer steps for durational cues (this dimension had larger steps and less variability), which could have caused the subjects to rely more heavily on this cue. Flege, Bohn and Jang (1997) used the same stimuli as Bohn (1995), but concluded that Spanish speakers treat both cues similarly, and that their reliance on spectral cues is not significantly different from that of native speakers. However, it seems that, unlike the native speakers, the Spanish subjects relied equally on both spectral and durational cues to perceive the /i/-/I/ contrast, which in turn suggests that Bohn’s findings may represent a prior stage in their perception of the /i/-/I/ contrast. Despite the interesting findings, these previous studies reached contradictory conclusions regarding the weighting of acoustic cues by Spanish speakers when identifying /i/-/I/, and it was therefore decided to retest their perception of this contrast. A preliminary study, which used the same identification task that had been used for native speakers of SSE (Escudero 2000), found that the majority of the Spanish subjects used durational cues exclusively or primarily to identify the vowels. However, an analysis of their individual performances revealed three different patterns: subjects who used durational cues exclusively, those who used both cues but used durational cues primarily and much more than L1 speakers did, and those who behaved like native speakers. Despite the revealing findings, the fact that only nine L2 subjects were tested and that the stimuli proved to be problematic casts doubt on the validity of the perceptual data. The nature and development of Spanish speakers’ perception of the contrast therefore remained a puzzle.
. The experiment In this new study, the relation between the phonological and the phonetic perception of SSE /i/-/I/ by native speakers of Spanish was examined. Three specific questions and respective hypotheses were formulated. First, can Spanish speakers learn to identify and discriminate new contrasts? Following the predictions made by the SLM, it was hypothesized that they can learn to identify and discriminate new sounds, as shown by their performance in discrimination and identification
Paola Escudero
tests. Second, do Spanish-speakers use the same cues as native speakers of English to perceive new contrasts? Since previous studies (Bohn 1995; Flege et al. 1997; Escudero 2000) have suggested that Spanish-speakers may use non-native means to perceive new contrasts, it was hypothesized that they would perceive the new contrast by using L1 secondary cues exclusively or primarily, that is, they would perceive the SSE /i/-/I/ distinction by paying attention to the durational differences. Third, can more developmental patterns than the traditional contrast and no-contrast stages (for this view see Brown 1998, among others) be found in the perception of new contrasts? Following Escudero (2000), it was hypothesized that at least two intermediate patterns could be found, namely the use of a secondary cue exclusively, and the use of a secondary cue primarily. . Subjects and materials There were a total of 50 subjects: 20 native speakers of SSE, and 30 subjects whose L1 was Spanish. All subjects confirmed that they had no hearing problems and agreed to participate in the study voluntarily. The control group consisted of 10 female and 10 male SSE speakers between the ages of 23 and 35 who reported that they had lived in Edinburgh for most of their lives. The experimental group consisted of 15 female and 15 male subjects between the ages of 15 and 58, originally from various parts of Spain and South America. Given that the present study, to my knowledge, is the first one to test the perception of the SSE variety in Spanish learners, the subjects were not selected on the basis of their target L2 dialect but simply on the basis of their general English proficiency. All subjects reported having started learning English after the age of 12 in their country of origin, while 24 of the subjects reported that they had been exposed to native English in an English speaking country for more than a month. All learners were undergraduate and postgraduate students and employees, with only six of them living in Edinburgh at the time of the study. Two vowel sounds, based on the naturally produced vowels of two L1 SSE speakers, were synthesized using the Sensyn version of the Klatt parameter synthesizer. The HLSyn version of the same synthesizer had been used in a previous experiment (Escudero 2000) and a comparison of the outcome of the two syntheses suggested that the sounds generated in Sensyn had a much better quality, being much closer to the naturally produced ones. The two synthesized versions of prototypical productions of /i/ and /I/ were used to make seven continua of six steps each (seven elements in each continuum). An auditory scale was used to establish the values for the six steps in all seven continua. That is, the values in Hertz (acoustic scale) were converted to mels (auditory scale) and then the values for seven elements were computed by linear interpolation. For the synthesis, the mels were converted back to Hertz because Sensyn only accepts Hertz values.
The phonological and phonetic development
Figure 1. Stimuli used in the experiment divided in different types of continua
The first continuum, the phonological continuum (Figure 1, left), was called AC and had seven elements (AC1, AC2, AC3, AC4, AC5, AC6 and AC7), with AC1 and AC7 representing the endpoints /I/ and /i/ respectively. This continuum was to be used in a forced identification and a discrimination same/different test to measure perception of the phonological contrast, including placement of the perceptual category boundary. Six other continua were generated in order to measure the subjects’ phonetic development of the contrast, i.e., their perceptual weighting of durational and spectral cues. Three continua (AFB, EG, and DHC in Figure 1, center) were to measure the perceptual weighting of the durational cue. In these continua, the spectral values were kept constant in three parts of the AC continuum, namely AC1, AC4 and AC7, and the duration was manipulated in the six steps. Therefore, if subjects relied on durational cues, they would show a very good performance on these continua, as represented by a clear category boundary. However, it was expected that there would be variability as a function of vowel continua, individual subject performance and language background. The other three continua (AED, FH, and BGC in Figure 1) consisted of stimuli with a constant duration value and manipulation of spectral information. The first continuum had an AC1 durational value, the second an AC4 value and the third an AC7 one. Thus, if subjects relied on spectral cues, their performance on these continua would be very good. However, variation across continua, individual listeners and language groups was also expected. . Procedure The 50 subjects listened to all the stimuli under comfortable hearing conditions and were tested in a soundproof room. The 35-minute long experiment was created in Psyscope, an experiment-designing program (Cohen, MacWhinney, Flatt, & Provost 1993), and consisted of three parts. For all three parts of the experi-
Paola Escudero
ment, the subjects had a visual display in every trial, similar to what they had on the button box, which reminded them of what to press; they were instructed both verbally and in writing; they were told to always give an answer and guess if unsure, because the next trial would not appear if they did not make a decision. The first part was a same or different discrimination test, in which the subjects had to indicate whether they thought the pairs of sounds they heard were the same or different by pressing the corresponding button. They were presented with ten repetitions of ten different pairs of sounds, organized in five blocks and played in random order. The pairs were formed from the stimuli AC1, AC3, AC5 and AC7 from the AC continuum (the phonological continuum). These stimuli were paired as follows, AC1-AC1, AC1-AC3, AC3-AC1, AC3-AC3, AC3-AC5, AC5-AC3, AC5-AC5, AC5-AC7, AC7-AC5, AC7-AC7, following Pisoni (1973). The second and the third parts were forced-choice identification tests. The subjects had to indicate whether the vowel sound they heard was represented by the picture of a ship or that of a sheep by pressing the corresponding button. In the second part, the subjects were presented with ten repetitions of the seven elements of the AC continuum (the phonological continuum) in five blocks, each containing two repetitions of the AC elements played in random order. In the last part, they were presented with ten blocks each containing the thirty-seven vowel sounds (the elements of all seven continua) played in random order. All subjects were asked for the names of the objects in the pictures (sheep and ship) before the identification tests in order to find out if they would produce the expected names of the two objects and pronounce them differently, showing an awareness of the vowel distinction. The experimenter never produced the words for either of the pictures nor did she tell them that the words were different. Some of the L1 subjects identified the ship as a boat or a yacht, in which case the experimenter explained that it was something else until they produced the expected word (ship). All the L1 subjects and 90% of the L2 subjects produced distinct words for the two pictures. Unlike the L1 subjects, the majority of the L2 subjects appeared to produce sheep with a long /i/ and ship with a short one, and referred to the vowel difference as being one of length, calling them “the long vowel” and “the short vowel”. . Results for the /I/-/i/ phonological contrast (AC continuum) An initial analysis showed that four of the L2 subjects had performed in a deviant manner, in that they either changed their choice of button from Part 2 to Part 3, or thought that the “sheep” picture represented /I/ and the “ship” picture /i/ (see discussion section for explanations). In the analysis of the AC (phonological) continuum, only the 26 L2 subjects who consistently showed native-like distribution (shown by all 20 L1 listeners) of /i/ and /I/ were considered, that is, only those who used the sheep button to represent /i/ and the ship button to represent /I/. Figure
The phonological and phonetic development
Figure 2. L1 and L2 identification and discrimination of the AC continuum (/I/-/i/ contrast)
2 shows the identification and discrimination scores, averaged across individual performance and experiment part (Part 2 and Part 3), for all 20 L1 subjects and 26 of the L2 subjects. The discrimination scores showed a peak at the category boundary for both language groups (L1 and L2). For identification, the graphs show that both groups have a steep category boundary at almost the same location along the vowel continuum. Taken together, and in line with traditional categorical perception studies (e.g., Pisoni 1973; Strange, Edman, & Jenkins 1979), the discrimination and the identification data show that both groups of listeners have a categorical perception of the phonological contrast. In line with previous studies, the data also showed that vowel perception is not as categorical as consonant perception, especially for the L2 group. A factor analysis, with language group and stimuli as factors, revealed that the L1 and the L2 performances were significantly different (p < .05): the L2 group’s perception was less categorical (i.e., the boundary was less steep and the steps were more gradual) than that of the L1 group. However, not all the L2 subjects showed different perception from that of the L1 subjects. A hierarchical cluster analysis showed that 21 L2 subjects clustered with the L1 subjects. That is, the node that minimally included the L1 subjects also included 21 of the 26 L2 subjects. The other five L2 subjects were found to be the ones who differed significantly from the L1 group and who were thus considered to be unable to perceive the phonological contrast with native-like accuracy. The 21
Paola Escudero
successful L2 subjects were considered for the cue weighting and reliance analysis that is shown in the next subsection. . Results for the phonetic treatment of the contrast: Cue weighting and reliance The phonetic perception was measured of the 21 L2 subjects who manifested a native-like performance on the phonological continuum. The L1 group performed very well on the spectral continua, as measured by a steep slope at the category boundary, and very poorly on the durational continua. In contrast, the L2 group performed much better on the durational continua than on the spectral continua. A univariate factor analysis between language and stimuli was run for each of the continua. The difference between the language groups turned out to be significant for all continua (p < .05). The category boundary for the contrast, the cue reliance or the individual use of a dimension in order to perceive a phonological contrast, and the cue weighting or the relative phonetic attention paid to acoustic dimensions that signal a phonological contrast, were computed and plotted using PRAAT statistics and graphs. In Figure 3, the vertical percentages represent the use of the spectral cue to perceive the contrast, i.e., spectral reliance, while the horizontal percentages represent reliance on the durational cue. The cue reliance values were computed by subtracting the score of the first stimulus from that of the seventh stimulus (the endpoints of the continuum) along the vertical continua for spectral reliance and along the horizontal continua for durational reliance, following Bohn (1995) and Flege et al. (1997).1 From these reliance values, the relative perceptual weighting of the two acoustic dimensions are computed by dividing each reliance value by the sum of both reliance values, the two resulting values having to add up to 100%. For instance, the reliance values for the control group were 93.4 for spectrum and 10.6 for duration, so that their cue weightings are .899 = 90% and .101 = 10%, respectively. Figure 3 shows the category boundary lines or the lines along which the chances of a token being one or the other category are 50%, and the reliance values (dur.rel and spec.rel); the cue weightings are given in the caption. The black areas represent what was always reported as “sheep” and the white areas what was always reported as “ship”. From the mean values given in Figure 3, it could be inferred that the L1 group used spectral cues much more than durational cues and that, conversely, the L2 group used durational cues more than spectral ones. Consequently, . The endpoints of all possible 14 continua, i.e., the edges of ABCD, were used for the cue reliance computation. For example, the duration reliance was computed as follows: ((AFB7 – AFB1) + (BG2 – AED2) + (BG3 – AED3) + (EG7 – EG1) + (BG5 –AED5) + (BG6 – AED6) + (DHC7 – DHC1)) / 7.
The phonological and phonetic development
Figure 3. Left: L1 perception, dur. weighting 10%; right: L2 perception, dur. weighting 62%
it could be hypothesized that the L2 subjects may have stored a phonetic category represented by a duration or length feature; i.e., /i/ is long and /I/ is short, rather than tense and lax respectively. However, there seemed to be great individual variability in the L2 group’s perception, which suggests that not all the subjects had this length contrast. . Patterns in L2 individual phonetic perception The variability in the L2 group’s individual phonetic perception of the vowel contrast turned out to be systematic. Three well-defined and well-distributed patterns can be observed in Figure 4: the first pattern (shown by seven subjects) was characterized by a very high reliance on duration and a zero or negative reliance for spectrum, the corresponding perceptual weighting being 100% for duration;2 the second pattern (six subjects) showed high durational reliance and weighting and low spectral reliance and weighting, with the spectral weighting being no more than 25%. These two groups may indeed have a length rather than a quality (i.e., tense-lax) representation of the vowels. On the other hand, eight subjects were found to match the L1 group’s perception in that they used spectral information primarily or exclusively. In sum, four clear patterns were found in the development of the new /i/-/I/ contrast by the L2 group. It is proposed here that these patterns may constitute a developmental sequence. In the first stage, Spanish learners of English are not able to identify tokens of /i/ and /I/ with native-like accuracy, that is, a no-contrast pattern constitutes their starting point; the results reported in Section 2.3 indicate that five of the L2 subjects showed this pattern. In the second stage, once learners are able to reliably identify and discriminate the contrast, they may do so by using . Negative spectral reliance values were small enough to suggest that they were due to chance, so spectral reliance can be assumed to have been zero.
Paola Escudero
Figure 4. L2 individual reliance: Pattern 1 (left), pattern 2 (center), and pattern 3 (right)
the durational information in the input exclusively (unlike native speakers). In the third stage, learners are able to make use of both types of information but still give priority to durational cues (unlike native speakers); and in the final stage they use spectral information primarily or exclusively (like native speakers).3 Support for this sequential development hypothesis comes from the apparently contradictory findings of previous studies, in which differences in L2 subjects’ performance can probably be accounted for by their level of English proficiency. In accordance with this sequential development hypothesis, two extra patterns that were not found in this study are also predicted, namely a real “no-contrast” stage in absolute beginners, in which they will perform randomly, as well as a pattern characterized by an equal reliance and weighting of spectral and durational cues. A further hypothesis is that sequential development may vary as a function of dialectal variation in the L1 and the target L2 variety (see Escudero 2001; Escudero & Boersma 2004; Escudero 2005).
. Discussion This section discusses (a) the possibility that the L2 group’s performance was based on non-linguistic auditory perception, (b) possible explanations for the divergent performance of the four L2 subjects who were excluded from further analyses, and (c) the plausibility of the hypothesized developmental sequence in the acquisition of the new vowel contrast.
. This last pattern may apply only to learners of SSE and perhaps not, for instance, to learners of Standard Southern British English, because these two English varieties differ in the phonetic properties with which high front vowels are produced (for details, see Escudero & Boersma 2004; Escudero 2005).
The phonological and phonetic development
. Linguistic or auditory strategies in L2 perception Recall that the data showed that 21 out of 30 L2 subjects had a native-like performance on the diagonal AC continuum. It was thought that this continuum would test perception of the phonological contrast, and not the weighting of the cues, because the two acoustic cues involved were varied at the same time. During measurement of cue weighting strategies, it emerged that 13 of the 21 subjects with native-like performance were, unlike the natives, using durational information exclusively or primarily to perceive the contrast. On the basis of these data, it was suggested that the phonological contrast was stored in the subjects’ phonological space as a short-long distinction, not a tense-lax one. However, there could be a different explanation for the L2 subjects’ performance, namely that they might have used a general auditory, and not phonological or phonetic, strategy, categorizing the vowel sounds as non-speech sounds differing in length only. In addition, it could be argued that there was no clear evidence for the existence of an independent phonetic category for /I/ because the English vowels could have been mapped onto a single Spanish vowel (i.e., /i/). However, there appears to be evidence of phonetic categorization on the basis of phonological length because 26 out of 30 L2 subjects consistently decided that sheep represented the long vowel and ship the short one, despite the fact that either picture had the same probability of being chosen as the long or short vowel. Also if it were the case that English /I/ was not represented as different from /i/, then it would be unlikely that the subjects could produce a difference between the vowel sounds when asked to name the objects in the pictures. It should be recalled that, even though the experimenter never told the subjects that the words were or sounded different, 90% of them produced a clear difference between them. These two pieces of evidence together suggest that the subjects used their linguistic representations of long and short vowels when identifying the vowel sounds. These findings suggest, in line with the L1 acquisition literature (Scobbie 1998; see also Gerrits 2000; Nittrouer 2000), that the Spanish listeners may have a covert contrast when perceiving and producing /i/ and /I/; in other words, they may perceive a contrast by different means than those used in adult L1 perception. . Different performance by four L2 subjects Two of the four subjects with a deviant performance changed their choice of button from the second to the third part of the experiment; i.e., they chose the sheep button for the /i/-like sounds in one of the experiments and the ship button for the same stimuli in the other experiment. Therefore, it cannot be determined whether they could perceive the contrast in a native-like way or whether they used non-linguistic strategies to perceive the contrast, nor can their representa-
Paola Escudero
tion of the vowel sounds be described. The other two subjects consistently used the ship button to represent the higher vowel and the sheep button for the lower one. There are at least three possible explanations for this performance. First, they could have used a non-linguistic auditory strategy (the one mentioned in the previous sub-section, i.e., the duration of the vowels) to make their decisions, but got the pictures the wrong way round, with ship representing a long sound and sheep representing a short one. However, their cue weighting data showed that their decisions were made mostly on the basis of spectral information, and that when they used duration they opted for the correct distribution; i.e., they chose sheep when the sound was long and ship when it was short. A second explanation for their performance could be that they simply got the buttons mixed up, which is unlikely given the number of times they saw the display on the screen. A third, and more convincing, explanation for these last two subjects’ performance is the possible use of an “orthographic strategy” (as suggested in Flege et al. 1997) combined with a two-category assimilation to the Spanish vowel sounds /i/ and /e/. That is, these subjects may have used the spelling of the words represented by the pictures and matched the vowel sounds that they heard to the closest Spanish vowel sound. This is likely because Spanish has a very transparent spelling system in which the letter i, as in the word ship, always represents an /i/ sound and the letter e, as in the word sheep, always represents an /e/, which is why the vowel in sheep may have been considered to be long (a double Spanish /e/ sound). However, it cannot be determined with certainty what these subjects actually did or what their category representations are on the basis of this experiment. Nevertheless, if they were in fact using the spelling of the words to cope with the task, their performance could be seen as evidence for an even stronger effect of literacy on L2 phonology (Young-Scholten 1995). Another interesting feature of this performance is the possible assimilation of Scottish English /I/ to Spanish /e/, which had never been documented in previous studies of Spanish learners of English. . L2 performance patterns and a possible stage-like development Despite the fact that the study reported here was not longitudinal, a stage-like development in the L2 subjects was hypothesized. According to some authors (Wode 1976; Werker, Gilbert, Humphrey, & Tees 1981; Polka & Werker 1994), it is plausible to infer a developmental sequence from cross-sectional patterns, especially if stages are considered to be patterns found in the development of a particular knowledge or performance that remain stable for a period of time before more learning takes place. However, the ordering of the patterns or stages cannot be evidenced; i.e., from the data shown here, it cannot be ascertained which of the intermediate stages occurs first. Nevertheless, it is still justifiable to hypothesize the sequential ordering of the patterns found here, as proposed at the end of Sec-
The phonological and phonetic development
tion 2.5, and see if both the patterns and their proposed sequential ordering can be confirmed longitudinally.
. Conclusions The hypothesis that Spanish listeners can learn to perceive new L2 contrasts was supported by the results of the 21 subjects who were able to identify the elements of the /i/-/I/ phonological vowel continuum with native-like accuracy. However, the majority of those 21 learners turned out to have a non-native-like phonetic perception of the contrast; i.e., they used durational cues exclusively or primarily to perceive the contrast, rather than spectral cues, which constitute the primary information for native speakers of SSE. It can therefore be concluded that learners go through more stages than just those of contrast and no-contrast in the learning of new phonological distinctions, one of these stages being the perception of the contrast by means of a non-native weighting of the acoustic information. In addition, it was found that individual L2 performances patterned in four well-defined stages, which led to the formulation of a sequential development hypothesis, namely that the patterns found may constitute sequential, stage-like development in the learning of the SSE /i/-/I/ contrast by Spanish listeners. In order to confirm or disconfirm the tentative conclusion that L2 speakers of SSE with Spanish as L1 use a linguistic strategy based on phonological length, further research with experiments that could access learners’ linguistic representations more clearly are needed. It might be possible to achieve this by means of an experiment consisting of an AXB categorical discrimination task in which the inter-stimulus interval (ISI) is long enough to force subjects to rely solely on their long-term memory representations (see Werker & Logan 1985 for a discussion of this type of experiment and a comparison of types of perception as a function of ISI manipulations). Longitudinal studies of at least a year’s duration would be needed to confirm or disconfirm the hypothesized sequence in L2 development as well as to find out how stable the patterns found are. An emphasis on the analysis of individual performance should be considered. In addition, comparisons between different target language varieties could provide evidence for the starting point and further development of subjects with similar L1 backgrounds but who are learning different varieties of the target language (see Escudero 2001; Escudero & Boersma 2004, for a comparison between Standard Southern British English and Scottish English). Finally, it would be very interesting to compare the perception of new contrasts with their production and to try to find out if the developmental changes that happen in perception occur in the same way in production. It is predicted, in line
Paola Escudero
with the SLM (Flege 1995), that ultimate attainment in perception would normally precede that of production.
References Best, C. T. (1995). A direct realist view of cross-language speech perception. In W. Strange (Ed.), Speech Perception and Linguistic Experience: Theoretical and methodological issues (pp. 171– 203). Timonium, MD: York Press. Bohn, O.-S. (1995). Cross-language speech perception in adults’ first language transfer doesn’t tell it all. In W. Strange (Ed.), Speech Perception and Linguistic Experience: Theoretical and methodological issues (pp. 279–304). Timonium, MD: York Press. Brown, C. (1998). The role of the L1 grammar in the L2 acquisition of segmental structure. Second Language Research, 14, 136–193. Cohen J. D., MacWhinney, B., Flatt, M., & Provost, J. (1993). PsyScope: A new graphic interactive environment for designing psychology experiments. Behavioral Research Methods, Instruments, and Computers, 25, 257–271. Escudero, P. (2000). The Perception of English Vowels by Spanish Speakers: Spectral and temporal effects in the perception of the /i/-/I/ contrast. Unpublished manuscript, University of Edinburgh. Escudero, P. (2001). The role of the input in the development of L1 and L2 sound contrasts: Language-specific cue weighting for vowels. In A. Do, L. Dominguez, & A. Johansen (Eds.), Proceedings of the 25th Annual Boston University Conference on Language Development (pp. 250–261). Somerville, MA: Cascadilla. Escudero, P. (2005). Linguistic Perception and L2 Acquisition: Explaining the attainment of optimal phonological categorization [LOT dissertation series 113]. Utrecht: Utrecht University. Escudero, P. & Boersma, P. (2004). Bridging the gap between L2 speech perception research and phonological research. Studies in Second Language Acquisition, 26, 551–585. Flege, J. E. (1995). Second language speech theory, findings and problems. In W. Strange (Ed.), Speech Perception and Linguistic Experience: Theoretical and methodological issues (pp. 233– 277). Timonium, MD: York Press. Flege, J. E, Bohn, O.-S., & Jang, S. (1997). Effects of experience on non-native speakers’ production and perception of English vowels. Journal of Phonetics, 25, 437–470. Fox, R. A., Flege, J. E., & Munro, M. J. (1995). The perception of English and Spanish vowels by native English and Spanish listeners: A multidimensional scaling analysis. Journal of the Acoustical Society of America, 97, 2540–2550. Gerrits, E. (2000). The perceptual weighting of acoustic cues by Dutch children. Paper presented at the 8th Meeting of the International Clinical Phonetics and Linguistics Association, Queen Margaret University College, Edinburgh. Nearey, T. M. (1989). Static, dynamic and relational properties in vowel perception. Journal of the Acoustical Society of America, 85, 2088–2113. Nittrouer, S. (2000). Learning to apprehend phonetic structure from the speech signal: The hows and whys. Paper presented at the 8th Meeting of the International Clinical Phonetics and Linguistics Association, Queen Margaret University College, Edinburgh. Pisoni, D. (1973). Auditory and phonetic memory codes in the discrimination of consonants and vowels. Perception and Psychophysics, 13, 253–260.
The phonological and phonetic development
Polka, L. & Werker, J. F. (1994). Developmental changes in the perception of non-native vowel contrasts. Journal of Experimental Psychology: Human Perception and Performance, 20, 421– 435. Scobbie, J. (1998). Interactions between the acquisition of phonetics and phonology. In K. Gruber, D. Higgins, K. Olsen, & T. Wysochi (Eds.), Proceedings of the 34th Annual Meeting of the Chicago Linguistic Society (pp. 343–358). Chicago, IL: Chicago Linguistic Society. Strange, W. (Ed.). (1995). Speech Perception and Linguistic Experience: Issues in cross-language research. Timonium, MD: York Press. Strange, W., Edman, T. R., & Jenkins, J. J. (1979). Acoustic and phonological factors in vowel identification. Journal of Experimental Psychology: Human Perception and Performance, 5, 643–656. Werker, J. F., Gilbert, J. V. H., Humphrey, K., & Tees, R. C. (1981). Developmental aspects of cross-language speech perception. Child Development, 52, 349–355. Werker, J. F. & Logan, J. S. (1985). Cross-language evidence for three factors in speech perception. Perception and Psychophysics, 37, 35–44. Wode, H. (1976). Developmental sequences in naturalistic second language acquisition. Working Papers in Bilingualism, 11, 1–31. Young-Scholten, M. (1995). The negative effects of ‘positive’ evidence on L2 phonology. In L. Eubank, L. Selinker, & M. Sharwood Smith (Eds.), The Current State of Interlanguage (pp. 107–121). Amsterdam: John Benjamins.
Age and native language influence on the perception of English vowels Francisco Gallardo del Puerto, Ma Luisa García Lecumberri and Jasone Cenoz Universidad del País Vasco, Spain
This paper examines, within the framework of Flege’s speech learning model (SLM), the relevance of two factors – age and native language – for the acquisition of vowel perception abilities by young Spanish learners of English in a formal instructional environment. Findings indicate that age did influence the participants’ ability to identify English vowels, but not in the expected direction and only for those vowels considered to be identical. Thus, the results do not provide evidence of the influence of the critical period for classroom learning, but they do support the SLM’s proposal concerning the influence of type of interlingual identification: that new and identical vowels will be easier than similar vowels.
.
Introduction
The influence of learners’ age on non-native language acquisition has been one of the most frequently considered topics in the field of second language (L2) acquisition. Where pronunciation is concerned, the bulk of research has been carried out in naturalistic learning environments and suggests that age of onset of learning (AOL) is a relevant variable (Singleton 1989, 1995), especially insofar as it relates to eventual attainment and rate of acquisition. As far as eventual attainment is concerned, the general finding is that the earlier the AOL, the more native-like the target sound system becomes in the long run; that is to say, early AOL is a strong predictor of ultimate phonological proficiency (Asher & García 1969; Thompson 1991; Flege, Munro, & MacKay 1995; Munro, Flege, & MacKay 1996). As for rate of acquisition, there is empirical evidence to suggest that, if the amount of exposure is held constant, the later the AOL the faster the phonology is acquired in the early stages of learning. In other words, late AOL seems to be a good predictor of phonetic achievement in the short term (Snow & Hoefnagel-Höhle 1978; Long 1990). This initial advantage on the part of older starters, however, has been found to be
Francisco Gallardo del Puerto, Ma Luisa García Lecumberri and Jasone Cenoz
shorter-lived in the case of acquisition of the phonological system than in other linguistic domains (Krashen, Scarcella, & Long 1982). It has also been shown that not all phonological components are affected equally by the age factor nor do they manifest the same rates of acquisition. Vowel sounds, which are the concern of this paper, appear to be especially influenced by age. Flege, Bohn and Jang (1997) concluded that AOL is a much more important variable for the acquisition of vowels than of consonants. While consonant acquisition does not seem to be strongly affected by AOL but is strongly determined instead by amount of exposure (Flege, Takagi, & Mann 1995, 1996), vowel perception and production are influenced by age in that early AOL is a good predictor of native-like vowel production (Munro et al. 1996). The acquisition of vowel sounds seems to need a longer time than other components. Hecht and Mulford’s (1982) 6-year-old Icelandic child showed traces of foreign accent for a longer time in his pronunciation of English vowels than of consonants. However, the situations in which the target language (TL) is actively used in the community have little in common with environments in which the TL has a foreign language status (FL) and there is limited exposure to it outside the classroom. In fact, research findings from acquisition in formal contexts indicate that early AOL does not guarantee native-like pronunciation. Studies such as those by Fullana and Muñoz (1999), Gallardo del Puerto, Garcia Lecumberri and Cenoz (2000, 2001a, 2001b), and García Lecumberri and Gallardo del Puerto (2003) have even found a directly proportional relationship between AOL and phonological proficiency; that is, the later the AOL of learners, the better their speech perception or production.1 These results coincide with those related to other linguistic components in formal contexts (Cenoz 2003; García Mayo 2003; Lasagabaster & Doiz 2003) and clearly point in the opposite direction to naturalistic research findings. The apparent discrepancies between findings in naturalistic and non-naturalistic settings can be understood in terms of differences in quantity of exposure between these two kinds of context. Singleton (1989, 1995) has pointed out that the concepts of initial advantage and eventual attainment relate to longer real-time periods of exposure in those environments in which the target language is little used outside the classroom, since in such contexts exposure to the TL is less intense than in naturalistic situations. It is important to realize that, in terms of amount of exposure, one year of natural acquisition, which is more or less the time younger beginners need to overtake older learners (Snow & Hoefnagel-Höhle 1978), may correspond to up to 18 years of formal school FL instruction (Singleton 1989).
. In these studies, as well as the present one, variability in AOL and in age at the time of testing are linked, as can be seen in Table 1, and this fact will be considered in our discussion.
Age and native language influence
Needless to say, any advantage related to early AOL will take longer to emerge, if it does so at all, in such formal instructional environments. Many investigators of the age issue have focused on a very specific age effect, which is possibly the main concern of age-related empirical and theoretical research, the so-called critical period hypothesis. This hypothesis, first formulated by Lenneberg (1967), proposes that the innate language acquisition faculty which children are born with stops functioning after a particular maturational point, which makes it impossible for a learner to acquire a language in a native-like way beyond a certain age. Later, Oyama (1976) proposed the term sensitive instead of critical, suggesting that children do not suddenly lose this capacity but that it diminishes with time, which appears to be in accordance with most empirical phonetic research findings (Asher & García 1969; Fathman 1975; Oyama 1976; Thompson 1991). The age limit for native-like phonological acquisition has been shown to be earlier than for other linguistic areas (Krashen et al. 1982). Several studies (Long 1990; Flege, Munro, & MacKay 1995; Munro et al. 1996) have suggested that the age of 6 is the upper limit of the sensitive period for phonology. The influence that the phonetics and phonology of the first language (L1) exert on the L2 sound system has been linked to the critical or sensitive period, for example in Flege’s speech learning model (SLM) (Flege 1991, 1992, 1995; Flege, Munro, & MacKay 1995; Munro et al. 1996). It has been shown that the formation of correct phonetic categories for the target language is more difficult after a certain age because the L1 sound system has been fully acquired. From then onwards, L1 categories will interfere and prevent the learner from developing native-like ability. By a process of interlingual identification (Flege 1991), L2 sounds are identified with L1 phonetic categories, which causes the ability to discriminate TL sound contrasts to diminish. This non-native perception will block native-like category formation, resulting in a foreign accent. The SLM categorizes TL sounds in three different ways depending on how similar they are to L1 sounds (Flege 1992). An identical sound is one that is the same as an L1 sound, a similar sound is acoustically similar but not exactly the same as an L1 sound, and a new sound is one that differs substantially from any L1 sound. The model also predicts how the learner will behave on the basis of the differences between L1 and TL sounds. Sounds classified as identical will not be problematic for the learner, as all the knowledge is available in the L1, and positive transfer will operate optimally. Sounds classified as new may also be learnt successfully, once enough input has been given, since they are sufficiently different from any L1 sound to trigger the formation of a new phonetic category in the target language without L1 interference. Those sounds which are judged to be similar, however, will present the most serious learning difficulty, because they are neither the same as an L1 sound nor yet so different as to enable learners to establish a new phonetic category. As a consequence, learners will succumb to interlingual identi-
Francisco Gallardo del Puerto, Ma Luisa García Lecumberri and Jasone Cenoz
fication or equivalence classification, which will impede native-like production. As Flege (1991: 707) states, “L2 sounds that match an L1 sound closely or else differ considerably from any sound in the L1 may be produced authentically, whereas L2 sounds that partially resemble an L1 sound may be pronounced poorly.” The purpose of this research in a formal FL instructional environment was two-fold: firstly, to find age-related effects on the perception of English vowels in the light of literature on the age factor, and secondly and more specifically, to investigate the influence of the L1 on vowel perception in the light of SLM predictions and in relation to learners’ age. As a consequence of these two objectives, an additional outcome of the study was an assessment of the applicability of the SLM to formal learning contexts.
. Method . Subjects The subjects in this cross-sectional study were all participants in a project focusing on the learning of English by Spanish schoolchildren. English is just one subject on the school curriculum and has FL status in the community. Sixty learners took part in the experiment, and were divided into three groups according to their age. All subjects had to satisfy the condition of not having received any significant amount of exposure to the English language outside the school context. The fulfilment of this condition forced us to take students belonging to two different grades in each age group so as to get a reasonable number of subjects in each of the three age groups. In order to control the amount-ofexposure variable, each age group was composed of ten students in the middle of their sixth year of learning English, and ten in the middle of their seventh. The three groups had thus received the same mean exposure to English (six years of formal instruction) but they differed from one another in AOL and in their age at the time of testing. The distribution of the sample is shown in Table 1. The 20 learners in Group 1 began learning English when they were four years old and had a mean age of 10 when they took the tests. The 20 students in Group 2 started learning English at the age of 8 and had a mean age of 14 at the time of Table 1. Distribution of sample
Group 1 Group 2 Group 3
AOL
Mean age
Mean exposure
4 8 11
10 14 17
6 years 6 years 6 years
Age and native language influence
testing. Finally, the 20 learners in Group 3 began learning English at the age of 11 and had a mean age of 17 when they participated in the experiment. . Stimuli The data-collection instrument was a perceptual exercise prepared in the form of a two-alternative forced-choice identification test and designed as a picturepointing task. Stimuli were 22 minimal pairs of English words containing 22 vowel contrasts. All words were monosyllabic so that listeners would focus on the vowel contrasts. Words with a CVC structure were chosen, since consonantal onsets and especially codas have been shown to favor vowel identification (Strange, Edman, & Jenkins 1979). Most of the codas selected were voiced alveolar stops, as it has been demonstrated that /d/, followed by /n/, constitute the most favorable contexts for identification of the preceding vowel by L1 speakers of Spanish (Stevens & House 1963; García Lecumberri & Cenoz 1997). All the monophthong vowels in Standard Southern British (SSB),2 except for the weak vowel schwa, which does not occur in stressed syllables, were included as target sounds for the identification test. Each of these 11 vowel sounds appeared twice in the test, contrasting with two potentially confusing vowel phonemes which were selected from the results of a previous study on English vowel perception (García Lecumberri & Cenoz 1997). Two phoneme contrasts were thus chosen for each of the English vowels, examples being the minimal pairs lead-lid and bead-bird (see Appendix). Stimuli words were recorded by a female native speaker of SSB and randomized. . Procedure A list of the vocabulary which was to appear in the perceptual task was delivered to the school’s teachers three months before the experiment. This way, teachers had enough time to include the selected words in their English lessons so that participants were familiar with the aural stimuli before testing. Teachers were L1 speakers of Spanish and were not aware of the aim of the experiment; that is, they were not informed that phonological proficiency was to be tested, nor were they asked to focus specially on the phonetic characteristics of the words when using them in the classroom. Identification tests were administered individually and participants did not have any specific training beforehand. Stimuli words were presented to listeners . SSB (Shockey 2003) is a non-rhotic accent, that is to say, the sound /r/ only appears followed by a vowel. Therefore, in words such as “bird” /r/ is not pronounced and the vowel quality is different to that found in rhotic accents, such as General American.
Francisco Gallardo del Puerto, Ma Luisa García Lecumberri and Jasone Cenoz
Table 2. Crosslingual comparison of SSB and Spanish vowels Identical
Similar
English
Spanish
English
Spanish
/i˜/ /u˜/ /e/ /f˜/
/i/ /u/ /e/ /o/
/I/ /~/ /6/ /æ/ /"˜/ /%/
/i/ /u/ /o/ /a/ /a/ /a/
New English /8˜/
on audio-tape, and a card with two possible answers, represented by means of drawings and the written form, was presented while each stimulus was played. In order to minimize the influence of spelling on listeners’ perceptions, they were urged to point to the drawings and not to the printed words when answering. For the same reason, the drawings were much larger than the written words. The participants’ responses were recorded on prepared answer sheets by the researcher. . Analyses Statistical analyses were conducted so as to (a) determine the influence of age on the students’ vowel identifications, and (b) explore whether the L1 vowel system had the influence predicted by the SLM on the perception of TL vowels. Mean scores and standard deviations for each of the age groups were obtained. ANOVA tests were used to establish comparisons among the three age groups and Scheffé tests were used for comparisons between any two groups. To investigate L1 influence, the English vowel phonemes were classified, in accordance with Flege’s SLM, as identical, similar or new on the basis of their relation to Spanish vowel phonemes. Correct identification totals and percentages were calculated for each of the three classes of sound. A 2 by 3 chi-square analysis contrasting correct and incorrect identifications for the three types of sound was applied. In order to distribute the stimuli among these three classes, the researchers’ experience as FL learners and teachers of English and the qualitative distance between English and Spanish vowels were taken into account.3 Table 2 shows how each English vowel was classified (identical, similar or new) and the Spanish vowel with which each English vowel is assumed to be identified. It should be remembered that Spanish has just five monophthong vowels whereas SSB has 11 different monophthong vowel phonemes. . The SLM uses three criteria to establish FL sound class: IPA symbols, acoustic detail and listener judgements, although the first of these, IPA symbols, is the one most often used (Markham 1997; Rochet 1995).
Age and native language influence
. Results First, we present the results of the analyses concerning the influence of learners’ age on the identification of English vowels (Tables 3 and 4). This will be followed by a consideration of the analyses regarding the influence of the L1 on L2 vowel perception (Table 5). Table 3 displays mean identifications by the three age groups for all vowels and for identical, similar and new vowels, along with ANOVA comparisons of the results among the three groups of students. As far as global results – total vowels – are concerned, the mean scores obtained by each age group indicate that the younger the learners are, the worse their perceptual ability. The youngest subjects (Group 1) attained the lowest means, the oldest participants (Group 3) the highest, and intermediate age students (Group 2) fell in between. These differences were statistically significant. Next, looking at results by age and relating them to the L1 according to the SLM sound classes (identical, similar and new), we find that the age variable did not affect all three classes in the same way. Mean scores for correct identification of those vowels classified as identical increase as a factor of age; that is, the ability to perceive identical vowels increased with age in a linear way. In this respect, Table 3. Means, standard deviations and ANOVA comparisons of correct identifications in the three age groups by sound classes Maximum Score Identical Vowels Similar Vowels
8 12
New Vowels
2
Total Vowels
22
Group 1 n = 20
Group 2 n = 20
Group 3 n = 20
F
p
5.35 (0.99) 7.77 (1.74) 1.70 (0.57) 14.80 (2.48)
5.75 (1.59) 8.60 (1.76) 1.45 (0.60) 15.50 (2.57)
7.00 (0.92) 8.50 (1.61) 1.60 (0.60) 17.10 (2.02)
10.26
0.00*
1.49
0.23
0.90
0.41
4.73
0.01*
* Significant, with significance level set at p < .05. ** Standard deviation scores appear in parentheses.
Table 4. Scheffé two-way age-group comparisons of mean scores for correct identification
Identical Vowels Similar Vowels New Vowels All Vowels
Groups 1–2
Groups 2–3
Groups 1–3
0.57 0.29 0.41 0.41
0.07# 0.98 0.72 0.23
0.00* 0.38 0.86 0.01*
* Significant, with significance level set at p < .05; # close to significance, i.e., tendency.
Francisco Gallardo del Puerto, Ma Luisa García Lecumberri and Jasone Cenoz
Table 5. Correct identifications and percentages for all subjects
Max score x 60 subjects Correct identifications Percentage correct
Identical Vowels
Similar Vowels
New Vowels
480 362 75.41%
720 497 69.03%
120 95 79.17%
differences among the three groups were highly significant. As for the other two classes of non-native sound (similar and new), a direct relationship between age and perceptual skill was not found for either class. The older students were clearly better at identifying those TL sounds considered to be identical to L1 categories, mistaking them for other TL vowels less frequently than the younger students. On the other hand, for the correct identification of TL vowels which were considered only similar to L1 vowels or totally new, the participants’ age was not relevant. Scheffé tests were subsequently carried out in order to identify which pairs of age groups were responsible for the significant differences reported in Table 3. Table 4 presents these two-way comparisons. It was found that the oldest subjects (Group 3) identified English vowel sounds as a whole significantly better than the youngest ones. No significant differences were found between the intermediate age group and either of the other two. Taking each sound class individually, the significant 3-way score for the identical vowels given in Table 3 was mostly due to the comparison between the youngest and oldest groups, although the comparison between Groups 2 and 3 was close to significance. No statistical significance or even tendency was found in the comparison of the two younger groups. These results could be taken as an indication that starting English instruction at age 11 actually favors the acquisition of English vowel sounds, whereas first exposure at age 4 or age 8 makes little difference. However, the results are also consistent with an interpretation of learners’ age at time of testing being the relevant factor. Thus, being older at the time of testing may be a favoring condition. In order to compare the differences among the identifications of identical, similar and new sounds, by all the students as a whole, percentages of correct identifications for these sound classes in the whole sample were calculated. In this way, it was possible to check whether the Spanish vowel system exerted an influence on TL vowel perceptions. Table 5 presents these percentages as well as the totals (maximum scores and correct identification scores) for all subjects. The SLM predicts that identical and new sounds will be acquired easily, while similar sounds will cause difficulty. Table 5 shows that, in accordance with the model, similar vowels were the sounds with the lowest percentage of correct identifications while values were higher for identical and new sounds. Although the percentage differences between the sound classes appear to be relatively small, a
Age and native language influence
2-way chi-square, with incorrect as the other level of the variable identifications, shows them to be significant: χ2 (2, N = 1320) = 8.98, p < .05.
. Discussion The first clear conclusion which emerges from the data presented here is that there was a noticeable influence of age on the participants’ ability to perceive English vowel phonemes, in that the older they were, the better their identification of the vowels. The comparisons that were statistically significant were those in which Group 3 (AOL 11 years) was involved, whereas differences between Groups 1 (AOL 4 years) and 2 (AOL 8 years) were never found to be significant. This means that starting English instruction at ages 4 or 8 may not exert a positive influence on vowel perception. It would be all too easy to extrapolate the notion that late AOL is a favoring factor. However, we cannot ignore the fact that AOL and age at testing are two variables which, in order to keep amount of exposure constant, could not be separated in this study. Thus, the significantly better results of the oldest learners could also be attributed to their higher cognitive development at the time they participated in the experiment. The results are in agreement with the bulk of investigations carried out in formal FL-learning contexts with regard to both phonology (Fullana & Muñoz 1999; Gallardo del Puerto et al. 2000, 2001a, 2001b; García Lecumberri & Gallardo del Puerto 2003) and other linguistic areas (Cenoz 2003; García Mayo 2003; Lasagabaster & Doiz 2003). In all of these studies, late beginners/older participants attained significantly higher scores than early beginners/younger participants. Contrary to the “earlier the better” position, the results show that, after six years of FL instruction, younger learners experienced more difficulties in identifying English vowel sounds than those who started later. One might reasonably suppose that the amount of English exposure the participants had received was not enough for the early advantage apparent in naturalistic contexts to manifest itself. If we keep in mind Singleton’s (1989) calculations, the students’ six-year period would correspond to approximately four months of natural acquisition, which is more or less a third of the time younger learners seem to need to catch up with or overtake older ones as far as phonological competence is concerned (Snow & Hoefnagel-Höhle 1978). It may therefore be that, given more years of instruction, early AOL may yet prove to be advantageous. According to the critical period hypothesis and Flege, Munro and MacKay’s (1995) claims regarding age, the earlier the AOL the more likely a learner is to hear the differences between native and non-native sounds. Specifically, they claim that children who begin L2 acquisition before age 6 may benefit particularly from their starting age, since L1 phonetic categories will not interfere with the formation of
Francisco Gallardo del Puerto, Ma Luisa García Lecumberri and Jasone Cenoz
TL categories. The youngest students in this study were within this favorable or critical/sensitive age period when they were first exposed to English. Nevertheless, they obtained worse results than older students on most of the vowels, indicating that early exposure did not favor the formation of native-like phonetic categories for the FL vowels. Although our results do not support the critical period hypothesis they cannot necessarily be seen as contradicting Flege et al.’s claims, because of differences in the populations studied. The SLM is primarily a model of ultimate attainment (Flege 1995) and our research study was conducted in a foreign language context with learners who were at an early stage of acquisition and with teachers who were not L1 speakers of English, and in these school environments it is impossible to measure ultimate achievement. In the light of these results and the general findings in formal acquisition research (Cenoz 2003; García Lecumberri & Gallardo del Puerto 2003; García Mayo 2003; Lasagabaster & Doiz 2003), we conclude that, in such contexts, starting age is not as important as other variables such as quantity, intensity and quality of exposure, teaching methodology or cognitive development. The results corroborate to a certain extent Flege’s SLM predictions about L1 influence on L2 acquisition. Similar sounds were the most difficult to perceive, while better results were achieved with identical and new vowels. Interestingly, new sounds turned out to be slightly better identified than identical sounds. There might well be an explanation for this result: most of the English vowels that were classified here as identical compete with another similar TL vowel for assimilation to the L1 category. This may create a confusion area in the interlanguage system, where the identical sound is sometimes ascribed to the similar one, as has been found in previous studies (García Lecumberri & Cenoz 1997, 2002). In any case, the differences among the three classes of target sound were small, which suggests that learners’ differing performance for the SLM’s three degrees of crosslingual identification were not as clear-cut as predicted by the model. Finally, an interaction was detected between age and degree of similarity between L1 and TL sounds. Specifically, it was found that age affected the perception of identical, similar and new sounds in different ways. The identification of identical sounds was influenced by age, in that late AOL (and, therefore, older testing age) was associated with better identification, whereas age did not play an important role in the perception of similar and new vowels. In other words, the perceptual skills of the oldest group were better only in the identification of the TL vowels which were most strongly assimilated to L1 vowels. Probably the older learners’ higher metalinguistic awareness enabled them to identify the likeness between English and Spanish vowel properties; so it could be said that their equivalence classification processes worked more efficiently than those of the younger groups.
Age and native language influence
Acknowledgements This research was funded by the Ministerio Español de Ciencia e Investigación, Proyecto de Investigación PB97-0611, and the Gobierno Vasco, Proyecto de Investigación PI1998-96. We would also like to thank the editors of this volume and an anonymous reviewer for valuable comments on the manuscript.
References Asher, J. & García, G. (1969). The optimal age to learn a foreign language. Modern Language Journal, 38, 334–341. Cenoz, J. (2003). The influence of age on the acquisition of English: General proficiency, attitudes and code mixing. In M. P. García Mayo & M. L. García Lecumberri (Eds.), Age and the Acquisition of English as a Foreign Language (pp. 77–93). Clevedon: Multilingual Matters. Fathman, A. (1975). The relationship between age and second language productive ability. Language Learning, 25, 245–253. Flege, J. E. (1991). The interlingual identification of Spanish and English vowels: Orthographic evidence. The Quarterly Journal of Experimental Psychology, 43A (3), 701–731. Flege, J. E. (1992). Speech learning in a second language. In C. Ferguson, L. Menn, & C. StoelGammon (Eds.), Phonological Development: Models, research, and application (pp. 565– 603). Timonium, MD: York Press. Flege, J. E. (1995). Second language speech learning: Theory, findings, and problems. In W. Strange (Ed.), Speech Perception and Linguistic Experience: Issues in cross-language research (pp. 233–277). Timonium, MD: York Press. Flege, J. E., Bohn, O.-S., & Jang, S. (1997). Effects of experience on non-native speakers’ production and perception of English vowels. Journal of Phonetics, 25, 437–470. Flege, J. E., Munro, M., & MacKay, I. (1995). Factors affecting strength of perceived foreign accent in a second language. Journal of the Acoustical Society of America, 97, 3125–3134. Flege, J. E., Takagi, N., & Mann, V. (1995). Japanese adults can learn to produce English /r/ and /l/ accurately. Language and Speech, 38, 25–55. Flege, J. E., Takagi, N., & Mann, V. (1996). Lexical familiarity and English-language experience affect Japanese adults’ perception of /r/ and /l/. Journal of the Acoustical Society of America, 99, 1161–1173. Fullana, N. & Muñoz, C. (1999). The development of auditory discrimination skills in EFL learners of different ages [CD-ROM]. Proceedings of the XXIII International AEDEAN Conference. León: Universidad de León. Gallardo del Puerto, F., García Lecumberri, M. L., & Cenoz, J. (2000). English simple vowel discrimination in learners with different ages of first exposure to EFL [CD-ROM]. Proceedings of the XXIV International AEDEAN Conference. Ciudad Real: Universidad de Castilla-La Mancha. Gallardo del Puerto, F., García Lecumberri, M. L., & Cenoz, J. (2001a). La influencia del factor edad en la percepción de vocales y diptongos ingleses. Proceedings of the II Congress of Experimental Phonetics, 204–208. Sevilla: Universidad de Sevilla.
Francisco Gallardo del Puerto, Ma Luisa García Lecumberri and Jasone Cenoz
Gallardo del Puerto, F., García Lecumberri, M. L., & Cenoz, J. (2001b). L3 English vowel and consonant discrimination in learners with different ages of first exposure. Proceedings of the 2nd International Contrastive Linguistic Conference, 419–426. Santiago de Compostela: Universidad de Santiago de Compostela. García Lecumberri, M. L. & Cenoz, J. (1997). Identification by L2 learners of English vowels in different phonetic contexts. In J. Leather & A. James (Eds.), New Sounds 97: Proceedings of the Third International Symposium on the Acquisition of Second-Language Speech (pp. 196– 205). Klagenfurt: University of Klagenfurt. García Lecumberri, M. L. & Cenoz, J. (2002). Phonetic context variation vs vowel perception in a foreign language. In A. Braun & H. R. Masthoff (Eds.), Phonetics and its Applications: Festschrift for Jens-Peter Koester on the Occasion of his 60th Birthday (pp. 178–188). Stuttgart: Steiner. García Lecumberri, M. L. & Gallardo del Puerto, F. (2003). English FL sounds in school learners of different ages. In M. P. García Mayo & M. L. García Lecumberri (Eds.), Age and the Acquisition of English as a Foreign Language (pp. 115–135). Clevedon: Multilingual Matters. García Mayo, M. P. (2003). Age, length of exposure and grammaticality judgements in the acquisition of English as a foreign language. In M. P. García Mayo & M. L. García Lecumberri (Eds.), Age and the Acquisition of English as a Foreign Language (pp. 94–114). Clevedon: Multilingual Matters. Hecht, B. F. & Mulford, R. (1982). The acquisition of a second language phonology: Interaction of transfer and developmental factors. Applied Psycholinguistics, 3, 313–328. Krashen, S. D., Scarcella, R. C., & Long, M. H. (Eds.). (1982). Child-Adult Differences in Second Language Acquisition. Rowley, MA: Newbury House. Lasagabaster, D. & Doiz, A. (2003). Maturational constraints on foreign-language written production. In M. P. García Mayo & M. L. García Lecumberri (Eds.), Age and the Acquisition of English as a Foreign Language (pp. 136–160). Clevedon: Multilingual Matters. Lenneberg, E. (1967). Biological Foundations of Language. New York, NY: Wiley. Long, M. H. (1990). Maturational constraints on language development. Studies in Second Language Acquisition, 12, 251–285. Markham, D. (1997). Phonetic Imitation, Accent, and the Learner. [Travaux de l’Institut de Linguistique de Lund 33]. Lund: Lund University Press. Munro, M. J., Flege, J. E., & MacKay, I. (1996). The effects of age of second language learning on the production of English vowels. Applied Psycholinguistics, 17, 313–334. Oyama, S. (1976). A sensitive period for the acquisition of a non-native phonological system. Journal of Psycholinguistic Research, 5, 261–284. Rochet, B. L. (1995). Perception and production of second-language speech sounds by adults. In W. Strange (Ed.), Speech Perception and Linguistic Experience: Issues in Cross-Language Research (pp. 379–410). Timonium, MD: York Press. Shockey, L. (2003). Sound Patterns of Spoken English. Oxford: Blackwell. Singleton, D. (1989). Language Acquisition: The Age Factor. Clevedon: Multilingual Matters. Singleton, D. (1995). A critical look at the critical period hypothesis in second language acquisition research. In D. Singleton & Z. Lengyel (Eds.), The Age Factor in Second Language Acquisition (pp. 1–29). Clevedon: Multilingual Matters. Snow, C. & Hoefnagel-Höhle, M. (1978). The critical period for language acquisition: Evidence from second language learning. Child Development, 49, 1114–1128. Stevens, K. N. & House, A. S. (1963). Perturbations of vowel articulations by consonantal context: An acoustical study. Journal of Speech and Hearing Research, 6, 111–128.
Age and native language influence
Strange, W., Edman, T. R., & Jenkins, J. J. (1979). Acoustic and phonological factors in vowel identification. Journal of Experimental Psychology, Human Perception and Performance, 5, 643–656. Thompson, I. (1991). Foreign accents revisited: The English pronunciation of Russian immigrants. Language Learning, 41 (2), 177–204.
Appendix Minimal pairs used in the perception test (aural stimuli in italics) lead lid head bad bun barn bird cod cord full fool
lid lead heard bud barn ban bad cord cod fool full
bead bin bread ban bud barn bird don lord good shoot
bird Ben Brad barn bad bun bard dun load god short
Syllable-level studies: Codas and onset clusters
The influence of voicing and sonority relationships on the production of English final consonants* Barbara O. Baptista† and Jair L. A. da Silva Filho†,‡ † ‡
Universidade Federal de Santa Catarina, Brazil / Centro Federal de Educação Tecnológica de Santa Catarina
This study investigated the influence of voicing and universal sonority relationships, both within and across syllables, on the production of English word-final consonants by Brazilian learners. Paragoge was found to be more frequent after final voiced than voiceless consonants, nasal codas were found to be easier than obstruent codas, and among the voiced obstruents, place of articulation made a difference. Finally, as predicted by Murray and Vennemann’s (1983) syllable contact law for diachronic language change, the difference in degree of sonority across words was found to be a determiner of frequency of paragoge. In sum, the frequency of paragoge, related to the productive native language process of vowel epenthesis, was shown to depend on markedness and phonotactic universals.
.
Introduction
As pointed out by Carlisle (1994: 224), markedness relationships and phonological environment can be at least as important in influencing the phonological or phonetic variants produced by foreign/second language learners as external constraints such as style shifting. Thus, it is important to consider, in the study of second language (L2) phonology, not only theories of second language acquisition (SLA) dealing with those external constraints, but also linguistic theories of SLA and linguistic (especially phonological) theory in general, as these considerations will have an important influence on the way in which data are analyzed. The reverse is also true. In discussions of universal grammar (UG) dealing with what * This paper is a revised version of Baptista and Silva Filho (1997).
Barbara O. Baptista and Jair L. A. da Silva Filho
aspects of UG are accessible at what ages, the results of research in SLA phonology (as well as syntax, etc.) are extremely important. Eckman (1977) took the lead in considering implicational markedness as an important factor affecting the difficulty of target language (TL) structures, whether syntactic or phonological. In his markedness differential hypothesis (MDH), he claims that TL structures which are both different from those of the native language (NL) and more marked than those of the NL will be difficult, and that the degree of difficulty will depend on the degree of markedness. An often cited example of this relationship is that of word-final obstruents. English distinguishes between voiced and voiceless word-final obstruents, while in German the distinction is neutralized, all obstruents being devoiced in this position. The fact that German speakers have difficulty making the distinction in English while English speakers generally have little difficulty suppressing the distinction in German can be predicted by the fact that voiced obstruents are more marked than voiceless obstruents in final position. Later Eckman (1991: 24) developed the structural conformity hypothesis (SCH), which simply states that “the universal generalizations that hold for the primary languages hold also for interlanguages”, ignoring differences between the NL and the TL. The universal generalizations that he tested in this later study were the fricative-stop principle and the resolvability principle, both involving implicational markedness, but the SCH as stated could refer to any other implicational markedness relationship as well, such as the voiced/voiceless relationship mentioned above. In this relationship the SCH would make predictions similar to those of the MDH regarding the German-speaking learner of English, but without necessarily predicting difficulty. Thus, according to the SCH, we may meet a Germanspeaking learner of English whose interlanguage (IL) contains voiceless but not voiced obstruents in final position (non-native-like) and one whose IL contains both voiced and voiceless obstruents in final position (native-like), but we should not find any such learner whose IL includes voiced but not voiceless obstruents in this position. As to English-speaking learners of German, it makes similar predictions: that there may be some whose IL contains only voiceless obstruents in final position (native-like) and some whose IL contains both voiced and voiceless obstruents in this position (non-native-like), but none whose IL contains only voiced obstruents in this position. Eckman (1991: 32) claims that the SCH is stronger than the MDH because it is more falsifiable, and more explanatory than the MDH because it can account for difficulties where the NL and TL structures are not different (p. 33). While these two claims may be difficult to dispute, one thing that appears to have been forgotten is predictability. On the one hand, the SCH has the advantage of allowing for the German-speaking learner of English who has no difficulty with final voiced
The influence of voicing and sonority relationships
obstruents and for the English-speaking learner of German who insists on maintaining the voiced/voiceless distinction in this position. No doubt both cases exist. On the other hand, it has the disadvantage of being unable to make the prediction that the German-speaking learner is more likely to have difficulty in making the distinction in English than the English-speaking learner in suppressing the distinction in German, because it does not take into consideration differences in degree of markedness between the corresponding structures of the two languages in contact. Since considerable theoretical research has been done in the area of markedness in syllable structure, this is an obvious place to apply the MDH or the SCH in SLA phonology. The first studies to examine the syllable in SLA phonology were concerned only with the universal CV syllable as a cause of consonant cluster reduction in interlanguage (Anderson 1987; Broselow 1983, 1984; Karimi 1987; Sato 1984; Tarone 1980; Weinberger 1987). In general, this view was supported by these studies, but several of them found transfer to be just as important. Studies carried out later began looking further at the voicing distinction in TL final single consonants (Eckman 1981; Edge 1991; Flege & Davidian 1984; Flege, McCutcheon, & Smith 1987; Major & Faudree 1996), generally supporting the universality of the tendency to devoice final consonants, showing that even speakers of NLs which do not have this process tend to follow the same strategy as the German speakers, although not to the same extent. The year 1987 saw innovation by Tropf, who introduced the concept of sonority to the production of L2 syllable structure, and by Eckman, who began applying Greenberg’s implicational universals regarding consonant clusters to L2 production. Tropf (1987), who examined the production of German initial clusters, final clusters, and final single consonants by Spanish-speaking learners, found relative sonority to be most important in final position, where the frequency of deletion of single consonants was inversely related to the degree of sonority, and thus positively correlated with the degree of markedness. His results showed that plosives – the most marked – were deleted most frequently, followed by fricatives, then nasals, and finally laterals. As for final clusters, the frequency of deletion of the second consonant was related to the degree of sonority of the first. Eckman (1987), in a study of Japanese, Cantonese and Korean learners of English, demonstrated, based on Greenberg’s implicational universals regarding consonant clusters, (a) that more marked clusters would not be produced correctly by a learner who did not also produce the less marked clusters correctly, and (b) that the consonant deleted from tri-literal clusters would depend on the degree of markedness of the resulting bi-literal cluster. Further innovation was seen in Carlisle (1991a), the first study to examine the influence of phonological environment on the production of English initial /s/-clusters by Spanish speakers, a variable which was examined further in Carlisle (1992, 1997, this volume), with the consistent finding that the
Barbara O. Baptista and Jair L. A. da Silva Filho
production of the difficult onset was facilitated by a preceding vowel and made more difficult by a preceding consonant. Tropf, Eckman, and Carlisle have been influential in determining the direction followed by studies on the L2 syllable since then. Eckman and Iverson (1994), in their investigation of the production of single coda consonants by Japanese, Korean and Cantonese learners, in general confirmed their prediction that the more marked (less sonorant) obstruent codas would be more difficult than the nasal codas, which would in turn be more difficult than the liquids. However, there were some exceptions to this tendency, in particular regarding relative difficulty of the liquids, which were attributed to L1 influence. Thus, while the importance of sonority was generally supported by this study, it seems that transfer can overrule the predictions made by this phonological universal. No such transfer was found by Carlisle (1988, 1991b, 1992) to diminish the effect of sonority sequencing within complex onsets on the frequency of prothesis by Spanish-speaking learners of English. On the other hand, the study reported in Rebello (1997) and Rebello and Baptista (this volume) found the transfer of a Brazilian Portuguese (BP) voicing assimilation process to be crucial: it apparently was responsible for the equal or greater importance of markedness in regard to voicing, compared to markedness in terms of sonority within the same complex onsets. Ross (1994), in his study of the production of English codas by Japanese learners, found a very complex interaction of sonority and L1 influences, in addition to the influence of an extra variable – pitch contour of the target syllable. Cebrian (2000), on the other hand, found universals to take precedence over transfer in the production of English word-final obstruents by Catalan speakers. Catalan has obstruent voicing and devoicing processes, but the latter process is transferred to English at a much greater rate, due to its universal preference. Environment was also shown to be important, voiced obstruents being produced more frequently before vowels and other voiced consonants. Eckman and Iverson (1994) mention another factor affecting markedness, which, if true, would conflict with other claims related to sonority. They claim that among the obstruents, the most marked in final position are the affricates, followed by the fricatives, and then by the stops. Since stops are generally considered to be less sonorous than fricatives in those hierarchies that put them at different levels, this is contrary to the claim made by Hooper (1976), Selkirk (1984) and others that the less sonorant the consonant, the more marked it would be in syllable-final position. Eckman and Iverson do not test this claim in their study, however. Yavas (1994, 1997) examines still another factor affecting markedness in final position. He suggests (1994) that the tendency for velars to be the last among final stops to be voiced in L1 acquisition, due to the smaller supraglottal area formed by velar closure, would be found also in L2 acquisition. He then provides evidence
The influence of voicing and sonority relationships
(1997) for this tendency, strongest after high vowels, among Mandarin, Japanese, and Portuguese speakers learning English. Three recent studies follow new tendencies in the investigation of L2 codas. Hancin-Bhatt (2000), based on optimality theory (OT), also found sonority and the voicing of obstruents to be important factors in both the perception and production of English codas by Thai learners, the influence of the former variable being tempered by the influence of L1, as exemplified by the difficulty of coda liquids. OT was found to be quite useful in expressing the different levels of importance of the different phonological constraints. Abrahamsson (2003), in a study of Chinese-Swedish interphonology, investigated another kind of context first applied to L2 syllable structure by Weinberger (1987, 1994): information given in the rest of the discourse. Where the informational context made quite clear the meaning of the word with the target coda, the preferred strategy of his participants would be deletion. However, where deletion might cause lack of clarity, his participants chose paragoge. Finally, Hansen (2004), in a longitudinal investigation of Vietnamese-English interlanguage, used VARBRUL to investigate a multitude of conditioning factors in the choice of paragoge or feature change as strategies to facilitate the production of difficult codas. She found non-linear development, influenced in particular by cluster length (but not across cluster types), in addition to an influence of both the preceding vowel and following segment, stress, and even morphemic structure. Thus, universal markedness regarding cluster length, sonority, and voicing have been found to be important constraints in the production of L2 syllable structure, at times tempered and at times intensified by the influence of L1 transfer. Other variables found to be important are preceding and following phonological environment, manner and place of articulation, stress, pitch contour, and even the non-phonological variables morphemic structure and discourse context. What has not previously been considered in IL studies is the influence of difference in sonority across syllables, that is, between the target final consonant and the onset consonant of the following syllable, in spite of the importance of this variable found in diachronic language change more than two decades ago (Murray & Vennemann 1983).
. Single-consonant codas in Portuguese-English interlanguage This paper examines the production of English single-consonant codas by native speakers of BP, which permits only /s/ and /r/ to be realized phonetically as syllablefinal consonants. Other syllable-final consonant phonemes occurring in the underlying representation are modified in some way in the phonetic realization: (a) obstruents cause resyllabification by vowel paragoge, as in apto (apt), realized as
Barbara O. Baptista and Jair L. A. da Silva Filho
[Áapitu]; (b) the liquid /l/ is weakened to the offglide /w/, as in sol (sun), realized as [sfw]; and (c) the nasals /m/ and /n/ are omitted after their nasality is assimilated to the preceding vowel, as in viagem (trip), realized as [viaÁŠ˜e]. The interlanguage codas produced by the subjects in this study are examined first in relation to relative markedness of the target segment, based on sonority, as well as voicing, place and manner of articulation, and second in relation to the environment. The relative markedness of the target consonant was considered in terms of the four aspects mentioned in the above discussion: (a) the voicing distinction, where voiced consonants were predicted to cause more frequent paragoge than their voiceless counterparts; (b) relative sonority, where obstruents, based on Eckman and Iverson (1994), were expected to cause more paragoge than nasals; (c) relative markedness within the class of obstruents, where, also based on Eckman and Iverson, it was hypothesized that affricates would cause more paragoge than fricatives, which would cause more than stops; and (d) relative markedness among voiced stops by place of articulation, where, based on Yavas (1994), voiced velars should cause more paragoge than alveolars, which should cause more than bilabials. The environment was considered in terms of (a) the possible facilitating effect of a following vowel, as compared to a following consonant or a pause (as in Carlisle 1991a, 1992), and (b) the differences in sonority (or consonantal strength) between the target consonant and the onset consonant of the following syllable. This latter consideration is based on Hooper’s (1976: 220) syllable structure condition (SSC), which “requires that a syllable-initial C be stronger than the immediately preceding syllable-final C”, and on Murray and Vennemann’s (1983) syllable contact law (SCL), which carries Hooper’s condition a step farther by referring to relative strength differences and to diachronic sound change: THE SYLLABLE CONTACT LAW (SCL): The preference for a syllabic structure A$ B, where A and B are marginal segments and a and b are the Consonantal Strength values of A and B respectively, increases with the value of b minus a. COROLLARY: The tendency for a syllabic structure A$ B to change, where A and B are marginal segments and a and b are the Consonantal Strength values of A and B respectively, increases with the value of a minus b. (Murray & Vennemann 1983: 520)
In other words, the probability that resyllabification will occur increases with the degree to which contiguous syllables violate the SSC, this violation being quantified as the difference obtained by subtracting the consonantal strength value of the first consonant of the second syllable from the value of the final consonant of the first syllable. The related hypothesis for this study was that the SCL and its corollary would apply also to ILs, where the greater the value of a (the strength of the target consonant) minus b (the strength of the first segment of the following
The influence of voicing and sonority relationships
word), the greater would be the frequency of paragoge. Conversely, the smaller this difference, the less frequently paragoge would occur. Rather than the consonantal strength values from Hooper’s (1976: 206) universal strength hierarchy, which was not intended for interphonology research, the following sonority hierarchy was used, following Dziubalska-Kolaczyk (1997), which does not distinguish between voiced/voiceless segments of each category and includes the affricates between the fricatives and stops (the weight values go in the same direction, in spite of the opposite terminology): glide
liquid
nasal
fricative
affricate
stop
1
2
3
4
5
6
. Method Participants were six students (three male and three female) between the ages of 19 and 29, all native speakers of Brazilian Portuguese, two each from three different levels (first, second, and eighth semesters) in the undergraduate course in English of a Brazilian university. The more advanced level was included in case the beginning students produced so much paragoge as to be random, but this was not found to be the case. Although paragoge is known to be a typical characteristic of Brazilian interlanguage (Major 1986), the participants of this study produced rates of prothesis which were much lower than expected: 16% and 8% by the firstsemester students, 9% and 26% second semester, and 5% and 9% eighth semester. This probably indicates that it is a very salient characteristic of the Brazilian accent, which is noticed in spite of the low frequency because it interferes with the rhythm of the utterance, and thus often with intelligibility (Garcia 1991). None of the participants spoke a language other than BP at home, only one had been to an English-speaking country (England), but only for one month, and all claimed not to have regular contact with native English speakers. The instrument consisted of 432 sentences, each containing a monosyllabic word ending in a single consonant. Consonants which are either infrequent in final position – /Š/ – or which are known to cause articulatory difficulty – /θ/, /ð/, /r/, and /l/ (which, in final position, is consistently pronounced by Brazilians as the glide /w/) – were excluded from the study, leaving 16 target final consonants: /p/, /b/, /t/, /d/, /k/, /g/, /f/, /v/, /s/, /z/, /w/, /tw/, /dŠ/, /m/, /n/, /]/. The velar nasal was later excluded from the analysis because most students produced a [g] after the nasal due to spelling (e.g., [s%]g] for sung), leaving 15 target consonants. Because of the desire to investigate the effect of environment on final consonant production, it was important for all consonants to occur in all the same environments.
Barbara O. Baptista and Jair L. A. da Silva Filho
Thus, each of the consonants appeared in 27 sentences: once in sentence-final position, once followed in the next word by each of the 19 context consonants /p/, /b/, /t/, /d/, /k/, /g/, /f/, /v/, /θ/, /ð/, /s/, /w/, /tw/, /dŠ/, /m/, /n/, /l/, /r/, /h/, once followed by each of the glides /j/, /w/, and once followed by each of the five vowels /7/, /æ/, /"/, /f/, /o~/. Participants were recorded individually in a quiet office on campus on a small cassette-tape recorder. They knew that their pronunciation was being evaluated, but not the specific aspect of focus. Since they were recorded individually, it was possible to ask them to repeat whenever they paused or misread a sentence. Target and context words were transcribed by the second author, who noted the clear occurrences of a prothetic vowel. Doubtful cases were heard also by the first author, and agreement was always eventually reached.
. Results Consistent with the results of previous studies with Brazilians (Tarone 1980; Major 1996) and with our experience in teaching English pronunciation to Brazilian learners, paragoge was almost exclusively the strategy used by the participants in this study to simplify English syllable structure. There were only a handful of cases of devoicing and, apart from the nasals, there was only one final consonant omitted. Vowel-nasal sequences were frequently pronounced as nasal vowels without the final nasal consonant. Although this result was also predicted, nasal consonants were still included in the study because of the knowledge that they also sometimes cause paragoge instead of omission. Although a knowledge of BP phonology would have been sufficient to predict the preferred strategy of paragoge and the process of nasal assimilation and deletion, NL phonology alone could not have predicted which final consonants would cause greater difficulty or in which environments. For these two questions, it was necessary to consider markedness. . Markedness of the target segment The following discussion of the results is based on relative markedness of the target consonant in terms of (a) the voicing distinction, with a comparison of the production of voiced versus voiceless consonants; (b) relative sonority, with a comparison of the production of obstruents versus nasals; (c) relative markedness within the class of obstruents, with a comparison of affricates, fricatives and stops; and (d) relative markedness according to place of articulation, with a comparison of velars, alveolars and bilabials. Concerning voicing, previous studies, as mentioned above, have shown that speakers of languages without coda consonants often devoice target voiced conso-
The influence of voicing and sonority relationships
Table 1. Rates of paragoge after voiced/voiceless obstruents (percentages in parentheses)
Voiceless N Paragoge Voiced N Paragoge Total N Paragoge
Bilabial Stop
Alveolar Stop
162 17 (10.5)
156 16 (10.3)
162 17 (10.5) 324 34 (10.5)
Velar Stop
Lab-dent Fricative
Alveolar Fricative
Alveopal Affricate
162 21 (13.0)
162 30 (18.5)
162 10 (6.2)
162 18 (11.1)
966 112 (11.6)
162 27 (16.7)
162 34 (21.0)
162 30 (18.5)
162 27 (16.7)
162 48 (29.6)
972 183 (18.8)
318 43 (13.5)
324 55 (17.0)
324 60 (18.5)
324 37 (11.4)
324 66 (20.4)
1938 295 (15.2)
Total
nants in this position, even if there is no devoicing rule in the NL and if a voicing distinction exists in initial position. Thus, the markedness of voiceless consonants in this position seems to be extremely important, regardless of the NL. Although devoicing was not found to be a strategy of the subjects in this study, voicing did appear to play a role in determining the frequency of paragoge. Table 1 shows the rates of paragoge of each voiced/voiceless pair of obstruents included in the study. The alveopalatal fricative /w/ was not included in this comparison because its voiced counterpart /Š/ was excluded from the study for lack of frequency. The paragoge rates overall were lower than expected, considering that four of the six subjects were in the first year of their university course in English. Even with the low rates, however, a clear tendency for more frequent paragoge with voiced consonants is evident. Overall, the voiced obstruents were epenthesized more often than the voiceless obstruents, yielding a very significant chi-square with the Yates correction factor for degrees of freedom of only one (χ2 (1, N = 1938) = 18.53, p < .001). It was pointed out by Koerich (2002) that spelling could have biased the results with the affricates and the labio-dental fricatives, as the voiced pair of each, when word-final, is always spelled with a final “e”. To compensate for this possible influence, calculations were made without these two pairs of obstruents, yielding 10.0% paragoge for the voiceless obstruents and 16.5% for the voiced, still significant (χ2 (1, N = 1344) = 19.53, p < .001) with the Yates correction factor. This tendency was consistent for four of the six voiceless/voiced pairs; in fact, for two of the four, the rate for the voiced consonants is more than twice that of the voiceless. Not a single pair had a higher rate for voiceless, but curiously, the two labial pairs had exactly identical rates for voiced and voiceless. A possible explanation for this can be found in Yavas’s claim (1994) and findings (1997) mentioned above, that bilabials are the least frequently devoiced of the stops, because of
Barbara O. Baptista and Jair L. A. da Silva Filho
Table 2. Error rates for obstruents versus nasals (percentages in parentheses)
N Paragoge Assimilation/deletion Combined error
Nasals
Obstruents
324 12 (3.7) 12 (3.7) 24 (7.4)
1938 295 (15.2) – – 295 (15.2)
their greater supraglottal area. This means that they are not much more difficult to pronounce in final position than their voiceless counterparts. Thus, not only would labial obstruents be less frequently devoiced than other voiced obstruents by speakers who use the devoicing strategy, but there would be no more need for vowel paragoge for these than there would be for the voiced member of each pair. For the second aspect of relative markedness of the target consonant – markedness dependent on sonority – nasals were the only sonorants compared with obstruents because, as pointed out in section 3, the liquids are known to cause other problems instead of paragoge. Although syllable-final nasals cause two alternate pronunciation strategies to be employed by BP speakers – paragoge and assimilation/deletion – the results do support the hypothesis based on Eckman and Iverson. As illustrated in Table 2, the nasals /m/ and /n/ together in this study caused paragoge in 3.7% of the productions and assimilation/deletion in 3.7% of them, giving a combined error rate of 7.4%, less than half the paragoge rate of the obstruent pairs transferred from Table 1 (15.2%). Thus, although the difference is small, the less marked nasals did yield a lower combined error rate than the obstruents, with a significant chi-square with the Yates correction factor (χ2 (1, N = 2262) = 13.34, p < .001). This result corroborates the results with the Japanese, Korean and Cantonese learners of Eckman and Iverson’s study. Also as in Eckman and Iverson, however, the importance of NL transfer is apparent in the alternate pronunciation strategies employed by the learners. The analysis of markedness within the class of obstruents, the third aspect concerning the target consonant itself, also produced results generally consistent with Eckman and Iverson’s claim regarding the markedness in final position of affricates, followed by fricatives and then stops. Returning to Table 1, it can be seen that the alveopalatal affricates, on the whole, had the highest rate of paragoge (20.4%) of the six classes included in the table; the next highest rate was for the labiodental fricatives (18.5%), and the rates of all three classes of stops (10.5%, 13.5%, and 17%) were lower than the affricates or the labiodental fricatives. However, it should be noted that among the voiceless obstruents the labiodental fricatives (18.5%) and the velar stops (13.0%) had higher rates than the
The influence of voicing and sonority relationships
affricates (11.1%). Since it has already been pointed out that the voiced affricate is always spelled with the silent “e” in word-final position, the generalization may not be valid after all. Also among the voiced obstruents, the velar stops were more frequently epenthesized (21%) than the labiodental fricatives (18.5%), possibly because of place of articulation. Thus, it appears that no tendency regarding difficulty of obstruents by manner of articulation is strong enough to prevail among other interfering variables, and this hypothesis cannot be supported. The especially low overall rate of paragoge of the alveolar fricatives (11.4% overall), especially the voiceless ones (6.2%), has an obvious explanation: they are the only consonants tested in this study which are phonetically realized in final position in BP, although this realization can be as [s], [z], [w], or [Š], depending on dialect and environment. The voiceless alveopalatal fricative /w/ also had a low rate of paragoge (10.5%). What is interesting here is that the three sibilants were produced with any paragoge at all, since they are permitted in this position in BP. Again, spelling might have interfered to some degree, but not all cases of paragoge after sibilants involved the final silent “e”. Thus, even positive NL transfer was not sufficient to eliminate errors, as might have been predicted by contrastive analysis. Finally, the last kind of markedness, based on place of articulation of voiced syllable-final stops, also produced the expected results. Table 1 shows that while minimal differences in paragoge rates were obtained among the voiceless stops from one place of articulation to the other (10.5%, 10.3% and 13.0%), the rates among the voiced stops range from 21.0% for velars to 16.7% for alveolars and a mere 10.5% for bilabials, yielding a significant chi-square (χ2 (2, N = 486) = 6.7, p < .05). While there was no devoicing strategy among these subjects, the greater differences in paragoge rates among the voiced stops, increasing from the larger to the smaller supraglottal areas, are quite consistent with the differential rates of devoicing among the different places of articulation found by Yavas (1997). No NL factors appeared to be relevant for this comparison. In sum, of the four aspects considered in the beginning of this section regarding relative markedness of the target consonant, (a) markedness by voicing and (b) markedness among voiced stops by place of articulation gave the paragoge results predicted by markedness relationships alone; (c) sonority of the target segment gave error rate results in conformity with the predictions based on markedness relationships, but the choice of strategy to pronounce word-final nasals was influenced by NL transfer; and (d) class of obstruents by manner of articulation did not show any clear tendency, possibly because of intervening variables. These results suggest that the interaction between markedness relationships and NL transfer is somewhat more complicated than stated in Eckman’s MDH, where NL/TL differences plus markedness should determine the difficulty of a TL structure.
Barbara O. Baptista and Jair L. A. da Silva Filho
Table 3. Rates of paragoge by environment (percentages in parentheses)
N Paragoge
Consonant
Vowel
Pause
Total
2010 273 (13.6)
480 57 (11.9)
96 7 (7.3)
2586 337 (13.0)
. Influence of phonological environment The influence of the phonological environment is discussed here (a) in terms of the differential effects of a vowel, consonant or pause following the target syllable-final consonant; and (b) within consonantal environments, in terms of the differences in consonantal strength (or sonority) between the target coda consonant and the following onset consonant. The first comparison, illustrated in Table 3, revealed a very small difference among the rates of paragoge in the three environments, resulting in a nonsignificant chi-square (χ2 (2, N = 2586) = 3.88, p > .20). This difference was especially small between the rates of paragoge in vocalic (13.6%) and consonantal (11.9%) environments, and somewhat larger between these two and the rate of paragoge before a pause, which was only 7.3%. Thus, no strong claims can be made here concerning the influence of environment. However, where it might be expected that a continued flow of speech would be important for the avoidance of paragoge, the somewhat lower paragoge rate before a pause suggests that what these learners have not adequately learned is how to make the transition from the final consonant to the beginning of the following word. The second environmental factor, differences in sonority between the last consonant of the target word and the first consonant of the following word, was, as explained in section 2, based on Murray and Vennemann’s syllable contact law. For this analysis, first the segments in question were assigned a sonority value according to the hierarchy used by Dziubalska-Kolaczyk (1997). Then, according to the differences computed, each sentence was assigned a syllable contact number (SCN), the values ranging from –3 to +5. For example, for the target word escape followed by the context word latch we would have strength 6 for the /p/ minus strength 2 for the /l/, giving a SCN of 4. Those sentences with a negative SCN were expected to cause minimal difficulty because they contained the preferred structure, with the initial consonant of the context word less sonorant than the final consonant of the target word, while those with a positive SCN were expected to cause a greater rate of difficulty, with the difficulty rate increasing as a function of the value of the SCN. Contrary to the distribution of sentences within the consonant classes in Table 1, the number of sentences corresponding to each syllable contact number varied considerably. Table 4 shows the raw numbers and rates of paragoge by SCN. As
The influence of voicing and sonority relationships
Table 4. Paragoge rates by syllable contact number (percentages in parentheses) SCN N Par.
–3
–2
–1
0
1
2
3
4
5
108 7 (6.48)
216 22 (10.18)
258 22 (8.53)
480 45 (9.38)
252 37 (14.68)
372 60 (16.13)
156 30 (19.23)
96 10 (10.42)
72 18 (25.00)
shown in the table and visualized more clearly in Figure 1, the paragoge rates increase gradually from SCN –3 (6.48%) to SCN 5 (25%), with only slight deviations in the rates for SCN –2, and SCN 4. The deviations in the general tendency are probably due to several factors. The first might simply be the limited number of tokens to be divided into nine categories, statistical tendencies showing up better with larger numbers. A second is the impossibility of separating sonority from other markedness variables such as voicing and the place of articulation of target voiced stops. A third factor is certainly NL transfer, making the final /s/ easier, for example. Fourth is the influence of spelling, which affects some categories more than others. Fifth and more generally, a consensus as to the most suitable universal sonority hierarchy has yet to be found, although almost 30 years have passed since Hooper’s SSC and more than 20 since Murray and Vennemann’s SCL. Moreover, there is the question of whether learners would be more influenced by a universal hierarchy, if one in fact exists, or by the sonority hierarchy of their NL. English, for example, according to Giegerich (1992), does not distinguish the sonority level of stops and fricatives. Related to this lack of consensus is the fact that sonority hierarchies assume equal intervals
Figure 1. Paragoge rates by syllable contact number (SCN)
Barbara O. Baptista and Jair L. A. da Silva Filho
among the various levels, where it is possible that some intervals are larger than others, which in theory would affect the weighting of the SCNs. Considering all of these interfering factors, it is actually quite surprising to see a tendency as consistent as what appears in Table 4 and Figure 1. The general tendency for the difficulty of final consonant production to increase with the difference in sonority across syllables is apparent. It thus seems that Murray and Vennemann’s SCL may be valid not only for diachronic change in primary languages, but for ILs as well. This supports Eckman’s SCH, which claims that “the universal generalizations that hold for the primary languages hold also for interlanguages” (1991: 24). More generally, these results suggest that we need to look at environment not just across the board, but in the way that it interacts with the target segment itself. The difference between vowels and consonants, found by Carlisle (1991a, 1992, 1997, this volume) to be important for initial /s/ clusters, may not be the crucial difference for all targets. When the environment is another consonant, the difficulty in producing the target consonant may depend not so much on the class of the context consonant itself, but on the interaction between the class of the target final consonant and the class of the context syllable-initial consonant. This is certainly something meriting further investigation.
. Theoretical implications This study is small in scope and, for this reason, limited in the claims that can be made. Nevertheless, it has resulted in potentially important tendencies and has thus pointed to new directions for research. Regarding theory, the results give support, on the one hand, to a slightly new twist to Eckman’s SCH. Not only might it be said that universal generalizations that hold for primary languages hold for interlanguages, but also that universal generalizations regarding diachronic change of primary languages may hold for interlanguages, since interlanguages are, in principal, always in a state of change. On the other hand, the limitations to the evidence obtained for the SCH may be due to limitations in the precision of phonological theory. Thus, the study provides evidence for the need to perfect the concept of sonority. Certain natural classes may need to be reordered in the hierarchy, and the hierarchical classes may need refinement in terms of subcategories and in terms of the degree of the intervals between them. It is hoped that the results obtained in this study can make a valuable contribution to the discussion of the influence of markedness relations in SLA phonology, first in having provided further support for questions of markedness brought up by previous researchers, and more importantly, in having pointed to a new area of markedness to be taken into account – that of differences in sonority across syllables as an important environmental factor. There are several factors that limit
The influence of voicing and sonority relationships
the generalization of the results of this study. The most serious is the possibility of spelling interference. Second is the difficulty of separating the variables in order to make stronger claims about the influence of each one. Third is the question of the most appropriate sonority hierarchy, discussed at the end of section 4. Finally, there is the usual tradeoff between natural speech and guarantee of occurrence of the relevant segments and environments. In this study preference was given to the latter, but other investigations could try to prioritize the former. In spite of these limitations, some clear tendencies were noted, which deserve further attention in future research.
Acknowledgements We would like to thank the anonymous reviewer for valuable comments on the manuscript.
References Abrahamsson, N. (2003). Development and recoverability of L2 codas: A longitudinal study of Chinese-Swedish interphonology. Studies in Second Language Acquisition, 25, 313–349. Anderson, J. I. (1987). The markedness differential hypothesis and syllable structure difficulty. In G. Ioup & S. H. Weinberger (Eds.), Interlanguage Phonology: The acquisition of a second language sound system (pp. 279–291). Cambridge, MA: Newbury House. Baptista, B. O. & Silva Filho, J. A. (1997). The influence of markedness and syllable contact on the production of English final consonants by EFL learners. In J. Leather & A. James (Eds.), New Sounds 1997: Proceedings of the Third International Symposium on the Acquisition of Second Language Speech (pp. 26–34). Klagenfurt: University of Klagenfurt. Broselow, E. (1983). Non-obvious transfer: On predicting epenthesis errors. In S. Gass & L. Selinker (Eds.), Language Transfer in Language Learning (pp. 269–280). Rowley, MA: Newbury House. Broselow, E. (1984). An investigation of transfer in second language acquisition. International Review of Applied Linguistics, 22, 253–269. Carlisle, R. S. (1988). The effect of markedness on epenthesis in Spanish/English interlanguage phonology. Issues and Developments in English and Applied Linguistics, 3, 15–23. Carlisle, R. S. (1991a). The influence of environment on vowel epenthesis in Spanish/English interlanguage phonology. Applied Linguistics, 12, 76–95. Carlisle, R. S. (1991b). The influence of syllable structure universals on the variability of interlanguage phonology. In A. D. Volpe (Ed.), The Seventeenth LACUS Forum 1990 (pp. 135–145). Lake Bluff, IL: Linguistic Association of Canada and the United States. Carlisle, R. S. (1992). Environment and markedness as interacting constraints on vowel epenthesis. In J. Leather & A. James (Eds.), New sounds 92: Proceedings of the 1992 Amsterdam Symposium on the Acquisition of Second-Language Speech (pp. 64–75). Amsterdam: University of Amsterdam Press.
Barbara O. Baptista and Jair L. A. da Silva Filho
Carlisle, R. S. (1994). Markedness and environment as internal constraints on the variability of interlanguage phonology. In M. Yavas (Ed.), First and Second Language Phonology (pp. 223– 249). San Diego, CA: Singular. Carlisle, R. S. (1997). The modification of onsets in a markedness relationship: Testing the interlanguage structural conformity hypothesis. Language Learning, 47, 327–361. Carlisle, R. S. (2006). The sonority cycle and the acquisition of complex onsets. In B. O. Baptista & M. A. Watkins (Eds.), English with a Latin Beat: Studies in Portuguese/Spanish – English Interphonology. Amsterdam: John Benjamins. Cebrian, J. (2000). Transferability and productivity of L1 rules in Catalan-English interlanguage. Studies in Second Language Acquisition, 22, 1–26. Dziubalska-Kolaczyk, K. (1997). ‘Syllabification’ in first and second language. In J. Leather & A. James (Eds.), New Sounds 97: Proceedings of the Third International Symposium on the Acquisition of Second-Language Speech (pp. 69–78). Klagenfurt: University of Klagenfurt. Eckman, F. R. (1977). Markedness and the contrastive analysis hypothesis. Language Learning, 27, 315–330. Eckman, F. R. (1981). On predicting phonological difficulty in second language acquisition. Studies in Second Language Acquisition, 4, 18–30. Eckman, F. R. (1987). The reduction of word-final consonant clusters in interlanguage. In A. James & J. Leather (Eds.), Sound Patterns in Second Language Acquisition (pp. 143–162). Dordrecht: Foris. Eckman, F. R. (1991). The structural conformity hypothesis and the acquisition of consonant clusters in the interlanguage of ESL learners. Studies in Second Language Acquisition, 13, 23–42. Eckman, F. R. & Iverson, G. K. (1994). Pronunciation difficulties in ESL: Coda consonants in English interlanguage. In M. Yavas (Ed.), First and Second Language Phonology (pp. 251– 265). San Diego, CA: Singular. Edge, B. A. (1991). The production of word-final voiced obstruents in English by L1 speakers of Japanese and Cantonese. Studies in Second Language Acquisition, 13, 377–394. Flege, J. E. & Davidian, R. D. (1984). Transfer and developmental processes in adult foreign language speech production. Applied Psycholinguistics, 5, 3223–3247. Flege, J. E., McCutcheon, M. J., & Smith, S. C. (1987). The development of skills in producing word final English stops. Journal of the Acoustical Society of America, 82, 433–447. Garcia, I. W. (1991). English as spoken by Brazilians: Discourse features impairing comprehension. In Anais do XI Encontro Nacional de Professores Universitários de Língua Inglesa – ENPULI (pp. 432–437). São Paulo: ENPULI. Giegerich, H. (1992). English Phonology: An introduction. Cambridge: CUP. Hancin-Bhatt, B. (2000). Optimality in second language phonology: Codas in Thai ESL. Second Language Research, 16, 201–232. Hansen, J. G. (2004). Developmental sequences in the acquisition of English L2 syllable codas: A preliminary study. Studies in Second Language Acquisition, 26, 85–124. Hooper, J. B. (1976). An Introduction to Natural Generative Phonology. New York, NY: Academic Press. Karimi, S. (1987). Farsi speakers and the initial consonant cluster in English. In G. Ioup & S. H. Weinberger (Eds.), Interlanguage Phonology: The acquisition of a second language sound system (pp. 305–318). Cambridge, MA: Newbury House. Koerich, R. D. (2002). Perception and Production of Word-Final Vowel Epenthesis by Brazilian EFL Students. PhD dissertation. Universidade Federal de Santa Catarina, Brazil.
The influence of voicing and sonority relationships
Major, R. C. (1986). Paragoge and degree of foreign accent in Brazilian English. Second Language Research, 2, 53–71. Major, R. C. (1996). Markedness in second language acquisition of consonant clusters. In R. Bayley & D. R. Preston (Eds.), Second Language Acquisition and Linguistic Variation (pp. 75– 96). Amsterdam: John Benjamins. Major, R. C. & Faudree, M. C. (1996). Markedness universals and the acquisition of voicing contrasts by Korean speakers of English. Studies in Second Language Acquisition, 18, 69–90. Murray, R. W. & Vennemann, T. (1983). Sound change and syllable structure in Germanic phonology. Language, 59, 514–528. Rebello, J. T. (1997). The acquisition of initial /s/ clusters by Brazilian EFL learners. In J. Leather & A. James (Eds.), New Sounds 97: Proceedings of the Third International Symposium of the Acquisition of Second-Language Speech (pp. 336–432). Klagenfurt: University of Klagenfurt. Rebello, J. T. & Baptista, B. O. (2006). The influence of voicing on the production of initial /s/ clusters by Brazilian learners. In B. O. Baptista & M. A. Watkins (Eds.), English with a Latin Beat: Studies in Portuguese/Spanish – English interphonology. Amsterdam: John Benjamins. Ross, S. (1994). The ins and outs of paragoge and apocope in Japanese-English interphonology. Second Language Research, 10, 1–24. Sato, C. (1984). Phonological processes in second language acquisition: Another look at interlanguage syllable structure. Language Learning, 34, 43–57. Selkirk, E. (1984). On the major class features and syllable theory. In M. Aronoff & R. Oehrle (Eds.), Language Sound Structure (pp. 107–136). Cambridge, MA: The MIT Press. Tarone, E. (1980). Some influences on the syllable structure of interlanguage phonology. International Review of Applied Linguistics, 18, 139–152. Tropf, H. (1987). Sonority as a variability factor in second language phonology. In A. James & J. Leather (Eds.), Sound Patterns in Second Language Acquisition (pp. 173–192). Dordrecht: Foris. Weinberger, S. H. (1987). The influence of linguistic context on syllable simplification. In G. Ioup & S. H. Weinberger (Eds.), Interlanguage Phonology: The acquisition of a second language sound system (pp. 401–417). Cambridge, MA: Newbury House. Weinberger. S. H. (1994). Functional and phonetic constraints on second language phonology. In M. Yavas (Ed.), First and Second Language Phonology (pp. 267–302). San Diego, CA: Singular. Yavas, M. (1994). Final stop devoicing in interlanguage. In M. Yavas (Ed.), First and Second Language Phonology (pp. 251–265). San Diego, CA: Singular. Yavas, M. (1997). The effects of vowel height and place of articulation in interlanguage final stop devoicing. International Review of Applied Linguistics, 35, 115–125.
Perception and production of vowel paragoge by Brazilian EFL students Rosana Denise Koerich Universidade Federal de Santa Catarina, Brazil
Based on Flege’s (1995) claim that inaccurate perceptual targets may be responsible for misproductions of L2 sounds, this study investigates the relationship between perception and production of vowel paragoge by Brazilian EFL learners. Production data was obtained through the reading of sentences containing monosyllabic target words of the form CVC followed by monosyllabic context words or silence. Perception data was obtained through an oddity discrimination test. The study involved twenty undergraduate students at the beginning level of English. Support for the perception-production relationship was provided by statistically significant correlation results. This study is innovative in that previous investigations focusing on the issue have dealt with distinctions between phonemes, whereas this deals with syllable structure and the addition of an extra phone.
.
Introduction
Investigations of L2 phonological performance typically involve systematic deviant productions – in general, the inappropriate use of phonological processes such as vowel lengthening, consonant devoicing or substitution, and resyllabification with vowel or consonant deletion or addition. Resyllabification with the addition of a vowel – whether word-initial (prothesis), word-medial (epenthesis) or word-final (paragoge) – is a frequent tendency when the two languages in contact differ in their syllabic structures, which is the case with English and Brazilian Portuguese (BP). The striking differences in the permissible structures of each system, in particular regarding codas, where there are crucial disparities, lead to the expectation that this may be an area of difficulty. Whereas General American English (GA) allows all consonants except [h] in codas, most varieties of BP allow only [r], [s] and [z] phonetically ([l] is vocalized and the nasal consonants are deleted with nasalization being transferred to the previous vowel), bringing the pattern closer to the optimal universal CV syllable.
Rosana Denise Koerich
The resistance of BP speakers to producing sequences of obstruents within words and single word-final obstruents is apparent in the way certain L1 words, acronyms and loanwords are dealt with. Vowel epenthesis is a productive phonological process in BP – ritmo (“rhythm”) is realized as [| ritimu] – and foreign words are often borrowed into BP with paragoge, like English club, which becomes clube, with a paragogic vowel in the pronunciation and the corresponding modification in the spelling. The same resyllabification strategy is applied to English word-final obstruents and sequences of obstruents within words or at word boundaries in contact, in order to make them more pronounceable. This strategy has been attested in a growing number of studies where frequency of resyllabification has been shown to be affected by level of L2 proficiency, transfer, and linguistic constraints such as markedness relations and phonological context (e.g., Baptista & Silva Filho, this volume; Fernandes 1997; Major 1986, 1987; Monahan 2001; Tarone 1987). In line with work by Flege and colleagues (e.g., Flege 1995; Flege, MacKay, & Meador 1999), the present study approaches this area of deviant L2 production by investigating the relationship between perception and production of wordfinal consonants by Brazilian learners of English. The investigation of the role of perception in L2 pronunciation stems from research indicating that, whereas L1 mispronunciations tend to be caused by motoric difficulties, L2 mispronunciations seem to be greatly influenced by the learner’s perception being linked to the phonetic parameters of the L1 – a phenomenon referred to by Strange (1995) as perceptual foreign accent. The complexity of the relationship between speech perception and production is acknowledged throughout the literature, and research attempting to unravel this complexity has provided inconclusive evidence pointing in three directions: (a) perception outperforms production (e.g., Flege 1984, 1988, 1995; Flege & Hillenbrand 1984); (b) production outperforms perception (e.g., Flege & Eefting 1987; Sheldon 1985; Sheldon & Strange 1982); (c) perception and production are interdependent (e.g., Best 1995; Flege 1999; Flege et al. 1999). The lack of consensus in the literature and consequent impossibility of making generalizations are probably attributable to the methodological diversity among different research programs (see Flege 1999). A considerable number of these programs have been strongly influenced by the speech learning model – SLM (Flege 1995) – a theoretical model of L2 speech learning and perception. Hypothesis 1 (H1) of the SLM, of particular interest to the present study because of its relationship to phonotactic interpretations, says that L2 and L1 sounds are perceptually related at a “position-sensitive allophonic level”, not at a phonemic level. Concerning the perception and production of word-final consonants, Flege says that this hypothesis
Perception and production of vowel paragoge
leads to the prediction that speakers of an L1 without word-final stops will not relate English word-final stops perceptually to word-medial or word-initial stops in their L1. If so, then we might expect them to eventually produce word-final stops in English accurately. This is because if H1 is correct, L1 phonetic structures should not interfere with the establishment of new phonetic categories. (Flege 1995: 261)
The author admits, however, that, contrary to H1, research has shown that L2 learners of various L1s without word-final consonants resort to remedial measures such as stop devoicing, vowel paragoge and consonant deletion to facilitate the pronunciation of word-final consonants. SLM predictions are basically concerned with the segmental dimension of speech (phoneme-sized units); however, as Flege himself comments, “nonsegmental (i.e., prosodic) dimensions are an important source of foreign accent” (Flege 1995: 233). At a nonsegmental level, a unit of speech perception extended to sequences of phones (syllables) may account for foreign accent in terms of paragoge production. Embedded in this discussion is the one pivotal question raised in psycholinguistic research that began to direct close attention to speech processing in the last decades – the definition of the unit of speech perception (e.g., Cutler, Mehler, Norris, & Segui 1986; Cutler & Otake 1994; Jenkins & Yeni-Komshian 1995; Kuhl 1993; Kuhl & Iverson 1995; Leather & James 1996; Pisoni & Luce 1986, 1987). In regard to this issue, the study of vowel paragoge in the contact of two languages that differ as markedly in the constituency of the coda as English and BP might make an important contribution to the discussion of the status of the syllable as a unit of perceptual analysis in speech.
. Method In the investigation of the relationship between perception and production of word-final consonants, the question under study was whether paragoge production would correlate with discrimination of CVC and CVCV sequences where the final vowel is an /i/, the hypothesis pointing in the direction of a positive correlation, that is, that participants who produced more paragoge would be those who failed more frequently to discriminate the sequences. The data for the study was originally collected from 34 Brazilian learners of English from the first and second semesters of undergraduate university English courses, selected among 76 students according to pre-established criteria using a profile questionnaire. The set of criteria aimed at selecting a group of adult learners pedagogically characterized as false beginners. The students selected shared the following characteristics: (a) previous experience with EFL in high school, where
Rosana Denise Koerich
instruction was centered on reading skills; (b) from 8 to 12 months of continuous instruction at private language schools and/or at the university they were attending; and (c) little (up to one month) or no experience in an English-speaking country. Of these 34 students, twenty (13 female, 7 male, aged between 17 and 23) were chosen at random in order to keep the task for the listeners within reasonable proportions. One native speaker of American English served as control in the production test, to guarantee that the native listeners were really distinguishing between paragoge and lack of paragoge, and three native speakers served as control in the perception test, to guarantee that it was easy for native speakers. The perception and production data were collected in a language laboratory in a single session. . Production test The production test consisted of the reading of a randomized set of 264 sentences containing a monosyllabic (C)CVC word ending in one of the target obstruents /p/, /b/, /t/, /d/, /k/, /g/, /f/, /v/, /w/, /tw/, /dŠ/, followed by a monosyllabic CVC(C)(C) or VC(C) word or by a pause (Appendix A). Thus, each target final consonant appeared before the consonantal contexts /p/, /b/, /t/, /d/, /k/, /g/, /f/, /v/, /s/, /w/, /tw/, /dŠ/, /m/, /n/, /θ/, /ð/, /r/, /l/, and /w/; in three vocalic contexts – one vowel from each of the pairs /7-æ/, /"-%/, and /f-o~/; and in sentence-final position. The consonants /θ/, /ð/, /s/, /z/, and /Š/ were not included among the targets because they cause articulatory difficulties (/θ/, /ð/), are confused with each other (/s/, /z/), or are infrequent in final position (/Š/). The glide /j/ was not included among the contexts because it is frequently mispronounced by Brazilian learners as /i/ or /I/, which are the epenthetic vowels often inserted in BP and thus would be difficult to separate in case of paragoge. The limitation in terms of vocalic contexts was due to the fact that this study did not set out to investigate the influence of vowel quality differences, but only to contrast the effect of vowels and consonants as phonological contexts in which paragoge production occurs. The sentences varied in length from five to seven words, and, owing to the students’ beginning level of English, an effort was made to limit them to the vocabulary characteristic of basic course books. The order of presentation was randomized so that each participant’s list had a different order. Although reading was self-paced, a patterned rhythm was guaranteed by the use of the card to cover the sentences, which the participant slid down as the reading proceeded. The use of the card was also intended to prevent visual preparation for reading the next sentence, and the inadvertent skipping of any sentence. Participants were instructed to record each sentence once. The speech production data (the sentences read by the 20 participants) was treated at the Speech and Hearing Laboratory of the University of Alabama at
Perception and production of vowel paragoge
Birmingham. The target and context words or single words, when the context was a pause, were edited from the sentences and digitized at 22.05 kHz with 16-bit amplitude resolution, using a Sony DAT tape recorder (model TCD D28) and an editing software (Cool Edit 2000), and then normalized to 50% for peak intensity. This material was then ordered for presentation to native listeners using the UAB Software.1 The stimuli were presented from a notebook computer, over headphones (Sony MDR 7506 Dynamic Stereo) in a sound booth in individual sessions. Three monolingual native speakers of American English, unfamiliar with linguistic studies, rated the speech samples (chunks, as for example TOP BED, from the sentence The top bed is Mary’s) for the presence or absence of the vowel [i] following the target consonant (/p/ in the chunk TOP BED), by pointing the cursor to one of two buttons labeled “yes” or “no” on the notebook screen using a mouse. A training session familiarized the three judges with the procedures. The judgments were assigned the values 1 for “yes” responses (there is an [i]), and 0 for “no” responses. The sequence was considered to contain a paragogic vowel if at least two judges gave a “yes” response. The resulting data was then arranged for statistical treatment in terms of the rates of paragoge produced by each subject. . Perception test The perception test was an oddity format task in which subjects had to discriminate CVC from CVCV sequences where the final vowel was /i/. The test followed the design and procedures of the tests used in Flege, Munro and Fox (1994) and Flege et al. (1999), called in the latter the categorial discrimination test – CDT. It consisted of 72 trials of 3 two-word phrases formed by a proper name or nickname (diminutive) and a verb in the present tense (Appendix B). The names were either a (C)CVC or a (C)CVCV word where the last C was one of the 15 consonants /p/, /b/, /t/, /d/, /k/, /g/, /f/, /v/, /s/, /z/, /w/, /tw/, /dŠ/, /m/, /n/. The verbs – (C)VC(C)(C) words – provided 24 different contexts: the consonants /p/, /b/, /t/, /d/, /k/, /g/, /f/, /v/, /s/, /w/, /tw/, /dŠ/, /m/, /n/, /θ/, /ð/, /r/, /l/, /h/, /w/, and the vowels /7/, /æ/, /"/, and /o~/. Three types of trials were designed: (a) different trials – 36 trials containing an odd item in one of the positions (e.g., Bobbie needs/Bob needs/Bob needs; Dottie cheats/Dot cheats/Dottie cheats); (b) catch trials – 12 trials where there was no odd item, that is, where all three stimuli consisted of the same phrase (e.g., Bob needs/Bob needs/Bob needs; Dottie cheats/Dottie cheats/Dottie cheats); and (c) dis. The UAB Software (Version 1997) was developed by S. C. Smith for the Dept. of Rehabilitation Sciences of the University of Alabama at Birmingham, Birmingham, AL.
.
Rosana Denise Koerich
tractor trials – 24 trials where the odd item constituted a contrast in a non-target vowel or consonant of the verb (e.g., Cat bands/ Cat bends/ Cat bends), rather than the target contrast in the noun. In line with the test developed by Flege and colleagues (e.g., Flege et al. 1999; Flege et al. 1994), catch trials were included to encourage subjects to disregard irrelevant variations and respond to the phonetically relevant differences among the stimuli. Although not present in the original design of the CDT, which dealt with individual vowels, the inclusion of distractor trials was necessary here to divert participants’ attention from the objective of the test, and thus avoid biased results. In all trials each phrase in the sequence of three was spoken by a different talker, all native speakers of American English. A number of measures were adopted in the quantification of the three types of trials, such as the structure and position of the odd item, in order to counterbalance methodological variables that could interfere with the results. The audio stimuli were recorded in a sound-treated room, using ProTools 24 hardware and Sound Forge software with 16-bit resolution at a 44.1-kHz sampling rate. The audio-signal followed a path from a condenser microphone (Audio-Technica AT4033a), through an amplifier (dbx 286) to a ProTools 24 table (MACKIE 1604VLZ). The stimuli were low-pass filtered at 4.8 kHz and normalized for peak intensity. The material was digitally edited, the inter-trial interval set at 2.8 s, and the inter-stimulus interval at 1.3 s, following Flege et al. (1994). In the different and distractor trials, an odd item appeared in one of the three positions in the trial, and thus the correct answer to be marked on the answer sheet would be 1, 2, or 3. In the catch trials, there was no odd item, so the correct answer would be 0. Participants were instructed to respond to all trials, guessing if unsure. A training sequence of eight trials was presented before the test itself, consisting of four different trials, two catch trials, and two distractor trials. The variable investigated in the perception test was the A’ (A prime) score for each participant listener, considered by Flege et al. (1999) to be “an unbiased measure of phonetic sensitivity” because the percentage of correct answers in the different trials (hits – H) is counterbalanced by the percentage of errors in the catch trials (false alarms – FA), that is, the percentage of times subjects indicated an odd item out when there was none. The distractors were disregarded in the analysis.
. Results and discussion The reliability of the tests was guaranteed by (a) the almost perfect (99.7%) judgment of the native speaker’s production (control participant) as being without a paragogic vowel; (b) the low rate of disagreement in the judgments of paragoge production by the Brazilian participants (4.2%); and (c) the three control native
Perception and production of vowel paragoge
speakers’ performance in the perception test (M = 99%, range = 97.2–100%), in the different trials, and (M = 100%) in the catch trials. Before addressing the prediction that there would be a relationship between speech perception and production, that is, that subjects who produced more paragogic vowels would fail more frequently to discriminate CVC and CVCV sequences, it is important to compare the overall rates of paragoge in this study – 44.45% – with those of Baptista and Silva Filho (this volume) – 15.2% for the words followed by an obstruent (the most difficult context) – a considerable difference, probably related to the students’ level of L2 proficiency. Two factors might have contributed to this difference. First, whereas Baptista and Silva Filho included participants in the eighth semester, besides participants in the first and second semester, the present study involved participants only in the first and second semester. Second, the participants in Baptista and Silva Filho were from a major university English course with 90 hours of English instruction per semester and an emphasis on oral expression, whereas the participants of the present study were from the same university and from two smaller colleges, where each semester comprised 60 hours of instruction, two semesters were needed to complete the same course book completed at the major university in one semester, and the written language occupied a good part of the course. The effect of instruction on paragoge production has been noted by Fernandes (1997) and Major (1986, 1987). Table 1 displays participants’ rates of paragoge in the production test and the perception results of the CDT: the rates of correct discrimination of odd items in different trials (HIT); the rates of false alarms (FA) in catch trials, that is, indication of odd items when there were none; and the resulting A’ scores. The table shows that, in general, A’ scores were extremely low. A’ scores range from 0 to 1.0. A value of 0.5 occurs when the HIT and FA rates are equal, characterizing chance performance. A value lower than 0.5 represents lack of sensitivity, and a value of 1.0 represents perfect sensitivity (Snodgrass, Levy-Berger, & Haydon 1985; Flege et al. 1999; Werker & Tees 1984). Fourteen of the twenty participants had A’ scores lower than .50, characterizing lack of sensitivity. Two participants scored exactly .50, and only four obtained A’ scores higher than that. Of these four, only two participants had A’ scores approximating the perfect sensitivity value (1.0): participants 1 and 7, who scored .82 and .81, respectively. The low A’ scores mean not only that participants had difficulty in identifying the odd items (in the different trials), but also that they believed they heard an odd item when the three tokens of the stimulus were the same (in the catch trials), characterizing lack of perceptual sensitivity. Figure 1 suggests the expected negative correlation between A’ scores and rates of paragoge; however, the statistical tests showed that the correlation was only weak to moderate. A –.15 correlation coefficient was found with the Pearson test, and a –.33 coefficient with the Spearman.
Rosana Denise Koerich
Table 1. Rates of paragoge in the sentence reading test and A’ scores from the CDT test Subject
% Parag
% HIT
% FA
A’
1 2 5 7 11 12 16 18 19 24 25 26 27 30 38 42 44 45 46 47
.40 .50 .36 .60 .54 .44 .56 .47 .62 .53 .43 .35 .39 .32 .42 .44 .36 .38 .40 .38
.86 .39 .17 .69 .33 .31 .14 .33 .17 .28 .22 .17 .47 .17 .31 .08 .19 .11 .25 .44
.42 .67 .17 .25 .58 .17 .42 .58 .50 .67 .33 .17 .58 .25 .42 .50 .42 .33 .42 .25
.82 .28 .50 .81 .30 .66 .25 .30 .23 .22 .38 .50 .40 .30 .39 .18 .30 .27 .34 .67
A¢
Rates of paragoge
Figure 1. Relationship between A’ scores in the CDT and paragoge rates in the sentencereading production test
A¢
Perception and production of vowel paragoge
Rates of paragoge
Figure 2. Relationship between A’ scores in the CDT and paragoge rates in the production test (outliers excluded)
Since the Spearman correlation yielded a larger coefficient, and thus a greater statistical significance (p = .26 Pearson, compared to p = .07 Spearman, approximating alpha level of .05), it was reasoned that there might be outliers in the data, interfering with the Pearson correlation. The outliers were identified based on the standard deviation values in the A’ scores. The two participants spotted as outliers were the ones whose A’ scores fell 2 or more SD from the mean, who were also two of the four participants scoring above the “lack of sensitivity” level (S1 and S7). The identification of outliers allowed for a further analysis of the data to examine whether the degree of association between the two abilities would be affected by computing more uniform data. The results of the second analysis, which can be seen in Figure 2, yielded the following coefficients: r (18) = –.49, p = .01, and rho (18) = –.54, p = .01; that is, the degrees of association increased, and reached statistical significance (p < .05) with both tests. Taken together, the results of the correlation analyses show that the association is clearly negative, taking the data with and without outliers. In other words, these results show that, as participants’ scores in the perceptual test decrease, their rates of paragoge tend to increase. As pointed out in the introduction, research
Rosana Denise Koerich
has variously supported three hypotheses concerning the direction of the relationship between L2 speech perception and production – that perception outperforms production, that production outperforms perception, and that the two abilities develop in parallel. This study indicates that, with this particular group of adult beginning L2 learners, there was a certain degree of alignment of the two abilities, in the sense that students who more frequently produced the paragogic vowel [i] were the ones who more frequently failed to discriminate CVC from CVCV. According to Flege (1995, 1999), accurate segmental perception is one condition for production. Since English final obstruents do not pose specific articulatory difficulties for BP speakers, the difficulty which the participants in this study had in producing them and their tendency to add the paragogic vowel [i], combined with their lack of perception of the difference between CVC and CVCV sequences where the final vowel was /i/, seems to be strictly related to a lack of sensitivity to the contrast.
. Conclusion Differential performance according to L2 proficiency was reported by Flege and Schmidt (1995), who found that, whereas proficient subjects’ perception and production correlated significantly, that of less proficient participants did not, proficiency being measured by overall degree of perceived foreign accent in English sentences. Other studies (e.g., Bohn & Flege 1990; Flege 1993; Flege et al. 1999; Piske, Flege, MacKay, & Meador 2000) show differences in performance between experienced and non-experienced or early and late L2 speakers; however, in general, language proficiency itself is not assessed in these studies. Unlike most previous studies on the relationship between perception and production, which focused on segments and involved L2-users in a naturalistic setting, the present study investigated larger units, and involved FL learners in an instructional setting, thus with considerably less exposure to the spoken language. The production findings showed that this group of adult L2 false beginners, speakers of a language in which there are severe restrictions on word-final obstruents, consistently used vowel paragoge (44.45%) as a remedial measure to deal with these sounds when reading a series of sentences in the L2. Since vowel insertion is the strategy most widely used in these speakers’ L1 to deal with “offending” syllablefinal obstruents, especially in loanwords and acronyms, the use of paragoge to cope with the same type of L2 sequences seems to characterize transfer-related foreign accent. The perception findings showed that the learners consistently failed to discriminate between L2 sequences ending in an obstruent and sequences with the addition of /i/ following the obstruent, the vowel employed in paragoge produc-
Perception and production of vowel paragoge
tion. Only four of the twenty participants reached levels of perception above the threshold score characterized as lack of sensitivity to the contrast. Might this be a manifestation of “perceptual foreign accent”, as characterized by Strange (1995), in the processing of the final consonant? Could these learners be interpreting the L2 input through the “grid” or “sieve” of the L1 phonetic system? Taking L1 phonological processes into consideration, it seems likely that this is the case. L1 speech perception is strongly characterized by the ability to deal with irrelevant contextual and allophonic variants, so that variation does not interfere with the processing. In unstressed word-final syllables in BP, there is great individual, dialectal, and stylistic variation in the production of the consonant plus high-mid front vowel sequence (Câmara Jr. 1970; Cristófaro Silva 1998; Battisti & Vieira 1999). In particular, dialectal features and speech rate account for productions varying from [e] to [I], [i], and to vowel devoicing, that is, its obscuration, in fast speech. The saliency of the vowel in this context is completely irrelevant, so that obscuration does not prevent the sequence from being recognized as a realization of CV, the pattern of the participants’ L1. Various studies (e.g., Cutler et al. 1986; Cutler, Mehler, Norris, & Segui 1992; Cutler & Otake 1994; Jenkins & Yeni-Komshian 1995) have indicated that L2 speakers with different L1s tend to rely on different representational levels, according to the characteristics of their L1s. In contrast to the view of a universal unit of speech perception, it seems to be the case that both the nature and the operation of different levels of representation vary according to the specific characteristics of the languages in contact and according to the target sound(s) in question. As observed by Pisoni and Luce (1987) for L1 and by Baptista (2004) for L2, speakers may have a repertoire of mental representations comprising units at different linguistic levels – at the level of segments or sequences of segments (i.e., the syllable, the onset, or the coda, or rhyme). The findings of this study, which suggest that perceptual encoding of the L2 final consonant by BP speakers conforms to the CV syllable pattern for BP, lend support to this point of view.
Acknowledgments This research was partially funded by a grant from the Divisão de Bolsas e Auxílios no Exterior of CAPES (Coordenação de Aperfeiçoamento de Pessoal de Nível Superior), of the Brazilian Ministry of Education. I would like to thank James E. Flege for his input, and both James E. Flege and the Division of Speech and Hearing Sciences of the Department of Rehabilitation Sciences of the University of Alabama in Birmingham, Alabama, for the opportunity to carry out the data analysis in the laboratory of this institution. I would also like to thank the editors of this volume and an anonymous reviewer for valuable comments on the manuscript.
Rosana Denise Koerich
References Baptista, B. O. (2004). Categorias fonéticas na aprendizagem de língua estrangeira. Revista de Estudos da Linguagem, 12, 475–489. Baptista, B. O. & Silva Filho, J. L. A. (2006). The influence of voicing and sonority relationships on the production of English final consonants. In B. O. Baptista & M. A. Watkins (Eds.), English with a Latin Beat: Studies in Portuguese/Spanish – English interphonology. Amsterdam: John Benjamins. Battisti, E. & Vieira, M. J. B. (1999). O sistema vocálico do português. In L. Bisol (Ed.), Introdução a Estudos de Fonologia do Português Brasileiro (2nd ed.) (pp. 159–194). Porto Alegre, Brazil: EDIPUCRS. Best, C. T. (1995). A direct realist view of cross-language speech perception. In W. Strange (Ed.), Speech Perception and Linguistic Experience: Issues in cross-language research (pp. 171–206). Timonium, MD: York Press. Bohn, O.-S. & Flege, J. E. (1990). Interlingual identification and the role of foreign language experience in L2 vowel perception. Applied Psycholinguistics, 11, 303–328. Câmara Jr., J. M. (1970). Estrutura da Língua Portuguesa. Petrópolis: Vozes. Cristófaro Silva, T. (1998). Fonética e Fonologia do Português. São Paulo: Contexto. Cutler, A., Mehler, J., Norris, D., & Segui, J. (1986). The syllable’s differing role in the segmentation of French and English. Journal of Memory and Language, 25, 385–400. Cutler, A., Mehler, J., Norris, D., & Segui, J. (1992). The monolingual nature of speech segmentation by bilinguals. Cognitive Psychology, 24, 381–410. Cutler, A. & Otake, T. (1994). Mora or phoneme? Further evidence for language-specific listening. Journal of Memory and Language, 33, 824–844. Fernandes, P. R. C. (1997). A epêntese vocálica na interfonologia do português/inglês. MA Thesis, Universidade Católica de Pelotas, Brazil. Flege, J. E. (1984). The detection of French accent by American listeners. Journal of the Acoustical Society of America, 76, 692–707. Flege, J. E. (1988). Factors affecting degree of perceived foreign accent in English sentences. Journal of the Acoustical Society of America, 84, 70–79. Flege, J. E. (1993). Production and perception of a novel, second-language phonetic contrast. Journal of the Acoustical Society of America, 93, 1589–1608. Flege, J. E. (1995). Second language speech learning: Theory, findings, and problems. In W. Strange (Ed.), Speech Perception and Linguistic Experience: Issues in cross-language research (pp. 233–272). Timonium, MD: York Press. Flege, J. E. (1999). The relation between L2 production and perception. In J. J. Ohala, Y. Hasegawa, M. Ohala, D. Granville, & A. C. Bailey (Eds.), Proceedings of the XIVth International Congress of Phonetic Sciences, Vol. 2 (pp. 1273–1276). Berkeley, CA: University of California. Flege, J. E. & Eefting, W. (1987). The production and perception of English stops by Spanish speakers of English. Journal of Phonetics, 15, 67–83. Flege, J. E. & Hillenbrand, J. (1984). Limits on pronunciation accuracy in adult foreign language speech production. Journal of the Acoustical Society of America, 76, 708–721. Flege, J. E., MacKay, I. R. A. & Meador, D. (1999). Native Italian speakers’ perception and production of English vowels. Journal of the Acoustical Society of America, 106, 2973–2987. Flege, J. E., Munro, M. J., & Fox, R. A. (1994). Auditory and categorical effects on cross-language vowel perception. Journal of the Acoustical Society of America, 95, 3623–3641.
Perception and production of vowel paragoge
Flege, J. E. & Schmidt, A. M. (1995). Native speakers of Spanish show rate-dependent processing of English stop consonants. Phonetica, 52, 90–111. Jenkins, J. J. & Yeni-Komshian, G. (1995). Cross-language speech perception: Perspective and promise. In W. Strange (Ed.), Speech Perception and Linguistic Experience: Issues in crosslanguage research (pp. 463–480). Timonium, MD: York Press. Kuhl, P. K. (1993). Early linguistic experience and phonetic perception: Implications for theories of developmental speech perception. Journal of Phonetics, 21, 125–139. Kuhl, P. K. & Iverson, P. (1995). Linguistic experience and the “perceptual magnet effect”. In W. Strange (Ed.), Speech Perception and Linguistic Experience: Issues in cross-language research (pp. 121–154). Timonium, MD: York Press. Leather, J. & James, A. (1996). Second language speech. In W. C. Ritchie & T. K. Bhatia (Eds.), Handbook of Second Language Acquisition (pp. 269–316). San Diego, CA: Academic Press. Major, R. C. (1986). Paragoge and degree of foreign accent in Brazilian English. Second Language Research, 2, 53–71. Major, R. C. (1987). A model for interlanguage phonology. In G. Ioup & S. H. Weinberger (Eds.), Interlanguage Phonology: The acquisition of a second language sound system (pp. 101–124). New York, NY: Newbury House. Monahan, P. J. (2001). Evidence of transference and emergence in the interlanguage. Rutgers Center for Cognitive Science (DOC 444-0701). Retrieved December 3, 2001, from http://roa.rutgers.edu. Pisoni, D. B. & Luce, P. A. (1986). Speech perception: Research, theory, and the principal issues. In E. C. Schwab & H. C. Nusbaum (Eds.), Pattern Recognition by Humans and Machines: Speech perception, Vol. 1 (pp. 1–50). Orlando, FL: Academic Press. Pisoni, D. B. & Luce, P. A. (1987). Acoustic-phonetic representations in word recognition. In U. H. Frauenfelder & L. K. Tyler (Eds.), Spoken Word Recognition (pp. 21–52). Cambridge, MA: The MIT Press. Piske, T., Flege, J. E., MacKay, I. R. A., & Meador, D. (2000). The production of English vowels by fluent early and late Italian-English bilinguals. Phonetica, 59, 49–71. Sheldon, A. (1985). The relationship between production and perception of the /r/-/l/ contrast in Korean adults learning English: A reply to Borden, Gerber, and Milsark. Language Learning, 35, 107–113. Sheldon, A. & Strange, W. (1982). The acquisition of /r/ and /l/ by Japanese learners of English: Evidence that speech production can precede speech perception. Applied Psycholinguistics, 3, 243–261. Snodgrass, J., Levy-Berger, G., & Haydon, M. (1985). Human Experimental Psychology. Oxford: OUP. Strange, W. (1995). Cross-language studies of speech perception: A historical review. In W. Strange (Ed.), Speech Perception and Linguistic Experience: Issues in cross-language research (pp. 3–45). Timonium, MD: York Press. Tarone, E. (1987). Some influences on the syllable structure of interlanguage phonology. In G. Ioup & S. H. Weinberger (Eds.), Interlanguage phonology: The acquisition of a second language sound system (pp. 232–247). Cambridge, MA: Newbury House. Werker, J. F. & Tees, R. C. (1984). Cross-language speech perception: Evidence for perceptual reorganization during the first year of life. Infant Behavior and Development, 7, 49–63.
Rosana Denise Koerich
Appendix A Examples of sentences used in the production test.
TARGET
CONTEXT
SENTENCE
/p/
/b/ /%/ /∅/ /w/ /r/ /∅/ /tw/ /o~/ /l/
The top bed is Mary’s. She has to keep up with me. The baby fell from my lap. The big shoe is blue. The hog ran to the water. This is the Italian flag. The deaf child was in the class. Cliff owes me an excuse. He wears gold cuff links.
/g/
/f/
Appendix B Examples of phrases used in the perception test.
TARGET
CONTEXT
/f/ /p/ /tw/ /p/ /k/ /t/
/k/ /7/ /f/ /n/ /k/ /r/
PHRASES JEFF CUTS PUP EDITS FITCH FEELS CAP NEEDS JACK CASTS MATT ROBS
JEFFY CUTS PUPPIE EDITS FITCHIE FEELS CAPPIE NEEDS JACKIE CASTS MATTY ROBS
The sonority cycle and the acquisition of complex onsets Robert S. Carlisle California State University, Bakersfield, USA
The Sonority Cycle (Clements 1990), consisting of the Core Syllabification Principle (CSP) and Feature Dispersion Principle (FDP), is a model of syllable structure that systematically reveals the markedness relationships among syllable margins. This paper presents the results of two studies testing the Sonority Cycle. The first study examines the production of biliteral and triliteral onsets, the latter being more marked than the former. The second study examines the production of three biliteral onsets differing in their sonority profile. Of the three, /.st-/ is the most marked because it violates the CSP. In turn /.sn-/ is more marked than /.sl-/ because of its higher dispersion value according to the FDP. Both studies found that less marked onsets are correctly produced more frequently than are more marked onsets.
.
Introduction
Over the last three decades, initially due to Eckman’s (1977) Markedness Differential Hypothesis, researchers in interlanguage phonology have actively investigated the relationship between markedness and the acquisition of a second language phonology (Abrahamsson 1999; Anderson 1987; Carlisle 1988, 1997; Eckman 1987, 1991; Eckman & Iverson 1993; Major 1996; Major & Faudree 1996; Rebello & Baptista this volume; Weinberger 1987). Most of these studies have appealed to implicational universals to establish markedness relationships. Implicational universals involve two structures in a conditional relationship: if X then Y. For example, whereas some languages have onsets of the form obstruent + liquid and others have both obstruent + liquid and nasal + liquid onsets, no language has only nasal + liquid onsets. Thus, the presence of nasal + liquid onsets implies the presence of obstruent + liquid onsets, meaning that the latter onsets are less marked in relationship to the former. With the possible exception of two studies on the interlanguage phonology of native Brazilian-Portuguese speakers learning English, studies investigating the relationship between implicational universals and
Robert S. Carlisle
the acquisition of an L2 phonology have consistently found that less marked structures are modified less frequently than are more marked structures and that more marked structures do not reach a criterion level of acquisition before less marked structures do. Most statements of implicational relationships have come from large-scale comparative studies such as the one by Greenberg (1978). Unfortunately, languages have other preferences among syllable margins that are not revealed in these comparative studies, and researchers have appealed to other notions such as the sonority hierarchy for statements to express these other preferences (Carlisle 1991b; Tropf 1987). In other words, the field lacked a unified approach for ranking all onsets and codas in a principled and unified manner. This infelicitous situation essentially ended with the introduction of the Sonority Cycle (Clements 1990). The Sonority Cycle is a model of syllable structure that systematically ranks syllable margins by their length and sonority profile by assigning them a numerical value, the higher the value the more marked the margin. Perhaps not surprisingly, the rankings produced by the Sonority Cycle are very similar to the markedness relationships expressed in implicational statements. The purpose of this paper is to present the findings of two studies examining the production of onsets in a markedness relationship based on the Sonority Cycle. The first study is a synthesis of longitudinal research extending over a period of over four years that examines the acquisition of biliteral and triliteral onsets. The results from the three different times of data collection have been reported in three previous articles (Carlisle 1997, 1998, 2002). The second study examines the production of three biliteral onsets in a markedness relationship based on their sonority profile rather than length. The results of this study have not been published previously.
. Background on the syllable . The length of margins and markedness Theoretical studies of the syllable find that the markedness of margins (both onsets and codas) increases with length (Battistella 1990; Blevins 1995; Cairns & Feinstein 1982; Clements & Keyser 1983; Greenberg 1978; Kaye & Lowenstamm 1981; Kiparsky 1979; Vennemann 1988). This fact is captured by the observation that the presence of onsets or codas of length n implies the presence of at least one subsequence n – 1 in the corresponding positions (Greenberg 1978; Kaye & Lowenstamm 1981), with the exception that the presence of CV does not necessarily imply the presence of V (a syllable with a zero onset). Vennemann (1988: 13) explicitly captures this regularity in Part A of his Head Law (where head is synony-
The sonority cycle
mous with onset): “A syllable head is the more preferred: (a) the closer the number of speech sounds is to one.” In turn, Part A of the Coda Law states the following: “A syllable coda is the more preferred: (a) the smaller the number of speech sounds in the coda.” Thus, a single C is the optimal onset and a zero C is the optimal coda, meaning of course that the CV syllable is the most preferred syllable type in languages (i.e., the unmarked syllable type that appears in all languages), and that increases in the length of either margin render corresponding relative increases in markedness. Research in historical linguistics has demonstrated that margins are generally shortened rather than lengthened, and that “if a change worsens syllable structure, it is not a syllable structure change, ... but a change on some other parameter which merely happens also to affect syllable structure” (Vennemann 1988: 2). Looking specifically at the onset, Vennemann notes a number of historical cases in which complex onsets were reduced to simple onsets as in the examples from Pali below (Vennemann 1988: 15): (1)
krayavikraya ambra srotas svapna syandana
→ → → → →
kayavikkaya amba sota soppa sandana
‘commerce’ ‘mango’ ‘stream’ ‘sleep’ ‘wagon’
These examples from Pali demonstrate that the biliteral onsets /.kr-/ and /.br-/ were reduced to the simple onsets /.k-/ and /.b-/ respectively and that /.sr-/, /.sv-/, and /.sy-/ were each reduced to the simple onset /.s-/.1 . Sonority sequencing Linguists have long been aware that even though margins of syllables may be equal in the number of segments composing them, some orders of segments occur much more frequently than others and have therefore been regarded as preferred, simpler, or less marked (Battistella 1990; Blevins 1995; Cairns & Feinstein 1982; Clements 1990; Clements & Keyser 1983; Green 2003; Greenberg 1978; Kaye & Lowenstamm 1981; Hulst & Ritter 1999; Morelli 2003; Selkirk 1984; Vennemann 1988). As discussed by Clements (1990), since the late nineteenth century, linguists have attempted to describe these preferred segmental orders in terms of sonority. In general, preferred syllables display both a continuous rise in sonority from the most peripheral member of the onset through the nucleus and a continuous fall in . How the onsets were reduced is also of importance. In most cases a sonorant was eliminated, leaving a simple onset consisting of an obstruent, the most universally preferred of all simple onsets, as will be demonstrated in the discussion of the Sonority Cycle.
Robert S. Carlisle
sonority from the nucleus through the most peripheral member of the coda. This Sonority Sequencing Principle (SSP) is not without exception, especially in codas where inflectional morphology may override sonority preferences, but it does capture a very strong cross-linguistic generalization, given that most languages have only margins that abide by the SSP. Margins violating the SSP exist but are relatively rare, and languages having them also have a much greater number of margins that do not violate the principle. Though linguists may disagree on the exact number of segmental categories that may be identified according to the criterion of sonority, they agree on the following minimal sequence, which indicates the rise in sonority from the obstruents through the vowels: (2) obstruents < nasals < liquids < glides < vowels
Syllable margins may violate the SSP in two ways. First, members of an onset or coda may have the same sonority rank as in the English words /spik/ and /æpt/, which contain sonority plateaus, both segments being obstruents. Onsets and codas can more seriously violate the SSP, and consequently be more marked, if a segment closer to the nucleus is less sonorant than a more peripheral segment as in the French word /tabl/. Such a violation is a sonority reversal, and they occur less frequently in languages than do sonority plateaus. Whether a violation of sonority is classified as a plateau or a reversal depends on how many segmental classes are recognized in the sonority scale. For linguists who accept that plosives are less sonorant than fricatives (i.e., dividing obstruents into two groups – Morelli 2003; Selkirk 1984, among many others), a word such as /spik/ contains a sonority reversal in the onset rather than a plateau. However, Clements (1990) believes that large descriptive studies such as the one by Greenberg (1978) reveal that consistent cross-linguistic generalizations are obtained by regarding the obstruents as a single class of segments, rendering further subdivisions superfluous.2 Syllable preference depends on more than whether onsets and codas abide by the SSP or not. Onsets and codas can abide by the SSP, yet some are still preferred over others. For example, though biliteral onsets consisting of an obstruent + liquid and those consisting of an obstruent + nasal both adhere to the SSP, the former is universally preferred to the latter as demonstrated by their implicational . Actually, and as Clements recognizes, several of Greenberg’s implicational statements depend upon the class of obstruents being divided into the sub-classes of stops (S) and fricatives (F). For example, one of Greenberg’s implicational statements is that if a language has an onset of the form SS, then it will have sequences of FS and/or SF. However, Clements notes that languages prefer to have obstruent clusters that differ on the feature of [continuant], a claim that has nothing to do with the sonority of the segments involved.
The sonority cycle
relationship (Greenberg 1978). The Sonority Cycle (Clements 1990) accounts for these preferences by systematically ranking onsets and codas by sonority. . Sonority cycle Clements’s sonority cycle is a model that accounts for the sonority-based constraints found in the world’s languages. It consists of two principles – Core Syllabification and Feature Dispersion – which together rank onsets and codas based upon preferred sonority relationships among their segmental members.3 .. Core syllabification principle Clements (1990: 299) has formally expressed the Core Syllabification Principle as follows: (3) The Core Syllabification Principle (CSP) a. Associate each [+syllabic] segment to a syllable node. b. Given P (an unsyllabified segment) preceding Q (a syllabified segment), adjoin P to the syllable containing Q iff P has a lower sonority rank than Q (iterative). c. Given Q (a syllabified segment) followed by R (an unsyllabified segment), adjoin R to the syllable containing Q iff R has a lower sonority rank than Q (iterative).
As stipulated in (3a) the CSP begins with the nucleus of the syllable; whatever segment has the feature [+syllabic] becomes associated with a syllable node. The onset position is next and segments before the nucleus are associated with the syllable node as long as each successive segment from the nucleus has a lower sonority value. The segments constituting the coda are the last to be associated with a syllable node. Each successive segment after the nucleus is associated with a syllable node as long as each has a lower sonority value than the previous segment. In accordance with the CSP the syllabification of crimp would be achieved in the following steps: (4) a.
C C V C C k r
I
m p
. Because this paper concentrates on onsets, little more will be presented about codas.
Robert S. Carlisle s
b.
C C V C C k r
I
m p
s
c.
C C V C C k r
I
m p
s
d.
C C V C C k r
I
m p
s
e.
C C V C C k r
I
m p
s
f.
C C V C C k r
I
m p
Essentially a formalized statement of the SSP, the CSP creates two broad categories of margins – those that adhere to the SSP and those that do not. Those that do not adhere to the principle consist of sonority plateaus or reversals as discussed previously. Margins violating the CSP are marked in relationship to those that do not, a point reinforced by the observation that all languages having sonority plateaus or reversals also have margins that adhere to the CSP. However, the converse is not true; languages may have only margins that adhere to the CSP, such as Spanish. In contrast, English has a number of onsets that violate the CSP including the three biliteral onsets that begin with the obstruent /s/ and are followed by a voiceless stop: /.st-/, /.sp-/ and /.sk-/. In addition, all triliteral onsets in English violate the CSP because they also begin with /s/ followed by a voiceless stop as in /.str-/, /.spl-/, and /.skr-/.
The sonority cycle
Clements’s claims are independently corroborated by the comparative analysis of Greenberg (1978), who described preferences for onsets in implicational relationships, where L = liquid, O = obstruent, G = glide, and N = nasal. (5) a. LO → OL b. GO → OG c. LN → NL
These three statements all contain sonority reversals to the left of the arrows and onsets that abide by the CSP to the right. Any language having any one of the onsets on the left has the corresponding onset to the right. In other words, the former implies the latter, meaning that the latter is less marked. .. Feature dispersion The ranking of syllables based upon preferred sonority profiles is much more complex than can be captured by the CSP. As mentioned previously, different margins may adhere to the CSP yet still be in markedness relationships to one another. For example, both obstruent + nasal and obstruent + liquid onsets abide by the CSP, yet universally the latter is preferred over the former (Greenberg 1978). Feature Dispersion – the second principle of the Sonority Cycle – accounts for these preferences by applying only to those strings of segments that adhere to the CSP and ranking them according to preferred sonority profiles. As discussed by Clements (1990), the Feature Dispersion Principle (FDP) applies to demisyllables. Each syllable has two demisyllables, each of which shares the nucleus. The initial demisyllable consists of the onset and the nucleus and the final demisyllable consists of the nucleus and the coda. For example, the two demisyllables of the word blimp – /blImp/ – are [blI] and [Imp]. Because the nucleus occurs in both demisyllables, the FDP expresses itself in terms of maximal rises and falls in sonority. According to Clements (1990: 303), the optimal initial demisyllable is “one with the maximal and most evenly-distributed rise in sonority”. In other words, if we examine three-member demisyllables, OLV would be optimal in the world’s languages. We can easily see this by assigning a number to the five segmental classes in the sonority scale as shown in (6): (6) Classes: obstruents < nasals < liquids < glides < vowels Values: 5 4 3 2 1
If the optimal three-member demisyllable indeed has a maximal and evenly distributed rise in sonority, then it must begin with an obstruent (a value of 5) and end with the nucleus (which is a vowel having a value of 1). The medial segment would have to be a liquid because it is equally distant (a distance of two) from both the obstruent and the vowel. Initial three-member demisyllables will be increasingly less preferred the more that their sonority profiles differ from the optimal
Robert S. Carlisle
profile above. In the slightly less preferred ONV demisyllable, the distance between O and N is one and the distance between N and V is three, so the segments in the two pairs are not equally distant from each other; this demisyllable therefore has a less preferred sonority profile and consequently a higher value for dispersion. The least preferred three-member demisyllable among those adhering to the CSP is the LGV. It has a distance of only one between L and G and between G and V, so even though the distance is equal, it is not maximal as it is in the OLV syllable. To rank all demisyllables that abide by the CSP, Clements incorporates a formula that assigns a value for dispersion to all margins as displayed in (7): (7) D =
i–1 m
1/di2
The formula considers the number of pairs of segments in each demisyllable (represented as m) and the sonority distance between each pair including all the non-adjacent ones (represented in the formula as d). Clements’s formula to calculate dispersion (D) produces the following rankings for initial demisyllables, the lower the value of D the more preferred the demisyllable: (8) Rankings for initial two-member demisyllables OV 0.06 NV 0.11 LV 0.25 GV 1.00 (9) Rankings for initial three-member demisyllables OLV .56 ONV, OGV 1.17 NLV, NGV 1.36 LGV 2.25 (10) Rankings for initial four-member demisyllables ONGV 2.53 OLGV, ONLV 2.67 NLGV 3.61
As displayed in the rankings above, the value of D increases linearly, with the length of onset. All triliteral onsets (those in four-member demisyllables) have higher values for D than do biliteral onsets; in turn, biliteral onsets have higher values for D than do simple onsets, with the one exception that a simple onset consisting of just a glide (D = 1.00) is actually less preferred than a biliteral onset consisting of an obstruent and a liquid (D = .56). In addition, onsets of the same length are also ranked according to their values for D. For example, among the biliteral onsets the optimal combination is OL,
The sonority cycle
which has the lowest value for dispersion (.56). As explained above, other onsets have higher values for D either because the sonority distance between the segmental members of the three-member demisyllable is not equal as in ONV or not maximal as in LGV. Again, the rankings produced through the use of Clements’s formula closely correspond to large comparative analyses such as the one by Greenberg (1978), which shows preference in terms of implicational statements. The following are a few of Greenberg’s implicational statements that correspond to Clements’s rankings: (11) a. NLV → OLV b. ONV → OLV
The demisyllables to the left of the arrows are the more marked, and if they are less preferred according to Clements’s formula for dispersion, they should have a higher value for D than do those on the right. NLV has a value of 1.36 and ONV has a value of 1.17; in turn, OLV, which appears to the right in both implicational statements, has a value of .56. So according to both the FDP and Greenberg’s implicational statements, OLV demisyllables are preferred over both NLV and ONV demisyllables.
. Study one . The reduction of margins Although quite a bit of research exists comparing the rate of modification of margins of different lengths, most of it has examined codas rather than onsets (Anderson 1987; Benson 1988; Eckman 1987, 1991; Hancin-Bhatt 2000; Hansen 2001, 2004; Sato 1984; Weinberger 1987). Findings from these studies have been uniform in finding either that longer codas are more frequently modified than are shorter codas or that longer codas do not reach a criterion level of acquisition before shorter codas do. Only a few studies have examined onsets by length. Anderson (1987) examined the frequency with which 20 speakers of Egyptian Arabic and 20 speakers of Amoy and Mandarin Chinese modified English onsets, finding that all groups of participants made significantly more modifications of onsets as their length increased. Arabic speakers did not modify simple onsets at all, but they did modify over 7% of the biliteral onsets. Unfortunately, no comparison could be made between biliteral and triliteral onsets because not enough data was available. Eckman (1991) examined the reduction of three triliteral onsets in four elicitation tasks by 11 participants from three different language backgrounds: Japanese,
Robert S. Carlisle
Cantonese, and Korean, none of which allow complex onsets. Unlike Anderson (1987), Eckman did not compare the frequency with which biliteral and triliteral onsets occurred relative to one another. Instead, Eckman used a criterion measure, 80%, to determine the presence or absence of a particular structure. For example, if a participant produced onsets of the form /.spr-/ correctly 80% of the time, the structure was regarded as present in the interlanguage phonology. If either or both of the two subsequences /.sp-/ and /.pr-/ were correctly produced 80% of the time, then they were also present and the hypothesis was confirmed. The hypothesis could have been falsified if the triliteral onset was present and both of the two-member subsequences were absent according to the 80% criterion. Eckman examined three triliteral onsets across 11 participants and four tasks and found one falsification; that is, in one case, a triliteral onset was present at the criterion level, but both two-member subsequences were absent. However, given that only one falsification occurred in approximately 130 cases examined, this study provides very strong evidence that shorter onsets are acquired before longer onsets. Finally, in a longitudinal case study, Abrahamsson (1999) tracked the production of /.sC(C)-/ onsets in Swedish by a native Spanish speaker. Abrahamsson’s participant was a beginning learner of Swedish who was taped nine times over a ten-month period. During that time he modified .77 of the triliteral onsets that he produced and .59 of the biliteral onsets, a statistically significant difference (p < .01). . Hypotheses The study tested three hypotheses: 1. Participants will modify triliteral onsets more frequently than biliteral onsets. 2. Prothesis will occur more frequently after word-final consonants than after word-final vowels. 3. Participants will not acquire more marked onsets before less marked onsets. . Method .. Participants4 The selection of participants for the study was based on two general criteria. First, all the participants had to be adult native Spanish-speakers. Spanish speakers were chosen because Spanish does not have any onsets of the form /.sC(C)-/ (Harris 1983), and Spanish speakers modify such onsets in a uniform manner, the use . Because the selection procedure has been described in detail in Carlisle (1997, 1998), it will just be summarized here.
The sonority cycle
of prothesis. That Spanish speakers transfer a prothesis rule from their native language has been well documented in several studies of Spanish speakers acquiring /.sC(C)-/ onsets in English (Carlisle 1991a, 1991b, 1998). Other studies have examined native Spanish speakers acquiring complex onsets beginning with /s/ or /w/ in other languages such as Swedish (Abrahamsson 1999; Hyltenstam & Lindberg 1983), German (Tropf 1987), and Italian (Schmid 1997); all these studies found that when the target onsets were modified at all, they were modified nearly exclusively by prothesis. The second selection criterion was that the participants had to have similar English proficiency. All the participants were studying ESL at a community college in California at Time I. All of them were enrolled in intermediate classes, and were placed there according to their scores on the Secondary Level English Proficiency Test. Because this test did not measure pronunciation, however, the participants selected were only those who displayed the use of an intermediate amount of prothesis before the onsets under study. Participants had to modify at least 21% of the onsets but no more than 79%. Finally, all prospective participants who had taken a pronunciation course were eliminated. Of 17 prospective participants at Time I, 11 were judged as suitable according to the three selection criteria. Ten participants were still available at Time II, and only four remained at Time III. .. Instrumentation The instrument consisted of a list of 176 topically unrelated sentences. Half of the sentences contained a word beginning with the triliteral onsets /.spr-/ or /.skr-/, and the other half of the sentences contained a word beginning with the biliteral onsets /.sp-/ or /.sk-/. For the data-gathering instrument to be valid, two factors had to be controlled: the phonological environments before the onsets and the sonority relationships among the segments in the onsets. The environments were controlled because previous research has demonstrated that native Spanish speakers insert a prothetic vowel before /.sC(C)-/ onsets significantly more frequently after a consonantal environment than after a vocalic environment in both English (Carlisle 1991a, 1991b, 1992) and Swedish (Abrahamsson 1999). As a consequence, 22 different environments, 13 consonants and 9 vowels, occurred twice before each of the four target onsets. The sonority profiles among the segments in the target onsets were also held constant so that the true parameter being investigated was length. Because all triliteral onsets violate the CSP and because previous research has demonstrated that onsets violating the CSP are modified significantly more frequently than those that do not (Carlisle 1991b; Tropf 1987), the biliteral onsets in the study also had to violate the CSP. The two biliteral onsets in the study – /.sp-/ and /.sk-/ – differ from the triliteral onsets – /.spr-/ and /.skr-/ – only in length.
Robert S. Carlisle
.. Procedure At Times I and II the participants were still students at Bakersfield College and were consequently taped individually in the language laboratory of the college or in a secluded office. At Time III, because the participants were no longer taking courses at Bakersfield College, it was necessary to tape them at different locations, and in less ideal situations, than previously. Two were taped in a private office at Bakersfield College; one was taped at a private office at California State University, Bakersfield, and the last was taped in her home. All participants were taped on either a Marantz or a SONY TC-D5PROII portable recorder connected to a high quality microphone. Recording at Time II took place 10 months after that of Time I, and Time III taping took place three years and five months after that of Time II. .. Transcribing and reliability At Time I, two faculty members independently analyzed the tapes, transcribing the environments, the onset, and the presence or absence of a prothetic vowel. As previous research has revealed that prothesis is nearly the sole means that Spanish speakers utilize to modify onsets of the form /.sC(C)-/, only prothesis was regarded as a modification. Original interrater reliability at Time I was 89%. The two transcribers then jointly listened to all of the original disagreements again in an attempt to come to agreement on them. Of the original 1,760 possible test items at Time I, 25 were discarded: 9 because the transcribers could not reach agreement and 16 because the participants misread or skipped them. The method for transcribing the data at Time II was similar to the method just described. Two transcribers independently evaluated the tapes of the 10 participants and original reliability was 87%. The two then independently reevaluated all of the items on which they had originally disagreed. After this procedure, the transcribers still disagreed on 55 items (3% of the data), and a third transcriber then resolved those differences. Of the 1,760 original items, 6 were eliminated from the study because the onsets were misread by the participants. At Time III, two transcribers independently evaluated the tapes of the four remaining participants and disagreed on 45 items, for a total reliability of 93.6%. Original reliability was unusually high because one participant no longer used any prothesis at all, and the two transcribers agreed on all 176 items. Reliability for the remaining three participants was 91.5%. While evaluating the tapes a second time, they reached agreement on the 45 items on which they had originally disagreed. Of the 704 original items, only one was eliminated from the study because the onset was misread by one of the participants. .. Analysis The first two non-null hypotheses were tested against only Time I data. The analysis consisted of a 2 X 2 ANOVA with repeated measures, environment
The sonority cycle
and onset being the independent variables and frequency of prothesis the dependent variable. The third hypothesis was tested by using a criterion level of 80% correct production. If an onset reached that criterion level, it was considered acquired, a procedure used in previous research (Eckman 1991). The hypothesis can be falsified if the more marked onset reaches the criterion level before the less marked onset. Support for the hypothesis occurs if the less marked onset reaches the criterion level first. Finally, findings are consistent with the hypothesis if both onsets reach the criterion level or if both do not. . Results .. Time I ... Influence of environment. At Time I the mean proportion of prothesis was .51 after word-final consonants and .35 after word-final vowels; this difference resulted in a significant main effect for environment: F(1,20) = 18.50, p = .0003. This finding is congruent with those of previous studies revealing that Spanish speakers use prothesis significantly more frequently after word-final consonants than after word-final vowels before onsets of the form /.sC(C)-/ (Abrahamsson 1999; Carlisle 1991a, 1991b, 1992). ... Markedness relationships. The mean proportion of prothesis was .38 before /.sC-/ onsets and .48 before /.sCC-/ onsets, a difference producing a significant
Figure 1. Percentage of /.sk-/ and /.skr-/ onsets correctly produced at Time I
Robert S. Carlisle
Figure 2. Percentage of /.sp-/ and /.spr-/ onsets correctly produced at Time I
main effect: F(1,20) = 7.60, p < .01. As hypothesized, the more marked onsets were modified significantly more frequently than were the less marked onsets. As demonstrated in Figures 1 and 2, these statistical findings were not due to a few participants modifying an unusually large percentage of the more marked onsets. Figure 1 compares the percentage of correct production of /.sk-/ and /.skr-/, revealing that nine of the 10 participants who participated at both Time I and II correctly produced a greater percentage of the less marked onset than they did the more marked onset. The same result was obtained for /.sp-/ and /.spr-/ as indicated in Figure 2; nine of the 10 participants correctly produced more of the less marked onsets than the more marked onsets. .. Time II As revealed in Figure 3, only Participant 6 produced the onsets /.sk-/ and /.skr-/ above the criterion level of 80% correct production. Consequently, the results are consistent with the hypothesis. More importantly, none of the participants produced a more marked onset at or above the criterion level while producing the less marked onset below the criterion level, a result that would have falsified the hypothesis. Figure 4 shows the results for the correct production of /.sp-/ and /.spr-/. As indicated in the figure, Participant 6 produced both onsets above the criterion level, a finding consistent with the hypothesis. Participant 2 produced /.sp-/ above the criterion level, but not /.spr-/, a finding that supports the hypothesis. None of the participants produced counterevidence.
The sonority cycle
Figure 3. Percentage of /.sk-/ and /.skr-/ onsets correctly produced at Time II
Figure 4. Percentage of /.sp-/ and /.spr-/ onsets correctly produced at Time II
In the Time I results, nine of ten participants produced a greater percentage of less marked onsets correctly than more marked onsets, a pattern that is maintained at Time II. As indicated in Figure 3, eight of ten participants correctly produced a greater percentage of /.sk-/ onsets than /.skr-/ onsets. In addition, as indicated in Figure 4, nine of the ten participants correctly produced a greater percentage of /.sp-/ onsets than /.spr-/ onsets. This pattern demonstrates why the hypothesis is satisfied: if less marked onsets are more frequently produced correctly, the more marked onsets will not reach the criterion level first.
Robert S. Carlisle
.. Time III Figure 5 presents the percentage of correct production for /.sk-/ and /.skr-/. In both cases Participant 2 did not modify either onset; thus, both were over the criterion level. None of the onsets produced by the other three participants reached the criterion level, making the findings consistent with the hypothesis. Figure 6 presents the percentage of correct production for /.sp-/ and /.spr-/. Again, Participant 2 no longer modified either onset, so both onsets passed the cri-
Figure 5. Percentage of /.sk-/ and /.skr-/ onsets correctly produced at Time III
Figure 6. Percentage of /.sp-/ and /.spr-/ onsets correctly produced at Time III
The sonority cycle
terion level of 80 percent correct production. None of the other three participants produced either onset at the criterion level. Again, the pattern of correct production supplies some indirect evidence for the hypothesis. Note that Participants 1, 3 and 4 correctly produced the biliteral onsets more frequently than they did the triliteral onsets. If this pattern were to continue, then the less marked onset would reach the criterion level before the more marked onset. That the less marked onset would reach the criterion level first is more than just an assumption; other studies that have examined the production of onsets by length have found that shorter onsets are modified significantly less frequently than are longer onsets (Abrahamsson 1999; Anderson 1987). . Discussion According to the Sonority Cycle, triliteral onsets are more marked than biliteral onsets. However, because all triliteral onsets in English violate the CSP, care must be taken to compare them only with biliteral onsets that also violate the CSP. This study compared onsets of the form OOL and OO, meaning that the only variable differentiating the two was the presence of L as the third segment in the triliteral onsets. The findings of this longitudinal study generally support the hypotheses that were investigated. At Time I the participants modified biliteral onsets significantly less frequently than they did triliteral onsets, a finding that corroborates those of previous studies that have compared onsets differing in length (Abrahamsson 1999; Anderson 1987). This pattern of correct production was further supported by examining the frequency of correct production against the criterion level of 80% correct production. At Times 1, 2 and 3 participants never produced a more marked onset at the criterion level unless they had also produced the corresponding less marked onset at the criterion level; counterevidence never occurred in the study. Finally, at Time I the participants inserted a prothetic vowel significantly more frequently after a word-final consonant than after a word-final vowel, a finding also consistent with those of other studies examining native Spanish speakers acquiring Swedish or English (Abrahamsson 1999; Carlisle 1991a, 1991b, 1992). A stronger test of the Sonority Cycle can be conducted by holding the length of the onsets constant, examining a ranking based on both the CSP and the FDP.
Robert S. Carlisle
. Study two . Background As predicted by the Sonority Cycle, and confirmed by the findings of the first study, when sonority profiles are held constant, shorter onsets are less frequently modified than are longer onsets; length is therefore a factor in determining the markedness relationships among onsets. In addition, the Sonority Cycle clearly reveals that onsets of the same length can also be ranked according to their sonority profiles. Taking into consideration just biliteral onsets, those that violate the CSP are more marked than those that do not, so onsets of the form /.st-/ and /.sp-/, which constitute sonority plateaus, are less preferred universally. Biliteral onsets not violating the CSP are also ranked according to the FDP so that biliteral onsets of the form OL are optimal and have the lowest value for D. Any biliteral onset other than OL will have a higher value for D and consequently be more marked. . Studies testing sonority sequencing .. Sonority plateaus and reversals Several studies have provided evidence that biliteral onsets abiding by the Sonority Sequencing Principle are modified less frequently than those that do not, whether plateaus or reversals. The first study with onsets was conducted by Tropf (1987), who examined German onsets produced by 11 native Spanish-speaking adults. The data came from about one hour of taped conversations with each of the participants. Though the results are difficult to interpret because Tropf did not take environment into account, perform statistical analyses, or separate the findings for biliteral and triliteral onsets, his results suggest that onsets abiding by the SSP are modified less frequently than those that do not. In a later study that attempted to avoid the problems evident in the Tropf study, Carlisle (1991b) examined the production of /.sl-/ and /.st-/ onsets by 11 native Spanish-speaking adults; the two onsets differ in that /.sl-/ conforms to the CSP, and /.st-/ does not. Because the latter onset is more marked than the former, it should be modified more frequently. Each participant read a reading instrument consisting of 290 sentences, each sentence containing one occurrence of a wordinitial /.sl-/ or /.st/; environment was strictly controlled before the target onsets. The frequency of prothesis was .36 before /.st-/ and .25 before /.sl-/, a significant difference at p < .0004. Thus, the frequency of modification of the onset that violated the CSP was significantly greater than the frequency of modification of the onset that did not violate it. Two studies of Portuguese speakers learning English (Major 1996; Rebello & Baptista, this volume) produced findings that seem to contradict those just
The sonority cycle
discussed. Major found that four native speakers of Portuguese learning English modified onsets abiding by the CSP more frequently than those that did not, the four participants modifying 45.7% of /.sl-/ onsets, but only 18.3% of /.st-/, /.sp-/, and /.sk-/ onsets. In the second study, Rebello and Baptista compared the production of three biliteral onsets abiding by the CSP – /.sl-/, /.sm-/, and /.sn-/ – against three that did not – /.st-/, ./sp-/ and /.sk-/ – and found that their six participants modified 63% of the less marked onsets as opposed to 54% of the more marked onsets. Possible reasons for these seemingly aberrant findings will be given in the Discussion. Another possible exception comes from a longitudinal case study of a native Spanish speaker learning Swedish. Abrahamsson (1999) found that his participant modified 75% of 44 /.sl-/ onsets and 59% of 291 /.sC1 -/ onsets (where C1 is a voiceless stop). However, these findings may be attributable to the small number of /.sl-/ onsets in the study and to a possible confounding effect of environment, for although Abrahamsson took environment into account and found that prothesis occurred significantly more frequently after word-final consonants than after word-final vowels as had been found in previous research (Carlisle 1991a, 1991b, 1992, 1997), he did not perform a sub-analysis of the environments before just the two onsets in question. Consequently, if a greater percentage of word-final consonants appeared before /.sl-/ than before /.sC1 -/, then a greater frequency of prothesis would be expected before /.sl-/ than before /.sC1 -/, a result attributable to environment rather than to the markedness relationship between the onsets. .. Preferred biliteral onsets Two studies have found that less marked biliteral onsets are modified less frequently than are more marked biliteral onsets. In the first study, Carlisle (1988) examined the frequency of prothesis before the OL and ON onsets. As noted by Greenberg (1978), the presence of ON onsets implies the presence of OL onsets, meaning that the latter is less marked than the former. In addition, OL onsets have a lower value for D (.56) than do ON onsets (1.17). To test the possible influence of this markedness relationship, Carlisle (1988) examined the frequency of prothesis before the onsets /.sl-/, /.sm-/, and /.sn-/, the hypothesis being that prothesis would occur less frequently before the OL onset than the ON onsets. For the study, 14 native Spanish speakers read a list of 435 topically unrelated and randomly ordered sentences, 145 sentences each for /.sl-/, /.sm-/, and /.sn-/. Environment was strictly controlled to determine whether the frequency of prothesis was influenced by different word-final environments. The mean proportions of prothesis before the three onsets were .29 for /.sl-/, .38 for /.sm-/, and .33 for /.sn-/; an ANOVA produced a significant difference among the three means. Pairwise comparisons revealed that the mean frequency of prothesis before /.sl-/ was significantly less than before /.sm-/ and /.sn-/ as hy-
Robert S. Carlisle
pothesized. In addition, /.sm-/ was also more frequently modified than was /.sn-/, although the two onsets are not in any known implicational relationship. However, a possible explanation may be found in Clements’s Sequential Markedness Principle (1990: 313) stated below: (12) For any two segments A and B and any given context X_Y, if A is simpler than B, then XAY is simpler than XBY.
Given that anterior coronals are less marked than are labials, then the sequence /.sn-/ is less marked than /.sm-/ and should therefore be modified less frequently. The second study that found that complex onsets with a lower value of D are modified less frequently than those with a higher value for D was conducted by Eckman and Iverson (1993). Eckman and Iverson investigated the production of onset clusters grouped into three rankings of markedness: (13) most marked Voiceless stop + glide ↑ Voiced stop + liquid Voiceless fricative + liquid least marked Voiceless stop + liquid
As displayed in (13) the least marked target structure is the OL onset with a D value of .56. In contrast, the most marked onset is an OG with a D value of 1.17. Eckman and Iverson derived the two intermediate onsets in the ranking – voiced stop + liquid and voiceless fricative + liquid – by appealing to Clements’s Sequential Markedness Principle as stated in (12). Given that stops are less marked than fricatives and voiceless stops are less marked than voiced stops, onsets of the form voiceless stop + liquid should be less marked than voiced stop + liquid and voiceless fricative + liquid. Eckman and Iverson gathered data for their study by interviewing eleven adult participants, three native speakers of Cantonese and four speakers each of Japanese and Korean. Participants had to produce a minimum number of four tokens of each target onset. To test their predicted markedness ranking, the researchers measured their participants’ production against a criterion threshold of 80% correct production. If participants produced 80% of a given onset correctly, it was considered to be present in the interlanguage; if the frequency of correct production was less than 80%, the onset was considered absent from the interlanguage. The researchers hypothesized that more marked onsets would reach the criterion level only if the corresponding less marked onsets also reached the criterion level. Counterevidence would occur if a more marked onset reached the criterion level and a corresponding less marked onset did not. The study contained 55 potential tests of the hypothesis, but five of the tests could not be conducted because some participants did not produce the minimum number of tokens of one of the target onsets. Of the 50 remaining tests 46 supported the general hypothesis that a more marked onset would not reach the criterion threshold unless a corresponding less
The sonority cycle
marked onset had also reached the criterion level. Two participants produced four instances of results failing to support the hypothesis. In contrast to the findings of the previous studies, Abrahamsson (1999) found that his participant actually modified /.sl-/ onsets more frequently than he did /.sn-/ onsets, though the result was not statistically significant. Abrahamsson is careful to note that the corpus of data contained only 44 cases of /.sl-/ and 67 cases of /.sN-/, (where N is a nasal), so the results may be attributable to the small sample or to the possible confounding effect for environment discussed previously. In general, results from past research support Clements’s Sonority Cycle. Onsets violating CSP are modified more frequently than those that do not violate it, and onsets with a higher value for D are modified more frequently than those with a lower value for D. Possible exceptions come from two studies of Portuguese speakers learning English (Major 1996; Rebello & Baptista, this volume) and a longitudinal case study of a Spanish speaker learning English (Abrahamsson 1999). . Hypotheses The following three non-null hypotheses were tested in this study based on the markedness relationships among the three target onsets: /.st-/ is the most marked of the three because unlike the other two it violates the CSP; in turn, /.sn-/ is more marked than /.sl-/ because it has a higher value for D, (1.17) as opposed to (.56). 1. Participants will correctly produce /.sl-/ significantly more frequently than they will produce /.sn-/ and /.st-/. 2. Participants will correctly produce /.sn-/ significantly more frequently than they will produce /.st-/. 3. Onsets after a word-final vowel will be correctly produced significantly more frequently than will those after a word-final consonant. The first two hypotheses are more specific statements of the general hypothesis that the frequency of correct production will depend on the degree of markedness – the less marked the structure the greater the frequency of correct production. The first two hypotheses are in non-null form because of the significant results of the previous research comparing /.sl-/ and /.st-/ (Carlisle 1991b) and /.sl-/ and /.sN-/ (Carlisle 1988). The last hypothesis is also in non-null form because of the significant findings in five previous studies (Abrahamsson 1999; Carlisle 1991a, 1991b, 1992, 1997).
Robert S. Carlisle
. Method .. Participants To be included in the current study, the participants had to fulfill a number of criteria. First, all the participants had to be adult native Spanish-speakers for reasons already discussed in the method section of the first study. The second criterion was that all of the participants had to be enrolled in intermediate levels of ESL courses. This was done to assure that all the participants would be able to read the sentences on the data gathering instrument. At the time of data gathering, all of the participants were enrolled in one of three levels of intermediate ESL courses at Bakersfield College. To be placed into ESL classes at the community college, all international students take the Secondary Level English Proficiency Test (SLEP) which consists of two forty-five minute subsections, one for listening comprehension and the other for reading comprehension; students are never placed according to their proficiency in pronunciation. Because the students are not placed according to their pronunciation, the third criterion was that the participants’ overall frequency of correct production of the onsets had to fall within a certain range. All of the participants had to produce at least 21% of the onsets correctly, but no more than 79%. This range can be defended because previous research has used 80% correct production of any particular structure as the criterion level for acquisition (Andersen 1978; Cancino, Rosansky, & Schumann 1975; Carlisle 1997; Eckman 1991). That is, if L2 learners produce a certain structure correctly 80% of the time, then that structure is considered acquired. Because L2 learners who have acquired a certain structure can no longer be considered intermediate, they were eliminated from the study. Also eliminated from the study were beginning students, defined as those who could not produce over 20% of the onsets correctly. To be accepted for the study, all participants had to fall within the range just described; the participants ranged between 24.3% and 79.0% for correct production. Of the 34 potential participants, half did not qualify as they did not fulfill one or more of the criteria discussed above. Of the 17 participants who qualified for the study, 13 were females and 4 were males; they came from four foreign countries – Mexico, Bolivia, Guatemala, and Peru. The final participants varied quite a bit in their age of arrival in the United States (between 11:6 and 33:2) and in their length of residence (between 1:5 and 17:8). Nevertheless, all participants were intermediate students based on their class placement and their range of correct production. .. Instrumentation The data gathering instrument consisted of 375 randomly ordered sentences, 125 each for /.sl-/, /.sn-/ and /.st-/. For reasons presented in the previous study, the environments before the three onsets were strictly controlled; 25 environments (17
The sonority cycle
consonantal and 8 vocalic) appeared exactly five times before each onset. Twentyfive sentences occurred on each sheet for the participants to read. .. Procedure The 34 original participants were individually recorded between May 03, 2001 and May 09, 2001 in the language laboratory at Bakersfield College. Participants were recorded on a Sony TC-D5PROII tape-recorder with Sony ECM-530 microphone. The participants read at different rates, but most finished the reading task in about 14 to 17 minutes. .. Transcribing and reliability The researcher first transcribed the tapes in June 2001; three items were transcribed: the quality of the environment, the presence of the prothetic vowel, and the target word. Items were eliminated from the study for any of three reasons. First, they were eliminated if the participants changed the environment before the target onset. As mentioned before, Spanish speakers use prothesis significantly more frequently after word-final consonants than after word-final vowels; therefore, if the participants read in such a way that a vowel was changed to a consonant or a consonant to a vowel, the entire item was eliminated. For example, one participant read the following sentence, “I know statistics is difficult” as “I had statistics is difficult,” which caused the elimination of the entire item from the study. Items were not eliminated if participants devoiced an obstruent, because the environment remained an obstruent. The second reason for removing items was if participants misread the word containing the target onset. For example, one participant read “sled” as “led.” Finally, items were eliminated if the participants skipped them. In total, 157 items were eliminated from the study for the reasons just given. After transcribing the tapes in June, the researcher removed the participants who produced less than 20% of the items correctly or more than 79% of the items correctly. This left the sample of 17 eligible participants. A second person independently transcribed the remaining tapes in July 2002. The second transcriber is a native English speaker, has an MA in applied linguistics and has had training in phonetic transcription. To assure the reliability of the study, the researcher and the second transcriber worked together in four one-hour training sessions, in which they listened to tapes from previous studies and transcribed them. All items that caused disagreement between the two transcriptions were removed from the current study, a total of 418. The interrater reliability coefficients between the two transcriptions ranged between .893 and .973 with the average being .934.
Robert S. Carlisle
.. Analysis The data was analyzed with a 2 x 3 ANOVA with repeated measures, onset and environment being the independent variables, and the frequency of correct production being the dependent variable. Tukey pairwise comparisons were also calculated. . Results Table 1 indicates the frequency with which the three onsets were correctly produced; 64.4% of the /.sl-/ onsets, 56.2% of the /.sn-/ onsets, and 46.4% of the /.st-/ onsets. These differences resulted in a significant main effect for onset: F(2, 48) = 32.13, p < .0001. Tukey results revealed that a significantly greater percentage of /.sl-/ onsets were produced correctly than /.sn-/ and /.st-/ onsets, and a significantly greater percentage of /.sn-/ onsets were produced correctly than /.st-/ onsets. The frequency of correctly produced onsets after consonants was 45.8% and after vowels 65.6%; this difference also produced a significant main effect for environment: F(1, 48) = 117.20, p = .0001. The interactive effect between the two independent variables was not statistically significant: F(2, 48) = 1.36, p = .267. A non-significant finding indicates that the pattern of prothesis occurring more frequently after consonants than after vowels was the same before the three onsets (see Table 2). As indicated in Table 2, the percentage of correct production was consistently greater after word-final vowels than after word-final consonants. In addition, the percentage of correct production linearly increased as the markedness of the onsets decreased, a pattern evident after both environments. Table 1. Mean percentage of epenthesis before each onset and after environment Onset
Mean
SD
/.sl-/ /.sn-/ /.st-/ Environment
64.4 56.2 46.4
19.44 24.54 24.08
-C## -V##
45.8 65.6
22.11 23.26
Table 2. Percentage of correctly produced onsets after consonants and vowels Environment -C## -V##
/.sl-/
Onsets /.sn-/
/.st-/
54.9 73.8
47.9 64.5
34.5 58.4
The sonority cycle
. Discussion .. Environment The results of this study reveal that a significantly greater percentage of onsets were produced correctly after a word ending in a vowel than after a word ending in a consonant. This finding was highly significant and consistent across all three onsets as indicated in Table 2. In all three cases the frequency of correct production was at least 17 percentage points higher for onsets after a word-final vowel than after a word-final consonant. As indicated in Figures 7 and 8, the findings were highly consistent across the 17 participants. No crossovers occurred in the study and only two participants (2 and 11) had a nearly identical frequency of correct production of onsets after the two environments. These findings are also in complete accordance with those of other studies that have examined the influence of environment on the production of word-initial /.sC(C)-/ onsets by native Spanish speakers. As indicated in Table 3, at least five other studies have examined the influence of environment before /.sC(C)-/ onsets, and all have found that onsets after a word-final vowel are produced correctly with significantly higher frequency than the same onsets after a word-final consonant. Four of the studies involved native Spanish speakers learning English (Carlisle 1991a, 1991b, 1992, 1997); the other involved native Spanish speakers learning Swedish (Abrahamsson 1999).
Figure 7. Percentage of onsets correctly produced after consonants and vowels: Participants 1–8
Robert S. Carlisle
Figure 8. Percentage of onsets correctly produced after consonants and vowels: Participants 9–17
Table 3. Percentage of the correct production of word-initial /sC(C)-/ onsets after consonantal and vocalic environments from five independent studies Studies
Environments N
Carlisle, 1991a /st/ /sp/ /sk/ Carlisle, 1991b /sl/ /st/ Carlisle, 1992 /sl/ /sN/* Carlisle, 1997 /sC/ /sCC/ Abrahamsson, 1999 /sC/ /sCC/
-V##
-C##
39 35 37
21 27 24
82 73
71 59
77 72
69 61
71 58
53 45
66 40
8 3
9
11
14
11
1
*/sN/ = /s/ followed by a tautosyllabic nasal.
The sonority cycle
.. Onsets The participants in this study correctly produced a significantly greater percentage of less marked onsets than more marked onsets, meaning that /.sl-/ onsets were produced with a greater frequency of correctness than /.sn-/ onsets, which in turn were produced with a greater frequency of correctness than /.st-/ onsets. These findings support the first two hypotheses of the study and reveal that onsets with a lower dispersion value are more frequently produced correctly than are onsets with a higher dispersion value. These findings can be further substantiated by examining the results for all 17 participants (see Figures 9 and 10). If some onsets are truly preferred over others, then individual participants should display that preference; in other words, the results should not depend on a small number of extreme participants who may skew the statistical results, but rather on the group. Figures 9 and 10 display the frequency of correct production for the 17 participants. As indicated in the figures, all 17 produced a higher frequency of /.sl-/ than /.st-/ onsets correctly. Any violations of the expected pattern involved /.sn-/, which is not surprising given that it should have a percentage of correct production between those of /.sl-/ and /st-/. The only true crossover was by participant 11, who produced /.sn-/ at a slightly higher frequency of correct production than /.sl-/. A few other participants produced /.sl-/ and /.sn-/ correctly with a similar frequency (participants 1, 7, 9 10, and 12), though the correct production for /.sl-/ was usually a little higher as expected. In turn, three participants produced /.st-/ and /.sn-/ correctly with a similar frequency (participants 14, 16, and 17), though as expected the correct production of /.sn-/ was usually a little higher than that of /.st-/.
Figure 9. Percentage of /sl/, /sn/, and /st/ onsets correctly produced: Participants 1–8
Robert S. Carlisle
Figure 10. Percentage of /sl/, /sn/, and /st/ onsets correctly produced: Participants 9–17
The findings that less marked onsets are correctly produced more frequently than are more marked onsets are in accordance with most other research on this topic (Carlisle 1988, 1991b; Eckman & Iverson 1993; Tropf 1987). However, as mentioned previously, two studies of Brazilian Portuguese speakers learning English (Major 1996; Rebello & Baptista, this volume) found that onsets violating the CSP were modified less frequently than those that did not. While these studies may seem exceptional, evidence presented by the researchers strongly suggests that the seemingly aberrant findings may be attributed to the positive transfer of two interacting rules, prothesis and voicing assimilation. First of all, in underlying representation, Brazilian Portuguese does not have any word-initial onsets of the form /.sC-/; instead, the underlying form is /s.C/, where /s/ is an extrasyllabic consonant that must resyllabify at some point in the derivation to surface form (Clements & Keyser 1983). Portuguese has a rule of prothesis triggered by the extrasyllabic consonant that inserts /i/ before the /s/ to which the latter sound resyllabifies, resulting in structures such as /is.C/. In addition, Portuguese also has a rule that voices word-final /s./ when it is followed by a voiced consonant, so underlying structures such as /s.n/ will be pronounced as [zn], but underlying /s.k/ will never be pronounced as *[zk]. Furthermore, in running speech the prothetic vowel is frequently devoiced and subsequently deleted, but only when /s/ has not been voiced, so a word beginning with /sk/ would go through the following derivation:
The sonority cycle
(14) /s.k/ underlying representation (/s/ is extrasyllabic) [s.k] voicing assimilation (does not apply) [is.k] prothesis [is.k] resyllabification (prothetic vowel syllabifies with extrasyllabic consonant) [is.k] devoicing of the prothetic vowel ˚ [sk] vowel deletion and subsequent resyllabification (in running speech)
The derivation above reveals that even though Portuguese does not allow /.sk-/ onsets in underlying representation, they may occur on the surface because of the devoicing of the prothetic vowel and its subsequent deletion in running speech. The derivation is quite different for initial clusters in which the second member is voiced as indicated in (15): (15) /s.n/ underlying representation (/s/ is extrasyllabic) [z.n] voicing assimilation [iz.n] prothesis [iz.n] resyllabification (vowel syllabifies with extrasyllabic consonant) [iz.n] devoicing of vowel (does not apply) [iz.n] vowel deletion (does not apply)
As revealed in (15), Portuguese cannot have onsets of the form /.sC-/ (where C is voiced) in either underlying representation or surface representation. Given these rules of Portuguese, the apparently aberrant findings in Major’s and Rebello and Baptista’s studies that onsets violating sonority sequencing are modified less frequently than those that do not become explicable. In Portuguese the prothetic vowel would never be deleted before clusters such as /s.l/ because the extrasyllabic consonant voices before /l/, consequently bleeding the environment for the application of the vowel devoicing rule and the subsequent vowel deletion. In contrast, the extrasyllabic /s/ of [s.k] would never be voiced. Thus the prothetic vowel would undergo devoicing and subsequent deletion, resulting in a marked surface form that is not found in the underlying representation. Compounding the effect of the interactive rules is Rebello and Baptista’s finding that in interlanguage prothesis is much more frequent when /s/ becomes voiced. The participants in their study voiced 58% of the occurrences of /s/ in /s/+sonorant clusters, and prothesis occurred before 79% of those. In contrast, prothesis occurred before only 39% of the clusters in which the first member was not voiced. Given this account, the results of the two studies do not constitute evidence invalidating the Sonority Cycle.
Robert S. Carlisle
. Conclusion Over the last 30 years an important question in research in interlanguage phonology has been the extent to which markedness affects acquisition. Much of this research has specifically dealt with syllable margins. A problem has been that a uniform manner for defining markedness and consequently for unambiguously ranking margins did not exist, at times researchers appealing to the implicational statements found in comparative studies such as that by Greenberg (1978) and at other times to notions such as the SSP. The Sonority Cycle, in contrast, provides a unitary approach for ranking all margins found in the world’s languages. The first principle of the Sonority Cycle, the CSP, recognizes that margins that do not abide by the SSP are more marked than those that do. Thus, margins consisting of sonority reversals or plateaus are more marked than those margins strictly adhering to the SSP. In turn, sonority reversals are more marked than are sonority plateaus, the former being more deviant from the SSP. The FDS in turn evaluates only those margins which abide by the SSP by assigning a value for D – essentially based on the number of pairs in a demisyllable and the sonority distance between the members of each pair – and consequently ranks them according to their complexity: the higher the value of D, the more marked the demisyllable and the margin associated with it. Working in conjunction with the Sonority Cycle is the Sequential Markedness Principle, which can further rank margins even when they have the same value for D. Somewhat surprisingly, only one previous study in interlanguage phonology has explicitly utilized the Sonority Cycle in its research (Eckman & Iverson 1993). This dearth of studies may be due to the fact that the predictions of the Sonority Cycle are quite similar to those made by implicational statements emanating largely from the work of Greenberg (1978). However, as mentioned previously, Greenberg’s implicational statements reveal the markedness relationships only among the pairs in the statements. His work is not a model for systematically ranking all margins. This paper presented two studies testing the Sonority Cycle. The first study summarizes the findings of a longitudinal study previously published. The results were presented again because the study offers a test of the rankings made by the Sonority Cycle. The study examined four biliteral and triliteral onsets in English, all four of which violate the CSP because they begin with plateaus of OO. The triliteral onsets are more complex, and therefore more marked, by the presence of a third segment, L, closest to the nucleus. The results of the study confirm the markedness predictions of the Sonority Cycle. The less marked onsets were correctly produced significantly more frequently than were the more marked onsets, and more marked onsets did not reach the criterion level of acquisition unless the less marked onset had also reached the criterion level. These findings cor-
The sonority cycle
roborate those of previous studies that have compared onsets differing in length (Abrahamsson 1999; Anderson 1987). The second study offers a stronger test of the rankings because the length of the onsets was held constant. Of the three biliteral onsets examined, /.st-/ was the most marked because it violated the CSP; and /.sn-/ was more marked than /.sl-/ because it had a higher value for D. Results again showed that less marked onsets were produced correctly significantly more frequently than were more marked onsets. These findings are also in accordance with most other research on this topic (Carlisle 1988, 1991b; Eckman & Iverson 1993; Tropf 1987). The two studies of Brazilian Portuguese speakers learning English (Major 1996; Rebello & Baptista, this volume) apparently do not provide evidence against the Sonority Cycle because of the transfer and interaction of two rules from Portuguese. To assure that the Sonority Cycle was validly being tested in the second study, the possibly confounding variable of environment was strictly controlled. The results for environment showed that prothesis occurred significantly more frequently after a word-final consonant than after a word-final vowel, a finding that corroborates those of at least four other independent studies examining the production of /.sC(C)/ onsets in English by non-native English speakers. From the results of these studies, the Sonority Cycle appears to be a valid and reliable model for ranking margins and consequently revealing their markedness relationships.
References Abrahamsson, N. (1999). Vowel epenthesis of /.sC(C)-/ onsets in Spanish/Swedish interphonology: A longitudinal study. Language Learning, 49, 473–508. Andersen, R. (1978). An implicational model for second language research. Language Learning, 28, 221–281. Anderson, J. (1987). The markedness differential hypothesis and syllable structure difficulty. In G. Ioup & S. Weinberger (Eds.), Interlanguage Phonology: The acquisition of a second language sound system (pp. 279–291). Cambridge, MA: Newbury House. Battistella, E. (1990). Markedness: The evaluative superstructure of language. Albany, NY: The State University of New York Press. Benson, B. (1988). Universal preference for the open syllable as an independent process in interlanguage phonology. Language Learning, 38, 221–242. Blevins, J. (1995). The syllable in phonological theory. In J. Goldsmith (Ed.), The Handbook of Phonological Theory (pp. 206–244). Cambridge, MA: Blackwell. Cairns, C. & Feinstein, M. (1982). Markedness and the theory of syllable structure. Linguistic Inquiry, 13, 193–225. Cancino, H., Rosansky, E., & Schumann, J. (1975). The acquisition of the English auxiliary by native Spanish speakers. TESOL Quarterly, 4, 421–430.
Robert S. Carlisle
Carlisle, R. S. (1988). The effect of markedness on epenthesis in Spanish/English interlanguage phonology. Issues and Developments in English and Applied Linguistics, 3, 15–23. Carlisle, R. S. (1991a). The influence of environment on vowel epenthesis in Spanish/English interphonology. Applied Linguistics, 12, 76–95. Carlisle, R. S. (1991b). The influence of syllable structure universals on the variability of interlanguage phonology. In A. D. Volpe (Ed.), The Seventeenth LACUS Forum 1990 (pp. 135–145). Lake Bluff, IL: Linguistic Association of Canada and the United States. Carlisle, R. S. (1992). Environment and markedness as interacting constraints on vowel epenthesis. In J. Leather & A. James (Eds.), New Sounds 92: Proceedings of the 1992 Amsterdam Symposium on the Acquisition of Second-Language Speech (pp. 64–75). Amsterdam: University of Amsterdam Press. Carlisle, R. S. (1997). The modification of onsets in a markedness relationship: Testing the interlanguage structural conformity hypothesis. Language Learning, 47, 327–361. Carlisle, R. S. (1998). The acquisition of onsets in a markedness relationship: A longitudinal study. Studies in Second Language Acquisition, 20, 245–260. Carlisle, R. S. (2002). The acquisition of two and three member onsets: Time III of a longitudinal study. In A. James & J. Leather (Eds.), New Sounds 2000: Proceedings of the Fourth International Symposium on the Acquisition of Second-Language Speech (pp. 42–47). Klagenfurt: University of Klagenfurt. Clements, G. (1990). The role of the sonority cycle in core syllabification. In J. Kingston & M. Beckman (Eds.), Papers in Laboratory Phonology I: Between the grammar and physics of speech (pp. 283–333). Cambridge: CUP. Clements, G. & Keyser, S. (1983). CV Phonology: A generative theory of the syllable. Cambridge, MA: The MIT Press. Eckman, F. R. (1977). Markedness and the contrastive analysis hypothesis. Language Learning, 27, 315–330. Eckman, F. R. (1987). The reduction of word-final consonant clusters in interlanguage. In A. James & J. Leather (Eds.), Sound Patterns in Second Language Acquisition (pp. 143–162). Dordrecht: Foris. Eckman, F. R. (1991). The structural conformity hypothesis and the acquisition of consonant clusters in the interlanguage of ESL learners. Studies in Second Language Acquisition, 13, 23–41. Eckman, F. R. & Iverson, G. (1993). Sonority and markedness among onset clusters in the interlanguage of ESL learners. Second Language Research, 9, 234–252. Green, A. D. (2003). Extrasyllabic consonants and onset well-formedness. In C. Féry & R. van de Vijver (Eds.), The Syllable in Optimality Theory (pp. 238–253). Cambridge: CUP. Greenberg, J. (1978). Some generalizations concerning initial and final consonant clusters. In J. Greenberg, C. Ferguson, & E. Moravcsik (Eds.), Universals of Human Language, Vol. 2 (pp. 243–279). Stanford, CA: Stanford University Press. Hancin-Bhatt, B. (2000). Optimality in a second language phonology: Codas in Thai ESL. Second Language Research, 16, 201–232. Hansen, J. (2001). Linguistic constraints on the acquisition of English syllable codas by native speakers of Mandarin Chinese. Applied Linguistics, 22, 338–365. Hansen, J. (2004). Developmental sequences in the acquisition of English L2 syllable codas. Studies in Second Language Acquisition, 26, 85–124. Harris, J. (1983). Syllable Structure and Stress in Spanish: A non-linear analysis. Cambridge, MA: The MIT Press.
The sonority cycle
Hulst, H. van der & Ritter, N. A. (1999). Theories of the syllable. In H. van der Hulst & N. A Ritter (Eds.), The Syllable: Views and facts (pp. 13–52). Berlin: Mouton de Gruyter. Hyltenstam, K. & Lindberg, I. (1983). [Immigrants’ Swedish. A critical examination of the material in the project Swedish of Immigrants (Josefson 1979), with special reference to the usefulness of the collected material] SSM Report 9, Studium av ett Invandrarsvenskt Sprakmaterial [A study of Immigrant Swedish Language Material], 5–51. Stockholm: Stockholm University, the Department of Linguistics. Kaye, J. & Lowenstamm, J. (1981). Syllable structure and markedness theory. In A. Belleti, L. Brandi, & L. Rizzi (Eds.), Theory of Markedness in Generative Grammar (pp. 287–315). Pisa: Scuola Normale Superiore. Kiparsky, P. (1979). Metrical structure assignment is cyclic. Linguistic Inquiry, 10, 421–441. Major, R. (1996). Markedness in second language acquisition of consonant clusters. In D. R. Preston & R. Bayley (Eds.), Variation and Second Language Acquisition (pp. 75–96). Amsterdam: John Benjamins. Major, R. & Faudree, M. (1996). Markedness universals and the acquisition of voicing contrasts by Korean speakers of English. Studies in Second Language Acquisition, 18, 69–90. Morelli, F. (2003). The relative harmony of /s+stop/ onsets: Obstruent clusters and the sonority sequencing principle. In C. Fery & R. van de Vijver (Eds.), The Syllable in Optimality Theory (pp. 356–371). Cambridge: CUP. Rebello, J. T. & Baptista, B. O. (2006). The influence of voicing on the production of initial /s/clusters by Brazilian learners. In B. O. Baptista & M. A. Watkins (Eds.), English with a Latin Beat: Studies in Portuguese/Spanish – English interphonology. Amsterdam: John Benjamins. Sato, C. (1984). Phonological processes in second language acquisition: Another look at interlanguage syllable structure. Language Learning, 34, 43–57. Schmid, S. (1997). Phonological processes in Spanish-Italian interlanguages. In J. Leather & A. James (Eds.), New sounds 97: Proceedings of the Third International Symposium on the Acquisition of Second-Language Speech (pp. 286–293). Klagenfurt: University of Klagenfurt. Selkirk, E. (1984). On the major class features of syllable theory. In M. Aronoff & R. Oehrle (Eds.), Language Sound Structure (pp. 107–136). Cambridge, MA: The MIT Press. Tropf, H. (1987). Sonority as a variability factor in second language phonology. In A. James & J. Leather (Eds.), Sound Patterns in Second Language Acquisition (pp. 173–191). Dordrecht: Foris. Vennemann, T. (1988). Preference Laws for Syllable Structure and the Explanation of Sound Change. Berlin: Mouton de Gruyter. Weinberger, S. (1987). The influence of linguistic context on syllable simplification. In G. Ioup & S. Weinberger (Eds.), Interlanguage Phonology: The acquisition of a second language sound system (pp. 401–417). Rowley, MA: Newbury House.
The influence of voicing on the production of initial /s/-clusters by Brazilian learners* Jeanne Teixeira Rebello and Barbara O. Baptista Universidade Federal de Santa Catarina, Brazil
This paper examines the production of English initial /s/-clusters by Brazilian learners, considering as possible variable constraints cluster length, sonority relations within the syllable, and environment, based on studies by Carlisle with Spanish learners and on Eckman’s markedness differential hypothesis (MDH) and structural conformity hypothesis (SCH). Results were inconclusive regarding the influence of cluster length, and, due to transfer of native-language voicing assimilation, universal markedness regarding voicing appeared to play a greater role in determining the frequency of syllable simplification than markedness regarding sonority, both within the syllable and in relation to the environment. Thus, the study complements previous research by demonstrating the importance of L1 transfer in determining which universals will have the greatest influence on L2 syllable production.
.
Introduction
Since Eckman first proposed his markedness differential hypothesis (MDH, 1977), which predicts that second/foreign language (L2) learners will have difficulty with those structures which are both different from and more marked than corresponding structures in the native language (L1), there has been an abundance of studies on the learning of L2 syllable structure. These studies have dealt with three main aspects of phonological structure affecting production of L2 syllables: (a) the complexity of the syllable, that is, the existence or not of a coda and the number of elements in onsets and codas (Anderson 1987; Carlisle 1997, 1998, 2002; Eckman 1986, 1991; Ross 1994; Tarone 1980); (b) the actual constituents of the onsets and codas and relationships among them (Carlisle 1988, 1991b, 1992, this vol* A previous analysis of the same data was reported in Rebello (1997).
Jeanne Teixeira Rebello and Barbara O. Baptista
ume; Tropf 1986); and (c) the environments in which the onsets occur (Carlisle 1991a, 1992, 2002, this volume). The present study examines the influence of all three of these factors on the production of English initial /s/-clusters by Brazilian learners of English as a foreign language (EFL).1 The analysis was carried out from the phonological perspectives of universals of syllable structure (Greenberg 1965) and of sonority relations within and across syllables (Clements 1990; DziubalskaKolaczyk 1997; Hooper 1976;2 Murray & Vennemann 1983; Selkirk 1984), both of which imply markedness relationships. Prediction of difficulty in all cases is based on markedness, in accordance with the MDH, which considers transfer as well, and with the structural conformity hypothesis (SCH, Eckman 1991), which claims that those universals that hold for natural languages also hold for interlanguages. Portuguese syllable structure differs from that of English not only in terms of the number of elements which can occupy onsets and codas, but also in terms of the sequences which are permissible in each of these positions. For instance, onset clusters in Portuguese consist of no more than two consonantal segments while in English they can have up to three. In this respect, the Portuguese syllable is less marked than the English syllable, since it complies more closely with the universal CV syllable structure. Moreover, some English initial clusters (/sp, st, sk, spr, str, skr, spl, skw/) violate the sonority sequencing principle (SSP), according to which there should be a gradual increase in sonority from the margins to the nucleus of the syllable (see Clements 1990; Dziubalska-Kolaczyk 1997; Hooper 1976; Selkirk 1984). According to Hooper (1976), this makes them more prone to modification than clusters which do not violate this principle.3 Vowel insertion is a productive process in Brazilian Portuguese (BP) phonology. Within-word epenthesis is used to make otherwise non-conforming Portuguese syllables conform to BP syllable structure constraints (e.g., advogado is commonly pronounced with a vowel between the /d/ and the /v/). Analyses in the generative tradition mostly attribute the initial [i] or [I] in words such as escola (“school”) and espírito (“spirit”) to the process of prothesis, although there is an initial vowel in the spelling. Finally, foreign . In this study a distinction is made between English as a foreign language (EFL), where the language is learned in the classroom and not generally spoken among the residents of the country, and English as a second language (ESL), where the language is learned, in the classroom or not, and used in everyday life by the residents of the country. . Hooper actually used the opposite term consonantal strength in her hierarchy, where the strongest consonants would be the least sonorant. . According to most sonority hierarchies (e.g., Dziubalska-Kolaczyk 1997; Hooper 1976; Selkirk 1984), fricatives are higher in sonority than stops, which would mean that /s/+stop clusters follow an inverse sequence to that of the SSP. However, even considering hierarchies such as Clements’ (1990), which give fricatives and stops equal sonority levels, /s/+stop clusters do not increase in sonority toward the nucleus, as preferred by the SSP.
The influence of voicing
borrowings ending in obstruents such as clube (“club”) routinely undergo paragoge, with the appropriate changes in spelling. All three are known to be strategies used to deal with difficult L2 syllables. L2 syllable structure was investigated in this study based on universal markedness in relation to the three aspects of phonological structure mentioned above. First, regarding cluster length, it was hypothesized that, due to the influence of universal syllable structure (Greenberg 1965), learners would have greater difficulty in producing the more marked initial three-member /s/-clusters than the less marked two-member ones. Results in conformity with this prediction have been found by Anderson (1987) for both initial and final English clusters produced by Egyptian Arabic, Mandarin Chinese, and Amoy Chinese speakers of English; by Carlisle (1997, 1998, 2000) for English initial /s/-clusters produced by Spanish speakers; and by Eckman (1986, 1991) for English final clusters produced by Cantonese, Japanese, and Korean speakers. In regard to the second aspect – sonority relations among segments within the syllable – two hypotheses guided the analysis, the first concerning the comparison of obstruents with sonorants and the second comparing two kinds of sonorants. First, it was hypothesized that learners would have greater difficulty in producing initial /s/-clusters which violate the SSP (the /s/+stop clusters) than those clusters which do not (the /s/+sonorant clusters). This hypothesis has been supported by Carlisle (1991b, this volume) who found significantly more frequent prothesis in /st/ clusters than in /sn/ clusters. Another previous study which found internal markedness relationships within the syllable to have an influence on L2 production was Tropf (1986), who found more frequent modification in general of onsets and codas that violated the SSP than of those that did not. Major (1996), on the other hand, in a study on the production of various English clusters by Brazilians, found /sl/ clusters to cause greater error rates than /s/+stop clusters, which he attributed to the existence of /s/+stop clusters in Portuguese, while a sibilant followed by a liquid in Portuguese is always voiced ([zl]). Neither cluster occurs phonologically in word-initial position in Portuguese, but the /s/+stop clusters in fluent speech are frequently realized with deletion of the preceding vowel because of the voiceless stop (e.g., estou –> [sto~]). This would not occur with /s/+sonorant clusters because they are voiced (e.g., eslavo –> [izlavu]). The second hypothesis regarding the components of the cluster was based on the classification of liquids as being more sonorant than nasals (just about all sonority hierarchies agree on this, e.g., Dziubalska-Kolaczyk 1997; Hooper 1976; Selkirk 1984; Clements 1990). This implies that onset /s/+liquid clusters have a greater increase in sonority toward the nucleus than /s/+nasal clusters, making the former more acceptable to the SSP. Greenberg (1965) also classifies the /s/+liquid
Jeanne Teixeira Rebello and Barbara O. Baptista
clusters as the more preferred of the two in an implicational relationship4 and Carlisle (1988, this volume) found significantly more prothesis by Spanish speakers in the production of /s/+nasal clusters than /s/+liquid clusters. Thus, it was also hypothesized that, among the /s/+sonorant clusters, the /s/+liquid would yield a lower error rate than the /s/+nasal clusters. The third aspect of phonological structure investigated was the environment preceding the initial /s/-cluster. Following Carlisle (1991a, 1992, 1997, 2002), the following hypotheses were proposed: (a) consonantal environments would yield a greater rate of syllable simplification than vocalic environments, because a preceding vowel allows for resyllabification; (b) obstruents as context would yield more frequent simplification than sonorants, based on Murray and Venneman’s (1983) syllable contact law for diachronic change, which predicts more frequent resyllabification when the final consonant of the preceding syllable is less sonorant than the initial consonant of the following syllable; (c) voiced obstruents as context would yield more frequent simplification than voiceless obstruents, because voiced obstruents are more marked than voiceless obstruents in general, and especially in syllable-final position. Consonants in the preceding environment would be expected to be even more relevant for BP speakers learning English than for Spanish speakers because of the Brazilian tendency to produce epenthetic vowels after final consonants.
. Method Six Brazilian learners of English, one female and five males, from the extracurricular courses of a major Brazilian university, took part in this experiment. Their ages ranged from 19 to 31 and two participants were chosen from each of the third, sixth, and ninth semesters (lower intermediate, intermediate, and upper intermediate, respectively), each semester consisting of 45 class-hours, in order to look for tendencies that would be valid across levels. They were chosen according to the evaluation of their teachers, who considered each of them to be average within their class in regard to the accuracy of their pronunciation. However, proficiency level was not considered to be a variable in the study because there was no correspondence between their semester of study and rate of error (prothesis or other cluster-simplification strategies) in initial /s/-clusters. This can be seen in Table 1, where individual error rates vary from 45% to 75%, with the three lowest rates dis. Greenberg, based on an enormous survey of more than 100 languages, came up with many generalizations, or universals, concerning the likelihood of occurrence of various consonant clusters in the world’s languages. These generalizations are expressed in implicational terms; i.e., if A occurs in a language, then B also occurs, meaning that A is more marked than B.
The influence of voicing
Table 1. Rate of error in initial /s/-clusters per participant Semester Participant
P1
3rd P2
P3
6th P4
P5
9th P6
Total
Error N Rate
135 284 48%
212 281 75%
139 282 49%
193 280 70%
127 281 45%
167 285 59%
973 1693 57%
tributed among the three semesters. It is important to point out that, as opposed to the participants of most previous studies on L2 syllable structure, these students had little or no opportunity to speak English outside the classroom, and pronunciation was not dealt with on a systematic basis in any semester of the course. Also, the teachers of all semesters of the course were native speakers of BP, who may not have been good models concerning initial /s/-clusters. Each participant was recorded reading 312 sentences, each sentence containing a word with an initial /s/-cluster. There were 26 sentences for each of the 12 clusters /sp, st, sk, sm, sn, sl, sw, spr, str, skr, spl, skw/, in order to include each cluster in the following preceding phonological contexts: (a) each of the 21 consonants /p, t, k, b, d, g, f, θ, s, w, v, ð, z, Š, tw, dŠ, m, n, ], r, l/, (b) 4 of the 7 vowels and diphthongs /i, eI, u, o~, aI, a~, fI/ varied at random, and (c) sentence-initial (see Appendix for a sample of the test – the 26 sentences for the cluster /sp/). The /sw/ cluster was excluded from the analysis, as out of a total of 154 valid tokens, this cluster yielded only 2 occurrences of prothesis and no other syllable-simplification strategy, probably due to the occurrence of /sw/ sequences phonetically in BP.5 Excluding this cluster and the 23 sentences which were inadvertently omitted or whose target word was misread, a total of 1693 tokens were analyzed. Each participant read the sentences in a different order and sentences containing the same cluster did not occur sequentially. The recordings were made individually on a small Sony cassette tape recorder in a quiet room on campus, with three 2-minute breaks to avoid fatigue. Participants were asked to repeat the entire sentence whenever they paused or wanted to correct their pronunciation, in an attempt to ensure that the intended contexts actually acted as contexts. The researcher did not interrupt, however, to make sure that participants actually repeated problematic sentences, and many pauses did occur. The target word and the context word were transcribed by the first author, a native speaker of BP herself with experience in phonetic transcription and in teaching English to BP speakers. Approximately 10% of the items were also tran. Although the sequence su- in words such as sueco, suino, suor (“Swedish, pig, sweat”) is generally analyzed phonologically as a separate syllable – /su.7.ku/, /su.i.nu/, /su.fr/ – obeying a BP constraint on rising diphthongs (those that begin with the glide), it is often realized phonetically as a glide in fluent speech – [sw7.ku], [swi.nu], [swfr] (Collischonn 1999; Cristófaro Silva 2000).
Jeanne Teixeira Rebello and Barbara O. Baptista
scribed by the second author, a native speaker of American English who also has experience teaching English pronunciation to speakers of BP. Each transcriber listened to each sentence as many times as necessary. As there were no discrepancies between the two transcribers in the 10% of the items heard by both, the other 90% were transcribed only by the researcher. Prothesis was, as expected, the principal strategy used to simplify difficult clusters, but two alternative but infrequent cluster-simplification strategies were noted: omission of the /s/, which occurred only 9 times, and resyllabification by tacking the /s/ onto the previous syllable and pausing before the second member of the cluster (e.g., [m7gs . kr7twId] for Meg scratched) – 10 times. These strategies were included in the error count in most of the analyses, but the frequently occurring voicing of the /s/ was not, as it does not simplify the cluster. Prothesis was considered to have occurred only when a voiced vowel could be clearly heard before the cluster, since voiceless vowels are almost impossible to distinguish from aspiration. Likewise, the /s/ was considered to have been voiced only when it was clearly voiced from beginning to end. The context segment, when produced differently than expected (e.g., [wId] for with), was classified accordingly in the error count, accounting for some discrepancies in category totals. All pauses long enough to interfere with a smooth transition between the context segment and the /s/-cluster were transcribed and counted separately in the analysis of environment.6 In the other analyses pauses were not taken into consideration.
. Results . Influence of length of cluster To test the hypothesis regarding syllable length – that three-member clusters would cause more difficulty than two-member clusters – a comparison was made of each pair of /s/+stop and /s/+stop+approximant clusters, which yielded inconsistent . It was pointed out by an anonymous reviewer that epenthetic vowels are often devoiced in BP. This is true, in particular, for epenthetic vowels following voiceless consonants. Given the difficulty of identifying voiceless vowels, in particular in recorded material, an attempt was made to compensate for this difficulty by separating, in the analysis of context influence, all productions with even minor pauses before the cluster, with the idea that even a voiceless vowel will take time and interfere with the smooth transition between the context segment and the target /s/-cluster. While this does not change the fact that imperceptible voiceless vowels may not have been counted, it should at least keep them from interfering in the comparison of vowel versus consonant contexts. When only the cluster itself was being considered, tokens with pauses were included, as it was assumed that all clusters had the same chance of having imperceptible voiceless epenthetic vowels.
The influence of voicing
Table 2. Frequencies and rates of error of two-member vs. three-member obstruent clusters
/sp/ vs. /spr, spl/ /st/ vs. /str/ /sk/ vs. /skw, skr/ Total
N
/sC/ Errors
%
N
/sCC/ Errors
%
152 152 156 460
80 77 93 250
53 51 60 54
308 154 306 768
194 90 169 453
63 58 55 59
results, as shown in Table 2. The three-member clusters /spC/ and /stC/ yielded higher rates of syllable simplification than two-member /sp/ and /st/ respectively, as expected, but chi-square tests with the Yates correction factor7 showed only the difference between the first pair to be significant (χ2 (1, N = 460) = 9.474, p < .01). The third pair unexpectedly yielded a higher rate of error for the two-member cluster /sk/, though this result was also insignificant. Taking all /s/+obstruent clusters together, the longer clusters did yield more simplification, but the same chi-square test showed this overall difference to be insignificant as well (χ2 (1, N = 1228) = 2.341, p > .20). Although the only significant results, those for the /sp/-/spC/ pair, do support the hypothesis that longer clusters are more difficult to produce than shorter ones, cluster length does not seem to have had much influence on the results for the other two pairs. This is not in accord with results obtained in the studies reviewed in the introduction (Carlisle 1997, 1998, 2000; Anderson 1987; Eckman 1986, 1991), all of which showed cluster length to be important. The lack of significance could be related to the small number of subjects, but this was compensated for by a large number of tokens, which should have allowed tests to show significance, and several of the previous studies were carried out on few subjects as well. The proficiency level could be a factor, but it was not much different from that of the participants in Carlisle’s studies. An important difference between this study and the previous ones is the fact that the participants were all classroom EFL learners not living in an English-speaking country. It is quite likely that the classroom learners of this study were less fluent and therefore spoke more slowly, which may somehow have reduced the influence of cluster length. Also, the artificial nature of the foreign language classroom might have reduced the influence of markedness factors by not giving the learners the opportunity to tap into universal grammar.
. In this and all subsequent 2 by 2 comparisons the chi-square was carried out with the Yates Correction Factor.
Jeanne Teixeira Rebello and Barbara O. Baptista
Table 3. Frequencies and rates of error of two-member /s/+sonorant and /s/+obstruent clusters
N Errors %
/sm/
/sn/
Total /sN/
/sl/
Total /s/+son
/sp/
/st/
/sk/
Total /s/+obs
156 107 69
153 96 63
309 203 66
156 96 61
465 299 63
152 80 53
152 77 51
156 93 60
460 250 54
. Influence of the sonority sequencing principle With regard to sonority sequencing within the syllable, the two-member /s/+obstruent clusters (considered to be in violation of the SSP) were analyzed in relation to the /s/+sonorant clusters, which are all two-member clusters not in violation of the principle. In addition, within the second cluster type, /s/+nasals were analyzed in relation to the /s/+liquid clusters. Table 3 shows that all three clusters not in violation (/sl, sm, sn/) actually caused higher error rates than all three clusters in violation (/sp, st, sk/). A chi-square comparing the totals of the two types of cluster shows the difference of nine percentage points to be significant (χ2 (1, N = 925) = 9.08, p < .01). While the results regarding the /s/+obstruent versus /s/+sonorant clusters were contrary to expectations, Table 3 shows the expected tendency for a greater error rate for the /s/+nasal clusters versus the /s/+liquid clusters. However, this difference yielded a non-significant chi-square (χ2 (1, N = 465) = .610, p> .20). Thus, it cannot be said that these results support either of the hypotheses within the perspective of the SSP (although they are not inconsistent with the second hypothesis) or that they corroborate the results of Carlisle’s studies. While no obvious explanation can be given for the lack of significance regarding the /s/+nasal clusters versus the /s/+liquid clusters, the results regarding /s/+obstruent versus /s/+sonorant should not actually be surprising, especially if we recall Major’s (1996) results with Brazilian learners. The explanation is quite likely to be related to the voicing of /s/ in /s/+nasal and /s/+liquid clusters. In BP, both within-word and cross-word anticipatory assimilation determines the voicing specification of sibilants followed by consonants, giving [ezmera~da]8 as the pronunciation for the word esmeralda (“emerald”) and a contrasting pronunciation of the article in pairs of phrases such as os gatos (“the cats”) – [uzgatus] – and os carros (“the cars”) – [uskaxus] (Cristófaro Silva 2000; Mateus & d’Andrade 2002). Although the BP voicing assimilation process is applied to syllable-final . The BP sibilants are being transcribed as alveolar, for ease of comparison with English, but in many dialects they are palatalized and/or the two points of articulation can be in free or positional variation.
The influence of voicing
sibilants and the English sibilants are phonetically syllable-initial, many recent generative analyses (Kenstowicz 1993; Roca & Johnson 1999) consider the /s/ of English initial /s/-clusters to be extrasyllabic, which would allow it to undergo processes normally limited to coda sibilants. In addition to the markedness of the resulting voiced cluster, the initial vowel is frequently deleted before the voiceless /s/+obstruent clusters in quickly spoken BP, whereas vowel deletion does not occur before the voiced /s/+sonorant cluster. Although Spanish also has within-word and across-word anticipatory voicing assimilation of the sibilant in dialects without aspiration of the /s/ (e.g., [lohÁgatoh] for los gatos would inhibit this voicing),9 there are several differences between the two languages regarding this process. First of all, in BP the voicing assimilation causes a neutralization of a phonological voicing distinction between the sibilants /s/ and /z/, which normally contrast in onset position in words such as saga (“saga”) and zaga (“fullback position”). Where voicing assimilation occurs in Spanish, the resulting voiced [z] is a positional allophone of the only sibilant phoneme /s/, the words saga and zaga (same meanings as in Portuguese) being homophones pronounced [saga], except in European varieties where the “z” is pronounced [θ]. Second, while sibilant voicing assimilation before voiced consonants is considered to be absolute in BP, it is considered by Bradley (2006) and Bradley and Del Forge (2006) to be “gradient and variable” in “conservative dialects” (those without aspiration of the sibilant) of Spanish. Third, across-word voicing assimilation in BP is also applied before vowels (e.g., [aÁzazas] for as asas – “the wings”), whereas in most dialects of Spanish this does not occur, causing a lack of distinction between pairs such as has ido (“you have gone”) and ha sido (“has been”), both being produced as [aÁsido] (examples from Bradley 2006). Whereas Major (1996) reports voicing of the sibilant in English /sl/ clusters by his Brazilian participants, Carlisle does not report voicing assimilation in any of his studies with Spanish speakers. Table 4 shows the percentages of voiced and voiceless sibilants and their respective rates of prothesis. The other cluster simplification strategies were not included in this error count because an omitted /s/ can obviously not be voiced and an /s/ separated from the following sonorant by a pause would be unlikely to undergo assimilation to that sonorant (and none did). As the table shows, more than half of the /s/+sonorant clusters underwent voicing assimilation and those that did yielded a rate of prothesis more than twice that of those that did not, resulting in a very significant chi square (χ2 (1, N = 461) = 73.843, p < .0001). Thus, it seems quite clear that there is a relationship between voicing and prothesis.
. As helpfully pointed out by an anonymous reviewer.
Jeanne Teixeira Rebello and Barbara O. Baptista
Table 4. Frequencies and rates of prothesis (excluding four tokens with other simplification strategies) of /s/+sonorant clusters with and without voicing of /s/
N Prothesis %
Voiced
Voiceless
Total
268 212 79
193 76 39
461 288 62
The apparent influence of voicing can be related to Greenberg’s implicational universals regarding consonant clusters. According to Greenberg (1965), in addition to the general preference for voiceless clusters over voiced clusters, there is an implicational relationship in initial systems favoring voiceless obstruent + nasal over voiced obstruent + nasal and favoring voiceless obstruent + semivowel over voiced obstruent + semivowel. His data were insufficient to make a similar generalization about obstruent + liquid, but this combination is likely to follow the tendency of voiceless clusters being favored over voiced clusters. The greater markedness of voiced clusters resulting from the voicing assimilation produced by BP speakers would account for the fact that the vowel is not deleted from these clusters in BP, and thus also for the greater rate of prothesis for these structures in the English of Brazilian learners. The frequent deletion of initial vowels before /s/+obstruent clusters that occurs in BP is not a common occurrence in Spanish because of its syllable-timed rhythm. The influence of this markedness by voicing might neutralize or even take precedence over the influence of markedness by sonority for BP learners of English, thus leading to a greater frequency of prothesis in the clusters predicted by the SSP to be easier and found by Carlisle (1991b) to be easier for Spanish speakers. These results highlight the fact that markedness is not an isolated influence in L2 phonology, but rather is a factor that at times interacts with transfer from the L1, especially where two kinds of markedness are in conflict. Thus, although the results do not support the influence of the SSP, they can be considered to be in accordance with both the MDH and the SCH (although the SCH does not consider transfer), since the more marked voiced clusters caused more prothesis than the less marked voiceless clusters. . Influence of environment The hypotheses regarding the environment led to three different comparisons of context: (a) vowel versus consonant versus sentence-initial position, as in Carlisle (1991a, 1992, 2002, this volume); (b) obstruent versus sonorant; and (c) voiced versus voiceless obstruent. Because of the frequency with which the participants paused during their reading, the category pause was added to the first comparison. In fact, the frequency of pauses before the initial /s/-cluster was so great (approxi-
The influence of voicing
Table 5. Frequency and rate of pauses by type of cluster /s/+obs
/s/+son
/sCC/
Total
460 113 25
465 123 26
768 188 24
1693 424 25
N Pauses %
Table 6. Frequencies and rates of error in the contexts of sentence-initial, pause, vowel, and consonant
N Errors %
Initial
Pause
Vowel
Consonant
Total
66 34 52
424 331 78
222 128 58
981 498 51
1693 991 59
mately one-fourth of all tokens) that it is perhaps important to examine this result first, to see whether there is a pattern to the pauses, since they could be caused by the difficult clusters themselves, unfamiliar vocabulary, or simple lack of fluency. Table 5 shows that the pauses were very equally distributed among the three types of cluster (/s/+obstruent, /s/+sonorant, and /s/+obstruent+sonorant) resulting in a non-significant chi-square (χ2 (2, N = 1693) = 1.014, p> .20), leaving vocabulary and/or lack of fluency as the most likely explanations. Having ruled out a link between cluster type and pauses, a comparison can now be made among the four environments sentence-initial, pause, vowel, and consonant. Table 6 shows very little difference in the error rates for these contexts with the exception of pauses, which yielded a much higher percentage. A comparison of the three non-pause environments yields a non-significant chi-square (χ2 (2, N = 1269) = 3.69, p> .20), while a comparison of pauses with non-pauses gives a very significant chi-square (χ2 (1, N = 1693) = 87.87, p < .0001). The former result, like the result regarding cluster length, is contrary to results of previous studies such as Carlisle (1991a, 1994), which found the frequency of prothesis to be higher after word-final consonants than after word-final vowels; Tarone (1980), who found a higher frequency of paragoge before word-initial consonants than word-initial vowels; and Baptista and Silva Filho (this volume), whose results were similar to Tarone’s. The extremely high error rate before pauses probably indicates only that when learners are having fluency difficulties, they are more likely to fall back on nativelanguage syllable structure. The reason for the discrepancy in the results of this study regarding vowels versus consonants is not exactly clear, but it might have to do with the choice of vowels, especially if compared with Carlisle’s studies on the same onsets. In the present study, the vowel contexts were all either tense vowels or diphthongs, mostly within monosyllabic content words which would most likely
Jeanne Teixeira Rebello and Barbara O. Baptista
Table 7. Frequencies and rates of error in the environments of sonorant consonants and voiced and voiceless obstruents
N Errors %
+vd obstruents
–vd obstruents
Obstruents
Sonorants
340 236 69
401 130 32
741 366 49
240 132 55
be stressed, such as go, do or boy. A stressed vowel seems like an unlikely candidate to resyllabify a complex onset of another stressed syllable, whereas an unstressed reduced vowel appears to lend itself to resyllabification, as epenthetic vowels are normally unstressed. On the other hand, a phrase like go skateboarding flows quite naturally with the insertion of an additional vowel – [go~IskeItbordI]] – compared to the phrase the skateboard, in which an additional vowel would cause an awkward sequence of two unstressed vowels – [ð6IskeItbord]. In addition to the awkwardness of the latter phrase, it would be much more difficult for a transcriber to perceive an additional unstressed vowel between a reduced vowel and the onset, which is why tense vowels and diphthongs were chosen for this study in the first place. Thus, if Carlisle did not avoid reduced vowels in his studies investigating the influence of environment, this would be a likely explanation for the difference in findings between this study and his. The results of the second and third comparisons regarding phonological environment – obstruent versus sonorant and voiced versus voiceless obstruent – can both be seen in Table 7. In relation to the former comparison, there was a small difference between the error rate obtained for the obstruents and that for the sonorants, resulting in a non-significant chi-square (χ2 (1, N = 981) = 2.062, p > .20). In relation to the latter comparison, voiced obstruents yielded a much higher error rate than voiceless obstruents, resulting in a very significant chi-square (χ2 (1, N = 1142) = 29.825, p < .001). Returning to the question of vowels (Table 6), although they yielded a higher error rate than voiceless obstruents and even than sonorants, this rate was significantly smaller than that of voiced obstruents (χ2 (1, N = 961) = 4.900, p < .05), which highlights the difficulty of the latter context. In Carlisle’s (1991a) study neither the obstruent versus sonorant nor the voiced versus voiceless comparison yielded significance, and thus neither could be considered a variable constraint influencing the rate of prothesis. In this study, however, while sonority of the context segment was not shown to be relevant, corroborating Carlisle, voicing of the context segment did act as an important variable constraint. Since Baptista and Silva Filho also found Brazilians to produce more frequent paragoge after final voiced than voiceless obstruents, initial /s/-clusters in this environment actually have two triggers for the insertion of a vowel – the previous consonant and the cluster itself. In sum, environment was shown to exert
The influence of voicing
influence on the production of syllable simplification in this study only in terms of pause versus non-pause and in terms of voicing, but not in terms of sonority.
. Conclusion This study has yielded some expected and some unexpected results regarding what can be considered variable constraints in the production of English initial /s/-clusters by the Brazilian learners of this study. First, unexpectedly, the results were inconclusive regarding the importance of cluster length, possibly because of the greater importance of other variables or simply because the fluency level of the participants of this study was insufficient for cluster length to make a difference. Second, relative error rates for /s/+obstruent clusters, expected to be higher than those for /s/+sonorant clusters because of the SSP, were shown to be significantly lower. This was apparently because of the transfer of voicing assimilation, which resulted in voiced clusters, considered to be universally more marked than voiceless clusters. Third and also related to the SSP, /s/+nasal clusters predictably yielded more frequent errors than /s/+liquid clusters, but the difference was not statistically significant, showing again the limited influence of the SSP for the Brazilian learners. Finally, of the three variables concerning phonological environment expected to play an important role in determining frequency of syllable simplification – vowel versus consonant, obstruent versus sonorant, and voiced versus voiceless obstruents – only the prediction regarding the third variable was borne out: preceding voiced obstruents did yield the expected higher frequency of prothesis in relation to voiceless obstruents. Regarding the first environment variable, the expected lower error rate for preceding vowels was not found. In fact, the only context segments associated with a greater error rate than vowels were the voiced obstruents. If this surprising result was due to the choice of stressed tense vowels in the context, as conjectured, this would indicate that it is not all vowels that facilitate the pronunciation of difficult initial clusters. As to the third environment variable, it was surprising to find that preceding sonorants yielded more frequent error than obstruents, though not significantly so, and this was shown not to be true for the voiced obstruents. Thus, both within the syllable and across syllables, the voicing feature has been shown to be the most important variable constraint for the production of initial /s/-clusters by these Brazilian learners. The results of this study support the MDH and the SCH in part, but neither of these hypotheses could have predicted the interaction between universals and transfer found in this study, where transfer of a native-language process actually mediated in the determination of which markedness relationships would be important. For Brazilian learners the SSP was shown to have limited influence on
Jeanne Teixeira Rebello and Barbara O. Baptista
L2 difficulty, as were sonority relations across syllables. However, implicational universals regarding voicing, both in the onset cluster itself and in the context segment, were shown to be of paramount importance. Perhaps an addendum could be added to the SCH, that the primary language universals that will be relevant to a particular interlanguage may be determined by processes which are productive in the native language.
Acknowledgements We would like to thank an anonymous reviewer for valuable comments on the manuscript.
References Anderson, J. I. (1987). The markedness differential hypothesis and syllable structure difficulty. In G. Ioup & S. Weinberger (Eds.), Interlanguage Phonology: The acquisition of a second language sound system (pp. 279–291). Cambridge, MA: Newbury House. Baptista, B. O. & Silva Filho, J. L. A. (2006). The influence of markedness and syllable contact on the production of English final consonants by EFL learners. In B. O. Baptista & M. A. Watkins (Eds.), English with a Latin Beat: Studies in Portuguese/Spanish – English interphonology. Amsterdam: John Benjamins. Bradley, T. G. (2006). Sibilant voicing in highland Ecuadorian Spanish. Lingua(gem), 2 (2), 9–42. Bradley, T. G. & Del Forge, A. M. (2006). Systemic contrast and the diachrony of Spanish sibilant voicing. In R. Gess & D. Arteaga (Eds.), Historical Romance Linguistics: Retrospectives and perspectives (pp. 19–52). Amsterdam: John Benjamins. Carlisle, R. S. (1988). The effect of markedness on epenthesis in Spanish/English interlanguage phonologoy. Issues and Developments in English and Applied Linguistics, 3, 15–23. Carlisle, R. S. (1991a). The influence of environment on vowel epenthesis in Spanish/English interphonology. Applied Linguistics, 12, 76–95. Carlisle, R. S. (1991b). The influence of syllable structure universals on the variability of interlanguage phonology. In A. D. Volpe (Ed.), The Seventeenth LACUS Forum 1990 (pp. 135–145). Lake Bluff, IL: Linguistic Association of Canada and the United States. Carlisle, R. S. (1992). Environment and markedness as interacting constraints on vowel epenthesis. In J. Leather & A. James (Eds.), New sounds 92: Proceedings of the 1992 Amsterdam Symposium on the Acquisition of Second-Language Speech (pp. 65–75). Amsterdam: University of Amsterdam. Carlisle, R. S. (1994). Markedness and environment as internal constraints on the variability of interlanguage phonology. In M. Yavas (Ed.), First and Second Language Phonology (pp. 223– 249). San Diego, CA: Singular. Carlisle, R. S. (1997). The modification of onsets in a markedness relationship: Testing the interlanguage structural conformity hypothesis. Language Learning, 47, 327–361. Carlisle, R. S. (1998). The acquisition of onsets in a markedness relationship: A longitudinal study. Studies in Second Language Acquisition, 20, 245–260.
The influence of voicing
Carlisle, R. S. (2002). The acquisition of two and three member onsets: Time III of a longitudinal study. In A. James & J. Leather (Eds.), New Sounds 2000: Proceedings of the Fourth International Symposium on the Acquisition of Second-Language Speech (pp. 42–47). Klagenfurt: University of Klagenfurt. Carlisle, R. S. (2006). The sonority cycle and the acquisition of complex onsets. In B. O. Baptista & M. A. Watkins (Eds.), English with a Latin Beat: Studies in Portuguese/Spanish – English Interphonology. Amsterdam: John Benjamins. Clements, G. N. (1990). The role of the sonority cycle in core syllabification. In J. Kingston & M. Beckman (Eds.), Papers in Laboratory Phonology I (pp. 183–333). Cambridge: CUP. Collischonn, G. (1999). A sílaba em português. In L. Bisol (Ed.), Introdução a Estudos de Fonologia do Português Brasileiro (pp. 91–123). Porto Alegre: EDIPUCRS. Cristófaro Silva, T. (2000). Fonética e Fonologia do Português. São Paulo: Contexto. Dziubalska-Kolaczyk, K. (1997). ‘Syllabification’ in first and second language. In J. Leather & A. James (Eds.), New sounds 97: Proceedings of the Third International Symposium on the Acquisition of Second-Language Speech (pp. 69–78). Klagenfurt: University of Klagenfurt. Eckman, F. R. (1977). Markedness and the contrastive analysis hypothesis. Language Learning, 27, 315–330. Eckman, F. R. (1986). The reduction of word-final consonant clusters in interlanguage. In A. James & J. Leather (Eds.), Sound Patterns in Second Language Acquisition (pp. 143–162). Dordrecht: Foris. Eckman, F. R. (1987). Markedness and the contrastive analysis hypothesis. In G. Ioup & S. Weinberger (Eds.), Interlanguage Phonology: The acquisition of a second language sound system (pp. 55–69). Cambridge, MA: Newbury House. (Reprinted from Language Learning, 27, 315–330, 1977.) Eckman, F. R. (1991). The structural conformity hypothesis and the acquisition of consonant clusters in the interlanguage of ESL learners. Studies in Second Language Acquisition, 13, 23–41. Greenberg, J. (1965). Some generalizations concerning initial and final consonant clusters. Linguistics, 18, 5–34. Hooper, J. (1976). Introduction to Natural Generative Phonology. New York, NY: Academic Press. Kenstowicz, M. (1993). Phonology in Generative Grammar. Oxford: Blackwell. Major, R. C. (1996). Markedness in second language acquisition of consonant clusters. In R. Bayley & D. R. Preston (Eds.), Second Language Acquisition and Linguistic Variation (pp. 71– 96). Amsterdam: John Benjamins. Mateus, M. H. & d’Andrade, E. (2002). The Phonology of Portuguese. Oxford: OUP. Murray, R. W. & Vennemann, T. (1983). Sound change and syllable structure in Germanic phonology. Language, 59, 514–528. Rebello, J. T. (1997). The acquisition of English initial /s/-clusters by Brazilian EFL learners. In J. Leather & A. James (Eds.), New sounds 97: Proceedings of the Third International Symposium on the Acquisition of Second-Language Speech (pp. 336–342). Klagenfurt: University of Klagenfurt. Roca, I. & Johnson, W. (1999). A Course in Phonology. Oxford: Blackwell. Ross, S. (1994). The ins and outs of paragoge and apocope in Japanese-English interphonology. Second Language Research, 10, 1–24. Selkirk, E. (1984). On the major class features and syllable theory. In M. Aronoff & R. T. Oehrle (Eds.), Language Sound Structure (pp. 107–136). Cambridge, MA: The MIT Press. Tarone, E. (1980). Some influences on the syllable structure of interlanguage phonology. International Review of Applied Linguistics, 18, 139–152.
Jeanne Teixeira Rebello and Barbara O. Baptista
Tropf, H. (1986). Sonority as a variability factor in second language phonology. In A. James & J. Leather (Eds.), Sound Patterns in Second Language Acquisition (pp. 173–191). Dordrecht: Foris.
Appendix Sentences for the cluster /sp/. /i/ /eI/ /u/ /o~/ /p/ /t/ /k/ /s/ /w/ /f/ /tw/ /θ/ /b/ /d/ /g/ /z/ /Š/ /v/ /ð/ /dŠ/ /m/ /n/ /]/ /r/ /l/ Ø
He speaks with the girls all the time. They spoilt everything. How do you spell your name? No spitting on the floor. That map specially attracted me. Do not speed up, please. The black spades are in the big box. His father is a famous speech therapist. Your rush spoilt everything. People’s life span in Brazil is not very long. I don’t like such spoilt children. The twentieth Spanish person on the list is Carlos. The whole mob spoke to the minister. A mad sponsor decided to pay for the conference. The big spatula is for the icing of the cake. You should always specify what you want. She bought a beige spaniel. Lots of spectators were standing by the gate. The children loathe spinach. We don’t give the judge special privileges. There are some special effects in that film. This has been done specially for you. They sang spectacular songs. The beggar spelt his name. Several species of mammal are in danger of extinction. Spaghetti is my favorite dish.
Production of English initial /s/-clusters by speakers of Brazilian Portuguese and Argentine Spanish Andréia Schurt Rauber Universidade Federal de Santa Catarina, Brazil
The objective of this study was to compare the influence of cluster length, the sonority sequencing principle (SSP), and environment on the production of English initial /s/-clusters by Argentine Spanish and Brazilian Portuguese speakers. Findings suggest an interaction of universal markedness and native language (NL) transfer, the latter mediating the type and degree of influence of the former. The Spanish and Portuguese speakers were shown to follow similar tendencies regarding the influence of cluster length, which was weak for both groups; quite different tendencies concerning the influence of the SSP, which was tempered for the Portuguese speakers by the transfer of a NL voicing assimilation process; and quite different tendencies regarding the influence of the environment.
.
Introduction
Several studies have investigated the production of English initial /s/-clusters by Spanish (Carlisle 1991a, 1991b, 1992 , 1997, this volume) and Portuguese speakers (Rebello 1997; reanalyzed in Rebello & Baptista this volume). In these studies, the addition of an extra vowel to the initial clusters (prothesis) was the usual strategy for dealing with syllable structure difficulty, given that in both Portuguese and Spanish /s/-clusters are invariably preceded by a vowel. However, different instruments were used by Carlisle and Rebello, making comparison of the results for the two native languages difficult. Thus, the objective of this study was to investigate how intermediate Brazilian Portuguese (BP) and Argentine Spanish (AS) speakers of English as a foreign language (EFL) produce words containing initial /s/-clusters, taking into account the variables of length and structure of the clusters, as well as the phonological context. Based on the results obtained by Carlisle and by Rebello and Baptista, the hypotheses for this study were the following: (a) AS speakers would tend to modify longer clusters, while for BP speakers,
Andréia Schurt Rauber
length would have minimal and inconsistent influence; (b) AS speakers would tend to modify structures violating the sonority sequencing principle (SSP) more frequently, while BP speakers would modify those structures less frequently; (c) both AS and BP speakers would produce more prothesis with the more marked /s/+nasal clusters than with the less marked /s/+lateral; (d) AS speakers would tend to produce prothesis most frequently in the context of consonants, followed by a vocalic context, and least frequently after silence (sentence-initial), while the only significant variable concerning environment for the BP speakers would be voicing. The remainder of this article is divided into four sections. The next section presents a brief review of the literature on the syllable structures of English, Portuguese, and Spanish, then briefly summarizes the findings of Carlisle and of Rebello and Baptista on the production of interlanguage complex onsets. The third section describes the method adopted to collect and analyze the data. The fourth section provides the analysis and discussion of the results. Finally, the last section summarizes the conclusions of the various analyses of this study and discusses their theoretical implications.
. Review of the literature The definition of the syllable has been a much-debated issue over the last three decades. In order to define the syllable as a phonological unit, Hooper (1976: 197) proposed a universal condition on syllable structure, called the syllable structure condition (SSC), which determines that an obligatory vowel must occupy the syllable peak or nucleus, and that the strongest consonantal elements are to be found at the margins of the syllables. More recent work refers to degree of sonority, rather than consonantal strength, the terms sonority and strength being in opposition, and Hooper’s SSC is more often referred to as the sonority sequencing principle (SSP, Clements 1990). One exception to the SSC or the SSP is the sequence of consonants formed by a sibilant plus a stop found in the onsets of English syllables. In these clusters, the /s/ occupies a marked position, since the stops are less sonorous than the sibilant, so the cluster decreases in sonority from the margin toward the peak. The /s/+stop clusters are exceptional even in English: they are the only instances of onsets where the second consonant may be an obstruent and where the onset may be formed by three consonants instead of one or two. This exception partly explains the difficulty Brazilian Portuguese and Spanish speakers often have to produce English initial /s/-clusters, since the syllabic structures of Portuguese and Spanish permit no violations of the SSP. According to Brinton (2000: 65), the English syllable may be represented as (C)(C)(C)V(C)(C)(C)(C), which means that the onset may contain up to three consonants and the coda may be formed by one to four consonants. One example
Production of English initial /s/-clusters
of a CCCVCCCC English syllable is the word strengths [stre]kθs]. Thus, the great variety of English syllable structure types would be expected to cause difficulty for EFL learners whose NL syllable structure is less flexible. The possible sequences of segments within the syllable structures of both Portuguese and Spanish are more limited than in English. The phoneme /s/, for instance, occurs in syllable-initial position in all three languages; however, it forms initial clusters only in English. When the /s/ is followed by a consonant in Portuguese or Spanish, it is always preceded by a vowel, making the /s/ the coda of the first syllable and the following consonant the onset of the following syllable, as in escola/escuela (school) – [iskfla] and [eskwela] respectively. This is why both BP and Spanish speakers tend to include an extra vowel before English initial /s/-clusters. This prothetic vowel may differ according to the native language. BP speakers, for instance, may produce [iskul] or [6skul] for school whereas Spanish speakers generally produce [eskul] or [7skul]. The addition of a prothetic vowel as the preferred strategy to simplify these difficult clusters can be considered to result from native language (NL) transfer, and difficulty itself can be predicted by the markedness differential hypothesis (MDH), proposed by Eckman (1987). According to the MDH, language universals and NL transfer predict difficulties in target language (TL) learning. This means that TL structures which are different from and more marked than the corresponding NL structures will be difficult to learn and that “the relative degree of difficulty ... will correspond to the relative degree of markedness” (1987: 61). Katamba (1989: 98) defines markedness in terms of naturalness: “what is natural can be said to be unmarked, and what is not natural can be said to be marked, i.e. in some sense unusual.” While the MDH makes predictions on the basis of both universals and differences between the NL and the TL, a more recent theory proposed by Eckman (1991), the structural conformity hypothesis (SCH), makes predictions only on the basis of universals; that is, it considers the tendency interlanguages have to follow the same universal principles that primary languages do. Both the MDH and the SCH were used in this study to provide an explanation for the difficulty BP and Spanish speakers have in accurately producing /s/-clusters. In order to investigate the linguistic variables affecting the frequency of prothesis production by Spanish-speaking learners in English initial /s/-clusters, Carlisle carried out several studies involving native Spanish-speaking learners of English as a second language, who were asked to read a number of topically unrelated and randomly ordered sentences containing initial /s/-clusters in different environments. He examined (a) the influence of environment on the production of English /sk/, /st/ and /sp/ (1991a); (b) the interaction of the influence of sonority sequencing and environment on the production of /st/ and /sl/ (1991b); (c) the influence of sonority sequencing on the production of prothesis before the wordinitial onsets /sl/, /sm/ and /sn/ (1992); and (d) the influence of cluster length
Andréia Schurt Rauber
on the production of /sC/ versus /sCC/ clusters (1997). All studies controlled the environments before the onsets and the sonority relationships among the consonants in the onsets. The results of these studies revealed that (a) vowel prothesis was significantly more frequent after consonants than after vowels (1991a); (b) prothesis was more frequent in the more marked /st/ cluster than in the less marked /sl/ (1991b); (c) prothesis was more frequent in the more marked /s/+nasal clusters than in the less marked /sl/; and (d) the more marked tri-literal clusters were more frequently modified than the bi-literal clusters (1997), thus confirming that language universals “influence the structuring of interlanguage phonology” (1997: 327). Rebello (1997; Rebello & Baptista this volume) adopted a type of instrumentation similar to that of Carlisle, but investigated BP-speaking classroom learners of English. In her instrument, their included both bi-literal and tri-literal English initial /s/-clusters and obtained results which were less conclusive and partly contrary to those of Carlisle (1991a, 1992, 1997). Concerning the phonological environment, the only significant finding was that prothesis was more frequent after unplanned pauses, indicating mostly processing difficulty. The three environments vowel, consonant, and sentence-initial yielded insignificant differences in rate of prothesis. Their results concerning cluster length were inconsistent among the three /sC/ versus /sCC/ pairs, the /sp(C)/ pair being the only one to obtain significantly more prothesis for the longer cluster. The other two pairs yielded insignificant results, the /st(C)/ pair in the expected direction (more prothesis for the longer cluster) and the /sk(C)/ pair in the unexpected direction (more for the shorter one). As to the structure of the word-initial onsets, Rebello and Baptista found that bi-literal clusters not in violation of the SSP (/s/+sonorant) were more frequently modified than bi-literal clusters in violation, a difference that yielded statistical significance. These findings are contrary to predictions based on the SSP, but are explained by Rebello and Baptista as being a result of transfer of the voicing assimilation process from BP (or of the already voiced NL /VzC(C)/ chunks): participants tended to voice the /s/ in /s/+nasal and /s/+liquid clusters, which resulted in voiced obstruent + sonorant clusters. Since voiced obstruents are more marked than voiceless obstruents in any position, these clusters are more marked than voiceless obstruent + obstruent clusters (Greenberg 1965). They also point out that in BP the initial vowel preceding /s/+obstruent clusters can be deleted in rapid speech, while this vowel deletion does not occur before /s/+sonorants because of their voiced status. Thus, they considered these results to conform to the MDH and the SCH, since it was the more marked voiced cluster that obtained a greater rate of prothesis. Rebello and Baptista also found more prothesis with /s/+nasal clusters than with the /s/+lateral, but this difference was not significant. The difference between Carlisle’s and Rebello and Baptista’s results was one of the motivations for the present study, in which the same instrument was used for both BP and Spanish
Production of English initial /s/-clusters
speakers, in an attempt to confirm whether the NL does have such an influence on TL production as to result in opposite findings.
. Method . Participants The native AS-speaking group consisted of two men and seven women with a mean age of 20, all undergraduate English majors in their first or second year at a university in Argentina. The native BP-speaking group consisted of two men and eight women with a mean age of 28, all second or third year undergraduate English majors at a Brazilian university. Participants in both countries had an English proficiency level approximately equivalent to Cambridge First Certificate, or ALTE level 3 (the Argentine university required greater proficiency on entry, accounting for the difference in year of university study), and all of them reported that they did not have opportunities to speak English outside the classroom. One of the Brazilian participants had lived in the USA for 5 months and another for 6 months. Two of the Brazilian participants spoke a language other than Portuguese at home (one Italian and the other Spanish). None of the Argentine participants had either lived abroad or spoke a language other than Spanish at home. . Material As in Carlisle and Rebello, the participants were asked to read topically unrelated sentences in the language laboratory of their respective universities. Sentence reading was preferred to ensure that all relevant phonological contexts were included in the corpus. The instrument included 13 sentences for each of the bi-literal and tri-literal /s/-clusters /sp, st, sk, sw, sm, sn, sl, spr, str, skr, spl, skw/, each cluster preceded five times by vowels, five times by consonants, and three times by silence (sentence-initial, the null context). This gave a total of 156 target sentences, to which were added 24 distractor sentences, making 180 in all. Each subject read the sentences in a different order to prevent a possible ordering effect. Some of the sentences used in this study were taken with permission from Rebello’s instrument. . Transcription Only the part of each sentence considered relevant to this study was transcribed. Three aspects were focused on: the absence or presence of the prothetic vowel, the phonetic realization of the preceding environment, and the phonetic representation of the onsets. The relevant sections of each sentence were transcribed by two
Andréia Schurt Rauber
judges with experience in phonetic transcription and only the 94.74% of items on which there was agreement were included in the statistical analysis. Two types of modification besides prothesis occurred, although less frequently: deletion of one of the members of the cluster (in 1.81% of the items), and substitution of one of the members of the cluster for another phoneme (in 0.71% of the items). These items were excluded from the analysis. In order to avoid pauses between the target word and the word preceding it, the participants were asked to reread the sentence whenever there was hesitation, and only the sentences with no pauses were transcribed.
. Results and discussion The comparison of the results for BP- and AS-speaking learners of English are reported and discussed below, with analysis of the influence of the variables cluster length, structure of cluster (/s/+sonorant versus /s/+obstruent and /s/+nasal versus /s/+lateral) and environment. . Length of cluster Hypothesis 1 predicted that AS speakers would tend to modify longer clusters more frequently than shorter ones, whereas BP speakers would modify both types of cluster at similar rates. Tables 1 and 2 show that for both groups of speakTable 1. BP speakers’ rates of prothesis production for /sC/ versus /sCC/ clusters N /sp/ vs. /spr, spl/ /st/ vs. /str/ /sk/ vs. /skw, skr/ Total
125 129 115 369
/sC/ Prothesis 33 47 34 114
%
N
*26.40 36.43 29.57 30.90
248 129 250 627
/sCC/ Prothesis 94 50 98 242
% *37.90 38.76 39.20 38.60
* Significant at p < .05
Table 2. AS speakers’ rates of prothesis production for /sC/ versus /sCC/ clusters N /sp/ vs. /spr, spl/ /st/ vs. /str/ /sk/ vs. /skw, skr/ Total
114 110 109 333
/sC/ Prothesis 30 41 29 100
%
N
26.32 37.27 26.61 30.00
211 113 218 542
/sCC/ Prothesis 63 49 77 189
% 30.81 43.36 35.32 34.90
Production of English initial /s/-clusters
ers and for all three pairs of /sC/ versus /sCC/ clusters, the longer clusters were consistently modified at a higher rate than the shorter ones. However, chi-square analyses yielded statistical significance for only one pair – the /sp/ versus /spC/ clusters produced by the Brazilian learners (ó2 (1, N = 373) = 4.39, p < .04). Hypothesis 1, thus, is not supported. These results are similar to those of Rebello and Baptista (this volume) in showing very little influence of length of cluster on the frequency of prothesis production by Brazilian learners, although the results of the present study were more consistent. Surprisingly, they do not corroborate Carlisle (1997), who found significantly more frequent prothesis in Spanish speakers’ production of the longer clusters. As suggested by Rebello and Baptista, possibly the fact that these were classroom learners made them somewhat less influenced by universals. . Internal structure of cluster: /s/+obstruent versus /s/+sonorant Hypothesis 2 predicted that AS speakers would tend to modify structures violating the sonority sequencing principle (SSP) more frequently, while BP speakers would modify those structures less frequently. For this analysis, bi-literal /s/+obstruent clusters (in violation of the SSP) were compared to bi-literal /s/+sonorant clusters (in conformance with the SSP). Table 3 shows more frequent prothesis for the /s/+obstruent clusters for both groups of learners; however, these results yielded a non-significant chi-square for the Brazilian learners (ó2 (1, N = 866) = 1.31, p > .25), and a highly significant chi-square for Argentines (ó2 (2, N = 774) = 12.25, p < .0005). Thus, Hypothesis 2 is supported only for the AS-speaking learners. Rebello and Baptista’s (this volume) findings showing more prothesis produced by Brazilians for the /s/+sonorant clusters are not corroborated, but nor are they clearly contradicted. On the other hand, Carlisle’s results yielding more prothesis for /s/+obstruent clusters produced by Spanish speakers are corroborated. . Internal structure of cluster: /s/+nasal versus /s/+lateral Hypothesis 3 predicted that both AS and BP speakers would produce more frequent prothesis before the more marked /s/+nasal clusters than before the Table 3. Rates of prothesis for /s/+sonorant versus /s/+obstruent clusters as produced by BP and AS speakers Group BR AR
/s/+son clusters /sm, sn, sl, sw/ N Prothesis % 497 441
137 86
27.57 19.50
/s/+obstr clusters /sp, st, sk/ N Prothesis % 369 333
144 100
30.89 30.03
Andréia Schurt Rauber
Table 4. Rates of prothesis for /s/+nasal versus /s/+liquid clusters as produced by BP and AS speakers Group BR AR
N 250 222
/s/+nasal clusters Prothesis 92 57
%
N
36.80 25.68
123 108
/sl/ cluster Prothesis 45 22
% 36.59 20.37
/s/+lateral. Although Table 4 shows somewhat more frequent prothesis before /sm/ and /sn/ in both NL groups, the difference was not significant for either the Argentine learners (ó2 (1, N = 330) = .85, p > .35) or the Brazilians (ó2 (1, N = 373) = .005, p > .90), the rates being almost identical for the latter group. Thus, the hypothesis cannot be supported and neither can Carlisle’s findings be corroborated. The results for both groups of learners are more similar to those of Rebello and Baptista, who also found only a weak tendency toward an influence of markedness within the group of /s/+sonorant clusters. As with the limited influence of cluster length, the only possible explanation appears to be the lesser influence of markedness on classroom learners. . Voicing assimilation of /s/ + sonorant clusters by BP speakers As stated in Section 2, Rebello and Baptista attributed the higher rate of prothesis obtained with /s/+sonorant clusters compared to /s/+obstruent clusters to the transfer of the NL processes of assimilation and vowel deletion before initial /s/+obstruent clusters. Although in the present study the Brazilian participants tended in the same direction as the Argentines, that is, with higher prothesis rates for the /s/+obstruent clusters, the difference between the two cluster types for the Brazilians was not significant, whereas for the Argentines it was. Thus, it cannot be claimed that the Brazilians prothesize more with the clusters in violation of the SSP. The Brazilian participants’ results in the present study were similar to those in Rebello and Baptista, however, in the high frequency of voicing assimilation, resulting in the clusters [zm, zn, zl] instead of [sm, sn, sl]. For this reason, a comparison was made, among the /s/+sonorant clusters, between the rates of prothesis in the tokens with voicing assimilation and in the tokens without voicing assimilation. As can be seen in Table 5, the /s/ was voiced in more than half of the tokens produced (55.60%) for /s/+nasal clusters. Out of these, 56.38% were produced with a prothetic vowel, compared to only 11.71% when there was no voicing, resulting in a very significant chi-square (ó2 (1, N = 250) = 52.10, p < .0001). Similarly, the majority of the /sl/ clusters were also produced with voicing (59.35%). Out of these, 53.42% were produced with a prothetic vowel, compared to 12% of the
Production of English initial /s/-clusters
Table 5. Voicing assimilation and rates of prothesis by BP speakers Cluster
N
/s/+nas /s/+lat /s/+son
139 73 212
Voicing of /s/ Prothesis 79 39 118
%
N
56.83 53.42 55.66
111 50 161
No voicing of /s/ Prothesis % 13 06 19
11.71 12.00 11.80
N 250 123 373
Total Prothesis 92 45 137
% 36.80 36.58 36.73
tokens without voicing, also resulting in a significant chi-square (ó2 (1, N = 123) = 20.19, p < .0001). These figures show evidence of the same strong relationship between voicing assimilation and prothesis found by Rebello and Baptista (this volume). This relationship did not cause a greater rate of prothesis for the /s/+sonorant clusters as found in their study, possibly because of the difference in English proficiency. The participants of the previous study were only from lower to upper intermediate and produced much greater rates of epenthesis overall than those of the present study. The voiced clusters that result from the assimilation process are, according to Greenberg (1965: 29), more marked than clusters containing voiceless obstruents in any position. Thus, what we have here is a sort of conflict between two types of markedness: that related to the SSP and that of voicing. This conflict seems to have caused a neutralization of the influence of each in this study; that is, it resulted in very little difference in frequency of prothesis between clusters not in violation and clusters in violation of the SSP. The significant difference found by Rebello and Baptista, with more prothesis for the clusters in violation of the SSP, seemed to indicate not just a neutralization of the two kinds of markedness, but a greater importance of markedness by voicing. Since the participants of the previous study were less proficient in English, the difference in the results of the two studies can be explained by Major’s ontogeny model (1986; later the ontogeny phylogeny model, 2001), according to which NL processes prevail in earlier stages of language development, gradually replaced by universal processes, which ultimately give way to TL processes. Thus, the less proficient learners in Rebello and Baptista would still be more influenced by the NL process (voicing), while the participants of the present study would be more or less equally influenced by both NL voicing and universal sonority sequencing. . Phonological environment Hypothesis 4, based on findings by Carlisle (1991a, 1991b, 1992, 1997) and Rebello and Baptista (this volume), predicted that AS speakers would tend to produce prothesis most frequently in the context of consonants, followed by a vocalic context, and least frequently after silence (sentence-initial), while for BP speakers the environment would be less important. As can be observed in Table 6, the AS speak-
Andréia Schurt Rauber
Table 6. Rates of prothesis production by AS and BP speakers in different environments Group AR BR
N 314 344
Silence Prothesis 53 75
%
N
16.88 21.80
499 570
Vowels Prothesis 113 232
%
N
22.65 40.70
502 579
Consonants Prothesis 199 186
% 39.64 32.12
ers produced significantly more prothesis after consonants, followed by the vocalic context and then by silence (ó2 (2, N = 1,315) = 60.48, p < .0001). As for BP speakers, these participants added a prothetic vowel most frequently after vowels, less frequently after consonants and least frequently after silence, resulting in a significant chi-square (ó2 (2, N = 1,493) = 34.98, p < .0001). These results support Hypothesis 4 in regard to the AS speakers, and thus corroborate Carlisle’s study. As for the BP speakers, environment did exert significant influence in this study, as opposed to the insignificant results obtained by Rebello and Baptista. The vocalic environment yielded the highest rate of prothesis in the present study, which is consistent with the tendency observed in Rebello and Baptista, the difference being that in the present study this tendency was significant. Rebello and Baptista attributed the lack of relative facility of the vocalic environment to the fact that the context vowels in that study were all tense vowels in content words, which would have a tendency to take stress, thus favoring the insertion of a vowel for the double purpose of maintaining the alternation of strong and weak syllables and allowing the resyllabification of the cluster. The present study did not control for the tense/lax vowel distinction, but since some of the sentences were borrowed from the previous study, tense vowels were the vast majority – 90%. The fact that the difference between consonants and vowels in the preceding context was significant in the present study could be due to the greater proficiency level of these participants. Again referring to Major’s ontogeny model, these participants may have reached a stage where there is a greater effect of TL processes, allowing the TL rhythm to increase the influence of the alternation of strong and weak syllables on resyllabification in the environment of a tense vowel. The AS-speaking participants, although equally proficient in English, were apparently not influenced by the TL rhythm, probably due to the fact that their NL is a more syllable-timed language than BP. Thus, without the difference in prominence between the tense and lax/reduced vowels, any preceding vowel would allow for resyllabification, making the vocalic context easier.
Production of English initial /s/-clusters
. Conclusion The analysis of the production of initial /s/-clusters by BP and AS speakers led to the following conclusions: (a) Participants of both NLs modified longer clusters more frequently than shorter clusters, but mostly without a statistically significant difference; (b) participants of both NLs inserted a prothetic vowel more frequently before clusters in violation of the SSP (/s/+obstruents) than before clusters not in violation (/s/+sonorants), but this difference was only significant for the AS speakers; (c) there was a tendency, but not significant, for the AS speakers to produce more prothesis in the more marked /s/+nasal clusters than in the /s/+lateral clusters, whereas the rates were almost identical for the BP speakers; (d) different results were obtained for the speakers of the two NLs concerning the influence of phonological context, the Brazilian hierarchy of difficulty being vowels > consonants > silence, and the Argentine hierarchy being consonants > vowels > silence. The results concerning the production of /s/-clusters by AS speakers corroborate Carlisle’s (1991a, 1991b, 1992, 1997) findings concerning the influence of the SSP and of the environment, and lend weak support for his findings concerning cluster length and /s/+nasal versus /s/+liquid. As for the production of /s/-clusters by BP speakers, the results concerning cluster length were consistent with Rebello and Baptista in showing minimal importance of this variable, but the tendency for greater length to equal greater difficulty was more consistent in the present study. Although the results for the BP speakers regarding the influence of the SSP were not statistically significant in this study, they were consistent with Rebello and Baptista in not supporting the greater difficulty of the clusters in violation of the principle. This study, as the previous one, found significantly more prothesis where there was voicing assimilation, indicating that the heavy influence of the SSP seen with the Spanish-speaking learners is either neutralized (present study) or even overridden (Rebello & Baptista) by the markedness of the voiced obstruent clusters resulting from the transfer of the L1 voicing assimilation. The degree of influence of voicing assimilation was attributed to proficiency level, which would determine the relative importance of NL and universal processes, following Major’s ontogeny model. The findings for the BP speakers lend no support whatsoever to the influence of the SSP within the class of /s/+sonorant clusters, the /s/+nasals and /s/+liquid obtaining almost identical rates of prothesis. Finally, concerning phonological environment, not only are the results for the BP speakers of this study consistent with Rebello and Baptista in not supporting the greater difficulty of preceding consonantal contexts (the results for the two contexts were similar in their study), but they actually follow the opposite pattern from that of the Spanish speakers: significantly more frequent prothesis for clusters following vowels. Again the difference was attributed to the
Andréia Schurt Rauber
greater proficiency level of the participants of this study compared to those of Rebello and Baptista, which may have led to a greater influence of the TL alternation of strong and weak syllables, with the prothetic vowel serving to separate the tense vowel from the following stressed syllable. In spite of the only partial corroboration of Carlisle’s findings and those of Rebello and Baptista respectively by the AS- and BP-speaking English learners in this study, the results for the two groups of learners, tested by means of the same instrument, demonstrate that native speakers of two closely related languages can exhibit very different strategies in their attempts to produce difficult clusters in a foreign language, as claimed by Rebello (1997) and Rebello and Baptista (this volume). Thus, the present study has contributed additional evidence to support an interaction between NL transfer and markedness in interphonology, where the NL can determine both the type of markedness which is most relevant to the production of TL speech and the degree to which each kind of markedness affects this production. It also further supports Major’s ontogeny model, demonstrating that markedness related to NL voicing assimilation was more important in the learners’ production in Rebello and Baptista, whereas markedness related to the universal SSP was equally important to the production of the more proficient learners of the present study.
Acknowledgements This research was funded by a grant from CAPES (Coordenação de Aperfeiçoamento de Pessoal de Nível Superior), of the Brazilian Ministry of Education. I would also like to thank the editors of this volume and an anonymous reviewer for valuable comments on the manuscript.
References Brinton, L. J. (2000). The Structure of Modern English: A linguistic introduction. Amsterdam: John Benjamins. Carlisle, R. S. (1991a). The influence of environment on vowel epenthesis in Spanish/English interphonology. Applied Linguistics, 12, 76–95. Carlisle, R. S. (1991b). The influence of syllable structure universals on the variability of interlanguage phonology. In A. D. Volpe (Ed.), The Seventeenth LACUS Forum 1990 (pp. 135– 145). Lake Bluff, IL: Linguistic Association of Canada and the United States. Carlisle, R. S. (1992). Environment and markedness as interacting constraints on vowel epenthesis. In J. Leather & A. James (Eds.), New Sounds 92: Proceedings of the 1992 Amsterdam Symposium on the Acquisition of Second-Language Speech (pp. 64–75). Amsterdam: University of Amsterdam.
Production of English initial /s/-clusters
Carlisle, R. S. (1997). The modification of onsets in a markedness relationship: Testing the interlanguage structural conformity hypothesis. Language Learning, 47, 327–361. Carlisle, R. S. (2006). The sonority cycle and the acquisition of complex onsets. In B. O. Baptista & M. A. Watkins (Eds.), English with a Latin Beat: Studies in Portuguese/Spanish – English Interphonology. Amsterdam: John Benjamins. Clements, G. N. (1990). The role of the sonority cycle in core syllabification. In J. Kingston & M. Beckman (Eds.), Papers in Laboratory Phonology I (pp. 183–333). Cambridge: CUP. Eckman, F. R. (1987). Markedness and the contrastive analysis hypothesis. In G. Ioup & S. H. Weinberger (Eds.), Interlanguage Phonology: The acquisition of a second language sound system (pp. 55–69). Cambridge, MA: Newbury House. (Reprinted from Language Learning, 27, 315–330, 1977.) Eckman, F. R. (1991). The structural conformity hypothesis and the acquisition of consonant clusters in the interlanguage of ESL learners. Studies in Second Language Acquisition, 13, 23–41. Greenberg, J. H. (1965). Some generalizations concerning initial and final consonant sequences. Linguistics, 18, 5–34. Hooper, J. (1976). Introduction to Natural Generative Phonology. New York, NY: Academic Press. Katamba, F. (1989). An Introduction to Phonology. London: Longman. Major, R. C. (1986). The ontogeny model: Evidence from L2 acquisition of Spanish r. Language Learning, 36, 453–504. Major, R. C. (2001). Foreign Accent: The ontogeny and phylogeny of second language phonology. Mahwah, NJ: Lawrence Erlbaum Associates. Rebello, J. T. (1997). The acquisition of English initial /s/-clusters by Brazilian EFL learners. In J. Leather & A. James (Eds.), New sounds 97: Proceedings of the Third International Symposium on the Acquisition of Second-Language Speech (pp. 336–342). Klagenfurt: University of Klagenfurt. Rebello, J. T. & Baptista, B. O. (2006). The influence of voicing on epenthesis production by Brazilian EFL learners. In B. O. Baptista & M. A. Watkins (Eds.), English with a Latin beat: Studies in Portuguese/Spanish – English interphonology. Amsterdam: John Benjamins.
Prosodic-level studies: Stress and rhythm
Variability in the use of weak forms of prepositions Michael Alan Watkins Universidade Federal do Paraná, Brazil
This study investigated the influence of phonological environment on variability in the production of the weak forms of four prepositions by advanced Brazilian speakers of English. Results of a VARBRUL analysis indicated that four phonological factors were exerting a significant effect: (a) the preposition itself; (b) whether or not there was a preceding syllable in the same intonation group; (c) the initial segment of the following word; (d) the metrical status of the following syllable. The relative amount of output by the speaker during the 30-minute recording also proved to have a significant effect. Since none of these factors appeared to be exerting a particularly powerful effect on its own, there may be some psycholinguistic factors also operating.
.
Vowel reduction
With regard to English, the term reduced vowel is used in this paper solely for the lax, central, unrounded vowel (schwa) in weak syllables. English schwa is a completely neutral vowel, which cannot be related phonetically to any of the full vowels (although it is part of a native speaker’s knowledge of the language that there are correspondences between schwa and one of the full vowels in many related forms such as oppose/opposition, atom/atomic, suppose/supposition). The key point as far as this study is concerned is that in English, except in open word-final syllables, there is a binary choice between a full vowel and schwa, with schwa (or a close variant) being selected automatically in unstressed syllables (Fear, Cutler, & Butterfield 1995), although reduction is blocked under some circumstances in those syllables closed by a non-coronal obstruent, for example the second syllable of expectation (Burzio in press). The variation in height between reduced vowels in word-final vs. pre-final position, noted in Scottish speakers by Giegerich (1992), and measured experimentally for Californian speakers by Flemming and Johnson (in press), is considered to be a systemic distinction, since in open word-final syllables there is a possible contrast with unstressed /i/ (and also with /o~/ in the view
Michael Alan Watkins
of Hayes 1995, who uses flapping of a preceding /t/ as a criterion for lack of stress), whereas in non-final position this distinction is generally neutralized (although a few pairs such as except-accept may be distinguished by some speakers). However, in the case of the monosyllabic function words investigated in this study any such realizational differences were considered to be non-contrastive. While vowel reduction in English tends to be binary and automatic, in Brazilian Portuguese (BP) it is to some extent related to style, and there is never more than partial neutralization, so that while there are different sets of vowels available depending on the metrical status of a syllable and its position within the word, it never comes down to a straight binary choice. Reduction thus follows the centrifugal pattern, as opposed to the centripetal pattern of English. Both types of reduction achieve the same effect but by different means, in that they both diminish the amount of phonetic information in the speech signal by reducing spectral complexity (Harris 2005). While centrifugal reduction consists of dispersal to the corner vowels (which have a relatively low degree of spectral complexity, allowing a clear three-way contrast to be maintained in weak syllables), centripetal reduction involves centralization to the neutral mid position and the loss of all informational content (Harris 2005; Crosswhite 2004). Major (1981) found that shortening (by means of raising, monophthongization, and syllabicity shifts, as well as deletion) tends to occur in BP, especially in casual speech, and is more likely to affect post-tonic syllables, although in the most casual speech pretonic shortening occurs also. This greater tendency for shortening and deletion to occur towards the more informal end of the stylistic continuum led Major to hypothesize that Brazilian Portuguese is in the process of changing from a syllable-timed to a stress-timed language. The fact that reduction is more extreme in post-tonic syllables subsequently led Major (1992, cited in Crosswhite 2004) to suggest a hybrid situation for BP, with the pretonic section of polysyllabic words syllable-timed while the post-tonic section is stress-timed. Another characteristic which Brazilian Portuguese shares with English, according to Major (1981), is the progressive shortening of stressed syllables as the number of intervening unstressed syllables increases. Massini-Cagliari (1992) mentions studies which suggest that there is considerable inter-speaker variation with regard to rhythm, with the rhythm of some speakers being predominantly syllable-timed, while that of others is predominantly stress-timed. She found the Portuguese spoken in Rio Grande do Sul to be in general more syllable-timed, closer to Spanish, than the speech of São Paulo. Major found that formal Brazilian Portuguese sounded more syllable-timed than less formal styles, in which post-tonic shortening regularly occurred, and even pretonic shortening in the most casual speech. The connection between casual style and reduction is interesting, as it may be that Brazilian learners of English unconsciously associate vowel reduction with less careful speech styles, and feel that somehow it is “not quite correct”.
Variability in the use of weak forms of prepositions
Major (1985) reports a study in which Brazilians from three different regions (Minas Gerais, Paraná, and Bahia) were asked to say the nonsense word lalala in different environments. The duration ratio of pretonic, tonic and post-tonic syllables was found to be approximately 3:4:2 across the speakers. Duration was most consistently correlated with stress, while pitch and intensity varied considerably. Based on the sets of possible syllables for the three positions, Major concluded that three levels of stress (corresponding to the size of the subset of vowels permitted) could be distinguished in Brazilian Portuguese trisyllabic words, with the largest number of combinations being possible in tonic syllables, a smaller number in pretonic, and the most restricted set in post-tonic. He also reports that pretonic raising of unstressed vowels (/o/ → /u/, /e/ → /i/) only occurred in casual style, and even then not invariably, whereas post-tonic raising was obligatory in normal and casual styles, and optional in citation forms. Unstressed diphthongs are shortened to monophthongs in accordance with the same stylistic and positional patterns that characterize raising. The stronger tendency for post-tonic raising is independent of the metrical contour of the word: for example, the second and third vowels of tráfego and diálogo undergo raising, while the first and second of merecer and seleção do not. Wetzels (1992) found four different sets of vowels depending on metrical position. His “post-tonic/non-final” is the extra category, as his “unstressed wordfinal” corresponds to Major’s “post-tonic”. The largest set consists of seven vowels, in tonic syllables: /i e 7 a f o u/; pretonics have a set of five, without /7/ and /f/; in post-tonics the /o/ is missing from the set, leaving four possibilities; and in unstressed word-final syllables only three vowels are available, /i a u/. Massini-Cagliari (1992) confirms Major’s finding that the most consistent cue to stress in Brazilian Portuguese is duration, reinforced by lower intensity and vowel quality changes in post-tonic syllables. She insists that it is syllable duration (which Major in fact measured), rather than the vowel duration, which is relevant. Like Wetzels and Major, she found that /7/ and /f/ only occur in stressed syllables, while /e/, /a/ and /o/ become more central and raised in unstressed syllables, and that there is a hierarchy for likelihood of reduction: post-tonic > pretonic > tonic. In short, while vowel reduction occurs in Brazilian Portuguese, feature loss is not so extreme as in English, as deletion tends to occur before any vowel gets stripped down to a totally neutralized [6]. Moreover, reduction tends to be along a continuum, depending on style and position within the word, rather than categorical, as it generally is in English. Feature loss is never so heavy that a reduced vowel in Brazilian Portuguese loses all its identifying features. It is more a matter of certain contrasts being neutralized in unstressed syllables: there are more contrasts in some positions than others, and the polysystemic/subset type of analysis used by Major and Wetzels is a useful way of describing this pattern (Ogden 1999, explicitly uses this approach in his analysis of the strong and weak forms of En-
Michael Alan Watkins
glish auxiliaries). Another difference between English and Portuguese reduction is the much stronger tendency in Portuguese for reduction in word-final syllables than in pretonic syllables. In English, pretonic syllables are subject to reduction in a way which often contrasts with cognates in Brazilian Portuguese, which keep a full vowel, as in the initial syllables of phonetics∼fonética, tomato∼tomate, catastrophe∼catástrofe, America∼América, event∼evento. The fact that reduction appears to correlate with style in Brazilian Portuguese, and is a matter of gradual loss of contrasts along a continuum, may make Brazilian learners of English inclined to attach priority to the full vowel of the citation form of function words, and to regard reduction of this vowel as being optional and gradient, rather than a categorical binary choice between distinct forms of the word, one with the full vowel, and the other with a schwa, as in standard native English speech. The aim of this research was to find out to what extent variability in the use of weak forms of prepositions occurs in the spontaneous speech of advanced Brazilian users of English, and whether this variability is systematically conditioned by the phonological environment.
. Method . Data collection Data for this study was provided by sixteen advanced Brazilian speakers of English, with at least the Cambridge Certificate of Proficiency in English or a Master’s degree in English, all of whom had Brazilian Portuguese as their sole first language (L1) and had learned English in an instructional setting in Brazil. They were recorded in informal surroundings for thirty minutes, talking about a fixed sequence of everyday topics. Baseline data was obtained from two adult native speakers, both teachers of English as a foreign language living in Brazil, one American and the other English, performing the same task. The subjects’ ages ranged from 23 to 60 (average 44), and all took part in the research voluntarily. Although they guessed that their contributions were going to be subjected to some kind of linguistic analysis, none knew what the exact focus was to be. . Variables I describe first how the dependent variable (use of the weak form of the preposition with schwa, or the strong form with a full vowel) was judged, after which I describe the five factor groups used in the final run of the VARBRUL analysis, that is, those which were found to be contributing significantly to variability. The other five factor groups originally included, but eventually excluded as being non-
Variability in the use of weak forms of prepositions
significant, were the following: presence of an immediately preceding word, final segment of the immediately preceding word, type of vowel (full or schwa) in the preceding syllable, type of vowel in the following syllable, and metrical status of the preceding syllable. .. Dependent variable: Schwa or full vowel The researcher and a trained native-speaker assistant rated the tokens independently by ear for this variable, doubtful cases being discussed together. The relatively low percentage of inter-rater agreement after the independent ratings (overall 77.25%, ranging from 63% to 85% for individual speakers) simply reflects the inherent difficulty of the task, due to the fact that the vowels seemed to be located at many different points along a continuum between totally full and totally schwalike, making it impossible to formulate clear, objective criteria which would apply in all contexts across all speakers. The subsequent discussion phase, which consisted of extensive joint re-listening to the data in order to agree on how the broad overall criterion should be applied in the context of each individual speaker’s idiosyncratic characteristics, eventually resolved all doubts, and in the end no tokens had to be thrown out because of failure to reach an agreement. While it is accepted that this is a potentially weak point in the analysis, insofar as other pairs of raters might have agreed on different classifications, it is hard to see how to exclude subjectivity. Instrumental analysis would come up against exactly the same difficulty, since (apart from the problem posed by interspeaker differences) an arbitrary “human” decision would have to be taken regarding the delimitation of the area within which a vowel would count as reduced. .. Factor Group 1: Target word Of the seven English prepositions with dual forms, only four occurred in the data with sufficient frequency to justify their inclusion in the analysis: to, of, at and for. Those excluded at an early stage were as, than and from. Once the range of target words was reduced to four, it was no longer necessary to include phonological features of these words (such as presence or absence of onset and coda) as separate variables, since the effect of phonological structure would already be clearly visible in the results. In the final analysis, at was also excluded since it distorted the results because of its lower frequency, which meant that only three words were left. .. Factor Group 2: Presence of an immediately preceding syllable in the same intonation group (IG) The inclusion of this variable was based on the hypothesis that there would be a stronger likelihood of a preceding syllable or segment affecting the target vowel if it belonged to the same intonation group. However, in practice it was not always easy to determine the exact point where one intonation group ended and another began
Michael Alan Watkins
in an unbroken stream of fast interlanguage speech, even following the criteria described by Cruttenden (1997), although it might nevertheless be quite clear that there were two intonation groups. .. Factor Group 3: First segment of the following word It was assumed that there would be more likelihood of an effect from the immediately following segment than from any preceding one, as in rapid speech the first segment of a complex onset might operate as a coda for the coda-less target words, and any following consonant might cause regressive assimilation with the coda of a preceding preposition, as well as increasing its weight. Since only prepositions immediately followed by another word were included as tokens in the analysis, there is no zero value for this variable group. The variables were /r/ and /h/ (as during the transcription it looked as if these might be individually relevant), G (the glides /j/ and /w/), C (any other consonant), and V (vowels). .. Factor Group 4: Metrical status of following syllable This variable is crucial as it defines the circumstances in which the strong forms of the prepositions in question are possible and when they are unacceptable in native speech. Strong forms in the contexts #__S (involving at and of only), #__W, and W__W are all attested in native speech. However, it is only in the first case that the preposition can have full vowel quality without being stressed; in the other two contexts the preposition must be stressed for the full form to occur, although it should be borne in mind that it is often very difficult to decide whether an initial function word is stressed or not when there is no preceding context. In such cases the metrical distinction seems to be partially neutralized, as the crucial cue of pitch prominence is largely concealed. To, of course, is a special case, as the full vowel is usual before a vowel or a pause in some of the major varieties of English, especially in careful speech. On the other hand, full vowels in the target words are not attested in standard native speech in the contexts S__S, W__S, and #__S (except occasionally for at and of ), unless someone is speaking abnormally slowly and emphatically, stressing the preposition as a separate foot. In such cases it could be argued, as Giegerich (1992), Burzio (1994) and Cummins and Port (1998) do, that there is an unrealized (phonetically empty) beat between the stresses. The important fact to bear in mind about the assignment of S or W to a syllable is that it denotes relative stress. S simply means “stronger than”, and W means “weaker than”. It follows that when there is no immediately preceding syllable, stress level is harder to determine. Apart from judging preceding and following syllables, this also had to be done when deciding which tokens were metrically weak and could therefore be included in the analysis in the first place. Cues used for judging metrical level, in addition to vowel quality (which might lead to circularity), were relative pitch and duration (in combination). In practice, the only
Variability in the use of weak forms of prepositions
difficult cases to judge were those already mentioned, when a phrase began with a preposition, after a pause, for example #of course. The usual cue of vowel reduction cannot be relied on in L2 speech, where very often a syllable is relatively weak but with the full vowel, and in such cases there is less context within which to interpret the other cues of pitch and duration. .. Factor Group 5: Speaker’s category by amount of output The issue of inter-speaker variation is discussed more fully below, but in principle the sixteen subjects were considered to constitute a homogeneous group as regards their overall level of proficiency in English, although Saito’s (1999) misgivings regarding such assumptions in the use of VARBRUL were kept in mind. Since, however, it was felt that rate of speech might be connected in some way with vowel reduction in the case of L2 speakers, and since there turned out to be quite substantial variations in the amount of output among the subjects over the thirty-minute period of the recording (the most talkative producing about 60% more speech than the least talkative), it was decided to group subjects into two categories according to the amount of text which resulted from the transcriptions. Two sets formed quite naturally, with seven in the more talkative group, and nine in the less talkative group. There is no implied connection between these groupings and proficiency – they simply reflect the amount of language produced during the recording, irrespective of its quality or whether the lower output was due to an overall slower rate of utterance, or to longer pauses, or a combination of both.
. Results The full set of data consisted of all the metrically weak tokens produced by the Brazilian informants which would (on the evidence of the baseline data) have been reduced by a native speaker: that is, a total of 2,743 words. The input (the likelihood of any token being reduced, regardless of conditioning factors) was .80. The results for the five significant factor groups after the second run were as in Table 1. The pi value represents the probability weight, which indicates “the strength of the influence of that factor in comparison to other factors in the same factor Table 1. Factor Group 1: Target word
to of for
pi
%
N
.57 .38 .38
80 74 68
1,610 555 331
Michael Alan Watkins
group” (Young & Bayley 1996: 280). A value above .50 is interpreted as a positive influence, a value below .50 as a negative influence, while values very close to .50 are having little influence in either direction. The % column shows the percentage of occurrences of each preposition which were reduced, while N indicates the total number of occurrences of the item in the data which qualified as tokens. The pi values in Table 1 show that when the target word was to, vowel reduction was more likely to occur than not, while the other two target words were more likely not to be reduced. In fact, as the pattern for to differed so markedly from that of the other two factors, an analysis combining of and for was tried. However, as there was no improvement in log-likelihood or chi-square values, or any clear theoretical justification for this amalgamation, the more transparent three-factor analysis is the one reported. It can be seen in Table 2 that the presence of a preceding syllable within the same intonation group slightly favoured reduction, while there was quite a strong tendency for tokens in IG-initial position to resist reduction, maybe because of the extra degree of prominence associated with that position. Table 3 shows that an onsetless following syllable had a clear positive effect on reduction, while an /h/ had a very strong inhibitory effect. Possible reasons for the effect of /h/ will be considered below in the General Discussion. It is all the more remarkable in that it is the only type of following segment which actually inhibited reduction. An /r/ had a slightly positive effect, while the other consonants had no significant influence either way on reduction. Because of the principle of alternation (Selkirk 1984), it had been expected that the weights of Factor Group 4 would show the reverse trend, with reduction being more probable before a stressed syllable. The figures in Table 4 therefore came as something of a surprise, showing a very slight inhibitory effect by a strong Table 2. Factor Group 2: Presence of an immediately preceding syllable in the same IG
Y N
pi
%
N
.55 .38
81 67
1,784 712
Table 3. Factor Group 3: First segment of the following word
Vowel Glide /r/ /h/ Any other C
pi
%
N
.67 .50 .55 .13 .51
83 75 84 36 79
169 108 62 125 2,032
Variability in the use of weak forms of prepositions
Table 4. Factor Group 4: Metrical status of following syllable
S W
pi
%
N
.47 .62
74 85
1,938 558
Table 5. Factor Group 5: Speaker’s category by amount of output
A B
pi
%
N
.56 .40
81 70
1,585 911
syllable, while a following weak syllable had quite a clear positive influence on reduction. In Table 5, A refers to the more talkative group, and B to the less talkative. The influence of this variable is not dramatic, but it is nevertheless significant. It is clear that “talkativeness” had a slightly positive influence on reduction, while membership of the “less talkative” category was a factor inhibiting reduction. This result really needs to be followed up by means of a more controlled experiment, in order to discover if speech rate consistently correlates with a higher rate of vowel reduction, as variations in the amount of output may be due to differences in the actual rate of speech, or to the length and frequency of pauses. The problem with the way in which the participants were selected was that I only set a minimum level of proficiency, with no clearly specified upper limit. This put them into quite a broad band, and at such a high level output can vary in more ways than in the case of speakers who are less proficient, because of the range of their knowledge and experience of the language. Furthermore, some people simply talk faster than others, even in their L1. In retrospect, it seems obvious that if I wanted to use spontaneous speech, some variation in speech rate was inevitable.
. General discussion Variability in the use of weak forms of the three prepositions included in the final analysis was shown to be systematically affected by the linguistic environment. It was also systematically affected by the amount of speech produced during the thirty-minute recording. On the other hand, no single factor appears to have been exerting a dramatically strong effect, which may mean that some factors not included in the research were also influencing variation, or that some of the variation was not systematic – or a combination of both. There is no way of knowing how much variation is unaccounted for other than by considering the weights calcu-
Michael Alan Watkins
lated by the program for each factor group. This is what the first part of this discussion will consist of, the second being an attempt to relate these results to some theoretical issues. The identity of the word itself was a significant factor. While the effect of to was weak, it was positive in all the analyses. Of and for had an inhibitory effect on reduction for both groups. None of the weights are at a great distance from the “no-difference” level, indicating that this variable alone does not account for a large proportion of the variation. However, it is clear that, overall, of and for were less likely to be reduced than to. The higher rate of reduction of to may be because of its lack of coda, the positive effect of which was a clear finding in the pilot study. However, this cannot be the only reason, as for is also often fully codaless before a consonant onset, both in native dialects and Brazilian interlanguage. It may also have something to do with the fact that the vowel of to is [+high], so that centralization involves a relatively small adjustment of the articulatory setting: no more than the unrounding of the lips, and a slight lowering of the tongue. To also enters into different types of syntactic relationships from the others, notably the infinitive construction, when it is not in fact a preposition according to the usual criteria (Huddleston & Pullum 2002). Because all uses of to were considered in this analysis, regardless of syntactic function, its occurrence was much greater than either of the other words, accounting for nearly two thirds of the data. It may be that it is learned initially in rhythmic units to a greater extent than the other prepositions: firstly in infinitive constructions, and then in larger structures such as like to go, want to have. In short, there are a number of factors – syllable structure, vowel quality, frequency, and learning context – which distinguish to from the other two prepositions and which may, singly or in conjunction, be influencing its probability weight. The presence of a preceding syllable had a significant influence on vowel reduction. This was not surprising, as an IG-initial syllable tends to have a certain prominence, and this may be the reason for the extra tendency for non-reduction of these syllables, even though they were not stressed (all tokens which were clearly stressed, with pitch prominence, having been excluded from the data). With regard to the initial segment of the following syllable, the most striking finding was the strong inhibitory effect of /h/. It is hard to think of any obvious reason for such a marked difference between this and all the other consonants (although glides also had a strong inhibitory effect for the B group). An acceptable approximation to English /h/ is not difficult for Brazilians, as word-initial /r/ in Brazilian Portuguese is auditorily similar (although stricture is higher, uvular rather than glottal), so one would not expect speakers to need to slow down to prepare themselves specially for it, unless they have a subconscious fear of confusing it with /r/ – which is actually a real possibility in the case of Brazilians. At a certain stage of learning, pairs like red and head, role and hole, are easily con-
Variability in the use of weak forms of prepositions
fused by Brazilian learners of English, and it is not beyond the realms of possibility that they remain permanently traumatized by humiliating errors involving these sounds in oral tests. Of the 125 tokens preceding /h/, 105 were to, and a cross-tabulation of the two factors showed that only 34% of these were reduced (compared with 80% of all occurrences of to). This shows very clearly that the overall weighting of /h/ was due to its strong inhibitory effect on the reduction of to, though why this should be so is not obvious. One possibility is that to is usually pronounced with the full (lip-rounded) vowel /tu/ when followed by a word which begins with a vowel, and that /h/ was being treated (variably) as if it were a voiceless form of the following vowel, so that the following syllable was considered to be onsetless. This is quite plausible, as (a) initial orthographic “h” is always silent in Portuguese, (b) a number of English dialects do not permit syllable-initial /h/ at all, and (c) the initial /h/ in metrically weak pronouns and auxiliaries is dropped after consonants in all dialects in informal speech. The words involved were quite restricted: in 66 of the 105 cases with to, the following word was have. However, this apparently interesting fact sheds no light on the matter, as one might equally well have expected frequency of co-occurrence to be conducive to reduction, rather than the contrary. Of the other consonants, only /r/ had any influence at all on variation, slightly favoring reduction, but the absence of an onset in the following syllable had a strong facilitating effect for B speakers (although it must be remembered that only of and for were involved in this environment). The most likely explanation for this would be that the absence of a following consonant allows the resyllabification of the coda of the preposition, turning it into an open syllable like to. Another finding restricted to the B group was the strong inhibitory effect of a following glide (/w/ or /j/). This is not quite as surprising as the effect of /h/, since the two glides do not occur word-initially in Portuguese, and cause particular difficulty in English when followed by vowels with similar features (as in words like wood and year). However, the effect of a glide for the A group was markedly different, being slightly on the positive side. There would appear to be no obvious reason for this difference. The metrical status of the following syllable had an influence on variation, but not in the way which might have been expected. A following weak syllable favored reduction, resulting in two successive weak syllables. If the preceding syllable is strong, and the following syllable is an article (as in went to the shops), then no other metrical pattern is available, and a ternary foot must result. In this case it is to be expected that vowel reduction would be favored. In other words, there may be some interaction with the preceding syllable, although on its own the metrical level of the preceding syllable had no significant effect and was excluded from the analysis. Although five factor groups were found to have a significant effect on variation, the strongest effects were associated with a rather small number of factors:
Michael Alan Watkins
the token to, IG-initial position, a following vowel (although this cannot co-occur with reduced to), a following weak syllable, and a relatively high rate of speech (broadly defined) were all facilitating factors, while reduction was inhibited when the token was of or for, and especially so when to was followed by an /h/.
. Conclusion This paper described a VARBRUL analysis of variability in the use of weak forms of English prepositions by Brazilian speakers. The results confirmed that there was a considerable amount of variability, and showed a systematic effect of some aspects of the phonological context: whether the token syllable was initial or not in the intonation group, whether the following syllable began with a vowel, a consonant, or an /h/, and whether the following syllable was metrically strong or weak. The effects of these phonological factors, though significant at p < .05, were not overwhelmingly strong, however, suggesting that they do not account for all the variation. Whether the residue was free variation or systematically conditioned by variables not included in the analysis is a question that could only be answered after exhaustive research, examining every possible variable. However, it seems intuitively likely that there were complex interactions between linguistic and psycholinguistic factors determining the exact output form in each case, as suggested by Pennington (2002) when she speaks of the physiological and psycho-social aspects of L2 performance.
Acknowledgement This research was funded by a grant from CAPES (Coordenação de Aperfeiçoamento de Pessoal de Nível Superior), of the Brazilian Ministry of Education.
References Burzio, L. (1994). Principles of English Stress. Cambridge: CUP. Burzio, L. (in press). Phonology and phonetics of English stress and vowel reduction. Language Sciences. Crosswhite, K. M. (2004). Vowel reduction. In B. Hayes, R. Kirchner, & D. Steriade (Eds.), Phonetically Based Phonology (pp. 191–231). Cambridge: CUP. Cruttenden, A. (1997). Intonation (2nd ed.). Cambridge: CUP. Cummins, F. & Port, R. (1998). Rhythmic constraints on stress timing in English. Journal of Phonetics, 26, 145–171.
Variability in the use of weak forms of prepositions
Fear, B., Cutler, A., & Butterfield, S. (1995). The strong/weak syllable distinction in English. Journal of the Acoustical Society of America, 97, 377–393. Flemming, E. & Johnson, S. (in press). Rosa’s roses: Reduced vowels in American English. Journal of the International Phonetics Association. Giegerich, H. (1992). English Phonology: An introduction. Cambridge: CUP. Harris, J. (2005). Vowel reduction as information loss. In P. Carr, J. Durand, & C. J. Ewen (Eds.), Headhood, Elements, Specification and Contrastivity: Phonological papers in honour of John Anderson (pp. 119–132). Amsterdam: John Benjamins. Hayes, B. (1995). Metrical Stress Theory. Chicago, IL: Chicago University Press. Huddleston, R. & Pullum, G. K. (Eds.). (2002). The Cambridge Grammar of the English Language. Cambridge: CUP. Major, R. (1981). Stress-timing in Brazilian Portuguese. Journal of Phonetics, 9, 343–351. Major, R. (1985). Stress and rhythm in Brazilian Portuguese. Language, 61, 259–282. Major, R. (1992). Stress and rhythm in Brazilian Portuguese. In D. A. Loike & D. P. Macedo (Eds.), Romance Linguistics (pp. 3–30). Westport: Bergin and Harvey. Massini-Cagliari, G. (1992). Acento e Ritmo. São Paulo: Contexto. Ogden, R. (1999). A declarative account of strong and weak auxiliaries in English. Phonology, 16, 55–92. Pennington, M. (2002). Equivalence classification in L2 phonology: In search of the mechanisms. In A. James & J. Leather (Eds.), New Sounds 2000: Proceedings of the Fourth International Symposium on the Acquisition of Second-Language Speech (pp. 280–289). Klagenfurt: University of Klagenfurt. Saito, H. (1999). Dependence and interaction in frequency data analysis in SLA research. Studies in Second Language Acquisition, 21, 453–475. Selkirk, E. (1984). Phonology and Syntax: The relationship between sound and structure. Cambridge, MA: The MIT Press. Wetzels, W. (1992). Mid vowel neutralization in Brazilian Portuguese. Cadernos de Estudos Lingüísticos, 23, 19–55. Campinas, Brazil: UNICAMP. Young, R. & Bayley, R. (1996). VARBRUL analysis for second language acquisition research. In R. Bayley & D. Preston (Eds.), Second Language Acquisition and Linguistic Variation (pp. 253– 306). Amsterdam: John Benjamins.
Perception of double stress by Spanish learners of English* Ma Luisa García Lecumberri Universidad del País Vasco, Spain
A perception experiment required Spanish learners of English (NNLs) and native English listeners (NLs) to identify the position and relative prominence of stresses in polysyllabic English words and compounds. The results indicated that both groups recognized stress shift and lack of shift very accurately. The NLs showed a stronger tendency to perceive prominence shift in simple words than NNLs, who were more likely to hear simple words as containing one stress. For compounds, differences between listener groups were not significant. It was concluded that, in these experimental tasks, native competence did not provide a strong advantage for stress identification. This result may be partly due to differences in metalinguistic awareness between the English and Spanish participants.
.
Introduction
The aim of this research was to compare Spanish learners and native speakers of English with regard to their ability to identify the location of primary and secondary stresses in English words and compounds, both in their usual form and when altered by stress shift. It was expected that the native speakers would perform better, because of their superior linguistic competence, but it was also predicted that the Spanish participants’ greater metalinguistic awareness of stress, resulting from the demands of Spanish orthography, would compensate to some extent for their lack of native competence. Since various authors (e.g., Kenworthy 1987; Flege & Bohn 1989; Mairs 1989; Archibald 1998; Peng & Ann 2002)1 have identified the * This paper is a revised version of García Lecumberri (2002). . Archibald (1998) discusses the transfer of L1 stress patterns to L2 and also mentions the possible influence of word structure awareness (p. 190) and of cognates in the L1 (p. 188).
Ma Luisa García Lecumberri
difficulty of foreign learners in hearing English stress patterns correctly as a possible cause of inaccurate production, and since incorrect stress production can result in the quality of the vowels being markedly different from that expected by listeners, thereby affecting comprehensibility, it was felt that a clearer understanding of Spanish learners’ actual performance in stress identification could provide useful information that might help in the development of classroom procedures and materials for reducing problems in this area. Polysyllabic Spanish words have primary stress on one of the three final syllables, with no other clearly salient prominences. The lack of an equivalent in Spanish to English secondary stress has to do with the syllable-timed nature of Spanish as opposed to the stress-timed rhythm of English. This means that all Spanish syllables retain their full vowel quality even when unstressed, whereas the vowels in unstressed English syllables are shortened and centralized, resulting in a much sharper distinction between stressed and unstressed than in Spanish. Since the great majority of English lexical words are either stressed monosyllables or begin with a stressed syllable (Cutler 1992), assuming a stressed syllable to be the start of a new word is a reliable strategy when parsing continuous speech. The more predictable metrical structure of words in Spanish, in conjunction with a higher degree of morphological transparency (redundancy in the form of suffixes which English has largely done away with), means that the information needed for identifying word boundaries, which is largely provided in English by the stressedunstressed distinction, is obtained by other means in Spanish. There is a trade-off in English between loss of phonemic distinctions (for example, modernity and maternity are usually distinguishable by just a single feature, the voicing of the alveolar onset of the second syllable, with the initial vowels reduced to schwa in both words) and enhancement of the stressed-unstressed distinction, while Spanish retains all the vowel contrasts, but at the expense of information that could be derived from greater rhythmic variation. The crucial point about stress is that it is a relative feature, its purpose being to make the more information-rich syllables stand out against the others. Different languages may achieve this effect by different means. In English, unstressed syllables constitute a relatively neutral background against which stressed syllables stand out far more clearly than in Spanish. Apart from having shorter duration and lacking pitch prominence, English unstressed syllables typically have a nucleus consisting of a short central vowel, usually schwa, but also /I/ and sometimes /~/. These syllables occupy the first line on a metrical grid such as (1) below. A syllable containing any other vowel will be heard as stressed, unless there is some other cue (lower pitch, reduced duration and/or volume, a preceding flapped /t/) that overrides the vowel quality cue (Hayes 1995). All stressed syllables are marked with an asterisk on the second line of the grid. Where an English word has two (or more) stressed syllables, one must be primary (receiving a third level asterisk),
Perception of double stress
and this is usually the last one in citation form, while the earlier one is secondary, as in the examples ÀfundaÁmental or ÀafterÁnoon. This late main stress pattern is traditionally known as end-stress or double-stress. Less commonly the reverse pattern occurs, with the main stress coming at the beginning of the word, as in ÁinterÀview and ÁanteÀlope, and with certain verb suffixes such as -ate and -ize, for example ÁsubliÀmate, ÁstandarÀdize. This early main stress pattern is known as front-stress. Compound nouns (except for a couple of types formed from phrasal verbs, such as Àpasser-Áby and Àsumming-Áup, formations like Àmumbo-Ájumbo and Àhelter-Áskelter, and some other relatively unproductive types) follow the same pattern, with primary stress on the first constituent (compound stress), as in Áwindow-Àcleaner and Áchewing-Àgum. It is precisely this stress pattern that marks them as being single lexical items, as opposed to separate constituents of a phrase, as in Àblack Ábird (phrase) vs. ÁblackÀbird (compound). However, although practically all compound nouns follow the front-stress pattern, many types of compound adjective (such as Àbrand-Ánew, Àhome-Ámade and Àbroken-Áhearted) have primary stress on the second word, and this meant that it was possible to introduce a morphological variable in the present study by including compounds with the same secondary-primary stress pattern as the majority of simple words with two stresses. As mentioned above, one of the characteristics of a clearly stress-timed language such as English is rhythmic alternation, whereby strong beats tend to be separated by weak beats (Selkirk 1984; Couper-Kuhlen 1993). When two syllables with primary stress are adjacent there is a stress clash, shown on a metrical grid as adjacent level 3 asterisks, to avoid which the first of the level 3 stresses automatically shifts onto a preceding syllable with secondary stress, if there is one, a process referred to by various terms such as stress shift, rhythm rule, or iambic reversal (Liberman & Prince 1977; Halle & Vergnaud 1987; Nespor & Vogel 1989; Roca & Johnson 1999). For example, this rule would cause the primary stress on the final syllable of afternoon to be shifted back to the first syllable when immediately followed by the tonic stress on tea (indicated by the fourth-level asterisk): (1) * * * * * * * * * * afternoon tea
>
* * * * * * * * * * afternoon tea
Ma Luisa García Lecumberri
In Spanish, on the other hand, shift of primary stress onto any other syllable for rhythmic purposes does not occur in everyday speech, although it exists for emphatic purposes in the speech of journalists and politicians (Hualde 2005).2 Of particular relevance to the present study is the matter of transfer or L1 interference, which has been incorporated into the models of Flege (1995) and Best (1994, 1995). Although the traditional view of L1 transfer as the main cause of learner errors (Stockwell & Bowen 1965) has been strongly contested in recent years, many authors (Eckman 1977; Ioup 1984; Flege 1992, 1999; Ellis 1994; García Lecumberri & Cenoz 1997; Major 2001) agree that phonetic/phonological mistakes are very often due to the influence of a learner’s L1 on the language being learned, even more so than errors at other linguistic levels. The purpose of this study was therefore to test the hypothesis that L1 influence would cause Spanish learners of English (NNLs) to perform less well than native listeners (NLs) at identifying the stress patterns in polysyllabic English words and compounds, especially when there was stress shift, since this results in a pattern that does not occur in Spanish.
. Method The experiment was designed as an open identification test in which listeners were asked to underline the stressed syllable or syllables in each stimulus item. The stimuli were randomized and recorded on audiotape by a speaker of British English with an educated mainstream accent. The stimuli were chosen with several criteria in mind. All the items selected had two potential stresses, with the stronger one being the later one in citation form. The main research variable was metrical. For this reason items which are susceptible to stress shift were selected and presented in two different contexts: utterance-final (no shift-triggering context), and followed immediately by a stressed word so that stress shift was induced. An additional variable was morphological. There were two types of stimulus: simple words and compounds (almost all adjectives) with an end-stress pattern. In an attempt to minimize direct transfer from the NNLs’ L1, there were equal numbers of cognates, in most of which the stress did not fall on the same syllable as in Spanish, familiar non-cognate words, and non-cognate words which were unlikely to be known to the NNLs (and which would thus function rather like nonsense stimuli). Stimuli were also chosen in such a way that the number . As mentioned by an anonymous reviewer, Spanish does employ strategies such as stress re-assignment and/or deletion to avoid stress clash at some morphological levels, such as the formation of diminutives, as in papél [paÁpel] vs. papelito [ÀpapeÁlito].
Perception of double stress
of syllables with respect to the stress positions would vary. Thus there were twosyllable words with two potential and contiguous stresses, three-syllable words in which the second stress was word-final, and four-syllable words in which the second stress was not word-final. The full list can be found in the appendix. All the possible combinations of the different numbers of syllables with the three-way cognateness distinction and the two-way simple vs. compound distinction resulted in 18 categories, with two items being used for each category, although two had to be excluded from the final analysis when it was realized that they did not fully meet the stress criteria. Each of the remaining 34 items appeared in two different contexts (unshifted and shifted), resulting in a total of 68 target sentences. Frame sentences were made up of unstressed words plus the target item in the case of non-shift contexts. For shift-provoking contexts, the trigger word was the only one stressed apart from the target item. Shift-provoking words were stressed on the first syllable to make the clash more evident. Twenty-eight distractor sentences made up of 14 items, each in two different contexts, were interspersed with the other stimuli. Below are two examples each of frame sentences for single words and compounds in the unshifted and shifted conditions: (2)
They were Chinese They were fundamental They were brand-new They were semi-finals They were hyperactive
They were Chinese cats They were fundamental days They were brand-new cars They were semi-final games They were hyperactive boys
The participants heard each sentence twice in succession over headphones, and recorded their responses on pre-prepared answer sheets by underlining the stressed syllables of target words once or twice according to the degree of prominence they perceived. Two different groups of listeners took part in the test. One consisted of 12 firstyear university students of English Philology with Spanish as their L1. Although their English competence varied, all had at least upper-intermediate level. All the students had completed English Phonetics (a compulsory first-year subject), but stress types and shifts had not been dealt with during that course. The other group contained 12 native speakers of English, and was made up of first-year Speech Science students and administrative staff at a British university. It was expected that, while neither group would achieve 100% accuracy in the tasks, L1 influence would cause the NNLs to demonstrate less accurate perception of stress patterns than the NLs, particularly in the case of the shifted condition, which never occurs in Spanish, but that the NNLs’ greater metalinguistic awareness might to some extent counter this influence and reduce the difference in performance between the two groups.
Ma Luisa García Lecumberri
Table 1. Shift perceptions in percentages Simple words NNLs NLs a. Shift b. Addition c. Change d. No Change e. Wrong
10.3 78.9 89.2 6.4 4.4
28.9 56.9 85.8 7.8 6.4
Compounds NNLs NLs 29.9 50.5 80.4 13.7 5.9
32.4 49.0 81.4 10.3 8.3
All items NNLs NLs 21.1 64.7 84.8 10.0 5.1
30.6 52.9 83.5 9.1 7.4
. Results . Perception of stress shift In this section the focus is on that half of the items tested in which there was stress shift. The data are comparative, in that shift is classified in relation to the main prominence perceived by each individual for the same stimulus in the unshifted condition. Table 1 displays results divided into five categories, according to whether listeners perceived (a) a shift of primary stress (shift) – e.g., ÁfundaÀmental; (b) an additional prominence, with or without a shift of primary stress onto it from the citation form (addition) – e.g., ÁfundaÁmental; (c) the previous two categories added together (change); (d) no change in primary stress (no change); or (e) wrong stress placement (wrong). Results are presented as percentages of the total number of responses, with the third category being the sum of the first two, and the last three categories adding up to the total. In this table the following points can be noted: 1. The percentages of perceived change were really high for both listener groups. Most of them perceived a difference between the item when it appeared with end-stress and when it appeared in a shifted context, which indicates that listeners were not being led solely by their expectations. Given that the unshifted pattern is the usual one, listeners apparently acted on their perceptions, since they had no explicit knowledge of the phenomenon of shift and might have assumed the same pattern for both occurrences of each item. However, the differences between the two listener groups were not significant due to wide inter-speaker variation. 2. Both groups perceived stress shift more often in compounds than in simple words (t = 3.23, p < .005). A repeated measures ANOVA analysis of shift perception with one within-subjects factor (simple vs. compound) and one between-subjects factor (nativeness) confirmed the effect of morphological type (F(1,22) = 12.9, p < .005, η2 = .37) and an interaction between morphological type and listener group (F(1,22) = 6.3, p < .05, η2 = .22). We can
Perception of double stress
see that for simple words there was a significant difference between NLs and NNLs (F(1,22) = 5.4, p < .05, η2 = .197), but not for compounds. Additionally, there was a significant difference between the NNLs’ treatment of compounds and of simple words (F(1,22) = 18.6, p < .001, η2 = .46), whereas in the case of the NLs this did not occur (F(1,22) = .57, p > .10). The NNLs’ perception of shift was thus very different for simple and compound words, whereas the NLs treated simple words and compounds more uniformly. A repeated measures ANOVA comparison of the perceptual strategies stress shift vs. stress addition, with nativeness as between-subjects factor, showed that there was a difference in the use of the strategies (F(1,22) = 14.2, p < .001, η2 = .39). Stress addition was favored over actual shift, with NNLs showing a stronger tendency in this direction than NLs. This predominance of stress addition over stress shift may have been due to the first constituent (in the case of compounds) or syllable (in the case of simple words) in the unshifted context not being perceived as having any degree of stress, so that in the shifted context early stress was perceived as having been added rather than strengthened. This topic will be discussed further in the following section, as will the question of whether these front-stressed structures were perceived as containing one or two stresses, and what the perceived relative strength was for each. To explore these issues the perceived distribution of stress strength will be examined. . Perception of stress strength In this section, the results for end-stress and front-stress will be presented and discussed individually. Tables 2 and 3 below show the perceived distribution of stress strength for items in the two conditions: unshifted (end-stress, Table 2), and shifted (front-stress, Table 3). The following categories have been included: (a) equal strength stresses – e.g., ÁfundaÁmental; (b) first stress stronger – e.g., Table 2. Perception of stress strength in end-stressed structures in percentages End-stress a. Equal strength b. First stronger c. Second stronger d. First only e. Second only f. Wrong place g. Total 2 stresses (a + b + c) h. Total end-bias (c + e) i. Total phrase-stress (a + c + e)
Simple words NNLs NLs 1.0 1.0 9.8 2.5 82.8 2.9 11.8 92.6 93.6
0.5 0 33.3 0 61.8 4.4 33.8 95.1 95.6
Compounds NNLs NLs 10.8 2.9 25.5 3.9 52.0 4.9 39.2 77.5 88.3
5.9 2.5 34.8 0.5 50.5 5.9 43.2 85.3 91.2
All items NNLs NLs 5.9 2.0 17.6 3.2 67.4 3.9 25.5 85.0 90.9
3.2 1.2 34.1 0.2 56.1 5.1 38.5 90.2 93.4
Ma Luisa García Lecumberri
Table 3. Perception of stress strength in front-stressed structures in percentages Front-stress a. Equal strength b. First stronger c. Second stronger d. First only e. Second only f. Wrong place g. Total front-bias (b + d) h. Total 2 stresses (a + b + c)
Simple words NNLs NLs 2.0 27.0 1.5 64.7 2.5 2.5 91.7 30.5
8.3 34.3 2.9 49.0 2.9 2.5 83.3 45.5
Compounds NNLs NLs 9.8 49.0 2.9 33.8 1.5 2.9 82.8 61.7
11.3 42.6 0.5 43.1 0.5 2.0 85.7 54.4
All items NNLs NLs 5.9 38.0 2.2 49.3 2.0 2.7 87.3 46.1
9.8 38.5 1.7 46.1 1.7 2.2 84.6 50.0
ÁfundaÀmental; (c) second stress stronger – e.g., ÀfundaÁmental; (d) first stress only – e.g., Áfundamental; (e) second stress only – e.g., fundaÁmental; (f) wrong placement. These six categories add up to the total number of responses. Additionally there are categories which are composites of those above and which overlap with one another. .. End-stress Table 2 shows results for end-stress perception in simple words and compounds. Results are given in percentages of the total number of responses for the unshifted condition. If we consider simple words and compounds together, both listener groups heard what I have called a phrase-stress pattern (i.e., strongest or only prominence at the end of the word or two equal strength prominences) in very similar and very high proportions (over 90% overall). The differences between the two listener groups were not statistically significant. A repeated measures ANOVA analysis of stress strength with the two withinsubjects factors item type (simple vs. compound) and type of stress perception (one end-stress only vs. double stress) and one between-subjects factor (nativeness) showed a significant three-way interaction between stress, item type and nativeness (F(1,22) = 7.2, p < .05, η2 = .25). It can be seen that both listener groups tended to hear one end-stress only (NNLs 67.4% overall, NLs 56.1% overall) more often than a double stress pattern (NNLs 25.5% overall, NLs 38.5% overall), whatever the relative prominence within the latter category. This was true for each of the morphological types, although the result was only significant in the case of simple word perception by Spanish listeners (F(1,22) = 26.9, p < .001, η2 = .551). This may have been due to L1 influence since, as mentioned above, most Spanish words have only one clearly prominent syllable. This might also help to explain why, in the perception of shift shown in Table 1, early stress addition was perceived more often than inversion
Perception of double stress
of primary stress, a tendency which was again more marked in the case of simple words and NNLs. Thus, the prominence in front-stressed items was generally heard correctly, but since most frequently the end-stressed words were considered to have only late stress, front prominence in the shift condition was most often heard as the addition of a stress at the beginning of the word, rather than real shift. If we compare simple words with compounds we find that compounds were perceived as having a double-stress pattern considerably more often than simple words, suggesting that listeners were carrying out some kind of morphological analysis of the structures; that is, they may have realized that there were two words to stress. There were significant differences between simple vs. compound items in the NNLs’ perception of one end-stress only (F(1,22) = 36.9, p < .001, η2 = .627) and of double stress (F(1,22) = 27.9, p < .001, η2 = .560), and less strong but still significant differences in the NLs’ perception of one end stress only (F(1,22) = 4.9, p < .05, η2 = .183) but not for double stress. The NNLs showed a stronger tendency to perceive simple words as having only one stress – possibly, as suggested above, because of L1 influence – but there appears to have been less influence from the L1 stress system in the case of compounds: when two stresses were perceived in compounds, they were judged to be of equal strength more often by NNLs than by NLs, which could be an indication that the NNLs had a stronger tendency to analyze compounds as two words. There was a significant difference between the two listener groups in perception of just an end-stress in simple words but not in compounds (F(1,22) = 4.3, p < .05, η2 = .163), and in perception of double stress (F(1,22) = 5.6, p < .05, η2 = .203). Results for the NLs show less difference in their perception of the number of stresses in simple words vs. compounds. Although they apparently considered that compounds were more likely to bear two stresses (probably because of morphological transparency), simple words were also considered capable of having two stresses, but not as often as one might expect given the high frequency of this pattern in English. In addition, when two stresses were perceived in compounds, the NLs were better at judging the stronger prominence of the second stress than the NNLs, for whom the two stresses more often sounded equivalent, as seen in Table 2. .. Front-stress The results for front-stressed words will now be examined. Again, the numbers in Table 3 represent percentages of the total number of listener responses for the stress-shift condition. A repeated measures ANOVA analysis of stress strength with the two withinsubjects factors item type (single vs. compound) and stress perception (one frontstress only vs. double stress) and one between-subjects factor (nativeness) showed
Ma Luisa García Lecumberri
a significant 3-way interaction between stress perception, item type and nativeness (F(1,22) = 11.3, p < .005, η2 = .34). It can be seen that, for simple words, both groups of listeners heard just one early stress more often than two stresses, but this bias was statistically significant only for NNLs (F(1,22) = 4.6, p < .05, η2 = .173). This again is consistent with the fact that in Spanish most words have only one clear stress, whereas English allows two or more, so that NLs were more ready to hear simple words with two stresses, the same tendency seen above for end-stress. It seems clear that the NNLs were more likely to hear simple stress-shifted words as having only one stress even where no early stress was perceived in the other condition. For compounds, both listener groups showed an inverse trend to that for simple words since in this case they tended to hear two stresses. This may be an indication that they considered compounds to be a combination of two words rather than as a new whole, as noted earlier for end-stress. However, there was a stronger tendency for the NLs than for the NNLs to perceive the second word as unstressed. There was a significant difference in the way NNLs heard the two stress patterns both in single words (F(1,22) = 36.6, p < .001, η2 = .618) and compounds (F(1,22) = 38.7, p < .001, η2 = .638), whereas there was no significant difference in the case of NLs. Therefore, although the general trend in both listener groups was to hear simple words as having early stress only and compounds as having two stresses, the two groups differed in that these two trends were more marked for the NNLs, who treated simple words and compounds quite differently, whereas the NLs perceived the stress patterns of simple words and compounds as being more alike, with a greater tendency to hear two stresses in simple words and one in compounds.
. General discussion The results show that both types of stress (end-stress and front-stress) were perceived with considerable accuracy by both groups, particularly when categories such as end-bias and front-bias are taken into account, and that the NLs were only slightly better than the NNLs at perceiving stress under each condition. It had been predicted that the NNLs’ lesser linguistic competence might be offset by their greater metalinguistic knowledge, and this appears to have been the case. Since the NNLs in this experiment were all philology students, who had been required to take several courses on Spanish grammar before entering university, and were currently taking a course on general linguistics which included morphology, phonetics and syntax, the metalinguistic knowledge which they brought to bear on the research task must indeed have been much greater than that of the NL group, who did not have this background in linguistics.
Perception of double stress
Another factor which would in any case have given the NNL group a metalinguistic advantage over the NLs is the fact that, in Spanish orthography, stressed syllables need to be marked with a diacritic under certain conditions, which means that literate Spanish speakers are quite familiar with the task of thinking about and identifying stressed syllables, an alien task for most L1 English speakers. It has been suggested that metalinguistic awareness is related to the speakers’ native language and to its orthographic system (Liow & Poon 1998; García Lecumberri & Gallardo 2003; Leather 2003), and the present results could be interpreted as supporting this claim. The NNLs were relatively successful at carrying out the task involved in this experiment because their L1 required some metalinguistic awareness of stress marking. The clear advantage the NLs would otherwise have had by virtue of their native competence was thus partially neutralized. Nevertheless, there is evidence of the effect of the NNLs’ L1 in their greater tendency to assign only one stress to double-stressed simple words. In the case of compounds, the influence of L1 stress patterns was not so clear, probably due to their treating compounds as two words because of morphological transparency. However, the stress patterns in the corresponding Spanish words were not transferred directly to the perception of English words. This may have been due to the fact that a sufficient number of non-cognate and rare words (which acted as pseudo-nonsense words) were included, which would lead to auditory rather than L1-influenced phonological analysis of the structures, and hence less L1 interference.
. Conclusion Both groups of listeners showed a high level of accuracy in determining the position and relative strength of prominences in the two stress structures under study. The most striking differences between the NNLs and NLs were manifested in the perception of one vs. two stresses, particularly regarding simple words vs. compounds: in both conditions (end and front stress), NNLs showed a stronger tendency than NLs to hear simple words as having one stress only, whereas in compounds NNLs and NLs performed more similarly. Nevertheless, in the frontstressed condition NNLs heard compounds as having two stresses more often than NLs. These results could indicate that NNLs were more aware of the morphological composition of the two types of structure and were being influenced by L1 norms (one word = one stress, two words = two stresses, assuming that they were analyzing compounds as two words). The NLs displayed a more balanced treatment of simple words vs. compounds as far as the number of prominences was concerned. In this respect, the NLs seemed to overlook the morphological makeup of the words more often than NNLs did, and to have fewer inhibitions about
Ma Luisa García Lecumberri
assigning two stresses to a simple word or one stress to a compound (a two-word structure). Taking stress bias as the measure of correct/incorrect perceptions, NLs were slightly more accurate than NNLs in all but one (front-stressed simple words) of the four structures involved. Nevertheless, in the present experiment, nativespeaker competence did not prove to be a strong indicator of possible advantage in stress identification. It is suggested that this result may be partly due to differences in metalinguistic awareness between English and Spanish listeners – at least the ones who participated in this particular study.
Acknowledgments This research was funded by a grant from the Universidad del País Vasco, no 105.130-HA 136/97. I am grateful to staff and students of this university and of the Department of Phonetics, University College, London, for their cooperation. I would also like to thank the editors of this volume and an anonymous reviewer for valuable comments on the manuscript.
References Archibald, J. (1998). Second Language Phonology. Amsterdam: John Benjamins. Best, C. T. (1994). The emergence of native-language phonological influences in infants: a perceptual assimilation model. In J. C. Goodman & H. C. Nusbaum (Eds.), The Development of Speech Perception (pp. 167–224). Cambridge, MA: The MIT Press. Best, C. T. (1995). A direct realist view of cross-language speech perception. In W. Strange (Ed.), Speech Perception and Linguistic Experience: Theoretical and methodological issues (pp. 171– 204). Timonium, MD: York Press. Couper-Kuhlen, E. (1993). English Speech Rhythm. Amsterdam: John Benjamins. Cutler, A. (1992). Auditory lexical access: where do we start? In W. Marslen-Wilson (Ed.), Lexical Representation and Process (pp. 342–356). Cambridge, MA: The MIT Press. Eckman, F. (1977). Markedness and the contrastive analysis hypothesis. Language Learning, 27, 315–330. Ellis, R. (1994). The Study of Second Language Acquisition. Oxford: OUP. Flege, J. E. (1992). Speech learning in a second language. In C. Ferguson, L. Menn, & C. StoolGammon (Eds.), Phonological Development: Models, research and applications (pp. 565– 604). Timonium, MD: York Press. Flege, J. E. (1995). Second-language speech learning: Findings and problems. In W. Strange (Ed.), Speech Perception and Linguistic Experience: Theoretical and methodological issues (pp. 233–273). Timonium, MD: York Press. Flege, J. E. (1999). Age of learning and second language speech. In D. Birdsong (Ed.), Second Language Acquisition and the Critical Period Hypothesis (pp. 101–132). Mahwah, NJ: Lawrence Erlbaum.
Perception of double stress
Flege, J. E. & Bohn, O.-S. (1989). An instrumental study of vowel reduction and stress placement in Spanish-accented English. Studies in Second Language Acquisition, 11, 35–62. García Lecumberri, M. L. (2002). An experiment on stress-shift perception by English FL vs. NL speakers. In A. James & J. Leather (Eds.), New Sounds 2000: Proceedings of the Fourth International Symposium on the Acquisition of Second-Language Speech (pp. 142–147). Klagenfurt: University of Klagenfurt. García Lecumberri, M. L. & Cenoz, J. (1997). L2 perception of English vowels: Testing the validity of Kuhl’s prototypes. Revista Alicantina de Estudios Ingleses, 10, 55–68. García Lecumberri, M. L. & Gallardo, F. (2003). English FL sounds in school learners of different ages. In M. P. García Mayo & M. L. García Lecumberri (Eds.), Age and the Acquisition of English as a Foreign Language (pp. 115–135). Clevedon: Multilingual Matters. Halle, M. & Vergnaud, J. R. (1987). An Essay on Stress. Cambridge, MA: The MIT Press. Hayes, B. (1995). Metrical Stress Theory. Chicago, IL: Chicago University Press. Hualde, J. I. (2005). The Sounds of Spanish. Cambridge: CUP. Ioup, G. (1984). Is there a structural foreign accent? A comparison of syntactic and phonological errors in second language acquisition. Language Learning, 34, 1–17. Kenworthy, J. (1987). Teaching English Pronunciation. London: Longman. Leather, J. (2003). Phonological acquisition in multilingualism. In M. P. García Mayo & M. L. García Lecumberri (Eds.), Age and the Acquisition of English as a Foreign Language (pp. 23– 58). Clevedon: Multilingual Matters. Liberman, M. & Prince, A. (1977). On stress and linguistic rhythm. Linguistic Inquiry, 8, 249– 336. Liow, S. J. R. & Poon, K. K. L. (1998). Phonological awareness in multilingual Chinese children. Applied Psycholinguistics, 19, 339–362. Mairs, J. L. (1989). Stress assignment in interlanguage phonology. In S. Gass & J. Schachter (Eds.), Linguistic Perspectives on Second Language Acquisition. Cambridge: CUP. Major, R. C. (2001). Foreign Accent: The ontogeny and phylogeny of second language phonology. Mahwah, NJ: Lawrence Erlbaum. Nespor, M. & Vogel, I. (1989). On clashes and lapses. Phonology, 6, 69–116. Peng, L. & Ann, J. (2002). Stress and duration in three varieties of English. In A. James & J. Leather (Eds.), New Sounds 2000: Proceedings of the Fourth International Symposium on the Acquisition of Second-Language Speech (pp. 271–279). Klagenfurt: University of Klagenfurt. Roca, I. & Johnson, W. (1999). A Course in Phonology. Oxford: Blackwell. Selkirk, E. O. (1984). Phonology and Syntax. Cambridge, MA: The MIT Press. Stockwell, R. P. & Bowen, J. D. (1965). The Sounds of English and Spanish. Chicago, IL: University of Chicago Press.
Ma Luisa García Lecumberri
Appendix Stimuli used in the study
1. À – Á –
2. À – – Á –
3. À – – Á – –
Unfamiliar words
Familiar non-cognates
Cognates
Scafell Umpteen Gung-ho Rough-hewn Bakerloo Assamese Cloven-hoofed Deckle-edged Balaclava Surreptitious Topsy-turvy Helter-skelter
Eighteen Undue Brand-new Next-door Afternoon Impolite Navy-blue Under-done
Chinese Princess Tex-mex
Cinderella Broken-hearted Single-decker
Kangaroo Absentee Inter-state Infra-red Fundamental Adolescent Hyper-active Semi-final
The production of compound stress by Brazilian learners of English L. Armando Silveiro and Michael Alan Watkins Universidade Federal de Santa Catarina, Brazil / Universidade Federal do Paraná, Brazil
This paper describes an analysis of the stress patterns used by advanced Brazilian learners in the production of English compound nouns. It was predicted that the lack of distinction between composite nominals and compound nouns in Portuguese would strongly influence the participants to assign primary stress to the final constituent, regardless of the syntax. Data were collected by means of a reading activity which included the target constructions in unrelated sentences. Results confirmed a strong tendency for participants to give greater prominence to the second constituent of compounds, although exceptions to this tendency indicated that the participants’ choice of stress pattern may also have been influenced by the length of constituents and relative familiarity of the lexical items involved.
.
Compound and phrasal stress
Either constituent in a construction consisting of adjective + noun (Adj+N) or noun + noun (N+N) in English can be given greater metrical prominence, depending on the syntactic and semantic relationship between them: a simple, everyday example is English book, in which the first word English is understood as being an adjective (referring to the book’s place of origin) if primary stress is on book, or a noun (referring to the language) if primary stress is on the word English itself. The former, phrasal stress (PS) pattern (Liberman & Prince 1977) with greater prominence on the final word, giving a weak-strong pattern (W-S), is the unmarked form when the first word is functioning in an attributive relationship to the second. Syntactically, the first word can be an adjective, as in heavy metal, the present or past participle form of a verb, as in running water or unmade beds, or a noun, as in ham sandwich. When it is an adjective or participle, it may have a transparently descriptive function, as in light color, or a transparently adverbial function in the case of a deverbal head noun, as in light sleeper (someone who sleeps lightly). When the modifier is a noun, it denotes some inherent charac-
L. Armando Silveiro and Michael Alan Watkins
teristic of the head noun, such as the material or chief ingredient from which it is made, as in cheese omelette. In all these cases the sequence constitutes a noun phrase, or what Huddleston and Pullum (2002: 448) term a “composite nominal”: a construction consisting of two independent words. However, in N+N or Adj+N combinations where the first item does not stand in a clearly attributive or adverbial relationship to the second (e.g., computer screen, surfboard, blackbird), the two words form a compound noun, a lexical category and not a phrase, on which the compound stress (CS) rule (Liberman & Prince 1977) imposes a strong-weak pattern (S-W) as the unmarked form. The crucial factor which appears to determine the choice is the non-attributive function of the first item. This is very often clear, as in the distinction between rubber Átire and Árubber industry, but in many other cases where there is compound stress, such as cheesecake, blackbird, green card, the first constituent has at least traces of an attributive function. The fact that semantic “attributiveness” and “compoundness” are to some extent a continuum means that the neatness of the CS-PS distinction is slightly blurred by a certain amount of variation and some inexplicable exceptions: for example, some people say Áice-cream, others ice-Ácream (mother tongue and hot dog are also examples where both patterns are heard), and Cruttenden (1997) mentions the inconsistency between ÁChristmas cake and Christmas Ápudding, ÁOxford Street and Oxford ÁRoad. Giegerich (1992: 258) also notes these cases, and concludes that the prominence pattern alone is not in itself an indicator of compound or phrase status (so that every construction with final stress is a phrase and every one with nonfinal stress a compound). If that were so then London ÁRoad would be a syntactic phrase and ÁLondon Street a compound, Christmas Ápudding a phrase and ÁChristmas cake a compound – hardly a categorisation that could be justified on syntactic grounds! We are left with the somewhat unsatisfactory situation, then, that the prominence patterns of compound words idiosyncratically conform with either those of words or those of phrases.
Huddleston and Pullum (2002: 450) reach a similar conclusion, after a comprehensive discussion of the various possible criteria for distinguishing compounds from phrases: “. . . we take the view, here as in so many other areas of grammar, that the existence of borderline cases does not provide a reason for abandoning a distinction that can be recognised in a great range of clear cases.” Chapter 19 of this same work includes an impressively thorough and illuminating analysis of types of compound. The existence of such variation and exceptions simply increases the learning difficulty for native speakers of languages such as Spanish and Portuguese, which do not have the compound stress pattern. The equivalent construction in these languages for the cases mentioned above would consist of N + Adj, N2 + N1 , or
The production of compound stress
N2 of N1, with greater prominence on the final lexical item, that is, phrasal stress. For example, gold Áring (W-S) would be anel de Áouro (W-S) in Portuguese, while Ágold mine (S-W) would be mina de Áouro (also W-S), that is, with the same stress pattern, neutralizing the distinction made in English. This uniformity of stress pattern strongly predisposes Brazilian speakers of English (whatever their level of proficiency) to put stronger stress on the second element of compounds, using the same metrical pattern as in their L1. Such errors in stress placement are non-trivial, as they can have a direct effect on parsing: for example, the common mispronunciation newsÁpaper (W-S instead of S-W) makes news sound attributive in relation to paper, and sorting this out demands some extra processing on the part of native listeners. This study focused on two-word constructions of both types (compound and phrasal), with constituents restricted to words of one or two syllables. Below is a list, with examples, of all the metrical patterns included in the study, the numbers beside the examples referring to the number of syllables per constituent (hereafter referred to as length categories): (1)
1-1 (S-W) Átoy shop 1-1 (W-S) toy Ácar 1-2 (S-W) Álight socket 1-2 (W-S) light Ásleeper
2-1 (S-W) Ácattle ranch 2-1 (W-S) apple Ápie 2-2 (S-W) Átable tennis 2-2 (W-S) heavy Ásleeper
Given the communicative importance of the distinction between the two stress patterns in English, and our impression that the W-S (phrase stress) pattern tends to be heavily overused by Brazilian learners of English, even at quite advanced levels, regardless of meaning, the objectives of this study were twofold: (a) to compare the frequency with which each of the two stress patterns was produced correctly; (b) to investigate whether the length of the two-word sequence (in terms of number of syllables) was a significant variable affecting correct stress placement. Hypothesis 1: The participants in this study mis-stress English two-word sequences more frequently when the compound (S-W) stress pattern is required. Hypothesis 2: Difficulty is affected by the number of syllables per constituent, the pairs with one syllable in each constituent (1-1) being more easily produced, because of the narrower range of choices of stress pattern and the fact that many of them are written as one word. No specific order is predicted among the other three length categories.
L. Armando Silveiro and Michael Alan Watkins
. Method . Participants Fifteen Brazilian learners of English – 9 males and 6 females ranging from 16 to 40 years of age – volunteered to participate in the study. Twelve of them were regular students from the extracurricular language program at a public Brazilian university, while the other three studied at a private language school in the same town. By the time data collection took place, all of them had received at least 420 hours of English instruction. The choice of higher-level students was based on two main factors: having had longer L2 experience, advanced learners are more likely than beginners to have developed intuitions and strategies about word stress placement. In addition, the findings of a small-scale pilot study conducted in 2002 revealed that (a) beginners tend to assign stress in a more random manner, and (b) a sentence-reading activity proved to be too demanding for students with limited exposure to the L2; the large number of pauses and mispronounced segments prevented accurate assessment of the strategies that might otherwise have been employed. . Production task In the present study, the participants’ production was assessed by means of a sentence-reading activity, which, in spite of its inherent limitations, was chosen as a way of ensuring uniform production on the part of the participants. The task consisted of 420 unrelated sentences: 280 containing the 140 pairs being tested (occurring twice each) and 140 distractor sentences. The target constructions were placed at least two words before the end of each sentence to avoid the possible influence of utterance-final position. All the sentences were then checked for accuracy by an educated speaker of General American English, then computerrandomized and divided into blocks of 20 sentences per page. . Procedures Data collection took place in individual or small-group sessions during June and July 2003 in the language laboratory of the public university of 12 of the participants. The participants recorded the sentences by using individual recording units – Sony Educational Recorder, model ER-5030 – and Sony headsets – model HS-95. Each learner received an envelope containing written instructions in English, the blocks of sentences to be recorded, and an audiocassette. Although the students knew they were participating in a research project, none of them were
The production of compound stress
aware of the exact focus of the research being done, nor were they given time in advance to become familiar with the test contents. . Data analysis The analysis of the target compounds consisted of two distinct steps. The first step involved listening to and transcribing all the recorded tokens, with stress being confirmed by a comparison of the acoustic measurements of amplitude (of the higher peaks) and duration of the two constituents of each sequence. This procedure dispensed with the need for a second listener and also enabled the use of all intelligible tokens. In the second step, correct tokens were marked as OK, while unintelligible tokens were labeled as such and eliminated from the analysis. In addition, specific labels were created to account for the classification of the incorrect productions: inversion, pause, mispronunciation, and wrong syllable. Inversion (Inv) refers to the switch between primary and secondary stresses within the token, that is, when primary stress was placed on the second component of a CS sequence – for example, ÀdeadÁline instead of ÁdeadÀline – or on the first component of a PS sequence – for example, ÁQueen ÀMother for ÀQueen ÁMother. Errors were classified as pause when there was a period of silence between the two components (a) which was perceived in the first listening step as long enough to break up the structure (so that the second component no longer belonged to the same intonation group), thus preventing a comparison of the stress level of the two components; or (b) whose duration was found to be longer than one second in the sound peak observation step. Mispronunciation (Misp) is a blanket label that encompasses the addition or omission of segments or syllables in either component, to the extent of modifying the length of the sequence – for example, [Àkri] Ákr7ks] for cream crackers, or [Àk"f ÁmeIkû] for coffeemaker. Wrong syllable (Wrong Syl) refers to the placement of primary or secondary stress on a syllable other than the expected one in either component, causing a disruption of the rhythmic pattern – for example, [Àn7t pr"Áfit] instead of [Àn7t Ápr"fIt] for net profit, or [pæÀsiv ÁvfIs] instead of [ÀpæsIv ÁvfIs] for passive voice. Wrong pronunciation of vocalic and consonantal segments – for example, [Àh7vi Ábr7dû] for heavy breather – did not invalidate the tokens for analysis as long as the rhythm was not altered.
. Results Figure 1 shows that, irrespective of the type of stress pattern or the length of the words, the correct pattern – either CS or PS – was produced in 51.08% of cases. Inversion occurred in 36.40% of cases, and the other error types occurred in smaller proportions: pause 7.56%, mispronunciation 3.06%, and wrong syllable 1.90%.
L. Armando Silveiro and Michael Alan Watkins
Figure 1. Total results of the production test (percentages)
. Stress pattern Hypothesis 1 predicted that incorrect use of the W-S (PS) pattern would be the most frequent error type produced by the participants, and this was confirmed by the results for stress pattern on its own, displayed in Table 1, as PS sequences were given the correct pattern in far more cases than the CS sequences, the difference in the scores being highly significant (χ2 (1, N = 4,115) = 1122.90, p < .001). Table 1. Production test: Total scores per stress pattern Stress Pattern
No. Valid Tokens
No. Correct Answers
% Correct Answers
CS PS Total
2238 1877 4115
608 1494 2102
27.17 79.60 51.08
. Constituent length Hypothesis 2 predicted that (1-1) constructions would yield the highest rate of correct placement, while variability was expected for the other sub-categories. Indeed, Figure 2 shows a very clear difference between the (1-1) type and the others, with 72.48% correct productions. The second highest-scoring sub-category was (2-1), with 55.25% of correct answers, followed by (1-2) and (2-2), with similar percentages – 44.10% and 45.74%, respectively. The comparison between the length sub-categories yielded highly significant chi-square results (χ2 (3, N = 4,115) = 156.41, p < .001), supporting the second hypothesis, at least when the different length sub-categories are considered separately. Further analysis investigated whether the greater facility of (1-1) would prevail when the other variables came into play.
The production of compound stress
Figure 2. Production test: Scores per length sub-category
Figure 3. Production test: Scores according to length and stress pattern
. Stress pattern and length of token While the difference in the overall scores between CS and PS was highly significant, a different picture emerged when stress pattern and sequence length were examined together (Figure 3). The reciprocal influence of these variables occurred most noticeably in CS sequences, where the rate of correct answers showed variability across the length sub-categories, yielding a significant chi-square (χ2 (3, N = 2,238) = 317.63, p < .001). However, the scores for PS sequences were all very close, and the results were non-significant (χ2 (3, N = 1,877) = 3.85, p > .25). In short, the interaction between stress pattern and length seemed to hold true for CS but not PS constructions. The following hierarchy of difficulty emerged: CS (1-2) > CS (2-2) > CS (2-1) > CS (1-1) > PS (all). It is clear that (1-1) was the easiest length category for CS constructions, but length appeared to be irrelevant
L. Armando Silveiro and Michael Alan Watkins
for PS ones. Therefore, the hierarchy predicted by Hypothesis 2 was confirmed for CS (1-1) constructions only. The low scores for the CS (1-2) and (2-2) length subcategories may have been caused by a tendency for the longer (two-syllable) second element to exaggerate an already existing preference for stress on the final word or by the difficulty to reduce the stress of an element with more than one syllable. . Discussion of results . Compounds The general results presented in 3.1 show that when the correct stress placement was not produced, inversion was the most frequent error type. When those figures are considered separately for each type of construction, Figure 4 shows that inversion (W-S instead of S-W) was by far the most common pattern given to compounds, corroborating the findings in Staub (1973) and Baptista (1989), and confirming Hypothesis 1 of this study. With regard to rate of correct responses according to number of syllables, (1-1) CS structures showed a much higher rate than the other length types. This difference may have been exaggerated to some extent by the fact that some of the items (football, bedroom, blacklist, freshman, darkroom, and deadline) have become fully lexicalized and may have been learned as single words rather than analyzed as compounds, due to their high frequency. On the other hand, some other highfrequency compounds such as airbags, headache and green card did not receive so many correct responses. Airbags and green card are commonly used in Brazilian Portuguese as loanwords, with a W-S stress pattern, which is likely to have influenced participants’ production. Besides, green card is the only compound in its respective sub-category that is written as two words, which may be an indication that, when (1-1) compounds are written as one word, they are more likely to be
Figure 4. Total results for compounds (CS structures)
The production of compound stress
treated as a single word and thus assigned S-W stress. Because of the relatively high level of proficiency of the participants in this experiment, they would probably have internalized this as the default metrical pattern for bi-syllabic nouns. The highest rates of inversion occurred in the (1-2) length sub-category, which suggests that this type of structure is the one most susceptible to L1-influenced phrasal stressing. Overall, (1-2) constructions showed a much higher rate of errors than any other category. The (1-2) compounds dishwashing, speed-reading, fox hunting, windsurfing, handwriting and the (2-2) compound window-shopping, all of which have V-ing forms as second element, had the highest rates of inversion. Other (2-2) compounds with high rates of inversion were drinking water, wrapping paper and dining table, which include V-ing forms as the first element, a position in which they often function attributively. There might even have been some effect of the similarity of this second group to a verb phrase (verb + object). A possible reason for the difficulty of (1-2) items is that in many cases they are written as a single word, which may have influenced Brazilian learners to apply the default Portuguese word stress rule and assign the main stress to what looks like the penultimate syllable (i.e., the first syllable of the second constituent). Most of the (2-1) and (2-2) compounds tested, on the other hand, are written as two separate words, without a hyphen, suggesting a noun phrase, thus also automatically attracting main stress to the second rather than the first word. Taken together, the results for the length categories with bi-syllabic second elements clearly suggest that the final constituent attracts primary stress. Mispronunciations included both the addition and omission of syllables: in the case of drilling platform, many participants inserted an epenthetic [a] at the end of the first syllable of platform, resulting in [Áplataffrm], which shows clear influence of the Portuguese cognate plataforma, since transfer of the usual BP phonological process of epenthesis causes insertion of the vowel [i] or [6]. Mispronunciation was also quite common for (1-2) compounds with the –er suffix, which was sometimes deleted in the pronunciation of northwester, cream crackers and freethinker, resulting in (1-1) structures instead, for example, [Àkri] Ákr7ks]. In the (2-2) compound coffeemaker, the second syllable of coffee was deleted, resulting in [Àk"f ÁmeIkû]. . Phrasal-stressed tokens The (1-1) PS constructions were generally assigned the correct stress pattern (79.19%), as expected, the highest rates being for hometown, meat pie and front door. Death row and smash hits had the lowest rates of correct responses – 36.66% and 43.33% respectively – mainly due to the large number of pauses. It should also be noted that death row follows the typical pattern for the names of thoroughfares and other geographical locations (like Oxford Road, mentioned above,
L. Armando Silveiro and Michael Alan Watkins
Figure 5. Total results for PS constructions (percentages)
and numerous others, such as Canary Wharf, Times Square, Sunset Boulevard – all, in fact, except those with Street). Perhaps it was not recognized as belonging to this semantic category. In general, pause was a more frequent source of incorrect responses than inversion in PS constructions, contributing heavily to the major exceptions to the correct productions of (1-2), (2-1), and (2-2) length sub-categories: 20% for rubber bands, 23.33% for paper towels. To some extent, the pauses may have been caused by the difficulty a sentence-reading activity itself imposes on L2 learners, by the eccentricities of English spelling, or by the informants’ lack of familiarity with some of the words. The somewhat uncommon laughing jackass seems to have been excessively challenging for some of the participants, resulting in pauses and mispronunciations – for example, [Àl7fti] Ádj7ks] – and stress on the wrong syllable [Àl7fi] dj7Àk7z]. Spitting image had six instances of inversion, but the most striking outcome was the placement of primary stress on the second syllable of image, pronounced as [iÁmeIdŠ] in thirteen instances – a possible transfer from the Portuguese cognate imagem. It may also have to do with the word age, and the fact that the final “e” often signals a tense vowel in English. Moreover, the fact that most of the compounds where a pause occurred are written as separate words might be an indication that they were possibly not recognized as compounds by the participants. Figure 5 shows the total results for phrase-stressed constructions.
. Conclusions The data obtained from the sentence-reading task confirmed the hypothesis that Brazilian learners of English have a strong overall tendency to assign phrase stress (W-S) to two-word expressions, regardless of whether they are phrases or in fact compounds, which should have the reverse (S-W) pattern. This was clear from the
The production of compound stress
low rates of correct answers for CS compounds. The PS pattern is clearly a natural choice for Portuguese-speakers, being obligatory in their L1, but the participants’ performance appears to have been influenced by other factors as well. The number of syllables in the words, with its effect on metrical structure, appeared to be especially important: on the whole, CS (1-2), (2-1) and (2-2) pairs were much more likely to be produced with the PS pattern than CS (1-1) pairs, in which the choice of syllable to carry primary stress is more restricted. Moreover, many of the CS (1-1) compounds included in the test are written as one word, and it is possible that they were treated as single words instead, receiving primary stress on the first syllable because of awareness of the tendency for early stress in English. Furthermore, familiarity with the item (although this was not strictly controlled in the study) may also have been an important factor, as it could explain the exceptions to inversion of CS compounds in the other length sub-categories. The most important pedagogical implication of the results described in this report probably lies in awareness-raising. It seems plausible to assume that learners’ awareness of the structure and of their own errors can help them improve performance (Pennington 1998). In this respect, Schmidt’s (1990) noticing hypothesis is useful to help understand the importance of explicit correction and planned teaching. According to Schmidt (1990), when learners notice the mismatches between their IL output and L2 input, they are likely to reduce their misproductions and overcome their difficulties. Compound stress is not frequently taught in the classroom; thus, it is quite likely that many advanced students have no conscious knowledge about it. Two major limitations of the study concern the method adopted for data collection. The selection of items included had to be arbitrary to some extent, because we were trying to test the different subcategories of length and stress patterns in approximately the same proportions, but this procedure unfortunately resulted in little control of word familiarity. Another limitation is the disadvantage of a reading activity for the purpose of data collection, in particular because some of the items were written as two separate words, some with a hyphen, and some as a single word, a factor which may well have had some influence on the choice of stress pattern. In spite of these limitations, however, the results of this study send out a clear message to teachers and materials writers to give more attention to a frequent and persistent error in the English of speakers with L1s that do not have the compound stress pattern. Teachers have a tendency to focus on segmental errors, whereas prosodic-level errors can often have a wider and more damaging effect on comprehensibility.
L. Armando Silveiro and Michael Alan Watkins
Acknowledgement This research was funded by a grant to the first author from the CNPq (Conselho Nacional de Desenvolvimento Científico e Tecnológico), of the Brazilian Ministry of Science and Technology.
References Baptista, B. O. (1989). Strategies for the prediction of English word stress. International Review of Applied Linguistics, 28, 99–115. Cruttenden, A. (1997). Intonation, 2nd ed. Cambridge: CUP. Giegerich, H. (1992). English Phonology: An introduction. Cambridge: CUP. Huddleston, R. & Pullum, G. (2002). The Cambridge Grammar of the English Language. Cambridge: CUP. Liberman, M. & Prince, A. (1977). On stress and linguistic rhythm. Linguistic Inquiry, 8, 249– 336. Pennington, M. (1998). The teachability of phonology in adulthood: A re-examination. International Review of Applied Linguistics, 36, 323–341. Schmidt, R. (1990). The role of consciousness in second language learning. Applied Linguistics, 11, 129–158. Staub, A. (1973). A acentuação do inglês e do português: Algumas observações contrastivas. Letras de Hoje, 12, 35–52.
Appendix Examples of sentences used in the production test Compound Stress sequences 1. 2. 3. 4. 5. 6. 7. 8.
Cars that have airbags are more expensive. You must have a green card to work in the US. They could not meet the deadline set by the managers. My favorite newspaper is the Washington Post. It’s difficult to read the handwriting of most doctors. I was brought up by my grandparents in Brazil. Accidents with drilling platforms are pretty frequent. I can’t find the coffeemaker anywhere!
Phrase Stress sequences 1. 2. 3. 4. 5. 6. 7. 8.
I really miss my hometown when I’m abroad. Kimberly is tired of blind dates that get her nowhere. The company’s net profit has exceeded expectations. Living on London Road is a sign of wealth. My grandfather was a town crier in his native Italy. English subjects held the Queen Mother in high esteem. Please include paper towels on the shopping list. We heard a laughing jackass when we were in Australia.
Author index
A Abrahamsson, N. , , , , , , , Anderson, J. I. , Archibald, J. B Baptista, B. O. –, , , , , , , , , , , , , , –, Best, C. T. , Boersma, P. Bohn, O.-S. , , , , Bradley, T. G. Bradlow, A. R. Brinton, L. J. Broadbent, D. Burzio, L. , Butterfield, S. C Carlisle, R. , , , , , , , , , , –, , –, , , , Cebrian, J. Cenoz, J. , Clements, G. N. , –, –, , Collischonn, G. Cook, C. Cooke, M. Crosswhite, K. M. Cruttenden, A. , Cummins, F. Cutler, A. , D Del Forge, A. M. Derwing, T. Dziubalska-Kolaczyk, K.
E Eckman, F. R. –, , –, , , , , , , , , , , –, , Edman, T. R. Eefting, W. Escudero, P. , , , , F Fear, B. Fernandes, P. R. C. Flege, J. E. –, , –, , , , , , –, , , , , , Flemming, E. Fox, R. A. , Fullana, N. , G Gallardo del Puerto, F. , García Lecumberri, M. L. , , Garcia, I. W. Gick, B. Giegerich, H. , , , Gómez Lacabex, E. G. Goodel, E. Graddol, D. , Greenberg, J. , , , , , , , , H Hancin-Bhatt, B. Hansen, J. G. Harris, J. , Hayes, B. , Hecht, B. F. Hoefnagel-Höhle, M. , Hooper, J. B. , , Hualde, J. I. Huddleston, R.
I Ioup, G. Iverson, G. , , ,
J James, A. Jang, S. , Jenkins, J. J. Johnson, S.
K Katamba, F. Koch, K. Koerich, R. D. ,
L Ladefoged, P. Leather, J. Lenneberg, E. Lindblom, B. , Logan, J. S. Long, M. H. Luce, P. A.
M MacKay, I. R. A. , , , Major, R. C. , , , , , , , , , , , , , , –, , Massini-Cagliari, G. , McRoberts, G. W. Meador, D. Morais, C. A. Morrison, G. S. Mulford, R. Munro, M. J. , , , , , , Murray, R. W. , , –
Author index
O Ogden, R. Ohnishi, M. , , P Pennington, M. , Piske, T. Pisoni, D. B. , Port, R. Pullum, G. K. R Rebello, J. T. , , , , , , , , , , – Ross, S.
S Saito, H. Schmidt, A. M. Schmidt, R. , Selkirk, E. O. Silva Filho, J. A. , , , Singleton, D. , Snow, C. , Staub, A. Stevens, K. N. Strange, W. , , , T Tarone, E. , Thornbury, S. Tropf, H. , , , ,
V Vennemann, T. , , –, ,
W Weinberger, S. H. , Werker, J. F. Wetzels, W. Wilson, I.
Y Yavas, M. , , , Young, R.
Subject index
A acoustic space affricate , – age , , – alveolar , , , alveopalatal – assimilation , , , , , , –, , , , , , B bilabial , , , C Categorial Discrimination Test cluster, initial /s/ , , , –, , –, cluster length , , , , , , , , , coda , , , –, , , , , , –, , , , consonantal strength , , , , Core Syllabification Principle (CSP) , –, , –, , , , critical period hypothesis , , D demisyllable –, E epenthesis –, , , , , , , , , equivalence classification , , , ,
F Feature Dispersion Principle (FDP) , , , foot , , final consonant –, , , – foreign accent , , , , , , , fricative –, –, , , ,
Markedness Differential Hypothesis (MDH) , , , , , , , , , , , mental representation , , metalinguistic awareness , , , , , metrical pattern , , , metrical status , , , ,
G glide , , , , , , , , , ,
N nasal , , , , –, , , , , Noticing Hypothesis ,
I implicational markedness implicational relationship , , , , Interlanguage Structural Conformity Hypothesis (SCH) –, , , , , , , , intonation , , L L1 category , , , , , , L1 influence , , , , , , labio-dental fricative lateral , lenition liquid , , , , , , , M markedness , , , , –, –, , , , –, –, , , , , –, , , , -, ,
O omission , onset , , –, –, , , –, , Ontogeny Model , – Ontogeny Phylogeny Model , , P paragoge , , , –, – Perceptual Assimilation Model (PAM) , , perceptual contrast perceptual distance , perceptual foreign accent , phonetic category , , , –, , , , , , , , , , , , , , phonetic system , plosives , prothesis , , , , –, –, , , , , , –, –, –, –
Subject index
R resyllabification , , , , , , , , , , , rhythm , , , stress-timed , , , syllable-timed , , , S schwa , , , sibilant , , , , , sonority –, –, –, –, , , , , , , , , , , –, , Sonority Cycle , , , , , , , – sonority hierarchy , , , sonority sequencing –, , , , , , , , –, , Sonority Sequencing Principle (SSP) , , , , , spectral cues , , –, , –, , Speech Learning Model (SLM) , , –, –, , , , , , , , , –, , , stress , , , , , , –, , , , –, – compound stress , , , , , ,
double stress , , end-stress , , – front-stress , – phrasal stress (PS) , –, – primary stress , , –, , , , , – Structural Conformity Hypothesis (SCH), see Interlanguage Structural Conformity Hypothesis (SCH) syllable margin –, –, , , , structure , , , , , , , , –, – Syllable Contact Law (SCL) , , –, Syllable Structure Condition (SSC) , , tonic syllable , , T transfer , , , –, , , , , , , , , , , , , U universal markedness , , , , universals , , , , , , , , ,
V VARBRUL , , , , velar , – voice voice onset time (VOT) , , voiced cluster , , , , , voiced obstruent , , , , –, , , , , , voiced stop , , , voiceless obstruent , , , , , , , , , , voiceless stop , , , , , , voicing –, –, , , , , , –, –, , , , , vowel discrimination , , , –, , , , distinction , , , , , duration –, – length , , , , , reduction , , –, – space , , , , , , , system , , , , , , , , , ,
In the series Studies in Bilingualism (SiBil) the following titles have been published thus far or are scheduled for publication: 32 Kondo-Brown, Kimi (ed.): Heritage Language Development. Focus on East Asian Immigrants. 2006. ix, 281 pp. 31 Baptista, Barbara O. and Michael Alan Watkins (eds.): English with a Latin Beat. Studies in Portuguese/ Spanish – English Interphonology. 2006. vi, 214 pp. 30 Pienemann, Manfred (ed.): Cross-Linguistic Aspects of Processability Theory. 2005. xiv, 303 pp. 29 Ayoun, Dalila and M. Rafael Salaberry (eds.): Tense and Aspect in Romance Languages. Theoretical and applied perspectives. 2005. x, 318 pp. 28 Schmid, Monika S., Barbara Köpke, Merel Keijzer and Lina Weilemar (eds.): First Language Attrition. Interdisciplinary perspectives on methodological issues. 2004. x, 378 pp. 27 Callahan, Laura: Spanish/English Codeswitching in a Written Corpus. 2004. viii, 183 pp. 26 Dimroth, Christine and Marianne Starren (eds.): Information Structure and the Dynamics of Language Acquisition. 2003. vi, 361 pp. 25 Piller, Ingrid: Bilingual Couples Talk. The discursive construction of hybridity. 2002. xii, 315 pp. 24 Schmid, Monika S.: First Language Attrition, Use and Maintenance. The case of German Jews in anglophone countries. 2002. xiv, 259 pp. (incl. CD-rom). 23 Verhoeven, Ludo and Sven Strömqvist (eds.): Narrative Development in a Multilingual Context. 2001. viii, 431 pp. 22 Salaberry, M. Rafael: The Development of Past Tense Morphology in L2 Spanish. 2001. xii, 211 pp. 21 Döpke, Susanne (ed.): Cross-Linguistic Structures in Simultaneous Bilingualism. 2001. x, 258 pp. 20 Poulisse, Nanda: Slips of the Tongue. Speech errors in first and second language production. 1999. xvi, 257 pp. 19 Amara, Muhammad Hasan: Politics and Sociolinguistic Reflexes. Palestinian border villages. 1999. xx, 261 pp. 18 Paradis, Michel: A Neurolinguistic Theory of Bilingualism. 2004. viii, 299 pp. 17 Ellis, Rod: Learning a Second Language through Interaction. 1999. x, 285 pp. 16 Huebner, Thom and Kathryn A. Davis (eds.): Sociopolitical Perspectives on Language Policy and Planning in the USA. With the assistance of Joseph Lo Bianco. 1999. xvi, 365 pp. 15 Pienemann, Manfred: Language Processing and Second Language Development. Processability theory. 1998. xviii, 367 pp. 14 Young, Richard and Agnes Weiyun He (eds.): Talking and Testing. Discourse approaches to the assessment of oral proficiency. 1998. x, 395 pp. 13 Holloway, Charles E.: Dialect Death. The case of Brule Spanish. 1997. x, 220 pp. 12 Halmari, Helena: Government and Codeswitching. Explaining American Finnish. 1997. xvi, 276 pp. 11 Becker, Angelika and Mary Carroll: The Acquisition of Spatial Relations in a Second Language. In cooperation with Jorge Giacobbe, Clive Perdue and Rémi Porquiez. 1997. xii, 212 pp. 10 Bayley, Robert and Dennis R. Preston (eds.): Second Language Acquisition and Linguistic Variation. 1996. xix, 317 pp. 9 Freed, Barbara F. (ed.): Second Language Acquisition in a Study Abroad Context. 1995. xiv, 345 pp. 8 Davis, Kathryn A.: Language Planning in Multilingual Contexts. Policies, communities, and schools in Luxembourg. 1994. xix, 220 pp. 7 Dietrich, Rainer, Wolfgang Klein and Colette Noyau: The Acquisition of Temporality in a Second Language. In cooperation with Josée Coenen, Beatriz Dorriots, Korrie van Helvert, Henriette Hendriks, Et-Tayeb Houdaïfa, Clive Perdue, Sören Sjöström, Marie-Thérèse Vasseur and Kaarlo Voionmaa. 1995. xii, 288 pp. 6 Schreuder, Robert and Bert Weltens (eds.): The Bilingual Lexicon. 1993. viii, 307 pp. 5 Klein, Wolfgang and Clive Perdue: Utterance Structure. Developing grammars again. In cooperation with Mary Carroll, Josée Coenen, José Deulofeu, Thom Huebner and Anne Trévise. 1992. xvi, 354 pp. 4 Paulston, Christina Bratt: Linguistic Minorities in Multilingual Settings. Implications for language policies. 1994. xi, 136 pp. 3 Döpke, Susanne: One Parent – One Language. An interactional approach. 1992. xviii, 213 pp. 2 Bot, Kees de, Ralph B. Ginsberg and Claire Kramsch (eds.): Foreign Language Research in CrossCultural Perspective. 1991. xii, 275 pp. 1 Fase, Willem, Koen Jaspaert and Sjaak Kroon (eds.): Maintenance and Loss of Minority Languages. 1992. xii, 403 pp.