314 66 35MB
English Pages 205 [216] Year 1982
English intonation from a Dutch point of view
Netherlands Phonetic Archives The Netherlands Phonetic Archives (NPA) are modestly priced series of monographs or edited volumes of papers, reporting recent advances in the field of phonetics and experimental phonology. The archives address an audience of phoneticians, phonologists and psycholinguists. Editors: Marcel P.R. van den Broecke University of Utrecht
Vincent J. van Heuven University ofLeyden
NicoWillems
English intonation from a Dutch point of view
1982 FORIS PUBLICATIONS
Dordrecht - Holland/Cinnaminson - U.SA.
Published by: Foris Publications Holland P.O. Box 509 3300 AM Dordrecht, The Netherlands Sole distributor for the USA and Canada: Foris Publications U.S A P.O. Box C-50 Cinnaminson N J. 08077 U.SA
ISBN 90 70176 73 4 (Bound) ISBN 90 70176 72 6 (Paper) © 1982 Foris Publications - Dordrecht. No part of this book may be translated or reproduced in any form, by print, photoprint, or any other means, without written permission from the publisher. Printed in the Netherlands by Intercontinental Graphics, H.I. Ambacht.
Acknowledgement This research was funded by the Faculty of Arts of the University of Utrecht on the basis of a two and a half year contract. I would like to express my sincerest gratitude to all people who in some way or other contributed to this study. In particular, however, I would like to thank the following persons and institutions: Prof. Dr. A. Cohen, for supporting and encouraging me during all stages of this study and for offering very helpful guidance and valuable suggestions. Dr. M . P . R . van den Broecke, for many acute criticisms and essential suggestions in connection with all aspects of this study. Prof. Dr. R. Collier, Prof Dr. T.J.M. van Els, Dr. P. Groot, J . ' t Hart and Drs. J. Posthumus for their assistance in various matters. Ing. C . G . van den Berg, for his unfaltering enthousiasm and exceptional ability in setting up new equipment in less than no time and for writing special computer programmes. Dr. V.J.J.P. van Heuven, for reading and skilfully criticizing this study at very short notice. I am very grateful to him for several substantial improvements. Drs. J.R. de Pijper, for his friendship and useful discussions, notably during our joint experimental sessions in England and for a profound study of typically English institutions. Drs. B.A.G. Elsendoorn, for his moral support, his reliable transport and his close editorial reading of the final manuscript.
Drs. J.C.T. vocoder.
Ringeling
for
his
excellent
performance
as a hi-fi
Drs. F. Staatsen and her colleagues for giving me the opportunity to make recordings of Dutch students. All my colleagues at the Institute of Phonetics of the University of Utrecht for their friendship, their help and their 'most stimulating discussions'. The Institute of Perception Research in Eindhoven for giving me the opportunity to use its technical equipment. Mr. A.J.G. O'Connor, manager of the Medical Equipment Company Ltd. ( M . E . L . ) in Crawley, for his kind arrangments with respect to our experiments in England. Likewise Dr. C.J. Darwin of the Psychology Department of Sussex University at Brighton for his invaluable help in this matter. Mr. and Mrs. Lennie (Brighton) and Ms. M . J . C . Burggraaff for their hospitality.
(London)
Drs. C . A . G . M . Tempelaars for his comments on statistical matters. All known (in particular Ron) and unknown subjects for enduring long experimental sessions. Their highly valued contribution was essential to this study. And last but not least my w i f e Hetty for not complaining too much about my continual absence and for taking over all tasks for which I could not find the time.
Contents 1
DROP YOUR FOREIGN ACCENT
1.0 Introduction 1.1 The importance of intonation in speech .1 Intonation and prosodic continuity .2 Intonation and boundary marking .3 Intonation and pitch accents .4 Intonation and meaning .5 Concluding remarks 1.2 Principles of an experimental-phonetic approach 1.2.1 Conclusions 1.3 Aims of this study 1.4 Brief outline of the study 2
FROM TUNES TO PERCEPTUALLY RELEVANT PITCH MOVEMENTS
2.0 Introduction 2.1 The British English school 2.1.1 The tune approach 2 . 1 . 2 The tone group approach 2.2 The American English school 2.3 Two recent phonological approaches to intonation 2.3.1 The autosegmental theory 2 . 3 . 2 The metrical theory 2.4 The Dutch school 2 . 4 . 1 Declination 2 . 4 . 2 Perceptually relevant pitch movements 2 . 4 . 3 From blocks to pitch contours 2 . 4 . 4 Intonation patterns 2 . 4 . 5 Prospects for English intonation 2.5 A final note on the notational systems 2.6 Contrastive English-Dutch intonation studies 2.7 Concluding remarks 3
FIRST EXPLORATIONS
3.0 Introduction 3.1 Characteristics of the production of a non-native pronunciation of English intonation by Dutch speakers: First experiment. 3.1.1 Introduction 3.1.2 Method 3.1.3 Results 3.1.4 Discussion 3.1.5 Conclusions 3.2 Discrimination of Dutch and English pitch contours: Second experiment 3.2.1 Introduction 3 . 2 . 2 Method 3 . 2 . 3 Results 3 . 2 . 4 Conclusions 3 . 2 . 5 Objective versus subjective analysis 3.2.6 Discussion 3.3 Concluding remarks
1
1 7 8 10 12 13 14 15 17 17 19 21
21 23 23 25 27 29 30 31 36 37 39 41 42 43 43 44 45 47
47 49 49 50 52 53 56 57 57 58 60 64 64 68 68
4
PERCEPTUAL TOLERANCES OF SOME PROPERTIES OF PITCH MOVEMENTS IN ENGLISH
4.0 Introduction 4.1 The experiment 4.1.1 Stimuli 4 . 1 . 2 Subjects 4.1.3 Procedure 4.2 Results 4.2.1 The method of successive interval scaling 4 . 2 . 3 Analysis of the results 4 . 2 . 3 Reliability of the subjects 4 . 2 . 4 Testing for order effects 4.2.5 Testing for differences between the two groups of judges 4.3 Conclusions 4.4 Discussion 5
A COMPARISON OF PITCH MOVEMENTS IN ENGLISH PRODUCED BY NATIVE SPEAKERS OF ENGLISH AND OF DUTCH
5.0 Introduction 5.1 Method 5.1.1 Test material 5.1.2 Informants 5.1.3 Procedure 5.2 Results 5.2.1 Categorization of instrumentally measured Fo curves 5.2 2 Categorization of configurations 5.2.3 Precursive rise 5.2.4 Pitch accent assignment 5.3 Discussion 5.3.1 A comparison of instrumentally measured F0 curves 5.3.2 Comparison of configurations 5.3.3 Precursive rise 5.4 Summary and conclusions 6
A PERCEPTUAL EVALUATION OF DEVIATIONS IN PITCH
6.0 Introduction 6.1 First experiment: the perceptual relevance of nine deviations 6.1.1 Introduction 6 . 1 . 2 Method 6 . 1 . 3 Results 6.1.4 Discussion 6.2 Second experiment: a continued search for deviations 6 . 2 . 2 Method 6 . 2 . 3 Results 6 . 2 . 4 Discussion 6.3 Third experiment: first attempts to evaluate 'duration 1 6.3.1 Introduction 6 . 3 . 2 Method 6.3.3 Results 6.3.4 Conclusions 6.4 General discussion
71
71 75 75 78 79 79 80 81 81 82 83 86 88 91
91 92 92 93 93 95 95 99 100 101 102 102 107 107 109 113
113 115 115 116 119 125 126 127 128 130 130 130 131 132 133 135
7
PRELIMINARIES TO A COURSE IN ENGLISH INTONATION
7.0 Introduction 7.1 Method 7 . 1 . 1 Stimuli 7 . 1 . 2 Subjects and procedure 7.2 Results 7.3 Discussion 8
CONCLUDING CONSIDERATIONS AND SUGGESTIONS FOR APPLICATIONS
8.0 Introduction 8.1 General discussion of the results 8.1.1 Experimental techniques 8.1.2 Discussion of the results 8.2 Viewing the aims of the study 8.3 Suggestions for further research 8.3.1 A continued search for contrastive precepts 8.3.2 A melodic grammar for British English intonation 8.3.3 Rules for pitch accent assignment 8.3.4 The final result: an intonation course 8.4 Testing the intonation course 8.5 Suggestions for applications 8.6 Some theoretical implications of a descriptive model 8.7 Some preliminary pronunciation precepts Postscript
137
137 139 139 142 142 146 147
147 147 148 150 153 155 155 156 157 158 159 161 162 165 171
APPENDICES
Appendix Appendix Appendix Appendix Appendix Appendix Appendix Appendix Appendix Appendix Appendix Appendix Appendix
A: B: C: D: E: F: G: H: I: J: K: L: M:
survey of stylized pitch contours (3) averaged data of pitch movement on tonic (3) raw frequency values Brighton-Crawley (4) scale valus Brighton-Crawley (4) scale values combined groups (4) dialogue used in the production test (5) averaged values of production test (5) scale values Brighton-Crawley (6) scale values combined groups (6) scale values second experiment (6) frequency count second experiment (6) survey of Fo curves of precepts test (7) acceptability scores precepts test (7)
173 176 177 178 179 180 181 185 188 189 190 192 194
Summary
195
References
197
Chapter 1
Drop your foreign accent
1.0 Introduction
When English native speakers listen to a native speaker of Dutch who speaks English with a reasonable proficiency, they will generally perceive that his speech, apart from apparent mistakes, sounds somehow ' d i f f e r e n t 1 from the speech of a native speaker of English expressing himself in his native tongue. This difference is often referred to as 'foreign accent', although the notion accent is somewhat confusing, as it is also used to indicate salient points in speech. This "foreign accent", or non-native pronunciation as we prefer to call it, will hamper a complete mastery of the foreign language, as may appear from a quotation from a Dutch remedial course in English, the title of which was borrowed as the heading for this chapter: "Maar wie een vreemdelingenaccent heeft, blijft voor altijd als onbeholpen gebrandmerkt'; (anyone who displays a foreign accent will be stigmatized as clumsy forever? Nolst Trenite, revised by van Eyseren, 1967: p.4). As one of the objectives of Dutch foreign language teaching is to achieve as correct a pronunciation as possible, much attention has been and is still being paid to this aspect of language teaching. Until recently many assumed that this non-native pronunciation was primarily due to deviations at the segmental level, that is to say in the quality and occasionally in the duration of vowels and consonants, as becomes apparent from the afore-cited course : 'De bewerker acht het van groot belang eerst de klinkers te leren beheersen en die daarna in het raam te zetten van de medeklinkers, die ook weer nun speciale aandacht vragen"; (the reviser attaches great importance to firstly learn to master the vowels and next to insert these in the frame of the consonants, which in their turn require special attention; p . 5 ) .
2
DROP YOUR FOREIGN ACCENT
Whereupon this textbook draws the students' attention to a multitude of differences in sounds between English and Dutch, and next maintains a stony silence. Unquestionably a command of the correct sounds is indispensable for a foreign learner, but this approach may lead a student to think that, if he masters these various sounds, he has a complete command of the pronunciation of the English language. Fortunately this course is not representative, since for considerable time attention in teaching has also been given to other aspects of the target language, such as intonation, which is the subject of our study. As early as 1635 the Dutch linguist Montanus (in Scott Sheldon, 1 9 7 6 ) mentioned the importance of intonation in the instruction of pronunciation. More recently O'Connor and Arnold (1973) state in their intonation course for British English that the use of a wrong tune in English can lead to misunderstandings and possible embarrassments. However, up to now the teaching of intonation is mostly restricted to a repeated imitation of instances of intonation usage by native speakers of that particular language, as explicit descriptions and rules for the generation of acceptable intonation contours are still lacking for most languages. This so-called 'drill method' is properly characterized by O'Connor and Arnold as: Ά repetition of the sound features of the language over and over again, correctly and systematically, until they can be said without any conscious thought at all, until the learner is incapable of saying them in any other way" (p.73). A major drawback of this method is that students are incapable of generating new instances not yet encountered, since they have no conscious knowledge of any underlying system or rules of the target language, nor for that matter of the system in their mother tongue, so that they are incapable of making comparisons between the two, which could help them to consciously produce an appropriate contour. Instead of these drill methods it could be profitable to apply cognitive methods to the teaching of pronunciation in general and of intonation in particular. Such a method aims at making students
INTRODUCTION
3
conscious of underlying structures of the target language by providing them with an explicit description of this particular aspect of the language, for instance in the form of an intonation course or grammar and by simultaneously training them in analytic listening with respect to pitch phenomena. One of the prerequisites for such a method is the availability of a rule-based intonation course. Unfortunately the formulation of appropriate rules for the generation of acceptable intonation contours of one of the major languages of the world has not been achieved yet. This lack of rules is partly due to the low accessibility of this intricate component of speech for inspection or registration and partly to the usual linguistic classification of speech into phonemic entities, as a result of which properties of segments received most attention in the past. Gradually the tide is on the turn as, over the last few decades, new attention has been focussed on so-called prosodic components of speech, especially by experimentally oriented phoneticians. Before we will go into this we will first give some definitions of the components of speech we will talk about. A message or idea a speaker wants to convey to a listener is transformed into a concrete realization (speech utterance) at the phonetic level as a result of higher order mental processes derived from the linguistic competence of the speaker. This utterance can be thought of as being made up of segmental components, generally referred to as phonemes, which constitute syllables and words, and a number of non-segmental components. The latter may be subdivided into prossodic components, such as intonation and duration, and paralinguistic components, such as voice quality. See figure 1.1. By prosodic components we understand those elements of speech which generally extend over more than one segment - hence they are also referred to as suprasegmental features - and which can only be described in relation to one another. Thus it is meaningless to speak of a 'high 1 pitch, unless there is a l o w ( e r ) pitch present to relate it to. The most important prosodic elements are considered to be temporal factors (duration, speech rate, rhythm), pitch phenomena (intonation) and intensity (loudness).
4
DROP YOUR FOREIGN ACCENT
Of the paralinguistic components we only mention voice quality, which is a particular property of the voice of a speaker resulting from the anatomical structure of the vocal organs and the articulatory habits of this specific speaker. We will return to this in section 1.1.1. Intonation, which is the subject of our study, can be regarded as a mental pattern at the abstract linguistic level. It is part of the linguistic competence of a speaker/listener and adds communica•tive properties to a message. On the concrete phonetic level we would like to restrict its d e f i n i t i o n to the pitch contour or speech melody of an utterance, reflected in the perception of the physical correlate of pitch: the fundamental frequency, which is generated by the vocal cords. Simultaneously with this before-mentioned new interest in prosodic components in general and (English) intonation in particular one must recognize that, apart from differences in quality of segmental MESSAGE mental conception linguistic competence (lexicon,syntax etc.) phonological rules phonetic structure
phonemes words etc.
temporal structures
PROSODIC COMPONENTS
PITCH CONTOURS
paral inguistic components (voice qual. ) intensity
Figure 1.1; Schematic and simplified diagram of the status of the prosodic components of speech in the phonetic representation.
INTRODUCTION
5
features, deviations from the native norm in prosodic phenomena may also contribute appreciably to the perception of a non-native pronunciation. Consequently in recent courses in English which are widely used at Dutch secondary schools, one will come across as yet brief instructions to improve upon prosodic aspects of the target language. Generally the main emphasis lies on intonation again. Instructions are usually derived from a few well-known English pedagogical intonation courses, such as those by Halliday ( 1 9 7 0 ) and O'Connor and Arnold ( 1 9 7 3 ) . Unfortunately these courses are in their admirable e f f o r t to give a f u l l account of English intonation so complex, or rather detailed, that the problems of Dutch students sooner grow than diminish. This is all the more so since a great diversity of highly complex notational systems, which will be discussed in chapter 2 of our study, is employed. However, their main deficiency lies in the fact that they are essentially based on impressionistic observations, which means that the descriptions are mostly compiled by virtue of the internal knowledge of the a u t h o r ( s ) or perhaps based upon observations by others. This implies that experimental evidence is lacking. And, although listeners are mostly quite able to distinguish small d i f ferences in pitch, a correct analysis "by ear' of a fundamental frequency contour is extremely d i f f i c u l t even for trained listeners. As a result internal inconsistencies can arise from this approach, such as O'Connor and Arnold's 'high f a l l 1 , a pitch movement with a range from a ' h i g h ' via a 'mid' to a 'low 1 level, whereas their 'high rise' has a range from only ' m i d ' to 'high' level ( c f . Crystal, 1969: p. 2 1 1 ) . Apart from this it is not made clear how ' h i g h ' for instance ' h i g h ' is, as quantitative data are lacking. Furthermore the distribution of the intonational units, in most courses called tone groups, is often not clear from a phonetic viewpoint: subjects often have great d i f f i c u l t i e s in distinguishing between, for instance, realizations of tone 1 and 5 taken from the tape accompanying Halliday's intonation course, which are supposed to be separate tone groups (Collier, 1 9 7 7 ) . The conclusions of Breitenstein et al. ( t h e Dutch Didactics Committee; 1 9 6 6 ) , which are based on the work of Kingdon ( 1 9 5 8 ) , demonstrate that the ear of the 'impressionistic 1 linguist is often fooled by the direction
6
DROP YOUR FOREIGN ACCENT
of pitch movements. They claim that a rising pitch movement is more common in English than in Dutch, whereas, in contrast to this, phonetic experiments point out that English intonation is more justly characterized by falling pitch movements and Dutch intonation by rising pitch movement ( c f . Cohen and 't Hart, 1972). Moreover, from recent experiments it appears as an even more serious drawback that the basic unit of these descriptions, the tone group, does not answer to its definition. Contrary to the general assumption that there is only one point of prominence in a tone group, the so-called tonic or nucleus, which carries the main pitch movement(s) in this group, Currie ( 1 9 7 9 ) and Brown, Currie and Kenworthy ( 1 9 8 0 ) have demonstrated that they often include two equivalent prominent points. In their experiments even trained phoneticians, who were well acgainted with the notion 'tonic 1 , disagreed widely on the position of the tonic in a tone group. How can we then expect phonetically naive Dutch students of English to be able to distinguish these basic units, used in most pedagogical descriptions, which· is at present a prerequisite to be able to study English intonation? Over and above that, the definition of the tone group appears to stand in some relation with the abstract English metrical concept of the 'foot' (Abercrombie, 1967), of which every English native speaker will undoubtedly have an intuitive knowledge, but which is a vague and undefinable notion for Dutch students of English. All things considered we think that there are good reasons for trying to come to a preferably simple, systematic and experimentally verified intonation course, including an account of similarities and differences between English and Dutch intonation. The present investigation was started as a first attempt on the way to finally providing the means for a comparative study of British English and Dutch intonation for the benefit of Dutch students of English. But as a preliminary to the outcome of this attempt a more general question will have to be answered first, viz. whether it is altogether worthwhile to learn to master the intonation system of a foreign language by looking into the importance of intonation for the perception of speech.
THE IMPORTANCE OP INTONATION
1.1 The importance of intonation in speech
In this section we will discuss contributions of prosodic cues in general and of intonation in particular to optimal speech perception. We will attempt to argue that incorrect intonation contours, such as may be produced by Dutch students of English, can have deleterious effects on the intelligibility and more specifically on the interpretation of the message by a native English listener. Traditionally most emphasis has always been put on distinctive linguistic functions of intonation. As it is d i f f i c u l t , if not impossible, to draw a sharp line between the various functional aspects of intonation, little agreement can be found in the literature about the terms in which these properties are to be described. To give a few examples: Halliday (1970) refers to the distinctive functions as being 'grammatical 1 , O'Connor and Arnold ( 1 9 7 3 ) speak of 'attitudinal 1 , Ladd (1980) prefers 'pragmatic', Stockwell ( 1 9 6 0 ) and Hirst ( 1 9 7 7 ) regard these functions as being 'syntactic 1 and finally Pilch (1970) uses the term 'morpho-syntactic'. These functional approaches have led to a number of assumptions in the past, such as the coupling of a particular pitch movement to certain sentence types, for instance a final rise with question intonation, a final fall with declarative intonation, or the primacy of a sentence final intonation in general. Sufficient evidence has been found to argue that these distinctive functions of intonation are often only marginal. Although some contours which were used in their experiments were questionable, Hadding-Koch and Studdert-Kennedy ( 1 9 6 4 ) demonstrated that listeners use the entire pitch contour rather than just the final pitch movement to make a distinction betweeVi questions and declarative sentences. Fries ( 1 9 6 4 ) showed that contrary to the general assumption 62% of the yes-no questions in his - American English - corpus was realized with a final fall. Also according to Crystal ( 1 9 6 9 ) , Cruttenden ( 1 9 7 0 ) , Collier ( 1 9 7 2 ) , Ladd (1980) and Lee ( 1 9 8 0 ) the choice of a particular intonation pattern does not seem to be correlated with these widely accepted linguistic categories. It cannot be denied that particular pitch contours often tend to accompany specific
8
DROP YOUR FOREIGN ACCENT
functional distinctions, for instance the before mentioned rise and a yes-no question, but the same movement or contour may also be used in quite d i f f e r e n t contexts so that it cannot exclusively specify this particular function. One of the reasons for the discrepancies in the descriptions and the terms used is that specific functions or rather meanings have been attributed to d i f f e r e n t forms of intonation (tone groups, •tunes, e t c . ) before the systematic properties of these forms were understood. As long as the systematic properties of the pitch contours themselves have not been properly described this confusion of tongues will continue and there will not be much point in proposing distinctive functions or meanings. In view of this we will condense the role of intonation in speech as 'communicative' and only discuss those aspects of the communicative function for which some experimental evidence has been found in the physical or perceptual properties of the speech signal. First of all intonation or rather pitch contours appear to act as a 'carrier 1 of the speech signal by linking the consecutive elements of speech into a coherent stream, in this way providing speech with continuity. Secondly specific pitch movements can, but do not necessarily have to, provide additional cues for the detection of boundaries in the speech stream, that will often but not always coincide with syntactic boundaries. By additional we mean in coherence with other factors such as syntactic cues (word o r d e r ) , syllable lengthening, pauses, etc. Thirdly specific pitch movements have such qualities as to be able to highlight particular important points of information. To Bolinger (1958) we owe the notion 'pitch accents' for these particular pitch configurations. Finally we will mention some experiments on semantic or, as the authors concerned call it, attitudinal aspects of intonation.
1.1.1 Intonation and prosodic continuity
A well-known phonetic experiment with assembled speech will immediately clarify the importance of prosodic cues such as intonation for a coherent perception of speech signals: words originally
INTONATION AND PROSODIC CONTINUITY
9
spoken in isolation by one speaker, can be concatenated without pauses in such a way that a meaningful utterance is obtained. Listeners to this concatenated speech signal will have d i f f i c u l t i e s in understanding the words and they will get the impression that the words were spoken by d i f f e r e n t speakers or they may even perceive a d i f f e r e n t word order. If one inserts pauses and provides the sentence with an acceptable intonation contour ( c f . Olive and Nakatani, 1974, and Young and Fallside, 1980) intelligibility will improve dramatically. This disruption of the speech signal, which is often referred to as auditory stream segregation, is not only due to interruptions in the prosodic components but also to other factors such as spectral continuity and amplitude. Experiments by Brokx ( 1 9 7 9 ) suggest that the coherence in speech is determined by spectral continuity in intervals shorter than about 150 ms , whereas longer stretches of speech are perceived as coherent by the continuous prosodic components, notably the pitch contour. In a dichotic shadowing experiment Darwin ( 1 9 7 5 ) showed that prosodic cues aid a listener in attending to one particular speaker and help him ignore other interfering sources. Listeners were asked to shadow dichotic and pairwise presented utterances of two readout passages under four conditions : (a) a normal condition with the two original passages paired together (b) a semantic change condition, in which one passage half-way switched to the second half of the other passage (c) an intonation change condition and (d) a condition in which both semantics and intonation changed. Subjects were asked to shadow the speech signal on one ear. When the prosodic cues were switched to the other unattended ear listeners showed a strong tendency to switch along and to follow the wrong semantic information, whereas a switch in semantics only resulted in omission of words. The possibility for a listener to concentrate on a particular speaker by means of coherent prosodic features is f u r t h e r reinforced by the paralinguistic phenomenon of voice quality (see f i g u r e 1 . 1 ) : a listener is able to identify a particular speaker solely on the basis of the laryngeal fundamental frequency (Abberton and Fourcin, 1978). The authors argue that this faculty of identification brings about normalisation to a particular speaker, that is to say an adaptation to his specific pitch range.
10
DROP YOUR FOREIGN ACCENT
which enables a listener to categorize pitch contours relative to each other. Finally we will mention an experiment by Wingfield ( 1 9 7 5 ) , the results of which support the view that prosodic components are probably the aspect of the speech waveform most resistant to noise conditions. He found in an experiment with highly time compressed, i.e. speeded-up speech, that, even when individual words became unrecognizable, prosodic features continued to be perceived in the same way as at normal speech rates. To summarize we may safely assume that one of the most important functions of intonation to the proper perception of speech is to act as a connecting element between separate units of speech. A listener gets an impression of listening to a coherent stream and is thereby able to direct his attention to one speaker in particcular, even under noisy conditions, or when the speaker is surrounded by several persons speaking simultaneously.
1.1.2 Intonation and boundary marking
Some indications have been found (Darwin, 1975; Lehiste, Olive and Streeter, 1976; Streeter, 1978; O'Shaughnessy, 1 9 7 9 ) that listeners use intonation information in making decisions about sentence structuring. Darwin suggests that a listener modifies his hypothesis while listening to a speech utterance, that is to say he backtracks throughout an as yet incomplete clause to delimit syntactic units. By virtue of the choice of specific pitch contours a speaker may be able to additionally mark important syntactic structures, generated with the help of his internal knowledge of language patterns. These contours provide the listener, who shares this knowledge, with s u f f i c i e n t acoustic cues to correctly interpret the sentence structure. An example of a boundary marker in Dutch intonation is a continuation movement, a non-prominence lending rise in pitch, t h a t - o f t e n marks the end of a syntactic phrase and simultaneously draws the listener's attention to the fact that part of the message is still to follow. See figure 1.2.
11
INTONATION AND BOUNDARY MARKING
500' »00-
300-
Beter
een
vogel
I n de
hand
dan
tlen
In de
lucht
Figure 1.2; An example of a prosodic boundary marker in Dutch: a so-called continuation movement consisting of a non-prominence lending rise in pitch.
In a dichotic listening experiment Wingfield (1975) constructed pairs of utterances with an identical word sequence, but with different positions of the major syntactic break. As an example he mentions: 'to avoid any attempts to influence voting, machines were installed,' and 'due to our new mayor's influence, voting machines were installed.' The underlined parts of the sentences were subsequently interchanged, which means that the intonation contours were in anomalous positions with respect to the syntactic boundaries. These sentences were, together with 'normal' examples, presented to listeners in one ear and would then be switched without warning to the other. Listeners were asked to repeat the whole sentence and to indicate the turnover point. In the case of non conflicting intonation, 93% correct judgments were made when the switch occurred at the major syntactic boundary and 59-62% correct when this was not the case. In the anomalous intonation there was no significant difference. However localization of the turn-over point was significantly more accurate when the switch occurred at
12
DROP YOUR FOREIGN ACCENT
the internationally marked syntactic boundary, associated with the major syntactic break of the other version of a pair of utterances ( 7 7 % correct). Wingfield concludes that prosodic cues - in this case mainly intonation - aid in marking syntactic boundaries. Although prosodic boundary markers tend to coincide with major syntactic boundaries a speaker is not obliged to mark a specific boundary. According to experiments for Dutch by Collier and 't Hart ( 1 9 7 5 ) speakers have a certain amount of freedom in choosing which syntactic boundaries are to be intonationally marked and which are not. It is not clear yet on which grounds a choice is made, as boundary marking is a complex and still little understood process. Moreover, other interacting factors such as temporal phenomena in the form of a preceding syllable lengthening and an optional pause in the speech signal, play at least an equally imporant part. For instance Streeter ( 1 9 7 8 ) has found evidence that both durational and pitch cues provide additive information for listeners to parse sentences. An extensive discussion of the problem of boundary marking can be found in Nooteboom, Brokx and de Rooij ( 1 9 7 8 ) .
1.1.3 Intonation and pitch accents
In languages like English and Dutch, pitch contours bring about accentuation in an utterance according to the phonological stress rules of the language by giving concrete form to abstract lexical word stresses. By assigning pitch accents to particular syllables a speaker provides a listener with acoustic cues which enable the listener to detect points of interest within the utterance. In this way a listener's attention is focussed on particular words the speaker wants to emphasize. These cues are generally (a) steep pitch movement(s) and a specific position of the pitch movement(s) with respect to the vowel onset ( c f . Nooteboom et al., 1978). A speaker seems to be able to mark a predicted lexical accent by choosing a particular variant from the intonation patterns allowed in the language. For Dutch an exhaustive inventory has been made (Collier, 1 9 7 2 ) of the characteristics of those pitch movements or
INTONATION AND MEANING
13
combinations thereof that can make a syllable accented. However the main problem lies in the fact that we still do not know on which grounds a speaker does or does not assign pitch accents to particular words. In our final chapter we will come back to this issue. At any rate it seems likely that choosing an incorrect intonation pattern, caused by an insufficient knowledge of the intonation patterns of a target language, can result in ( a n ) unintended prominent syllable(s), which may lead to a misinterpretation of the message.
1.1.4 Intonation and meaning
Although we have observed before that it is a hazardous task to d i f f e r e n t i a t e 'meanings' reflected by intonation in d i f f e r e n t kinds of ' m e a n i n g ' , that is to say, a semantic or pragmatic and an extrasemantic or attitudinal meaning, we will at this point adopt the terminology of the authors concerned. By pragmatic is generally meant the meaning conveyed by a particular choice of a pattern or tone group which effects the 'normal' semantic interpretation of the context, whereas the attitudinal aspects are mostly regarded as expressing the emotions of a speaker reflected in the choice of variants of a particular pattern or tone group. Despite the fact that not all linguists include attitudes in linguistic descriptions on account of the unpredictability of this phenomenon, one generally agrees that attitudes are reflected by intonation. As we are not aware of any major experimentally oriented approach aiming to establish the relation between intonation patterns and specific semantic meanings we will only b r i e f l y refer to a few experimental studies in the past which had a bearing on ' a t t i t u d i n al' aspects. Many authors have tried to establish relations between particular patterns and attitudinal meanings (for instance Uldall, 1960; Lieberman and Michaels, 1962; Hadding-Koch and StuddertKennedy, 1964; Crystal, 1969, and Urbain 1 9 7 2 ) , the results are in most cases discouraging, mainly on account of the d i f f i c u l t i e s involved in assessing satisfactory 'labels' for the categorization of the attitudes. As general agreement on this categorization is still
14
DROP YOUR FOREIGN ACCENT
lacking most labels can evidently not help but being highly subjective. Subjects only appear to reach reasonable agreement on the labels when a very limited number of very distinct meanings is used, such as is the case in the experiment by Hadding-Koch and Studdert-Kennedy ( 1 9 6 4 ) . Individual subjects even appear to vary their interpretation of the same label within the course of an experiment. In view of the marginal and discouraging results it does not seem appropriate to deal with attitudes or for that matter semantics in our study, as a discussion of these aspects of intonation would only be highly speculative and would lead us far from our actual goal. Nevertheless we are aware that 'meaning' is an important aspect of intonation.
1.1.5 Concluding remarks
We have seen that evidence exists that intonation performs important functions in the correct perception of speech signals. On the one hand it secures a coherent stream in any speaker's output message, which enables a listener to concentrate on a particular speaker and to perceive a message under noisy conditions (prosodic continuity). On the other hand intonation provides the listener with additional means to parse a sentence into syntactic categories: major boundaries can be marked with particular pitch movements. Pitch accents which highlight important words, enable him to extract important information about the message. Moreover it is certainly true that intonation has communicative values which can modify or add up to information provided by a speaker by means of choice of words, word order etc. Insufficient knowledge of intonation patterns of a target language may lead to a wrong interpretation of the message on the part of the native speaker, if the chosen contour is incompatible with the underlying lexical meaning. For instance in the case of 'tags' produced by native speakers of Dutch. Consequently we feel that the teaching of correct intonation patterns to students of a foreign language may enhance intelligibility under specific conditions.
AN EXPERIMENTAL APPROACH
15
We think that Halliday ( 1 9 7 0 ) is basically correct when he says: 'Naturally in speaking English it is an advantage to be able to produce the various pitch contours accurately and correctly. But this is not simply a matter of making oneself understood, or of sounding less 'foreign 1 to an English speaker. The importance of intonation is not so much that it is part of a good accent, or of the right way of speaking, although it is true, of course, that a good pronunciation always includes correct intonation as well as correct articulation and rhythm. The importance of intonation is also that it is a means of saying different things' (p.21).
1.2 Principles of an experimental-phonetic approach
In an extensive survey by Gibbon ( 1 9 7 6 ) on the possibilities of combining phonetic descriptions of intonation with functional linguistic proposals or theories, under which head he also includes pedagogical courses, to an overall systematic description, the author concedes that this attempt does not yield f r u i t f u l results as yet. He concludes that this is caused by the fact that experimental phonetics is still a relatively young science and is thus still dealing with 'smaller details'. In our opinion his attempt is due to fail as he decides to neglect in his study the wide range of results already available in the field of experimental phonetics. He admits however that 'the quantitative methods will undoubtedly provide the key to a good deal of progress in intonation research in the f u t u r e 1 ( p . 2 8 8 ) . We hold the view that before we can f r u i t f u l l y tackle functional aspects of intonation in general and before we can ultimately attain a comprehensive systematic description of all intonational differences between English and Dutch in particular, we should start with a purely melodical, experimentally verifiable description of distinctions in permissible contours in English. However we are aware that in investigations with a pedagogical goal such as ours functional aspects cannot be avoided at all times. For purposes of language teaching we believe that one of the best approaches to the study of intonation may be the one in which the directly observable physical data, the instrumentally extracted
16
DROP YOUR FOREIGN ACCENT
fundamental frequency curves, are quantitatively analysed and described. Fundamental frequency measuring devices have improved in such a manner that results are quite reliable at present. However if only fundamental frequency measurements are taken as input to the investigation of intonation, one is still faced with the probblem how to bridge the large gap between the unlimited number of detailed and in several respects redundant frequency curves at the acoustic level and the supposedly limited number of abstract intonation patterns at the competence level. See figure 1.1. A powerful means to accomplish this was the introduction of the experimentally verifiable notion of 'perceptual equivalence 1 in the study of Dutch intonation by Cohen and 't Hart ( 1 9 6 7 ) , which was based on an analysis-by-synthesis system used in intonation studies by Mattingly ( 1 9 6 6 ) and Isacenko and Schädlich ( 1 9 7 0 ) . In an extensive investigation of Dutch intonation Collier ( 1 9 7 2 ) , 't Hart and Cohen ( 1 9 7 3 ) and 't Hart and Collier (1975) demonstrated that it is possible to unravel the fundamental frequency curves, observable at the acoustic level, into two components: voluntarily produced perceptually relevant pitch movements which move up and down between a low and high line of slowly declining pitch - hence declination line - and involuntary minor pitch fluctuations known as microintonation. Perceptually relevant pitch movements should be conceived as representing the discrete fundamental frequency movements produced by the speaker, i.e. active commands to change the vocal cord freqency. They can be defined by the fact that they are experimentally v e r i f i a b l e in listening tests. If such a movement is left out of a pitch contour the perception of the pitch contour as a whole changes. In contrast to this the remaining minor pitch phenomena are not actively controlled by a speaker, but are the result of segmental influences and may be left out without consequences to the perception of the overall contour. The perceptually relevant movements, superimposed on the downward d r i f t i n g declination line, can be combined to form complete pitch contours according to rules in an intonation grammar. If the positions of pitch accents are given, any utterance in Dutch can be provided with an appropriate pitch contour.
CONCLUSIONS
17
Moreover from recent experiments on English intonation by de Pijper ( 1 9 7 9 , 1980 and forthcoming) it has become clear that the method just discussed can be f r u i t f u l l y applied to English intonation as well. Consequently, as this method provides an excellent means for verification of the perceptual relevance of observed differences between Dutch speakers of English and native English speakers by means of listening tests, we will make use of it in our experimental study. We will discuss this approach more extensively in chapter 2.
1.2.1
Conclusions
On the basis of the afore-mentioned considerations we will study in our investigation the result of the mental processes of the speaker at the acoustic phonetic level in the form of fundamental frequency curves, which are (a) readily accessible for investigation, (b) show a direct correlation with abstract intonation patterns through the intermediate step of perceptually relevant pitch movements and (c) are also the signals the listener has to direct his attention to in order to be able to interpret the communicative intonational aspects of the message. In order to obtain simple and clear-cut instructions for Dutch learners of English we will start at the 'lowest' level of description, v i z . an instrumental analysis of fundamental frequency curves. These findings will in the course of our study be supplemented by perceptual evaluations by native speakers of English, excluding as much as possible the communicative functions of the material used.
1.3 Aims of this study
Using an experimental phonetic approach this study aims at developing as simple and systematic precepts and combinatory rules for English intonation as possible, which may eventually get the form
18
DROP YOUR FOREIGN ACCENT
of a contrastive English-Dutch intonation course for the sake of Dutch students of English. The necessity for a simple description of intonational features is prompted by the fact that in the most common language learning situation, namely the classroom, only limited time can be spent on this particular aspect of pronunciation. In all probability lack of simplicity in existing English courses has greatly restricted the application of these courses in language teaching in the Netherlands. By systematic we understand a limited set of unambiguous precepts and rules that enable a Dutch student to provide the majority of English utterances with an acceptable intonation contour. At the same time these precepts should include descriptions of clear distinctions from and similarities with the native language's intonational system. We suggest that this set of precepts be used in a cognitive way, that is to say a student has to be made aware of his normally implicit knowledge of intonational features. This implies that it should be most e f f i c i e n t for a student to first of all become aware of his existing implicit knowledge of pitch phenomena in his native language and subsequently to become familiar with the as yet unknown patterns of the target language. We do not entertain illusions about presenting a cut and dried intonation course for Dutch students, but we are convinced that this objective can be achieved in the near future. The reasons for this assumption are on the one hand that an intonation course for foreign learners of Dutch has already been accomplished (Collier and 't Hart, 1 9 8 1 ) , on the other that it has been established that it is possible to work along the same lines with respect to English intonation (de Pijper, 1979, 1980 and forthcoming). We will try to achieve our goal by combining instrumental measurements of fundamental frequency curves with perceptual evaluation experiments: by means of an analysis of comparable pitch movements produced by native speakers of Dutch and of English on identical English utterances we hope to trace apparently important d i f f e r ences in the speech melody. In subsequent perception experiments, in which we will use (reJsynthesized and thus experimentally controllable pitch contours, we will try to establish the perceptual relevance of the deviations by presenting these utterances
BRIEF OUTLINE OF THE STUDY
19
with manipulated contours to native speakers of English for acceptability judgments. In this way we are able, by neglecting irrelevant details, to incorporate only those deviations from the English 'norm' in our inventory that contribute substantially to the perception of a non-native pronunciation. Summarizing we will try to answer the following questions: 1) Which factors contribute most to the perception of a nonnative 'accent 1 in English intonation by native speakers of English in utterances produced by native speakers of Dutch? 2) Is it possible to provide contrastive English-Dutch precepts in the form of specified properties of pitch movements which enable Dutch students of English to consciously generate appropriate English pitch contours, resulting from a process of becoming aware of the properties of the pitch contours in their native language?
1.4. Brief outline of the study
In the next chapter we will give a short survey of several approaches to the study of intonation in the literature, such as the level (mainly American) and the configuration (mainly British) approaches, with an emphasis on pedagogically oriented studies. The main points of two main phonological streams will be presented. A more extensive description will be given of previous work in the field of experimental phonetics. Finally a few contrastive EnglishDutch studies will be discussed. In chapter three we will start the survey of our experimental work with an introduction to the research techniques used in our production experiments and we will describe two preliminary experiments: a f i r s t search for deviations at the production level and a small perception experiment, using normal speech, on the discriminability of Dutch and English pitch contours in general. In chapter four we will discuss the results of our efforts to establish perceptual tolerances of some properties of prominence lending pitch movements, notably magnitude of the excursion and position in the syllable, using synthesized speech.
20
DROP YOUR FOREIGN ACCENT
Chapter five reports on our main production experiment, in which almost 1300 pitch movements or combinations of pitch movements, produced by native speakers of Dutch and of English, were analysed and compared. The sixth chapter makes use of the data from the preceding production experiment in a perceptual evaluation of what were found to be nine major deviations. In this experiment we manipulated the fundamental frequency of our test utterances by means of speech resynthesis systems. The techniques used will be described in short. Furthermore we will report on a second perception experiment intending to assess the importance of minor deviations. In a final pilot experiment the usefulness of spectrally rotated speech for experiments on deviations in durations of the pitch movements was put to the test. In chapter seven we conclude the description of our experimental work with a report of a perceptual estimation experiment on the potential effectiveness of a small number of preliminary precepts for the correction of pitch movements produced by native speakers of Dutch. In our eighth and final chapter we will attempt to relate the conclusions obtained from our experimental work to the problems stated in the introduction. We will examine relevant relationships with previous investigations in this field and we will consider the prospects for application of our findings in second-language teaching. This chapter will be concluded with a survey of preliminary precepts on behalf of Dutch students of English.
Chapter 2
From tunes to perceptually relevant pitch movements 2.0 Introduction
Most studies of English intonation in the past may in effect be divided into two geographically oriented main streams, namely a British and an American school. The British school is primarily characterized by its preoccupation with intonation courses on behalf of foreign learners, although some linguists such as Crystal ( 1 9 6 9 , 1975) and Halliday ( 1 9 7 0 ) have set forth more elaborate theories. Representatives of this school above all try to establish relations between the meaning of a sentence and variations in intonation contours in the form of tunes or tone groups, on the strength of the l i n g u i s t ' s impressionistic observations and intuitive knowledge of his native language. Generally the tunes or tone groups are subdivided according to sentence mode (question, statement, e t c . ) . Initially the tune approach caught on. It described intonation in terms of overall units of a sentence and did not bother to determine the internal composition of a contour. Most approaches of this type are based on studies by Jones ( 1 9 1 8 ) and an influential course by Armstrong and Ward ( 1 9 2 6 ) . In later years the tone group approach began to prevail in most British English studies, following the work of Palmer ( 1 9 3 3 ) . In the tone group approach the overall sentence contour is subdivided into a number of tone units, centred around a main or nuclear pitch movement on an accented syllable. In the British approach lexical stresses are considered to be effected by pitch accents, that is to say pitch movements having particular qualities. Consequently stress is not regarded as an independent component. Because of their pedagogical preoccupation most studies aim at an instantly usable description, so that they pay little attention to combining tone groups into larger units. As a result this approach cannot
22
FROM TONES TO PITCH MOVEMENTS
effectively handle matters beyond the tone group. In contrast to this the American school is characterized by its emphasis on the necessity of obtaining an elaborate theory of intonation. It owes its reputation particularly to the work of Trager and Smith ( 1 9 5 1 ) , but the foundations were already laid by Bloomfield ( 1 9 3 3 ) and Wells ( 1 9 4 5 ) . In analogy to the phonematic description of segmental speech sounds this approach tries to describe intonation in terms of phonemes (pitch movements) which constitute morphemes (intonation contour), as intonation carries meaning. In most descriptions the intonation contours are represented by four pitch levels. Stress is, contrary to the British school, supposed to be completely independent of pitch and is regarded as a function of intensity. It is likewise represented by four levels. This approach aims at describing all linguistically relevant distinctions which may be signalled by pitch levels. As a large number of recent theoretical studies of intonation are based on this level approach we will come back to this in section 2.3. Most intonation studies in the past have been performed by linguists who based their findings on perceptual impressions of their own or at best on direct observations of other people's speech. Accordingly the main objection against these studies is that their outcome is more often than not irreproducible, as a listener is easily misled by particular aspects of pitch phenomena, notably by the direction of pitch movements. As a response to this criticism a tendency has grown among linguists to study the fundamental frequency curves at the acoustic level with objective measuring devices (e.g. Lieberman, 1 9 6 0 ; 1967; Ohala, 1978; O'Shaugnessy, 1979). Although these instruments yield reliable results at present, this method of analysis has as a major disadvantage that the profusion of details makes it extremely d i f f i c u l t to isolate and describe linguistically relevant pitch phenomena. Since the f i r s t experiments by Cohen and 't Hart ( 1 9 6 5 , 1 9 6 7 ) on Dutch intonation a 'Dutch school 1 has developed over the last twenty years that tries to get around the afore-mentioned problems by means of a method of. analysis-by-synthesis. In this approach native listeners are asked to evaluate systematically manipulated artificial pitch contours, which are superimposed on resynthesized 'carrier phrases'. In this manner one is able to select and de-
THE BRITISCH ENGLISH SCHOOL
23
scribe perceptually relevant pitch movements. The result, an inventory of relevant pitch movements of a language, has the advantage of being objective, reproducible and thus reliable, as it is not solely based on the internal knowledge of the investigator. At the same time the notational system used, that is to say combinations of straight lines representing the artificial pitch movements, is not interpretative but purely descriptive. It appeared to be possible to apply (variations of) this method to other languages as well, e.g. to American English (Maeda, 1976), to French (Delgutte, 1976) and to British English (de Pijper, 1979; 1980 and forthcoming) . In this chapter we shall shortly discuss some influential studies in the English and American tradition (including two recent phonological approaches) and give a more extensive survey of the Dutch school, as it is in many respects basic to our present study. He will include a description of some representative notational systems. Finally we will review a few existing contrastive studies of English and Dutch intonation.
2.1 The British English school
2.1.1 The tune approach
Pitch is described as an overall structure functioning at sentence or rather intonation group level. Two basic configurations, Tune I and II, are considered to be s u f f i c i e n t to adequately describe English intonation. Minor variations of these tunes are possible, depending whether they are emphatically used or not. By means of emphasis a special emotional meaning in the speaker's mind is expressed, which implies that it can be thought of as being equivalent to the usual term 'attitude 1 . The best known representatives of this approach are Jones ( 1 9 1 8 ) and Armstrong and Ward ( 1 9 2 6 ) . For pedagogical reasons a notational system is used consisting of discrete symbols, although Armstrong and Ward recognize that a pitch contour is in fact continuous: dots to mark the relative
24
FROM TUNES TO PITCH MOVEMENTS
pitch of an unstressed syllable, dashes for stressed syllables and curved lines to denote pitch falls or rises. Tune I essentially starts at a mid level and continues on this level or may rise on a stressed syllable and ends in a fall. It is mainly used for 'decided' statements, wh-questions, commands and exclamations. Tune II starts at either a mid or a high level and gradually falls but generally ends with a rise in pitch. It is used for yes-no questions, requests and sentences in which something is implied. Examples of these basic, i.e. unemphatic tunes are presented in figure 2.1. Compound sentences are considered to be divided in sense groups, which are identical to the usual sentence intonation groups, and consequently each can comprise a tune. Armstrong and Ward also mention a kind of 'reset 1 , called 'change of k e y ' , in long sentences consisting of only one sense group, in that the pitch level is raised on an important word to prevent the final pitch of the group from becoming too low.
Tune I
They
'came
to
'call
"yesterday
'after 'noon.
Tune II
J I
'wish
I
could
'speak
'English
like
'that.
Figuur 2.1: Examples of the two unemphatically used basic tunes quoted from the course by Armstrong and Ward (1926). Tune I: p.4; Tune II: p.22.
The main drawback of this system is firstly that no description is given of the internal components of a contour, and secondly that the description is simplified for pedagogical reasons to such an
THE TONE GROUP APPROACH
25
extent that all more complicated movements are omitted, thus leaving out possibly important gradations of English intonational usage.
2 . 1 . 2 The tone group approach
As we have mentioned before the tone group approach has largely superseded the tune approach and is still widely used at present in British English intonation courses for foreign learners. In this approach intonation contours are decomposed into a number of tone groups or tone units, each of which consists of one main pitch movement, the 'nucleus', which may be preceded by a ' ( p r e ) h e a d ' and optionally followed by a ' t a i l ' . We will discuss this in more detail below. The heads and tails function as connecting elements between separate nuclei in larger stretches of speech. Originally four nuclear tones and three d i f f e r e n t heads were distinguished by Palmer ( 1 9 2 2 ) , which were combined and expanded to six tone groups (Palmer, 1933). The number of tone groups was further enlarged and the description of the internal components was more refined by a large number of researchers such as Jassem ( 1 9 5 2 ) , Kingdon ( 1 9 5 8 ) , Crystal ( 1 9 6 9 ) , Halliday (1970) and O'Connor and Arnold ( 1 9 7 3 ) . Jassem added four new nuclear tones and five more heads (or 'prenuclear tones' as he calls them) to Palmer's inventory. Kingdon closely followed Palmer's system, but introduced a further division of the head into 'prehead 1 and 'body 1 . Whereas Halliday still emphasized, as was the case with most other studies, the grammatical functions of intonation, and also presented one of the first attempts in the British tradition to place intonation into a more general linguistic theory, O'Connor and Arnold shifted their attention to the attitudinal aspects of intonation. As we will use examples of O'Connor and Arnold's course in our main experiments and as it is a fairly recent and elaborate description we will give a short survey of their terminology, which is broadly representative of most tone group studies as far as the description of the units is concerned.
26
PROM TONES TO PITCH MOVEMENTS
O'Connor and Arnold distinguish ten tone groups. The stressed syllable of the last accented word in a tone group carries the main pitch movement ( ' n u c l e u s ' ) . All syllables following the nucleus, which are consequently unstressed, are called the tail. Seven nuclear tones are distinguished, which are optionally preceded by a (pre)head. The head starts with the stressed syllable of the first accented word and ends with the syllable preceding the nucleus. The prehead consists of any number of unstressed syllables before the beginning of the head. Each tone group encompasses a word group, which is rather vaguely defined as a group of words having some intonationally marked grammatical function within an utterance. Word group and thus tone group boundaries are indicated by a single bar (/) in their system, whereas sentences are separated by double bars (//). This grammatical relation is not further elucidated, so that it may be d i f f i c u l t to explain to foreign learners why in the following examples (figure 2 . 2 ) the first sentence comprises only one tone group, whereas the second is divided into two groups. The examples also present a small selection of the many tonal marks used in their detailed notational system. 1. //I 11"really prehead
don't head
seel why Iyou're oso opessimisticj// nucleus
D 'you ' seriously ^think
prehead
head
tail
/ 'English will
nucleus
be a °world
head
.language I u one "day?! // nucleus v
( ) (') (/) (0) (o) (/) (//)
tail
= high falling to very low = relatively high level = very low rising to medium = high level pitch = varying from low to medium - tone group boundary = tone group boundary and pause
Figure 2.2; Two examples from the course by O'Connor and Arnold (1973) showing their division into tone groups and their notational system (top: p.284; bottom: p.277).
THE AMERICAN ENGLISH SCHOOL
27
For pedagogical reasons all tone groups are described in relation to five sentence types - statement, wh-question, yes-no question, command and interjection - although they rightly point out that no tone group is used exclusively with a particular sentence type and also that no sentence requires the use of a particular tone group. However, they persist that certain tone groups are more likely to be used in connection with particular sentence types. In an attempt to give a full account of all the attitudinal meanings of intonation they distinguish no less than 160 largely overlapping labels, which makes it almost impossible to interpret the meanings of these labels. In conclusion we may safely assume that, apart from shortcomings already mentioned in chapter 1 ( o f . Currie, 1979), the tone group approach is generally too complex for foreign learners, as a vast amount of implicit knowledge is presupposed. When appropriately used as a drill method, it will hardly incite a deeper consious understanding of English intonation.
2.2 The American English school
Following suggestions by Bloomfield ( 1 9 3 3 ) , the American school attempts to apply a segmental phonemic analysis to the description of intonation. In this approach stress and pitch are considered to be independent components. Stress is regarded as a function of perceived loudness, which in its turn is a function of the intensity of the acoustic signal. As intonation carries meaning, intonation contours must be pitch morphemes and consequently the various pitch movements, which constitute a contour, must be pitch phonemes. Pitch phonemes also mark the end of a sentence. Bloomfield distinguishes five pitch phonemes and three stress phonemes. Wells ( 1 9 4 5 ) proposes four pitch levels, which are represented by the numerals 1 for lowest to 4 for highest level. Independently Pike ( 1 9 4 5 ) suggests basically the same description - although the value of the levels is reversed - but he adds two pause markers to the analysis: a tentative pause ( / ) , indicating uncertainty or non-finality and a final pause ( / / ) , which generally
28
FROM TUNES TO PITCH MOVEMENTS
signals the end of a sentence. Examples of Pike's notational system are presented in figure 2.3. 'Two °2-3-4/
'Eng
times
'three
4-
'2
lish
'plus "1
lessons
two -4-3/
are
is
'ten.
3-
*2-4 //
'ea
sy.//
( * ) = beginning of primary contour ( 1 ) = highest level (4) = lowest level Figure 2.3; Example of Pike's notational system for indicating pitch and stress levels (top: p.33)and an example of his notational system used for foreign learners (bottom: p.123). An influential description comes from Trager and Smith ( 1 9 5 1 ) . They too agree on four levels (phonemes) that can be modified by pitch variation features within any level (allophones), in order to be able to account for pitch glides. Trager and Smith state that the primary function of intonation is to divide the sentence into constituents (abstract phonemic words) and that each constituent is always represented by a prosodic pattern (phonemic clause: pitch levels and a terminal juncture) that can be objectively established in the acoustic signal. Bolinger (1951) criticizes the level approach by arguing that the pitch levels are indistinguishable as they are relative and may overlap: a pitch morpheme, consisting of four different pitch levels may be shifted in frequency range as long as the internal ratios are undisturbed. In this way a given absolute pitch can correspond to any of the four pitch phonemes implying that a sequence of 1-2-3 would be identical to 2-3-4; 1-2-1 to 2-3-2 etc. Furthermore a pitch morpheme consisting of only one phoneme could be represented by any of the four levels. In a number of informal experiments Bolinger showed that subjects correctly identified instances
TWO PHONOLOGICAL APPROACHES
29
with dissimilar pitch configurations without regard for any supposedly identical pitch level sequence and also equated curves that had the same configurations but were different with respect to levels. The notational system according to Trager and Smith was tested with regard to its practical reliability in a by now classical experiment by Lieberman ( 1 9 6 5 ) . He asked two well-known linguists who were quite familiar with the system to describe a number of d i f f e r e n t l y intoned sentences in terms of four pitch and four intensity levels. Their description appeared to be d i f f e r e n t in 60% of the cases. Lieberman concluded that they imposed the Trager and Smith system onto the material. This conclusion was further corroborated by a second experiment in which delexicalized material was used. Both subjects were unable to describe more than two stress levels. Furthermore one of the subjects appeared to be far more consistent when using a configurational system instead of the level system. For a more extensive discussion of this level versus configurations dispute the reader is referred to Ladd (1978, 1980).
2.3 Two recent phonological approaches to intonation
Following to a large extent the discrete Trager and Smith approach a large number of linguists, e.g. Stockwell ( 1 9 6 0 ) , Lieberman ( 1 9 6 7 ) , Chomsky and Halle ( 1 9 6 8 ) - although mainly dealing with stress -, and Halle and Keyser ( 1 9 7 1 ) , have applied generative transformational principles to the study of intonation, mostly in trying to establish relations between intonation and syntax. As a survey of all these theoretical approaches would be far beyond the scope of our phonetically oriented study we will only give a survey of two recent theories. As the sound rules of a language are incorporated in the phonological component, intonation rules are generally included into phonological descriptions of languages. At present two main phonological streams can be distinguished that seem to be related to one another and that both try to represent intonation by units larger than segments: the autosegmental and the metrical theory. As the
30
FROM TUNES TO PITCH MOVEMENTS
metrical theory seems to be the more influential of the two we will pay more attention to this theory than to the autosegmental description of intonation.
2.3.1 The autosegmental theory
The autosegmental theory originates from the study of a number of African tone languages by Leben (1976) and Goldsmith (1976). They propose an independent level of suprasegmental features which can be segmented in their own right into a discrete number of segments, hence 'autosegmental 1 . These discrete 'segments' at the suprasegmental level run simultaneously in time with the segmental or phonemic level. Both levels are coordinated according to well-formedness conditions by the following association rules: 1) All vowels are associated with at least one tone; 2) All tones are 'associated with at least one vowel; 3) Association lines do not cross, in other words tune and text are directly related. The suprasegmental pitch 'contours' are represented in terms of a number of separate discrete pitch levels with one prominent member. Levine (1979) proposes an additional tone group hypothesis to restrict the number of possible intonation contours, as the autosegmental system allows more possible contours in a language than it needs to use. In English a sequence of an optional mid (M) followed by a high (H) and a low (L) level, (M) H L, generally used for declarative utterances, would then become a tone group. The concrete pitch contour is derived according to a principle of hierarchically organized cycles, in the first of which a concrete pitch level is assigned to each word based on its position in the hierarchy of abstract phonological tones. In the second step the concrete pitch levels are modified to particular pitch contours according to the previously observed lexical word pitches in the first cycle.
THE METRICAL THEORY
31
2.3.2 The metrical theory
Although the idea was not entirely new, Liberman (unpubl. 1975, publ. 1978) and Liberman and Prince (1977) claim that all sequentially ordered human behaviour is metrically organized. Stress (a property of the text) and intonation, which are following the American school considered to be independent, are related to rhythm. This idea corresponds well with the use of the rhythmic 'foot 1 in most definitions of tone groups in the British tradition. On the basis of an analysis of vocative chants Liberman proposes completely independent levels of "metrical patterns', i.e. patterns which impose metrical structure on complex events, but, contrary to Goldsmith's view, the tonal (tune) and the toneless linguistic level (text) are not matched in time by association rules, but are both independently related to this metrical pattern and only achieve an association with each other through this intermediary metrical structure. Lexical entries in English are not specified for tonal features, but tonal features have independent identity. The syntactic structure of a sentence (text) is mapped onto a hierarchically organized rhythmic structure of weak and strong pairs of stress levels, represented by a binary branching tree with relational node labels, S (strong) and W (weak). They are rooted ( R ) , i.e. any tree is always a single constituent at the highest level, and oriented, that is constituents are ordered. See figure 2 . 4 ( a ) for an example. As the tune of the vocative chant is represented by ( L ) H M and as the H ( i g h ) tone always falls on the downbeat of the chant a tonal tree will become as shown in figure 2 . 4 ( b ) . For the downbeat Liberman gives the following rule: 'the downbeat or designated terminal element of a metrical node N is that terminal element dominated by N which is reached by a path starting from N that intersects no nodes labelled W (p. 4 3 ) . The underlying segments of tonal representations are static tones. When the metrical patterns of tune and text are superimposed the tonal tree congrues with the encircled nodes of the textual tree: the L(ow) tone is superimposed on the syllable Oh', the M ( i d ) tone
FROM TONES TO PITCH MOVEMENTS
32
Tonal Tree
Textual tree
A - - - - - - 1 1 - - cia (b)
(a)
Figure 2.4; (a) Example of a hierarchically organized rhythmic structure of weak and strong pairs of stress levels: a textual tree. (b) When the metrical patterns of tune (vocative chant ( ( L ) M H ) ) and text (Oh, Alicia 1 ) are superimposed, congruing with the encircled nodes of the textual tree, a tonal tree results.
is superimposed on 'cia 1 and the H ( i g h ) tone on ' l i ' , in accordance with the designated terminal element rule. By means of an assimilation or 'tone spreading 1 rule the initial syllable Ά' of 'Alicia' will be associated with the L(ow) pitch, yielding the correct tune. As a combination of two binary tone features [± High] and [± Low] results in four distinct tones in the phonology for American English the derived contour will ultimately become as shown in figure 2.5. Oh L
Hi
A
- - li - - - - - - c i a
1
H
M
1
1
[+High]
f+High] +Low
M
Figure 2.5; Tonal features are assigned to the segments through the metrical pattern yielding a 'call contour'.
THE METRICAL THEORY
33
Contours are considered to be functionally distinctive. Liberman mentions the before-mentioned "call contour 1 , a 'declarative contour 1 , a 'contradiction contour' and a 'surprise-redundancy' contour. Selkirk ( 1 9 8 1 ) proposes a hierarchically organized prosodic structure consisting of prosodic categories analogous to syntactic categories. See figure 2.6.
w
r
1
S
Ma
s /
- dame
w
r r
S1
S1
Tris - tan
Figure 2.6: A hierarchically organized proposed by Selkirk (1981).
prosodic
tree
as
The topmost category in an underlying phonetic representation is an ( U ) t t e r a n c e , consisting of a sequence of prosodic words, which stands in some relation to the terminal strings of the syntactic representation via intermediate well-formedness conditions. The utterance is parsed into one or more (I)ntonational phrases, which is a unit with at most one nuclear tone associated with the primary stressed element of an intonational phrase. Generally the rightmost intonational phrase will be (s)trong to reflect the greater pitch prominence of the sentence final nuclear tone. This intonational phrase is parsed into one or more (P)honological phrases in order to be able to express stress prominences within the intonational phrase. The rightmost of the phonological phrases will in general be strong, consequently the main stressed word of the strong phonological phrase bears the nuclear accent. Each phrase consists of a (number of) prosodic (W)ords, which in their turn consist of a (number of) stressed ( F ) e e t . (S)yllables are the lowest category.
34
FROM TUNES TO PITCH MOVEMENTS
The relations between prosodic categories are subject to two types of well-formedness conditions: principles of structure and principles of prominence. An example of the former is that for instance a prosodic word, in English, groups feet in a right branching structure. The principle of prominence determines the relations among the nodes of the prosodic tree, e.g in English a righthand node is labelled 's 1 if and only if it branches. There is no one-to-one relationship between the prosodic structure and the syntactic structure, firstly as the categories are not the same, secondly as the strong-weak relations do not exist in syntax and thirdly as there is no correspondence between the constituents of the syntax and those of the phonology. Following Liberman (1975) Selkirk too argues that prosodic structure mediates between syntax and phonetic realization. Although it is unlikely that all elements of the metrical structure have a correlate in the acoustic signal at the phonetic level it could be rewarding to try to relate these phonological principles to the phonetic structure. This additional step has been made by Pierrehumbert ( 1 9 8 1 ) , who tries to convert her underlying phonological descriptions in terms of auditorily testable utterances by means of synthesis by rule. Her immediate goal is to find rules for improving upon the naturalness and the intelligibility of a text-to-speech system and ultimately she hopes to develop a full model of English intonation. In short her computer implemented rule system works as follows: the stress contour for a phrase is computed by constructing a syntactic tree for it. Strong stress is assigned to the right of each sister node by Chomsky and Halle's ( 1 9 6 8 ) Nuclear Stress Rule; the left node is automatically weak. Focus may override these predictions. Absolute prominence is assigned to each node by counting up the total number of weak nodes dominating the words (unstressed function words are cliticized). An example is presented in figure 2.7, which yields a sequence 2-1-1-0 ( a * ) . By adding Ί ' to each node a ' Γ becomes the highest stress, yielding 3-2-2-1 ( b * ) . This sequence is not equal to the outcome of Chomsky and Halle's rule as Pierrehumbert uses a Liberman and Prince 'strong-weak alternation 1 modification of this rule.
THE METRICAL THEORY
W
I
The region's 1 2 1 3
35
"S I weather 1 1
1
2
W \
S
1
was unusually 1
dry
1
0
I
2
1
1
1
(a*) (b*)
Figure 2.7; An example of stress assignment in Pierrehumbert's (1980) rule system (a* and b*: see text, p.34).
For the assignment of pitch two levels are used: a high topline and a low baseline. In a neutral declarative utterance a ' 1 ' stress gets a target at the topline, a ' 2 ' stress a factor 0.7 down the topline and a '3' stress a factor 0.4 down the topline. Each (level) target has a duration of 60 ms. The last accent is the nuclear accent and is characterized by an early steep fall with a duration of 200 ms, which can continue below the low baseline. Targets are connected by means of 'sagging r u l e s ' , which depend on the difference in height of the targets and the time in between two targets. The resulting contour is shown in figure 2.8. Contrary to Liberman she does not consider stress to be an independent component: in her rule system stresses are realized as pitch accents by assigning to them a target Fo peak. Moreover it may be beneficial for the progress of the investigation of intonation that she tries to relate her phonological theory with concrete phonetic realizations by providing utterances with actually derived pitch contours. Nevertheless her description would greatly gain value if she would also include the f i n a l step, viz. a perceptual evaluation of her resynthesized utterances by a representative number of (American)English native speakers.
36
FROM TUNES TO PITCH MOVEMENTS
350 300 250 200 150 -
Ο
2s
1
% In november % the region's weather was unusually dry% 0.4
1.0
0.3
0.6
0.8
0.8
1.0
Figure 2.8; An example of a derived pitch contour Pierrehumbert's rule system (p.988).
in
2.4 The Dutch school
This approach to intonation is based on the assumption that the intonation contours of a language are reflected in a limited number of recurrent discrete pitch movements, which are actively controlled by a speaker of that language. These pitch movements are interpreted as relevant by a listener and are therefore called the perceptually relevant pitch movements of that language. As yet this approach does not account for the distinctive linguistic functions of these pitch movements. The perceptual relevance of the pitch movements can be experimentally verified in listening tests, which means that this approach yields reproducible and thus reliable results. This method was first applied to Dutch in the form of a perceptual analysis of pitch contours by 't Hart and Cohen (1964) and Cohen and 't Hart - ( 1 9 6 5 ) : initially portions of 30ms of speech were gated out and the perceived pitch was matched with an adjustable and easily measurable signal of a tone generator (analytic listening through pitch matching). In this way problems with the in-
DECLINATION
37
strumental analysis of fundamental frequency or with the deceptive analysis of pitch impressions in running speech were avoided. Its main disadvantage was the time consuming procedure. Subsequent improvements of the apparatus led to the development of an 'Intonator 1 (Cooper, 1962; Willems, 1 9 6 6 ) , which made it possible to provide entire sentences with a synthetic pitch contour that could be varied at will (analysis-by-synthesis). A large number of experiments have led to a grammar for Dutch intonation, which can provide any Dutch utterance with an acceptable intonation contour, i.e. established as acceptable in listening experiments (synthesis-by-rule). We will discuss this approach in more detail, as many aspects of it will be applied in our own study.
2 . 4 . 1 Declination
Analysis of Dutch utterances by means of the technique of pitch matching already showed that overall pitch generally declines towards the end of an utterance. In resynthesizing the utterances by means of an Intonator, it appeared to be necessary from the point of view of naturalness to superimpose the relevant pitch movements on a baseline of gradually declining pitch according to a fixed time dependent formula: the declination line. See figure 2.9. This formula is an adaptation (i.e. the linear scale was converted to a logarithmic scale) by 't Hart ( i n : 't Hart, Nooteboom, Vögten and Willems, 1982) of an original formula by Maeda ( 1 9 7 6 ) . This declination line cannot be left out without serious perceptual consequences and should accordingly be regarded as a perceptually relevant pitch movement. Some evidence has been provided by Collier ( 1 9 7 5 b ) that this downdrift is a function of physiological processes, such as the decrease of subglottal airpressure. Declination appeared not to be an exclusive phenomenon of Dutch intonation. Bolinger (1964) even argued that this could be a language universal. Mattingly ( 1 9 6 6 ) stated that his programme for synthesizing pitch contours of British English sentences required a gradual fall in pitch which was dependent on the particular syllabic nucleus involved. He remarks however, that a fixed slope
38
PROM TONES TO PITCH MOVEMENTS
500 Μ>0·
300
100
We
gaan
ih
leder
geval
n a a r de
schouwborg
Figure 2.9; Relevant pitch movements in Dutch move up and down a low and a high declination line, as a function of time. formula: for t £ 4.82 s for t > 4.82 s
-1 D= 0.13 + 0.09 (t) D=
-1 07ΪΤ7
TU
D : declination in semitones per second. t : time in seconds. During pauses > 250 ms the declination should be interrupted. For a definition of semitones see chapter 3, sect ion 3.1.3.
might be just as effective. Bolinger's observation is further substantiated in experiments by Vaissiere ( 1 9 7 1 ) for French and Maeda ( 1 9 7 6 ) for American English. Maeda suggests that declination is due to tracheal pull, i.e. as a consequence of decreasing lung volume the larynx is progressively lowered, which causes a gradual lowering of pitch on account of the correlation between pitch and larynx height.
PERCEPTUALLY RELEVANT PITCH MOVEMENTS
39
Ohala (1978) argues that declination is a function of active laryngeal control by the speaker of vocal cord tension and that it serves to signal clause and sentence boundaries. His view is based on experiments which showed the rate of pitch decrement to be inversely proportional to the length of an utterance, thus making it very unlikely for this downdrift in pitch to be a purely physiological effect. Pierrehumbert ( 1 9 7 9 ) concludes from her experiments that listeners' judgments reflect normalization for expected declination: subjects made a greater correction for wide range stimuli than for narrow ones and the slope of expected declination was less for longer stimuli than for shorter ones. She suggests that apart from physiological processes also mental representations play a role in the assignment of syntactic boundaries. Most of the above mentioned experiments favour the argument that declination is at least partly purposefully controlled by a speaker and that declination is a perceptually relevant pitch movement.
2.4.2 Perceptually relevant pitch movements
As has been mentioned in the introduction relevant pitch movements are discrete events in the quasi-continuous pitch contours as a result of corresponding commands to the vocal cords on the part of a speaker. These discrete pitch movements, which are superimposed on a declination line, are perceived as relevant by a listener. They can be retrieved by listening experiments involving resynthesized speech, in which the original fundamental frequency curve is replaced by artificially generated pitch movements - represented by straight lines - which form a melodical copy of the detailed fundamental frequency curve and evoke the same pitch impression. See figure 2.10. Moreover it proved to be possible to generate acceptable Dutch intonation contours by standardizing the dimensions such as magnitude of the excursion, slope and duration, so that only a limited number of these movements is needed to account for practically all pitch movements in Dutch. Altogether five different rises and five falls, which move up and down between a high and a
40
FROM TUNES TO PITCH MOVEMENTS
We gaan
vanavond
In
ieder
geval
naar de
schouwburq
Figure 2.10: Example of a fundamental frequency curve with stylized pitch contour that evokes the same pitch impression.
low declination, appeared to be s u f f i c i e n t to describe Dutch intonation. Collier ( 1 9 7 5 b ) has provided some evidence that the fundamental frequency is actively controlled by actions of the cricothyroid muscle. Ohala ( 1 9 7 8 ) mentions a great number of electromyographic studies of laryngeal muscles in a variety of languages which show that they play a central part in effecting pitch changes. He states that evidence exists that the cricothyroid muscle evokes rises in pitch, whereas the sternohyoid and sternothyroid were most active during pitch lowering. These experiments give some physiological support to the notion of perceptually relevant pitch movements, i.e. pitch movements which are actively controlled by the speaker. The remaining involuntary perturbations of the fundamental frequency curve are caused by segmental factors such as the intrinsic pitch of a particular vowel or consonant and coarticulatory variations (Lehiste and Peterson, 1961; Slis and Cohen, 1969; Lehiste,
FROM BLOCKS TO PITCH CONTOURS
41
1976 and Di Cristo and C h a f c o u l o f f , 1 9 7 8 ) . These so-callecl microintonational phenomena may be left out of the description without perceptual consequences.
2.4.3 From blocks to pitch contours
Relevant pitch movements can be combined to so-called international 'blocks'. A combination of one or more blocks in their turn constitute a pitch contour following the sequence: ( (P) C) (P) E Each contour consists of at least one (E)nd-block, which can be either a contour on its own or be the end of a contour. This: obligatory End-block can optionally be preceded by one or more (C)ontinuation-blocks, which generally mark a syntactic boundary. Both the End and the Continuation-block may be preceded by a (P)refix-block. By combining blocks according to specific rules in the form of a grammar (cf. Collier, 1972 and 't Hart and Collier, 1975) 94% of all utterances in their corpus could be accounted for by the grammar. See f i g u r e 2 . 1 1 . The resulting pitch contour represents the original complete sentence melody and is thus perceptually equivalent to the original fundamental frequency curve.
Figure 2.11; Transitional probability scheme for the intonation contours of Dutch (from 't Hart and Collier, 1975: p.249.
42
FROM TUNES TO PITCH MOVEMENTS
2.4.4 Intonation patterns
By means of experiments in which subjects had to sort out contours into categories, Collier and 't Hart (1970) and Collier (1975a, 1977) demonstrated that subjects combine contours belonging to the same intonational categories, in spite of minor differences. In this way they showed a relationship to exist between pitch contours and abstract intonation patterns. It will be evident that it becomes possible by means of the concept 'perceptually relevant pitch movement' to relate the capricious fundamental frequency curves in the concrete acoustic speechwave to abstract intonation patterns, which are part of the linguistic competence of a speaker-hearer. See figure 2.12.
limited number of abstract intonation patterns
standardized stylized pitch contours
concatenation by sequence rules into intonational blocks
artificial perceptually relevant stylized pitch movements + declination
perceptually irrelevant micro intonation (contrib. to naturalness)
fundamental frequency curves at the acoustic level!
Figure 2.12; Perceptually relevant pitch movements, which can be combined to pitch contours by means of intonation rules (grammar) and are superimposed on a declination line, provide an intermediate step to relate acoustic Po curves with abstract intonation patterns.
PROSPECTS FOR ENGLISH INTONATION
43
2 . 4 . 5 Prospects for English intonation
From recent experiments on English intonation by de Pijper (1979, 1980, and forthcoming) it has become clear that the method just discussed can be f r u i t f u l l y applied to English intonation as well: stylized and standardized English pitch contours were judged to be very acceptable by native speakers of English. Moreover de Pijper found a means to narrow the relatively large step from fundamental frequency curves to standardized stylized pitch contours by introducing 'close copy' stylizations, that is to say stylizations which can hardly be distinghuised from the original fundamental frequency curves, although they consist of only a very limited number of straight lines. For further details the reader is referred to de Pijper.
2.5 A final note on the notational systems
In addition to the three main types of notational systems already mentioned, v i z . (a) a graphical system (e.g. Kingdon, 1958; Cohen and 't Hart, 1965) (b) an accental system (e.g. the use of tone marks by O'Connor and Arnold ( 1 9 7 3 ) and (c) a numerical system (e.g. Pike, 1 9 4 5 ) , attempts have been made to represent pitch movements by a musical notation (e.g. Fonagy and Magdics, 1963). Apart from the fact that one first of all has to acquire a specific musical knowledge in order to be able to interpret this system Lehiste ( 1 9 7 0 ) has demonstrated that differences in pitch height in speech do not conform to musical intervals such as a t h i r d , a fourth or a quint. Because of the complexity and the often interpretative nature of most of these notional systems we prefer a purely descriptive system of straight graphical lines, as used in the Dutch intonation course by Collier and 't Hart ( 1 9 8 1 ) . An instance of this straight line approach is presented in figure 2.13.
44
FROM TUNES TO PITCH MOVEMENTS
A / Er
waren
verscheidene
kinderen
V ziek
(Several children were ill) Figure 2.13; An example of the graphical notation system taken from the Dutch intonation course by Collier and 't Hart (1981, p.57).. Declination is not drawn. Pitch accents are underlined.
2.6 Contrastive English-Dutch intonation studies
Contrastive English-Dutch intonation studies are rather scarce in the literature. The first comprehensive attempt was made in 1925 by Guittart. First of all he describes Dutch intonation according to a slightly adapted method by Palmer ( 1 9 2 2 ) and distinguishes four main nuclei in Dutch: falling, rising, rising-falling-rising and slightly rising. These main contours are related to sentence types and can be modified by attitudinal aspects, unfortunately, as he considers experimental methods - phonograph and sootdrum - too laborious and still too unprecise, he gathers his data by ear, so that his description appears to be rather unreliable (Cohen, 1968). For his subsequent comparison of Dutch and English intonation Guittart translated English examples of Palmer into Dutch, which were thereupon transcribed in Palmer's notational system. He wrongly concludes that there is essentially no difference between the pitch contours of English and Dutch when both are described according to Palmer's system, and, by a study of connected discourse, that Dutch is more melodious and knows larger intervals than English. On the other hand he already pleaded at that time for a conscious learning process of foreign intonation patterns by first of· all making students aware of the intonation patterns of their
CONCLUDING REMARKS
45
native language and he strongly opposed the view that intonation can only be learned by imitation (drill methods). In 1965 Delattre claimed that one of the characteristic features of Dutch intonation patterns in general are pitch rises, whereas English seems to be characterized by downward pitch movements. A pedagogically oriented comparison of English and Dutch intonation, following Kingdon's system, by Breitenstein et al.(the Dutch Didactics Comittee; 1966) does not succeed any better than Guittart in avoiding the p i t f a l l s of the impressionistic 'Ohrenphonetik' and arrives at the conclusion that rising intonation patterns are far more common in English than in Dutch. Likewise in their course for Dutch learners Gussenhoven and Breeders ( 1 9 7 6 ) conclude that "Dutch and English do not d i f f e r greatly with regard to intonation (p.162). Both conclusions are questionable to say the least. Finally we would like to mention two experiments of ,an explorative nature, using resynthesized speech. Cohen and "t Hart ( 1 9 7 2 ) demonstrated that languages have a characteristic set of intonation patterns which makes it possible to distinguish them solely on the ground of prosodic features. This observation was confirmed in an experiment by Elsendoorn ( 1 9 7 9 ) in which English and Dutch interrogative sentences were compared. Subjects appeared to be able to correctly distinguish Dutch and English pitch movements in almost 70% of the cases.
2.7 Concluding remarks
In this second chapter we have discussed some noteworthy approaches to the study of pitch phenomena in speech: the configurational tune or tone group studies in the British tradition, the level approach which was mainly favoured by American linguists, two subsequent phonological approaches in this tradition and finally the search for perceptually relevant pitch movements by the 'Dutch school 1 . Evidently we prefer an experimentally oriented approach in our search for non-native deviations in English pitch movements produced by native speakers of Dutch. We will largely follow the procedure of the Dutch school and employ a combination of acoustic
46
FROM TUNES TO PITCH MOVEMENTS
measurements to trace potential non-native deviations, and perception tests to establish the perceptual relevance of the observed deviations contributing to a non-native pronunciation. We will limit ourselves to salient distinctions in the melodic realization of pitch movements and neglect for the time being possible relations with linguistic categories. In the next chapter we will start the description of our experimental work which initially only has the form of simple pilot experiments. This approach was adopted for the following reasons: firstly we could not base our study on any authoritative contrastive English-Dutch intonation study and secondly we wanted to ascertain the usefulness of particular instrumental techniques for our study.
Chapter 3
First explorations
3.0 Introduction
Acoustic measuring of the fundamental frequency is one of many methods to study pitch, an important but a limited one. It investigates the most accessible manifestations of pitch in speech, b u t , since the material available is very complex, cannot explicitly bring out those pitch phenomena which are relevant for a systematic analysis and description. Consequently studies of manifestations of pitch which are largely based on acoustic measurements alone or which are occasionally complemented by the ear of the experimenter, will often become sketchy and haphazard. Yet, despite these objections, we feel that a comparison of the acoustical properties of pitch movements produced by native speakers of Dutch and of English might give valuable clues, as to what could be systematic deviations between the two groups and should be included in a perceptual evaluation. We are not interrested in idiosyncrasies of individual subjects, but in recurrent pitch movements which may be distinguished in a systematic comparison of corresponding pitch movements of utterances produced by a number of subjects. Our first production test must be viewed in that light: it was primarily intended to assess whether an instrumental analysis would allow to trace gross systematic factors in the multitude of detailed Fo fluctuations that might be relevant for the perception of a non-native pronunciation. As an additional factor the adequacy of the technique employed (electroglottography) could be put to the test for the sake of future more elaborate experiments. In a subsequent pilot perception experiment we have tried to establish whether it was possible to discriminate between Dutch and English solely on the basis of perceptual differences in pitch contours. Only if this proved to be successful it would be meaningful
48
FIRST EXPLORATIONS
to explore the intonational components contributing to the perception of a non-native pronunciation by means of more systematic and comprehensive perception experiments. Apart from this discriminatory aspect this subsequent test was also intended to establish the fundamental appropriateness of the method applied for assessing a non-native pronunciation. This method was in a later phase to be replaced by more sophisticated (resynthesis) techniques. Evaluating the rather complicated notion of non-native pronunciation presented a particular problem, as this 'accent 1 is not only composed of deviations in the prosodic layer, but also of deviations at the segmental level, which is beyond the scope of our study. If one would ask a panel of phonetically trained English native speakers to give their opinion on intonational deviations in unprocessed English sentences, produced by Dutch native speakers, their judgments will most likely be influenced by generally inevitable differences in quality and intrinsic durations of phonemes and semantic-syntactic considerations. One means of getting around this problem would be the use of for instance low-pass filtered speech, so that segmental deviations are largely obscured. Yet pilot experiments have shown that subjects have great d i f f i c u l t i e s in passing a judgment on the remaining delexicalized fundamental frequency curves. True, in such an experiment for measuring 'accent 1 in French intonation, Fonagy, Guznan and Berard ( 1 9 7 6 ) claim that French listeners were able in about 65% of the presented instances to i d e n t i f y foreign speakers when speaking French, although this result turned out higher on account of an artefact in their experiment, viz. the use of both read-out and spontaneous speech in the same experiment. But it appeared to be a practically hopeless task to trace those components in the fundamental frequency curves that actually played a role in listeners' judgments on the presence of a non-native pronunciation in French intonation. In our preliminary experiment we have tried to avoid these interferences by asking subjects to evaluate only English sentences - with variations in pitch contours - spoken by a native speaker of Dutch, who is regarded as near-native for English. If this approximating method proved to be successful we would be able to apply essentially similar but technically more advanced techniques in f u t u r e experiments, viz. by replacing the original fundamental
FIRST EXPERIMENT
49
frequency curves of English sentences by synthetic and thus controllable stylized pitch movements. As a consequence of the lack of substantial experimentally oriented approaches to English intonation we decided to use material of some widely used British tone group approaches, in order to obtain somewhat uniform material both for the instrumental analysis and for the subsequent discrimination test. Finally the reader's attention is drawn to the fact that for the sake of convenience we also use the term 'pitch' at the acoustic level, although we are aware that it is actually a perceptual notion.
3.1 Characteristics of the production of a non-native pronunciation of English intonation by Dutch speakers; f i r s t experiment
3.1.1. Introduction
In
this pilot production experiment
a limited
utterances read out by six native speakers tive speakers of English were registered lysed with respect to fundamental frequency diate object of the experiment was to trace
number of English
of Dutch and three naand instrumentally anaand duration. The immegross systematic devia-
tions from the English norm by the Dutch native speakers and as a secondary object to assess the appropriateness of the instruments used in view of subsequent extensive experiments. In this test we also included a few Dutch paraphrases of the English sentences with intent of obtaining evidence of assumed native language interference on the part of the Dutch subjects.
50
FIRST EXPLORATIONS
3.1.2 Method
3.1.2.1 Test material
The English test material was partly borrowed from, partly constructed in accordance with classifications by Kingdon (1958) and Halliday ( 1 9 7 0 ) . All items are examples of tone II (Kingdon) or tone I ( H a l l i d a y ) , which are mostly linked with declarative sentences in English: 1. Without head or pretonic: a fall on the nucleus. 2. With head or pretonic : a ( v i r t u a l ) rise on the first accented syllable followed by a fall on the nucleus. The second pattern corresponds largely to the Dutch 'hat pattern" (Cohen and 't Hart, 1967) and seems to represent a very common pitch contour in many languages, for instance American English (Maeda, 1976) and French (Delattre, 1965). The Dutch test material, which was meant to investigate assumed native language interference, was as far as possible a close paraphrase of the English with respect to syntactic and syllabic structure. The test material is presented in table 3.1.
3.1.2.2 Informants
Informants were six male speakers of Dutch, who had all finished secondary school. Their average age was 23 years. Furthermore three male native speakers of English took part in the experiment. They were staff members of the English Institute of Utrecht University. All participated on a voluntary basis and were not paid.
METHOD
51
Table 3.1; Test material used in the pilot production test; in the case of the items with head or pretonic, only the first half of each sentence was analysed. 1. Without head or pretonic English It's right It was a fire John saw Tt I'm far too tired I never argue It's a new one She used to be fond of us Dutch Het was een vuur Jan zag het Het is veel te laat 2. With head or pretonic English The 'small black cat / (leaps on a stool) In the 'centre of London / (there's a famous building) In the 'middle of the circle / (there's a clean pin) Once a thief / (always a thief) 'Many hands / (make light work) Dutch De 'smalle "bleke" kat / (loopt op een stoel) 'Vele handen / (maken licht werk) xxx = nucleus of tone group in English items; accented syllable in Dutch. ' = pretonic or accented syllable.
3.1.2.3 Procedure
Informants were asked to read out the items mentioned in table 3.1 without obvious emotional involvement, in order to get 'neutral' pitch excursions. The items were recorded by means of a Sennheiser MD421 HL/8 microphone on one track of a Revox A-77 taperecorder at a speed of 19 cm per second. Simultaneously vocal cord activity was recorded on track two through a Fr^kjaer-Jensen EG 830 Electro-
52
FIRST EXPLORATIONS
glottograph. By means of electroglottography changes in electrical impedance are registered which reflect vocal cord movements. Impedance changes are determined by feeding a high frequency signal between two electrodes which are attached to the throat of an informant. This signal is modulated by the vocal cord movements. This technique has the advantage of avoiding interferences of the speech channel and the recording room, and allows for more reliable Fo measurements when used in combination with an Fo meter than an analysis of speech material recorded by microphone would. The recordings were fed through a Frjzikjaer-Jensen FFM 650 Fo meter to a Honeywell 2206 Visicorder. UV-oscillograms of the audio output and the electroglottograph signals were obtained at a speed of 20 cm/sec. The durations of the segments and the pitch of the voiced syllables were measured. In cases of doubt syllable durations were checked by ear by means of segmentation (making a small segment audible with the aid of an electronic gate; cf. 't Hart and Cohen, 1 9 6 4 ) . FO values which could not be obtained by oscillographic analysis were determined by means of analytic listening (comparison of a small gated out segment with the signal of an easily measurable synthetic vowel generator).
3.1.3 Results
A normalized version of all realizations of each item is displayed in Appendix A. Pitch contours have been stylized in a subjective manner by omitting all minor Fo fluctuations. They have not been visually represented in full on account of their length. Fundamental frequency is plotted on the vertical axis and time on the horizontal axis. The pitch movements are represented on a relative logarithmic scale (expressed in semitones), as this scale corresponds better with the logarithmic function of pitch perception and as it facilitates comparison of d i f f e r e n t voice registers. The logarithmic relation between two frequencies which d i f f e r one semitone can be expressed as W 2 » 12 semitones are 1 octave. Figures have been lined up with respect to the beginning of the syllabic nucleus of the tonic.
DISCUSSION
53
3.1.4 Discussion
Due to uncontrollable variables such as rate of speech and physical characteristics of the vocal cords it was impossible to compare the waveforms
in
an absolute manner.
Furthermore, on account of
the
inter-individual disparities of the Dutch and English informants it seemed hardly possible to draw any conclusions with regard to the position of the beginning of the main pitch movement in the accented
syllable. Consequently the analysis of
the contours
has
been restricted to a rough informal comparison of the rate of declination, the direction of the pitch movements and the relative magnitude of the pitch movements on the accented syllables.
3 . 1 . 4 . 1 English items without head or pretonic
The majority of the
items produced by the English native speakers
showed a f a l l on the nucleus which was often preceded by a rise on the same syllable. The unaccented syllables showed only minor fluctuations with the exception of 'she used to be fond of u s ' , which had a substantial rise on 'used 1 in most cases. The following deviations could be observed in the English items produced by the Dutch informants: 1) The rise preceding the fall
(precursive rise) was often
mis-
sing. See figure 3.1 2) Several items showed a small rise on the nucleus or almost no pitch change at all. The latter may be due to the experimental situation. This can also be seen in f i g u r e 3.1. 3) The magnitude of the pitch movements of most items read out by the Dutch informants was much smaller than those spoken by the
English
informants.
pitch movement was 4
The relative range of
the
falling
± 3 semitones ( S T ) for the Dutch infor-
mants and 8 ± 4 ST for the English informants. The range of the preceding rise was 3.5 ± 2 ST for the Dutch informants and 5
± 3.5 ST for the English informants.
FIRST EXPLORATIONS
54
12,
ST
ST 1_
1.
Τ Ι Ί Ι Γ 100
200
300
I I 100
Γ Τ Τ 1^1^ 200
300
400
ms
Figure 3.1; Stylized versions of an English item without head or pretonic read out by Dutch informants (left) and English (right). The dotted line represents a variant produced by Dutch informants which showed a rise in pitch.
3.1.4.2 English items with head or pretonic
Most items produced by the English informants showed a substantial fall on the tonic preceded by either a rise on the pretonic or a rise on the pretonic followed by a non-final fall. The successive tone groups of each item were mostly linked by an upward continuation movement. The pitch movements of the Dutch informants were more or less similar to those of the English informants, with the exception of the magnitude of the excursions, which was again much smaller. Table 3.2 presents a comparison in semitones.
3.1.4.3 Dutch items
The small number of Dutch items which were investigated seem to corroborate assumed influences of native language interferences in the realization of the English items by native speakers of Dutch with respect to the direction of the pitch movements (rises instead of falls ) and the magnitude of the pitch movements. The relative range of the pitch movements on the final accented syllables was
DISCUSSION
55
Table 3.2; A comparison of the magnitude of the excursions of pitch movements (in semitones) on the English items with pretonic produced by the Dutch and the English informants.
pretonic
tonic
n
rise
non-final fall
final fall
Dutch inform.
4.5 ± 2.0
2.0 ± 1.5
4.0 ± 3.0
30
English inform.
8.0 ± 2.0
4.5 ± 2.0
7.0 ± 3.0
15
3 ± 3 ST, which seems to conform with the range of their realizations of the fall on the English items.
4 ± 3 ST for
3 . 1 . 4 . 4 Declination
Declination, a gradual downward movement of the contour, could be observed in all utterances both in English and in Dutch. The falling rate of each contour was approximated as a logarithmic function of time (see chapter 2, p.36) by taking the initial and the terminal Fo points of the low baseline superimposed on the contour. The averaged falling rates were as follows: English items read out by English informants: 4.9 ST/s English items read out by Dutch informants : 3.3 ST/s Dutch items read out by Dutch informants : 3.9 ST/s. Although these figures are by no means conclusive they suggest that the falling rate is somewhat larger for the English items read out by the English informants in comparison with the same items read out by the Dutch informants.
56
FIRST EXPLORATIONS
3.1.5 Conclusions
This preliminary production experiment demonstrated that it is feasible to observe specific gross characteristics of instrumentally analysed fundamental frequency registrations and thus to detect systematic differences by comparing the acoustic output of native speakers of Dutch, when speaking English, with that of native speakers of English. Furthermore the use of electroglottography appeared on the whole to yield reliable results. In summary four systematic deviations of native speakers of Dutch could be distinghuised: 1 ) Smaller excursions of the pitch movements 2) Opposite direction of the pitch movements: rises instead of falls 3) The absence of a precursive rise in most cases 4) A smaller declination rate. Some of these differences, notably the smaller magnitude and the opposite direction of the pitch movements , reflect assumed native language interference. The apparent differences in magnitude of the excursion seem to endorse the suggestion by Collier ( 1 9 7 6 ) that English intonation contours operate at three levels rather than at two, with the second level corresponding to the high declination of Dutch intonation contours. See f i g u r e 3.2. In the following pilot perception experiment we will try to assess whether the instrumentally observed differences in pitch contours produced by native speakers of English and of Dutch, are substantiated by corresponding differences in the perception of pitch contours of both languages.
57
SECOND EXPERIMENT
(It »as a)
rire
In the
middle
of the
circle
....
Figure 3.2; The three levels proposed for English intonation contours.
3.2 Discriminability of Dutch and English pitch contours; second experiment
3.2.1 Introduction
The pilot perception test described in this section is an attempt to assess the ability of listeners to consciously discriminate between the pitch characteristics of their native language and those of a target language. Differences between the pitch contours of the English utterances produced by the native speakers of English and of Dutch were to be found mainly in the magnitude of the excursion and the direction of the pitch movements. Our aim was to find out whether listeners were able to discriminate English and Dutch, solely by virtue of differences in the pitch contours of both languages.
58
FIRST EXPLORATIONS
In spite of the fact that the position of the pitch movement in the accented syllables was left out of account in the former production test for reasons mentioned in section 3.1.3, it was incorporated in this test, as it has been shown to be an important perceptual factor in Dutch intonation. Our experiment involved lexically near-equivalent pairs of utterances in English and Dutch with superimposed pitch patterns of English and of Dutch. Subjects of both languages were asked to tell these patterns apart in a forced choice situation.
3.2.2. Method
3.2.2.1 Stimulus material
The basic material consisted of 12 English utterances selected from a tape which goes with the English intonation course by Halliday ( 1 9 7 0 ) , v i z . two instances of each so-called primary tone (1 to 5) and one instance of each compound primary tone (1-3 and 5-3). A survey of the twelve utterances is presented in table 3.3 and includes Dutch paraphrases of the English instances. These paraphrases were read out by another native speaker of Dutch and recorded on one track of a tape, simultaneously with an electroglottograph signal on the other track. The above mentioned English utterances were recorded twice, read out by a native speaker of Dutch whose English was considered to be near-native, once with an English pitch contour similar to the contours of the instances selected from Halliday and once with a Dutch pitch contour similar to the paraphrases read out by the first native speaker of Dutch. The adequacy of the imitation of the desired pitch contour was checked by means of visual and auditory feedback. The native speaker of Dutch was seated in front of a Fr^kjaer Jensen two-channel curve display (CD 1300). The Fo curves of each of the model utterances were traced on the upper track of the screen, whereas on the lower track the speaker could trace his Fo imitation on the basis of electroglottography. In this way the
METHOD
59
Table 3.3; Stimulus material used in the pilot perception experiment. Tonics are underlined. Tone 1. They're here said Eileen. Ze zijn hier zei Ireen. 2. It's entirely unnecessary. Het is geheel overbodig. 3. Are you sure it's the right address? Is dit wel het juiste adres? 4. Would you like the radio on ? Wil je graag de radio aan? 5. They are all here. Ze zijn allen hier. 6. Be careful with that ladder. Voorzichtig met die ladder. 7. It's better than I expected. Het is beter dan ik verwachtte. 8. That little bit won't do you any harm. Dat kleine beetje zal je geen kwaad doen. 9. No wonder they don't grow. Geen wonder dat ze niet groeien. 10. It's not surprising they never make any money. Het is niet verwonderlijk dat ze geen geld verdienen. 11. It isn't really flat. Het is niet echt plat. 12. The salmon is very good tonight. De zalm is erg goed vandaag.
1. 1. 2. 2. 3. 3. 4. 4. 5. 5. 1-3. 5-3.
Dutch speaker could compare his output with that of the model. At the same time the model utterances could be made audible. All trials of the audio signal were recorded simultaneously with the glottograph signals, the latter only being used for subsequent analysis. Afterwards the best copies were selected by eye and by ear to serve as stimulus material. Fundamental frequency recordings as a function of time, together with oscillographic recordings of the microphone signal were made of all the model utterances and the imitations to obtain frequency and duration values of the pitch movements in the tonics. As a reference point for the duration measurements the vowel onset was chosen. To establish the reliability of using one speaker comparisons were made between two characteristics of the model utterances and the imitations by the Dutch native speaker, viz. the beforementioned position of the pitch movement and the magnitude of the
60
FIRST EXPLORATIONS
excursions on the tonic. A paired t-test applied to the measurements showed no significant differences. The 24 speech utterances were then copied onto language master cards (cards with magnetic strips attached which allow random auditory comparison of items). For each language four randomly chosen items were once repeated to test the consistency of the subjects.
3 . 2 . 2 . 2 Subjects
Two groups of subjects participated in the experiment. They were 9 male native speakers of English, staff members at the English Department of the university of Utrecht, and 9 male and female native speakers of Dutch, staff members and students at the Institute of Phonetics, Utrecht. The 9 native speakers of Dutch were phonetically trained. All subjects participated on a voluntary basis and were not paid.
3.2.2.3 Procedure
All subjects participated in the experiment individually. They were presented with the set of randomized utterances described in section 3.2.2.1 and were instructed to concentrate on the pitch contours only. With each stimulus subjects were asked to indicate on their answer sheets whether or not they thought it to be an instance of a pitch contour of their native language.
3.2.3 Results
The compilation of- the data was mainly performed by punching the variables on card and feeding them into a computer using SPSS ( N i e , Hadlai Hull, Jenkins, Steinbrenner and Bent, 1975). By means of a breakdown procedure correct identification was determined in percentages of the maximum score.
RESULTS
61
3.2.3.1 Results of the English subjects
The mean identification score of the English subjects for items with an English pitch contour appeared to be 75.5%, for items with a Dutch pitch contour only 4 4 . 1 % . The results of correct identification for each item and each tone are presented in figure 3.3. Application of the Binomial test showed that total number correct ( 1 7 3 ) was scored better than chance (two-tailed, p < . 0 0 1 , n= 2 8 8 ) ; the same applies to the number correct for items with an English pitch contour (p < .001 ( 1 0 9 correct: n= 1 4 4 ) . The items with a Dutch intonation contour showed a random score (64 correct: n= 1 4 4 , ins). The outcome of a Wilcoxon test applied to the percentages correct of English and Dutch contours showed that language discriminated significantly (T= 6.5, n= 12, p < . 0 1 ) . The result of a t-test applied to the 4 repeated scores of either language gives an indication that the English subjects scored consistently on the items with an English pitch contour (t= -1.34, df= 3, i n s . ) , but inconsistent on the items with a Dutch pitch contour (t= -5.93, df= 3, p < . 0 1 ) .
62
FIRST EXPLORATIONS
90
.•English —* :0uteh
70-
chance
50
——^.
·
3010-
Item
perc. correct
English *—* :Dutch —
chance
Tone P*rc·
correct
1
2
3
4
5
1
-
3 5-3
Figure 3.3; Percentages correct of the English subjects for each item (top) and each tone (bottom).
3.2.3.2 Results of the Dutch subjects
The mean percentage correct scores for the Dutch subjects on the items with an English pitch contour was 79.1%, for the items with a Dutch pitch contour 7 2 . 1 % . Correct scores for each item and each tone are presented-in figure 3.4. The binomial test was applied to the total number correct ( 2 1 8 ) (two-tailed, ρ < . 0 0 1 , n= 2 8 8 ) , to number correct of the items with an English pitch contour (p < .001; 114 correct: η = 144) and of items with a Dutch pitch
RESULTS
63
significantly better than chance. Percentage correct scores on the Dutch contours was not different from the scores on the English contours: T= 41, n= 11, ins. (Wilcoxon test). The Dutch subjects scored consistently on the 4 repeated items of either language; Dutch contours: t= -1.73, df= 3, ins.; English contours: t= -0.49, df= 3, ins.
9070-
English *--·* :0uteh
50-
•—·—· chance
30· 10·
Item perc. correct
90-
:English :Dutch chance
10-
perc. correct
Tone '
1-3
5-3
Figure 3.4; Percentages correct of the Dutch subjects each item (top) and each tone (bottom).
for
64
3.2.4
FIRST EXPLORATIONS
Conclusions
3 . 2 . 4 . 1 English subjects
The results of this test showed that phonetically untrained native speakers of English were only able to identify with reliability pitch contours of their mother tongue. The consistent results of the repeated scores with respect to native language pitch contours underline this observation. Several Dutch pitch contours appeared to be hardly distinguishable from native language contours because of two interfering factors: f i r s t l y , the gross pitch features of several contours of both languages were perceptually identical in their outline and secondly, according to information obtained from the English subjects, some supposedly Dutch contours were acceptable as English contours as well, depending on their contextual situation. This points to a wide range of possibilities for the realisation of each contour.
3 . 2 . 4 . 2 . D u t c h subjects
It stands to reason that the Dutch subjects performed somewhat better, as they were familiar with both languages. In approximately 75% of the cases they were able to correctly discriminate between pitch contours of Dutch and of English.
3.2.5
Objective versus subjective analysis
As this experiment only dealt with rather gross perceptual pitch differences between English and Dutch our following analysis will be restricted to a small number of seemingly important distinguishing factors, viz. position, direction and magnitude of the pitch movements. Owing to a lack of s u f f i c i e n t data on pretonics the
65
RESULTS
analysis was restricted to the movements on the tonic or nucleus. Averaged data are presented in Appendix B.
3.2.5.1 Position of the pitch movement
The position of the beginning of the pitch movement in the tonic, expressed in ms with respect to vowel onset, was not significantly d i f f e r e n t in Dutch and English pitch contours (t= 1.43, df= 13, i n s . ) . This result was well in accordance with the subjective test scores. Correlations of percent correct and the positions of the pitch movements in the tonic gave the following Pearson correlation coefficients: see table 3.4. Table 3.4; Correlation coefficients between position of the pitch movements with respect to vowel onset in the tonic and percentages correct.
English subjects
Dutch subjects
Eng . inton .
Dutch inton.
Eng. inton.
Dutch inton.
r = -.16 ins.
r= -.08 ins.
r= -.02 ins.
r= .19 ins.
This gives an indication that as far as the data of this experiment were concerned the position of the pitch movements was not a significant discriminator between Dutch and English pitch contours.
3.2.5.2 Direction of the pitch movement
To determine the influence of the direction of the pitch movements on the number of correct scores of both groups of subjects the stimulus material was divided into two groups: one in which there was no distinction in the direction of the pitch movements between the
66
FIRST EXPLORATIONS
mother tongue and the second language, the other in which there was such a distinction. The scores related to the particular groups were compared by means of a t-test for two means. The result of the English subjects on the items with a Dutch pitch contour was t= -0.47, df= 12, ins., on the items with an English contour t= 1.23, df= 12, ins. The results of the Dutch subjects on the items with a Dutch pitch contour was t= -0.20, df= 12, ins., on the items with an English pitch contour t= -1.34, df= 12, ins. The results show that at least for this limited number of data differences in the direction of the pitch movements provided no apparent cue for correct discrimination.
3.2.5.3 Magnitude of the excursion
Magnitude is used here in the sense of a number of semitones between the beginning and end of a particular prominence-lending pitch movement, either rising or falling, in the tonic. A paired t-test was applied to the objective data of the English and Dutch pitch contours: t= -3.94, df= 13, p
»
/ '
(2)
(3)
(4)
(5)
(6)
^\^^ (8)
Direction Excursion Continuation Reset Overshoot Outset Inclination Wh-attribute Precursor
.
\\ \\ NS
(1)
(7)
(1) (2) (3) (4) (5) (6) (7) (8) (9)
Λ
i
7 / / /
(9)
EngN
DuN
fall ca. 4-26 ST (hlf rise)fall(hlf rise) virtual rise to Mid rise to Mid beginning at Mid gradual rise Mid-High stat. High + fall>12 ST half rise ca. Mid to High
rise ca. 4-12 ST half fall Low level rise to High begin at Low stat. Mid rise to Mid often absent
Figure 6.1; A survey of the nine (combinations of) pitch movements used as a reference, which were produced by the English native speakers (solid lines ) and the corresponding deviations produced by the Dutch native speakers (dotted lines).
11 8
DEVIATIONS IN PITCH
whereas the segmental durations are identical for each carrier phrase. However, as speakers were carefully selected with respect to speech rate, these differences appeared to be rather small. Furthermore experiments by Pollack ( 1 9 6 8 ) , Klatt (1973) and 't Hart ( 1 9 7 7 ) have demonstrated that the 'slope 1 of a pitch movement is a relatively unobtrusive factor, i.e. the threshold for perceiving changes in slope is rather high. For this reason we preferred this approach to the presumably much more negative influence of using a non-English segmental structure, as produced by the native speaker of Dutch. Each condition was implemented on two different utterances to reduce utterance specific effects on the evaluations due to syntactic or semantic influences. Finally to test the consistency of the judgments and consequently the reliability of this experiment each stimulus was once repeated. Thus this procedure should yield nine (repeated stimuli) χ three (conditions) χ two (utterances) « 108 stimuli. Unfortunately, as both the selection of appropriate utterances and the analysis and resynthesis of the items by means of LPC in software proved to be rather time-consuming procedures, we were forced by lack of time to use the same reference phrases in three cases (conditions I and I I I ) . Moreover the variable 'excursion 1 appeared to be the same under conditions II and III (no further deviations of the Dutch speaker), so that one of these (identical) phrases was left out too. This resulted in 92 stimuli. Since the actual test was preceded and followed by 4 arbitrarily chosen stimuli to counter beginning and end effects, subjects were exposed to 100 stimili. The stimuli were randomized and taperecorded. Each item was presented twice and was preceded by a warning signal.
6 . 1 . 2 . 2 Subjects
Two groups of subjects participated in the experiment. All were native speakers of British English. The first group were 39 students of various disciplines of the University of Sussex, Brighton.
RESULTS
11 9
Their age varied between 17 and 35 years and they were paid for their services. The second group were 16 office employees of the Medical Equipment Company Ltd. in Crawley. Their age varied between 18 and approximately 50 years. This group was not paid as they performed the test during working hours.
6.1.2.3 Presentation of the test
Subjects took the test individually through high quality headphones. The test was presented to the Brighton group in sound treated cubicles, to the Crawley group in a quiet conference room. All subjects first listened to a 10 minute introduction, which was on tape as well as on paper, to become familiar with the notion of intonation and to grow accustomed to the quality of the resynthesized test items. Furthermore they were asked to read the complete dialogue, which was to allow them to put the utterances in an appropriate setting. The text of each utterance was on paper too to preclude problems of intelligibility. Subjects were asked to give an acceptability rate by encircling the figures 1 to 5 on their answer sheet, 1 corresponding to least acceptable, 5 to most acceptable. The complete test took 30 minutes.
6.1.3 Results
6.1.3.1 Internal consistency
To gain some insight into the experimental behaviour of the subjects with respect to individual differences in motivation or an incapacity to judge pitch phenomena, the scores of all subjects on the repeated items were expressed as the number of standard deviations from the mean score (z-scores). For this approximation equidistant steps were assumed between the scale values 1 to 5.
120
DEVIATIONS IN PITCH
The results are summarized in table 6.1. From this table it can f i r s t of all be concluded that most subjects were very consistent in their judgments of the items: repeated scores of 33 (Brighton) and 14 (Crawley) subjects deviated less than one scale value. Secondly most subjects' score d i f f e r ences between the repeated items remained within one standard deviation of the total sample means. As the number of inconsistent subjects appeared to be extremely small, resulting in a negligible influence on the total scores, we decided not to exclude any of the subjects from either group. Table 6.1; Consistency of the individual subjects expressed in the mean difference between repeated scores and the number of standard deviations from the total sample means (z-scores).
SUBJECTS
MEAN OF MEAN DIFFERENCE MEAN 1 scl.val. DIFF.
SD
NO OF SUBJECTS Z-SCORES