The Structure of Intonational Meaning: Evidence from English 0253158648

scan by internet archive

255 33 14MB

English Pages 256 [262] Year 1980

Report DMCA / Copyright


Table of contents :
I. General Introduction and Review of Past Work
1. Segmentation and the Taxonomy of Intonation
2. The Two Major Traditions of Analysis
a. ‘American’: Trager and Smith
b. ‘British’: Kingdon
3. Sentence Stress
4. Intonation
a. Levels vs. Configurations
b. The Intonational Lexicon
c. Tunes vs. Tones
d. Accent Analyses
5. Stress
a. Criticisms of Traditional Stress
b. The Criticisms Coopted
c. Stress as a Rhythmic Phenomenon
6. Summary and Implications
II. Evidence for the Rhythmic Nature of Prominence
1. Rhythmic Cues in the Accent Analyses
a. Explicit Use of Length Cues
b. Implicit Use of Rhythmic Cues
2. Experimental Evidence for the Rhythm Hypothesis
3. Difficulties with Sentence Stress in the Accent Analyses
III. The Phonology of Deaccenting
1. The Concept of Relations
2. The Relational Nature of Deaccenting
3. Pretonic Accent and Deaccenting
IV. The Grammar of Accent Placement
1. Syntactic vs. Semantic Approaches
2. ‘Normal Stress’ and Focus
a. A Characterization of Normal Stress
b. Syntax of Focus
c. ‘Contrastive Stress’
d. Summary
3. Default Accent
4. Degrees of Accentability
a. Compounds
b. Bolinger’s ‘Contrastive Stress’
c. Deaccenting
d. The Accentability of Nouns
5. Semantics of Deaccenting
6. Summary
V. Paralanguage and Gradience
1. The Problem
2. Contrast vs. Paralanguage: Three Approaches
a. Trager-Smith
b. Lieberman
c. Bolinger and Crystal
3. Gradience
4. The Investigator’s Task
VI. Around the Edge of Language?
1. Central vs. Peripheral
2. The Expression of Speaker’s Attitude
3. Intonation and Emotion
a. Three Experiments
b. Critique
4. Instrumental Phonetics and Intonation
5. A Phonological Analogy
VII. Intonation and Grammar
1. The Role of Intonation: Two Approaches
2. The Intonational Lexicon
3. Preliminaries to the Analysis of Fall-Rise
a. Two Approaches to the Problem
b. Taxonomy of Falling-Rising Contours
4. A Semantic Analysis of Fall-Rise
a. Fall-Rise in Single-Nucleus Sentences
b. Fall-Rise in Double-Nucleus Sentences
c. Fall-Rise and Scope of Negation
d. Conclusion
5. Intonation and Phrasing
a. Pitch Contours and Boundaries in American Work
b. Boundary Phenomena as Relational
c. Phrasing and the Grammatical-Affective Distinction
VII. Stylized Tones and the Phonology of Intonation
1. The Calling Contour
2. Stylized Fall
3. Stylized Rises
a. Low-Rise
b. High-Rise
c. Some Implications of the Analysis of Stylized Rises
4. The Phonology of Intonation
a. Levels vs. Configurations: A Review of the Debate
b. Stylized Intonation and the Pitch-Level Analyses
c. Possible Objections to a Contour Analysis
IX. Intonation and Phonesthesia
X. Conclusion
1. General Introduction and Review of Past Work
2. Evidence for the Rhythmic Nature of Prominence
3. The Phonology of Deaccenting
4. The Grammar of Accent Placement
5. Paralanguage and Gradience
6. Around the Edge of Language?
7. Intonation and Grammar
8. Stylized Tones and the Phonology of Intonation
9. Intonation and Phonesthesia
Recommend Papers

The Structure of Intonational Meaning: Evidence from English

  • 0 0 0
  • Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up
File loading please wait...
Citation preview


from English
















} ;

; '


ie hf


a ‘te









; i vi



i .








ad ag ij? fi


a /





Pind . :










; t LB






; &




‘ Tl


is ie


r\ j





i t Lh)


a Nine



a0 i



‘ b





, v



1 Ale A


he ims









, ef


eal t





in {















7 iy









; bi



‘By ’


’ ¥







ney .







RL r i

(ly Li

me 7







ae 4





ag a ’





ie ms ,

SY eae:

’ Le?

i7 hae: i}



7 ryan






} “









| ¥


roo ‘ wean l) ire oad td aD


























mens © et

j i





ia f




: .


on ,

: fi ;


7 Ye











Bloomington & London


Copyright © 1978, 1980 by D. Robert Ladd, Jr. All rights reserved No part of this bouk may be reproduced or utilized in any form or by any means, electronic or mechanical, including photocopying and recording, or by any information storage and retrieval system, without permission in writing from the publisher. The Association of American University Presses’ Resolution on Permissions constitutes the only exception to this prohibition. Manufactured in the United States of America

Library of Congress Cataloging in Publication Data

Ladd, D Robert, 1947— The structure of intonational meaning. A revision of the author’s thesis, Cornell, 1978. Bibliography: p. Includes indexes. 1. English language—Intonation. 2. English language—Semantics. I. Title. PE1139.5.L3 1980 421.6 79-3093 ISBN 0-253-15864-8 12345 84 83 82 81 80

Dedicated to my parents, from whom I learned the fascination of words, and the worth of a word well chosen.



bey stack tiny ‘eminsaty pA


an ane. An tenia Aca


‘athe? Fy










¢ scion hn Mag

ia) eit

A i. ite

hey ctl” a,

iy) te


ne cat


eee Ss

Lar > >




oy lar td

‘te, foes

ae ole

Si Putian


Chay ora a


ce oan}








7 :


h Mebenit a8 dian Ea Enwith o ad ve ‘

%. a



eye paral “ip

tao aaa

wie” nai “ Re i, Note «%,

A va

bi OSD

es ; ~~

Phe: ve?

A, alba

it (ites

ey he








7 A



ets Ld oe


aT Deeded et.



* ‘’ \ite beta

pele i ‘






© ia" (euk







ee 1



he CP


f 1











eae ht ye aA

Homie im

ae fat

ne Dn Nota





AMF BleaieA moe 1 ae, oe ie \

Fy ey



{ :




Pe‘ Rist och a uw







ea oe

rie '



ie ee


















: f



—s : ;

(Saige Yet ear Re nati. tyage PRI 0 iy achat" a2



vik! i Bera

ae, i








4 :



Contents Preface / ix Acknowledgments

/ xi

General Introduction and Review of Past Work 1. Segmentation and the Taxonomy of Intonation/1, 2. The Two Major Traditions of Analysis/3, 3. Sentence Stress/6, 4. Intonation/g, 5. Stress/19, 6. Summary and Implications/29


Evidence for the Rhythmic Nature of Prominence


1. Rhythmic Cues in the Accent Analyses/34, 2. Experimental Evidence for the Rhythm Hypothesis/39, 3. Difficulties with Sentence Stress in the Accent Analyses/46


The Phonology of Deaccenting.


1. The Concept of Relations/50, 2. The Relational Nature of Deaccenting/52, 3. Pretonic Accent and Deaccenting/58


The Grammar of Accent Placement


1, Syntactic vs. Semantic Approaches/70, 2. ‘Normal Stress’ and Focus/73, 3. Default Accent/81, 4. Degrees of Accentability/84, 5. Semantics of Deaccenting/g2, 6. Summary/98

Paralanguage and Gradience


Around the Edge of Language?


Intonation and Grammar


1. The Problem/100, 2. Contrast vs. Paralanguage: Three Approaches/102, 3. Gradience/108, 4. The Investigator’s Task/112



At1. Central vs. Peripheral/119, 2. The Expression of Speaker’s mental titude/123, 3. Intonation and Emotion/128, 4. Instru y/136 Analog ogical Phonol A Phonetics and Intonation/133, 5.

Into1. The Role of Intonation: Two Approaches/139, 2. The Fallof national Lexicon/140, 3. Preliminaries to the Analysis Intonation Rise/145, 4. A Semantic Analysis of Fall-Rise/152, 5. and Phrasing/162 vil

Vili VIL.


Stylized Tones and the Phonology of Intonation


1. The Calling Contour/169, 2. Stylized Fall/173, 3. Stylized Rises/179, 4. The Phonology of Intonation/186


Intonation and Phonesthesia




Notes / 207 References / 221 Name Index / 229 Subject Index / 232


In a way this is simply another book about English suprasegmentals, and the somewhat grandiose title was chosen in part because all the other obvious titles for treatments of English suprasegmentals have already been used. I came to study intonation because I was interested in its role in connecting sentences in discourse, and like many another linguist, I assumed that the problem of a notation to represent intonational form was a relatively “trivial” one, which could be disposed of quickly on the way to studying “interesting” questions of intonational function. But I soon realized that even establishing a simple taxonomy on which to base a notation was a formidable task, one on which there was little agreement. Like the blind sages studying the elephant, investigators of intonation had found snakes and ropes and tree trunks; increasingly I found myself interested in the question of why descriptions of intonation vary as much as they do. More importantly, then, the title focuses on what it is about intonation that has led generations of careful researchers to produce such different descriptions of the same beast. Linguists, in general, have simply assumed that intonational meaning is somehow different from other linguistic meaning, which has given them license to mix the ordinary and the extraordinary in their analyses in unpredictable proportions. I would not disagree with the premise that intonational meaning is different, but I think it is important to consider just how it differs and how it does not. Treating that question, rather than analyzing English intonation, is the most important aim of this work. While it is true that I have largely restricted the discussion to English, I believe that the characteristics of intona-

tional meaning identified here are generalizable—mutatis mutandis—to other languages as well. ’ The book is thus not a complete analysis of English intonation. It only lays

the foundation for such an analysis; much detail remains to be filled in. It might

be better characterized as an introduction to the study of intonation. Even at that, some familiarity with past work and important issues is assumed throughout, but wherever possible I have summarized and synopsized the writers whose work I discuss. Perhaps the book is best described as an investigation of what has been said about intonation, and what it proves about the way intonation affects the meaning of what we say. This is, I hope, a matter of interest to any linguist, not just a specialist on intonation. In my use of technical terminology, I have been as consistent and as conservative as possible. Because of the broad range of conflicting approaches to the general subject, however, the goals of consistency and conservatism are not




always compatible, and a few preliminary remarks are in order on my use of the most general terms. I argue for a distinction between stress and accent akin to that made by Bolinger—‘stress’ at word level, ‘accent’ at phrase or sentence level —but in many cases where no ambiguity seemed to arise I have loosely used ‘stress’ to refer to phenomena of syllable prominence in general. (For example,

I have made much use of the traditional American term ‘sentence stress’, even though by Bolinger’s definition—and mine—it is accent, not stress.) Where I specifically needed a general term to cover both stress and accent, I have used prominence. In the same way, I also argue for the traditional distinction between ‘stress’ and ‘pitch’, and have in many cases used intonation (or ‘intonation proper’) to distinguish pitch phenomena from prominence. Again, however, I have often loosely used ‘intonation’—as in the title itself—to refer to prosodic features in general. Where I needed an explicit cover-term to distinguish all phenomena of intonation, prominence, etc., from segmentals, I have used supra-

segmentals, but I disown any theoretical baggage that may come along with that term. As a notational convention I have adopted Bolinger’s use of squiggly lines of type, in which the ups and downs roughly indicate the melody of the voice, like


i Ss

Since any notation system other than such a purely iconic one does presuppose an analysis, I have thought it best to avoid the well-known systems developed by Pike, Trager and Smith, and the British pedagogical tradition. The latter two are illustrated in Chapter 1, however, and in Chapter 7 I do make some limited use of the British ‘tonetic’ marks for the sake of typographical simplicity, once the analysis is established.

ACKNOWLEDGMENTS This book is a modest revision of my 1978 Cornell University dissertation, “The Structure of Intonational Meaning.” As such it owes a debt to the criticism and encouragement I received from my teachers at Cornell: James S. Noblitt (my chairman), Linda Waugh, John Bowers, Wayles Browne, Joseph E. Grimes, Charles F. Hockett, Gerald B. Kelley, and Sally McConnell-Ginet. It is a pleasure to be able to offer them this book as tangible thanks for their years of support. Also among my teachers has been Dwight Bolinger, who has corresponded with me from the very beginning of my research; his influence will be apparent throughout the book. Others who through discussion and correspondence have helped me to develop my ideas include Anne Cutler, Duncan Gardiner, Mark Liberman, Louis Mangione, Ivan Sag, and Ralph Vanderslice. Naturally, none of the foregoing agrees with all or even most of what I say here. Financial support for my years as a graduate student and the period when I was writing this book was provided by a National Science Foundation Graduate Fellowship, Veterans Educational Benefits under the G.I. Bill, various teaching positions in the Department of Modern Languages and Linguistics at Cornell, and most recently by a Fulbright Lectureship in Cluj, Romania. For much editorial help and patience I am grateful to both Ruta Noreika and the staff at Indiana University Press; thanks also to Bruce Downing, who read the manuscript for the Press and made many useful suggestions. And not least, I thank the many friends in Ithaca and elsewhere who have taken an interest in my work and well-being these last several years.




pater Pandit yew

; ‘






“ ‘
















tgaesh eer LY nay Py

carck ems









Pratt bees


leans Boe






best tier




A ©

eee ! ;

| ip hae ee


. i ‘aoatenl a



oP Peed






ds +


Sore bails



the th

aD ety




2 er : : . beatae toy res -NEWS a Ans ree «aa.





; 2 (1


th aS

“ele ore




n beersChie ear od


t >the ’



re a







General Introduction and

Review of Past Work 1. Segmentation and the Taxonomy of Intonation

At the foundation of most linguistic discussion lies the axiom that utterances can be segmented into recurring lexical and phonological elements and that contrasts between the elements are not gradient, but allor-none. The parties to any argument about kill and cause to die, for example, at least agree that kill is an element of the surface lexicon of English, which contrasts on the one hand with keel and pill and on the other with murder and execute. The significance of this seemingly trivial point of agreement can be illustrated with a simple example. If we hear of someone sitting in a chair, we are likely to picture a big upholstered armchair, while the phrase sitting on a chair is more likely to make us think of a simple straight-backed chair. On the basis of this datum alone, a French linguist might briefly entertain the hypothesis that English has a lexical distinction comparable to that between French chaise and fauteuil. But it would be a simple matter to discover that chair is a lexical element, test its range of possible referents, and conclude that in the context it is the preposition which influences the native speaker’s intuitions about the situation being described. Now, however, imagine the position of an extraterrestrial linguist with no knowledge of how to segment utterances, without even the notion that utterances might be segmentable. Confronted with the phonetic sequences [sttrmmatfea] and [sttranotfea] and the corresponding semantic intuitions, it might conceivably conclude any or none of the following: (i) The two utterances are essentially the same linguistic unit, but with

a small phonetic modulation at a particular point, corresponding to gradient differences in the situation described—i.e., with no segmenta-







tion, and with gradient rather than all-or-none phonological differences. (ii) English has lexical elements [srt] ‘(locative)’, [1j1ne] ‘armchair’, [ryana] ‘straight-backed chair’, and [t/ea] ‘sit’—i.e., segmented, but incorrectly, into elements that do not recur. (iii) English has lexical elements sit, -ing, in, on, a, chair—i.e., the ‘correct’ solution, segmented into recurring lexical and phonological elements with all-or-none contrasts.

It is so remotely unlikely that a human linguist would ever produce an analysis like (i) or (ii) that we do not even think of all-or-none segmentability as a methodological or theoretical principle, but as a basic design feature of human language. I have belabored the point here simply to dramatize the magnitude of the disagreements over the analysis of intonation: all-or-none segmentability is by no means universally accepted as a characteristic of suprasegmental structure. The intonation contours of utterances have been treated as unanalyzable gestalts: this is what Lieberman (1967) proposes, for example, when he subsumes all the linguistically systematized functions of English intonation under a single contrast between marked and unmarked breath-group. At the other extreme, contours have been cut up both horizontally and vertically, so to speak, into linear segments and allor-none pitch phonemes: a recent instance is Leben (1976), whose autosegmental-style analysis treats contours as sequences of level tonal elements occurring at rhythmically well-defined points in the utterance, permitting (or requiring) him to distinguish, within one small corner of the English intonational system, the vocative chant from the newspaper vendor’s chant. Somewhere in between we might put Bolinger (various dates), who posits elements that are smaller than whole contours (e.g., Accents A, B, and C), but who also refers to gestalt characteristics (e.g., relative height of pitch accents) and strenuously objects to attaching mor-

pheme-like meanings (e.g., vocative chant) to particular contours. These analyses differ as fundamentally as the extraterrestrial analyses of English. The Martian who proposed analysis (iii) above could not discuss the meaning of in and on with the Venusian who proposed analysis (ii), since in the latter’s analysis there are no such segments; still less could he talk with the Jovian who proposed analysis (i), since it has no segments at all. In the same way, Leben’s analysis of the formal and functional characteristics of the vocative chant and the newspaper vendor’s chant cannot readily be evaluated in terms of Bolinger’s system, which has intonational segments of a very different sort, or Lieberman’s, which involves no

General Introduction and Review of Past Work


segments at all. Put simply, approaches to intonation diverge from one another in ways in which treatments of segmental phenomena do not. This divergence cannot be attributed to differences between various schools or linguistic theories. That is, there is no particular “‘Bloomfieldian’ or ‘generative’ or ‘Praguian’ interpretation of suprasegmentals. In their work on segmental phonology and grammar, all these schools share the fundamental assumption of all-or-none segmentability, and any modification or suspension of this assumption when treating intonation thus tran-

scends theoretical divisions. P. Lieberman


and M. Liberman

(1978) are both ‘generative’ treatments; but there are vast differences between them about which generative theory is simply silent. Similarly, Bloomfield (1933) discussed intonation contours as gestalts, comparable in many ways to Lieberman’s, while his followers elaborated the highly segmented analysis which is the forerunner of Liberman’s approach. This does not mean, of course, that different linguistic theories will not deal differently with suprasegmental phenomena, but it does suggest that we must agree on what the phenomena are before we deal with them. In reviewing past work, then, my purpose is not simply to restate what has been done in the terms of those who did it, or to decide whose theoretical arguments demolished whose.) Rather it is a ‘pretheoretical’ attempt to get behind disagreements and discern common denominators, to express the views of various traditions of analysis in terms of fundamental agreements and well-defined differences. Hence the discussion in this chapter is organized around three widely shared assumptions: the existence of sentence stress;

the existence of meaningful pitch contours; the existence of degrees of syllable prominence.

However elementary these may seem, understanding them thoroughly may help us to produce a descriptive framework about which there can be general agreement and proceed from there to problems of phonological and grammatical analysis.

2. The Two Major Traditions of Analysis

It may be useful, nevertheless, to begin by illustrating the typical ‘American’ and ‘British’ analyses to which I refer extensively throughout the chapter and the whole book. This section presents brief synopses of Trager and Smith (1951) and Kingdon (1958) more or less without com-






ment; the comparative review of the literature begins with the discussion of sentence stress in Section 3. a. ‘American’: Trager and Smith The system first proposed in Trager and Smith (1951) and modified in the 1957 version of that work was the most widely accepted analysis of intonation in the American post-Bloomfieldian period. Its principal goal was the extension of Bloomfieldian principles of phonemic analysis to suprasegmentals. It posits:

Four phonemes of stress, /’ * *~/, with primary /’/ corresponding to sentence-stress or ‘nuclear stress’ in Chomsky and Halle’s system, and weak

/~/ corresponding

to unstressed

or 0. Stress is assumed

to be

manifested by loudness, each level being louder than the next lower level; stress and pitch are strictly separate elements of the system. One and only one primary stress occurs in each ‘phonemic clause’ (see below). Four phonemes of pitch, /1?%4/, with /1/ low and /*/ high. The

distribution of /#/ is somewhat restricted, being used only for ‘emphasis’ or ‘contrast’ where

/?/ would


be used. A pitch phoneme

‘occurs’ at the beginning of an utterance, and the pitch continues at that level until another pitch phoneme occurs. Pitch phonemes are always

marked at the beginning of an utterance, before the primary stress, and at the end before the terminal juncture, though they may be marked at other points as well. Four phonemes of ‘juncture’, one internal /+/ and three terminal

/# || |/. The ‘internal open juncture’ or ‘plus juncture’ (/+/)—which distinguishes, e.g., night rate from nitrate—was of course the subject of grand theoretical debates, including a debate over whether it was to be considered segmental or suprasegmental. The three terminal junctures are an integral part of the pitch contour system; any pitch contour ends with one of the three. These are roughly fall (/#/), rise (/ |/), and level (/|/), though with endless complications and qualifications. A pitch contour thus consists of at least two pitch phonemes and a terminal juncture, and the domain of a pitch contour is a ‘phonemic clause’. In longer utterances the audible pauses or breaks are generally marked by /|/, and each phonemic clause set off by such pauses contains one occurrence of primary stress. Examples follow:

. Ap

General Introduction and Review of Past Work

(1) s36hn?¥

Jon n

(2) *Jéhnny?||


(3) *whY are you


Why are you Ping

*géing!# di

(4) ?You ‘dtd?||


(‘emphatic’) 4d

(5) 71 %dfdn't?|*he *s4ia'#



he said

b. ‘British’: Kingdon

The system presented in Kingdon (1958) is typical of the analyses developed by a long line of British scholars whose principal interest has been in developing an effective taxonomy and notation for teaching English intonation to foreigners. Among pitch contours, Kingdon identifies a number of ‘tones’, two static:

H (high level) and L (low level); three kinetic: I (rising), in two varieties Iu (high-rising), and It (low-rising); II (falling) (also occurs high and low, but not meaningfully distinctive like In and IL); III (falling-rising), in two varieties III (undivided) and IIp (divided, i.e., with the rise beginning at a secondary nucleus or stressed syllable) ; and two complex, being modifications of II and III, respectively: IV (rising-falling ); V (rising-falling-rising), which also comes in a divided version analogous to III.

Stress follows the IPA convention of fully stressed ['], half-stressed [,],

and unstressed, including a notion of ‘emphatic’ stress ['']. The nucleus

or most prominent syllable of a tone occurs at a fully stressed syllable, i.e.,






the sentence stress in American descriptions. Audible boundaries are marked by long vertical lines. While Kingdon, like most of the other writers in the British tradition, gives numbers to the contours he identifies, his most important contribution has been the development of ‘tonetic stress marks’, which are used to mark the tones in running text, and which have been used in one guise or another by most recent British writers. These marks are relatively iconic and easy to learn to read:

(6) *John

"°n_ ((high) fall)

(7) Marie






(9) Melinda

Melis (fall-rise ) John



(fall, emphatic) n.


In connected text, some half-stresses are marked, though the system for notation of half-stresses varies more from author to author; typical examples would be: (11)'Why

are you

(12)can it


‘really be

| he



| do you


3. Sentence Stress

Our comparative review might best begin with sentence stress, if for no other reason than that it is a good illustration of the precept that we can make progress once we agree on what the phenomena are. There are obviously plenty of different ways of looking at this problem which are largely the result of theoretical differences: writers as diverse as Bresnan, Bolinger, Schmerling, DaneS, and Halliday have all studied the grammar and function of sentence stress from markedly different points of view and have accordingly come up with different observations and interpreta-

General Introduction and Review of Past Work


tions. But it is important to note that there is also a shared assumption here, which is prerequisite to the theoretical discord: the assumption that sentence stress exists. This apparently trivial consensus is important. Notwithstanding theoretical disagreements, we have learned a great deal about the role of sentence stress in signalling discourse connections, theme-rheme relations, and the like; but whatever understanding we have of the role of sentence stress depends on our agreement on its existence. It is this sort of shared assumption that we shall be looking for in what follows. Concealed in the agreement on the existence of sentence stress is an ambivalence about its nature. In the typical American analysis, sentence stress is considered to be another ‘level of stress’, and stress and pitch are taken to be independent elements of the suprasegmental system. American analysts have accordingly been troubled by the fact that what is perceived as sentence stress often coincides with the greatest pitch prominence of the intonation contour. Trager and Smith’s original analysis (1951) made no mention of this connection, but Hockett (1958) proposed revising the system so that the difference between primary and secondary stress was seen as allophonic, conditioned by the occurrence of what Hockett called primary stress at the intonation center. Evidently uneasy with such fraternization between stress and pitch, though, Hockett took pains to define his terms so that the two concepts remained independent: ‘intonation center’ is defined as the beginning point of the next-to-last pitch level phoneme in the contour. But the unworkability of such a definition was quickly shown by Sledd (1956);? and Trager’s cautious recodification (1964, but presumably written in 1960 or 1961) of the original analysis retains the four stress phonemes and simply notes that one of the ‘pitch positions’ of the intonation contour occurs at the primary stress (see Trager 1964 for details). Meanwhile, Stockwell (1960) and Chomsky, Halle, and Lukoff (1956) set the precedent for the adoption of the Trager— Smith suprasegmental analysis by generative grammarians. Chomsky and Halle (1968), like Trager and Smith, call sentence stress a separate level of stress (1-stress, or nuclear stress), though they say virtually nothing about pitch. M. Liberman (1978) gives the first really extensive treatment of English pitch phenomena in a generative context; he notes the connection between pitch changes and ‘strong’ syllables, but for him as for the rest of the Trager-Smith-Chomsky-Halle tradition, stress and pitch are . independent phenomena. In the British tradition, sentence stress is commonly called the ‘nucleus’.2 The nucleus is considered an intonational phenomenon which






has nothing to do with stress at all. It simply occurs at one of the fully stressed syllables of the sentence—one of those syllables which in the Trager-Smith system would have primary or secondary stress. That is, the nuclearity of the syllable is in no sense felt to contribute to its degree of stress: it is considered stressed on independent grounds, and is additionally seen as the location of the nucleus. With some terminological variation, this notion of sentence stress is found in the British tradition as

far back as Palmer (1922) and has been maintained right down to Crystal’s recent works. It is worth noting that the nucleus (under the name of ‘tonic’) shows up in Halliday’s writing with exactly the same relationship to stress, even though Halliday’s notion of stress is very different from that of most other writers (see below, Sec. 5a). Not surprisingly, proponents of the American position have always objected that the British analysis “confuses stress and pitch” (the clearest statement in Smith 1955). Yet in their own terms, the British make just as sharp a division between stress and intonation as do the Americans; they

simply draw the line in a different place. For the Americans, sentence stress is primarily a stress phenomenon which is often associated with a pitch change, while for the British it is an intonational phenomenon which occurs at a stressed syllable. The disagreement between the two traditions concerns the nature of sentence stress, not the separability of stress and intonation.

But even if the disagreement were put this way rather than in terms of “confusing stress and pitch,” traditional notions of ‘stress’ would be of no help in resolving the problem. Since both the British and the TragerSmith school accepted the notion of levels of stress as systematically different degrees of loudness, the Americans could argue for their analysis simply on the basis that sentence stress is “louder” than other stressed syllables. On the other hand, as early as Coleman (1914), there were suggestions that at least some instances of perceived loudness (Coleman spoke of ‘emphasis’) are correlated not so much with physical intensity as with pitch change, and the British could argue that sentence stress is perceived to be “louder” simply because of the pitch change associated with the nucleus. At this point, then, it might seem logical to turn to the work of a third, rather different, tradition, that of the phoneticians who have long been concerned with determining the acoustic and physiological correlates of stress (e.g., Stetson 1951, Twaddell 1953, Bolinger 1955, Fry 1955, 1958,

Mol and Uhlenbeck 1956; for reviews of this literature see Lieberman 1967, Lehiste 1970, Léon and Martin 1970). Unfortunately, however, their


General Introduction and Review of Past Work


findings only add to the confusion, for they seem to indicate that stress and pitch are indeed quite intertwined and that the debate between the British and American traditions is without any empirical basis. By the mid-fifties the consensus was emerging from phonetic research that the acoustic correlate of perceived stress is not physical intensity—‘loudness’— but a complex interaction of pitch obtrusion,* syllable duration, intensity, and perhaps other factors as well, with pitch obtrusion apparently the most significant. It was in an effort to integrate this growing body of experimental evidence into linguistic description of suprasegmentals that Bolinger (19582) proposed his notion of accent. Bolinger defines ‘accent’ as syllable prominence signalled by pitch obtrusion or pitch change; he treats ‘stress’ as a lexical abstraction, a potential for accent. His analysis, however, is hardly a resolution of the differences between the British and American approaches. While it was aimed primarily at clarifying the traditional notion of stress, his theory also affects traditional intonation by dividing pitch phenomena into ‘accent’ and ‘intonation’. By incorporating into his notion of accent aspects of both traditional stress and traditional intonation, Bolinger calls into question the clear division between the two, which is the basis of the disagreement between the British and the Americans. Sentence stress, for Bolinger, is neither stress nor intonation,

but accent. In this light we can appreciate the significance of the common ground between the typical British and American analyses, and the radical nature of Bolinger’s approach. We will return in the next chapter to evaluate the accent concept, but not until we have discussed more generally the place of stress and intonation in the British and American traditions and the accent analyses.

4. Intonation

a. Levels vs. Configurations

One of the better-known controversies in the study of intonation is whether pitch phenomena are to be analyzed linguistically in terms of ‘levels’ or ‘configurations’ (the terms are from Bolinger 1951). The American structuralist tradition of intonation analysis, beginning with Pike (1945) and Wells (1945), more or less canonized in Trager and Smith (1951), and revived in somewhat different form by Liberman (1978) and in recent autosegmental work (e.g., Leben 1976, Goldsmith 1976), divides the speaker's pitch range into four relative phonemic pitch levels (three






in the autosegmental work) and describes contours as sequences of pitchlevel phonemes, It is true that Trager and Smith also posit three ‘terminal junctures’, characterized by pitch movement (roughly rise, level, and fall), but these are, in effect, merely a by-product of the level analysis, a way of avoiding a proliferation of phonemic levels. (In this connection, we should note that Liberman, in order to be able to describe all pitch movements as sequences of levels, develops the notion of ‘boundary tone’ [Liberman’s ‘tone’ = Trager-Smith’s ‘pitch level’], an underlying phoneme manifested phonetically only by pitch movement at an intonational boundary.) The British tradition of intonation description, by contrast, has always taken pitch contours to be unitary. The phoneticians who were the forerunners of the tradition—notably Sweet (1892) and Jones (1909)—spoke in terms of ‘intonation curves’ (the title of Jones 1909), and one of the early linguistic analyses that followed Jones’s work, Armstrong and Ward (1926), set the precedent for considering whole-sentence ‘tunes’ to be functional units. (Armstrong and Ward posited two such tunes.) Since the mid-thirties, following Palmer (1922), it has become usual to divide the ‘tune’ into at least two parts, the part preceding the sentence stress— usually called the ‘head’—and the remainder, consisting of the ‘nucleus’— the syllable with sentence stress—and optionally a ‘tail’—any syllables after the sentence stress. Some modern treatments (notably O’Connor and Arnold 1961) recall the early emphasis of the British tradition by taking considerable note of the function of the different ‘tunes that result from various combinations of head and nucleus, while others do not. But in

either case the ‘nuclear tones’—the various pitch contours that, roughly speaking, begin with the nucleus and continue to the end of the sentence— are seen as fundamental elements of the intonational system. These tones are described as contours like falling, falling-rising, low-rising, etc. Though most such analyses make distinctions like low-rising vs. high-rising, the idea of phonemic level is not found. The two views have coexisted for quite some time without serious debate. During the post-Bloomfieldian heyday of the 1950s, Smith’s review (1955) of Jassem (1952) was one of the few salvos fired eastward across the Atlantic; it is a succinct statement of the Trager-Smith view that the British descriptions commit the unscientific sins of confusing stress and pitch and of allowing meaning as a criterion in analysis. The British, for their part, have acknowledged the American treatments, but have remained aloof from the debate and gone on as before. They seem to have held Pike in considerable awe, however, and showacurious willingness to

General Introduction and Review of Past Work


believe without any evidence or explanation that the pitch-level scheme may be more suited to American English than to British.* Crystal, however, is not impressed, and regards the general theoretical arguments against the level analyses, especially Bolinger’s, as never having been answered (1969a:196-201 ). In any case, the specific idea that British and American English have totally different intonation systems demands close scrutiny. In fact, Pike’s notation has enjoyed a certain amount of use by writers on British intonation (e.g., Wode 1966, Pilch 1970), and systems much like the British analyses have been used for American English by, e.g., Jackendoff (1972) and Gunter (1972). In the absence of any clear evidence to the contrary, Bolinger’s conclusion that the two dialects differ little is surely to be preferred to the idea that the British talk in contours and the

Americans in levels.” b. The Intonational Lexicon

Various American investigators have suggested that the level view and the configuration view are not as incompatible as they might first appear. Sledd (1955) claimed that “the contour analyses . . . include the concept of levels” and that “Bolinger’s antithesis between levels and configurations is ultimately false.” He continued: The necessity of levels appears whenever the contourist introduces the terms high and low into his vocabulary, as he regularly has done in the past and presumably must continue to do in the future. To some extent, a geometrical analogy is justified. If two points determine a line, the occurrence of two pitch phonemes determines a sustention, a rise, or a fall. [328-329]

Sledd argues, in other words, that there is no issue. His compromise view says: Everyone agrees that there are meaningful contours at some level of analysis, and at another level of analysis these contours can be broken down and seen as sequences of discrete pitches. It seems to me rather that there are two issues: (1) What are the meaningful contours? and (2) What is their phonological nature? Separating these two issues is prerequisite to unravelling the confusions that abounded during the fifties, and the point is worth discussing at some length. Implicit in most analyses of intonation is some sort of intonational lexicon, by which I mean no more than an inventory of meaningful contours that are in contrast with one another.’ For example, Trager and Smith viewed contours (like, say, /* *1#/) as intonational morphemes—






theoretically, that is, as lexical elements like any segmental morpheme. Bolinger’s original accent paper (1958a)° contains a section (51-54) entitled “The Accents as Morphemes,” in which he discusses the general meaning or function of each of the accents. The British treatments set forth an inventory of contours—tunes, tones, tone groups, or whatever—

and then discuss the nuances produced by each such contour in a variety of contexts. Pike’s treatment is the most explicit in positing an ‘intonational lexicon’: each contour contributes an ‘intonational meaning’ which is superimposed on the ‘lexical meaning’ of the segmental words with which it is used. (Pike’s view is discussed further in Chapter 6). But because the lexical inventory implicit in these analyses has not always been recognized for what it is, the debate over levels-vs.-configurations has often been conducted at cross-purposes. Trager and Smith and their followers never expressly confronted the question of lexical segmentation, but assumed they were doing primarily phonological analysis. As a result, they were stuck with the secondary implication that any sequence of pitch phonemes that occurs is by definition meaningful and contrastive. It was this feature of their system, not the notion of pitch levels per se, that drew the heaviest fire from their American critics.® Bolinger, Householder, and others repeatedly pointed out that there are many contours which the Trager-Smith analysis considers phonemically different, which nevertheless do not appear to contrast in the way they are actually used and responded to by native speakers. This is perhaps best put by Gunter (1972:199-200): [The] representation deals in discrete elements of pitch and juncture. These elements are ‘phonemes’, with all the dogma and doctrine that the word implies. Thus the implication is present that each intonation is absolutely different from every other. For example, /414/ and /31N/ are just as different from each other in signalling power as either is from, say, /33t/ or /32t/. But the behavior of these intonations in dialog is distinctly against this implication, for within [certain] sets all the intonations behave alike. This fact should not be surprising, for all of the members of a given set closely resemble each other in that they share a gross shape: The members of [one] are grossly falling; those of [another] are grossly high-rising; those of [a third] are grossly falling-rising .. . .

Thus each set of intonations can be regarded as a contour with a recognizable shape, and each member of a set can be regarded as a variant of that contour. In a given dialog, moreover, all of the variants within a contour signal exactly the same relevance, as in the following: Context: Who is in the house? Response: 3 JOHN 1d (Relevance: ‘Answer to information question’)

+ WP

General Introduction and Review of Past Work


This relevance remains intact with any variant of the falling contour, whether /411/, /31)/, or /21\/. To be sure, each of these variants may seem to have its own flavor in this dialog, but that flavor is emotional or expressive. . .. What is important about these falling variants is that they all have the same gross shape. All signal the same relevance here; they all answer the question.

Householder made the same point: that the Trager-Smith analysis fails to identify the meaningful contours before moving on to phonological analysis: Smith and Trager . . . are led to their elaborately complicated system largely by their choice of units, by some principle of establishing phonemicity which I do not yet fully understand, and by the well-known bugaboo, ‘once a phoneme always a phoneme’. [We should] postpone our choice of units until after we have established our grammatical contrasts (instead of assuming some kind of validity for the unitary nature of the marks used in phonetic transcription). . . . [Householder


Unlike the Trager-Smith system, the British analyses do establish a lexical inventory, and then concentrate on the grammatical and semantic characteristics of the meaningful contours they identify. Now, these contours are treated as phonologically unitary, but only by implication, for the British simply do not attempt a phonological analysis. Unlike Sledd, the British identify the meaningful contours at one (i.e., lexical) level of analysis, but do not attempt to break them down into sequences of discrete pitches at another (i.e., phonological) level. This is the point the - Americans missed: they took the British ‘configurations’ as primarily phonological and argued against them on that basis. If they had understood the real emphasis of the British system, they might have seen that it answers the objections of critics like Gunter and Householder—it identifies the meaningful distinctions first. Accepting the British lexical inventory, they could then have gone on to phonological analysis. This is essentially what M. Liberman does in his dissertation (1978): he integrates the British lexical taxonomy into an American-style pitchlevel phonological analysis. Liberman first identifies certain functionally distinctive contours (e.g., ‘contradiction contour’, ‘surprise/redundancy tune’; see also Liberman and Sag 1974 and Sag and Liberman 1975), which he sees as ‘intonational words’ in an intonational lexicon. In establishing this lexical taxonomy he draws heavily on O’Connor and Arnold (1961) and Crystal (1969@). He then goes on to analyze these contours phonologically as sequences of ‘static tones’ (i.e., pitch phonemes); here he acknowledges his place in the Trager-Smith tradition. He also posits






two distinctive features [High] and [Low], which define four phonemes:

H (High ) (= [+High —Low]); HM (High-Mid) (= [+High +Low]); LM (Low-Mid) (= [—High —Low] ); and L (Low) (= [—High +Low]). These four pitches are deployed not like the traditional Trager-Smith pitches, i.e., with the highest used only for ‘overhigh’ pitch, but rather with all four playing a role in representing ordinary contours. However, it should be noted that Liberman is able to avoid criticisms

like those that were directed at Trager and Smith partly because he is not bound by the once-a-phoneme-always-a-phoneme principle. He defines his contours not strictly as sequences of whole phonemes, but as sequences of segments with features sometimes left unspecified. The specification of these features in an actual utterance results in a ‘modulation’ of the meaning. For example, the surprise/redundancy tune, as in or

(13) The bl ackboard's paint

ang |

is defined phonologically as a sequence of [—High]



These segments, unspecified for [Low], can each be realized in two dif-

ferent ways, giving a total of six possible realizations of the contour. Lumping a number of different ‘phonemic’ sequences under the same ‘morphemic’ rubric this way would have been unthinkable in the days of Trager and Smith. Here, in any case, is the point of the fantasy about the extraterrestrial linguists: our first task in analyzing intonation must be to identify the inventory of meaningful elements. Phonological and grammatical analysis must follow lexical segmentation. The real levels-vs.-configurations argument does not pit the British analyses against the American ones, but assesses the arguments for a phonological analysis into levels once the lexical analysis has been made. In Section 6 of this chapter, I will argue (like Liberman) for the acceptance of an essentially British lexical inventory; in Chapter 8 I will return to the phonological question and argue against a level analysis. The point here has been to separate the issues, to show that Sledd’s compromise view was not a solution, but only a statement of the problem. c. Tunes vs. Tones

Earlier I noted that since the mid-thirties it has been usual in the British tradition to divide the ‘tune’ into at least two parts, and that anal—


General Introduction and Review of Past Work


yses vary according to whether they consider the nuclear tones or the whole-sentence tunes to be more significant lexical elements. Assuming for the moment, then, that an integration (such as Liberman’s) of the British lexical inventory and the American phonological analysis is desirable (this is the view I will return to challenge in Chapter 8), there is still the question of which British lexical segments to take into account. Liberman, following O’Connor and Arnold’s lead, considers the tunes to be most significant, likening the head and nuclear tone to bound morphemes in words like interdict (88ff); his ‘intonational lexicon’ consists of tunes. In my chapters on intonation, I will take the opposite position, namely that we can profitably look at the meanings of tones and consider tunes to be compound. For example, Liberman and Sag (1974) make a point of considering their ‘contradiction contour’ holistic:

24) 2 le Semen. a








I would suggest rather that it is a compound of a high-falling head and a low-rising nuclear tone. . However, this difference is much less serious than the disagreement about levels and configurations, for even those analyses—like O’Connor and Arnold—whose lexical emphasis is on the meaning and function of tunes nevertheless assume some sort of structural division between head and nucleus. Liberman, too, with his ‘bound morpheme’ analogy, implicitly acknowledges some internal structure in the tune. Indeed, the _ division between head and nucleus has been noted by investigators outside the British tradition: Hockett (1958) proposes the terms pendant and head, corresponding to British prehead + head (= pendant) and nucleus + tail (= head). In other words, the notion that intonation contours may be divided into a part preceding the sentence stress and a part including and following it is not only a British idea, but is compatible with the American pitch level analysis as well. In most of what follows, I will assume that the ‘anatomy of an intonation contour’ summarized in Figure 1 is well established.'° The main point of the ‘tune-tone controversy’, then, is not whether tunes are composed of smaller parts, but whether the smaller parts are semantically relevant. But even this is largely a matter of emphasis. Liberman’s contention that the most significant configurations are whole tunes does not deny the possibility that the nuclear tones also have some relevance, but merely claims that it is not especially productive to focus on





send ille





Kingdon (1958) Schubiger (1958)

Prehead | Head

O’Connor and Amold (1961) Crystal (1969a) Ladd (1980)


Chao Palmer



Anacrusis seen eenee==


Hockett (1955) Hockett (1958) Pike (1945)


Main Head Head ---------/| Head

Remainder Pendant Precontour

Nucleus | Tail

Nucleus | Tail

Nucleus | Tail --- Body ,-----|Nucleus | Tail

Intonation Head |Primary Contour

Figure 1. Anatomy of an Intonation Contour. Division between head and nucleus (shown by vertical double line) is assumed by all writers. As well, there is considerable usefulness in separating off the prehead (any unstressed syllables before the first major stressed syllable) and the tail (any unstressed syllables after the nucleus). However, there also seems to be a need for terms covering the range of Hockett’s or Pike’s terms or of Chao’s ‘head’ and ‘body’. When I have needed such cover terms, I have simply extended the use of ‘head’ and ‘nucleus’, but it might be appropriate to coin new terms.

the tones. Similarly, my concentration on the tones does not preclude the possibility that certain compound tunes—like the ‘contradiction contour’— have idiosyncratic uses. The two views are not mutually exclusive. d. Accent Analyses

If the disagreements over levels vs. configurations and tunes vs. tones were the extent of the differences of opinion over the linguistic organization of pitch contours, we could conclude our review right here and move on to the question of stress. However, the accent analyses (Bolinger’s and Vanderslice and Ladefoged’s) present a very different picture, one in which the tune-tone controversy does not even emerge. Since we have no way of knowing if the tune-tone controversy is even the right question to ask, we must consider the answers that the accent analyses get by asking the question in a quite different way. Though Bolinger’s analysis was developed ten or fifteen years earlier, it will be simpler for exposition to begin by introducing the accent concept in Vanderslice and Ladefoged’s terms (1972, based on Vanderslice 1968). -

1 WP

General Introduction and Review of Past Work


Accent, in their view, is a binary feature [accent] manifested by pitch obtrusion, i.e., deviation from a relatively constant pitch line. The deviation may be either up or down (i.e., to a higher or lower fundamental frequency), though it is more commonly up. (Vanderslice and Ladefoged posit an added feature [Dip]—the term borrowed from Malone 1926—to describe downward obtrusion.) Pitch movements other than those which define accents are ascribed to intonation; specifically, they posit two binary features [Cadence] and [Endglide] (roughly, falling and rising terminal, respectively), which characterize the pitch movement from the last accented syllable to the end of the sentence as either falling [+cadence —endglide], rising [—cadence +endglide], or falling-rising [-+cadence +endglide]. Examples follow.

(15) What are you



(do- is [+cadence —endglide]) “ng

(16) Did he

Cre)" rT



(fin- is [—cadence + endglide])

(think is [+cadence +endglide])

(18) What do you Rad

(want is [+cadence —endglide +dip])


(19) What's the

(mat- is [—cadence + endglide +dip])

martes Bolinger likewise defines accent in terms of pitch obtrusion, but unlike Vanderslice and Ladefoged he does not posit a single all-or-none accent. Rather, he describes three different accents, which differ in the type of

pitch movement used to render the accented syllable prominent. Accent A is characterized by a marked drop in pitch during or immediately after the prominent syllable; Accent B is characterized by a marked rise in pitch, either (i) during or immediately after the prominent syllable, or (ii) from the preceding syllable to the prominent one. Accent C is characterized by a drop in pitch from the preceding syllable to the prominent one. Bolinger’s diagrams of the three accents are shown in Figure 2. Bolinger thus includes in accent some of what Vanderslice and Ladefoged assign to intonation: his Accent C corresponds to their [+accent —cadence +endglide +dip]; his A, at least at the end of a sentence, cor-






Accent A

A relative leveling off of the accentable syllable followed by a relatively abrupt drop, either within the accentable syllable (which is prolonged for the purpose) or in the immediately following syllable. In very rapid speech the drop may be postponed to the second following syllable, but rarely beyond this. . . . [One subtype] puts the accentable syllable at a lower pitch than the one immediately following, but requires that only that one weak syllable remain high—the syllable after it must come down rapidly. [N.B.: This subtype is equivalent to ‘scoop’; see Chapter 2, Section 1.] The least common denominator in all A’s is the abrupt fall rarely more than two syllables after the accentable syllable.








i“ a


In (8) there is a low-rise nucleus on incurable, and the pitch in the head rises and falls without regard to word-stress in elephantiasis. But in (9) there is a fall-rise tone with its nucleus on elephantiasis, and the pitch prominence is located on the stressed syllable -ti-. The semantic difference becomes clear in these examples, too. In (8) we do have something like a holistic ‘contradiction’ or questioning of speaker A’s assumptions (“Whaddya mean you're going to die—elephantiasis isn’t incurable.’) On the other hand (9) narrowly focuses on elephantiasis, and says that it, at least, is not incurable. (Since the discourse seems only to provide a choice between elephantiasis and rabies, the implication is hard to escape that rabies is incurable.) Switching responses in these contexts, we get a change: (10) a: I just B:



I'm going

to die of elephantiasis.



(fall-rise tone) asis isn't incurab+®


I'm doomed--the doctor just elephantiasis or rabies.


e * g1°phan


I either


(contradiction contour) ti


ey isn't in ura?

Intonation and Grammar


In (10) the speaker is again focusing on elephantiasis, and the force of the reply is something like ‘I know that elephantiasis, for one, is not incurable, though I’m not saying you don’t actually have something else that is.’ (This response sounds odd to some informants, but given the proper speaker assumptions it would be perfectly normal: suppose B knows A has been having bizarre symptoms and has gone to the doctor to find out what is ailing him.) In (11), on the other hand, we could have a comedy routine, for there is a mismatch between segmentals and intonation. The intonation implies that the reply is a relevant contradiction, yet the segmentals contradict only half of A’s sad report, leaving us to wonder just what B thinks of the other half. From the TV show “All in the Family”: (12)¢loria

[who is pregnant

is having

an affair]:

Look at me--I'm Mother,

and is discussing whether her husband Why wouldn't

he be running



fat and ugly.


you're Aw,




ug [awkward


on Gloria's




The difference between the two intonations also emerges in a simple syntactic test. As Liberman and Sag note, the contradiction contour is unembeddable: (13) *1t's been demonstrated

by medical



ele = iBhans, tiag;


in ura?


But the fall-rise tone can be embedded freely: (14) a: They've figured out that I either have rabies or


B: Well,





elephan asis




by medical








If the fall-rise tone were simply contrast overlaid on the contradiction contour, then (14) should be impossible too. The behavior of the fall-rise and contradiction contour relative to the segmental syntax is thus incorrectly predicted by the Liberman-Sag taxonomy, whereas in terms of the nuclear-tone analysis it can be simply explained. The ‘beginning’ of the fallrise contour—the high-pitched part—depends on the location of the nucleus, and can thus go anywhere the nucleus can go. The ‘beginning’ of the contradiction contour, on the other hand, is the beginning of the head; since the head is defined roughly as the pitch contour on that part of the sentence that precedes the nucleus, then almost by definition we will not expect to find it starting anywhere but at the beginning of the sentence.® Finally, there are cases of actual ambiguity between the fall-rise and the contradiction contour, a sure clue that we are dealing with two linguistically separate categories. This is seen in an example we used in Chapter 2: (15) John's

of not

._ Bost in

This can be interpreted either as the contradiction contour (John’s not in BOSTON—what are you talking about; he’s right in the next room watching the tube.’) or as the fall-rise with nucleus on John (JOHN’s not in Boston—it was Henry’s turn to go this time.’) Since the nucleus of the fall-rise is on the monosyllable John rather than on a long polysyllabic word like elephantiasis, the distinction between the two contours may be neutralized, which helps explain Liberman and Sag’s confusion of the two. But the semantic distinction remains sharp despite the phonetic identity. All the evidence just presented shows that the contradiction contour is to be kept separate from the fall-rise tone with nucleus early in the sentence. This very point has been treated before in the literature, and my position seems to be supported by Lee (1956b) and Schubiger (1958: 26n). The remainder of this chapter assumes this distinction, and makes no further mention of the contradiction contour.7

4. A Semantic Analysis of Fall-Rise a. Fall-Rise in Single-Nucleus Sentences

The use of the fall-rise tone is exemplified in the following dialogues:®

Intonation and Grammar (16) a: pia you feed Bei


the animals?

the. cat.

(17) A: What would you B: A “cat, maybe.


of getting

(18) A: You



B: (19) A:




a VW,

got an


Do you want

a glass

a dog?

of water?

B: I'll have a “beer.

(20) a: Harry's B:

the biggest

In “Ithaca,

(21) a: My roommate B:

because “That's

fool in the State of New York.


and I are always

she one

arguing about

buys a lot of junk food. hassle I don't have with


food costs


In all of these sentences there is a narrow focus, but the fall-rise tone adds the information that the variable of the focus presupposition is a member of a set which is in the context. The meaning of fall-rise is thus something

like focus within a given set. It picks something out of a set of possibilities and focuses on it, but it specifically notes the connection of the set of possibilities to the context. In the following paragraphs this analysis is applied to the examples just given. The first is a straightforward illustration of the meaning of fall-rise as we have defined it:

(16) A: Did you feed the animals? B:

I fed



Speaker B clearly implies that he didn’t feed the dog (or whatever). ‘Of the group that you asked about, yes, I did feed the following member of that group.’ Like the fall, the fall-rise signals a focus: speaker B presupposes ‘I fed something’. But a fall focus is simply new information—rhematic, unpredictable—whereas the fall-rise focus also signals a connection to the context. I fed the “cat thus means ‘I fed something [focus presupposition] from a set of things in the context [fall-rise nuance] and it was the cat [assertion].’

The hierarchy implied in I fed the “cat is like this: from A's utterance few)

Intonation and Grammar (42) A:

It wasn't

so bad meeting


didn't think anyone would Beveldown and Wandervogel

B: Well,


talk were

“most of them wouldn't






to you, but it seemed like being pretty friendly.

talk to me.

(most < all)

Finally, our other classic fall-rise examples: (3)



drink because










show that the effects of fall-rise on scope of negation are not merely a function of quantifiers or even of particular quantifiers, but can be obtained any time the meaning of the fall-rise, the negative, and the focused item fit together in an appropriate way. In (3) because he’s unhappy is focused on as one reason out of a set of possible reasons; the combination of the negative, the focus on one reason, and the reference to a set of other reasons causes us to interpret the negative as associated with the focused reason, and we infer that John drinks, but not because he’s unhappy. In (4), on the other hand, there is no reference to other reasons and we have no cause to interpret the scope of the negative as being outside its clause. That is, our analysis of fall-rise puts the association of negation and focus in a different perspective. The semantic effects discussed by Jackendoff are rea] enough, but they are secondary ‘pragmatic’ effects. The primary message of fall-rise is focus within a given set; the logical relation of the negative and the focus is not part of the ‘deep structure’, but only the result of such lines of inference as “all can’t be a subset so it must mean not all.” d. Conclusion

The analysis of fall-rise presented here shows the potential value of the intonational lexicon hypothesis. By positing a single abstract general meaning of ‘focus within a given set’, we have explained a wide variety of semantic effects of the fall-rise tone, including narrowly grammatical uses (as in “All the men didn’t go or John doesn’t drink because he’s unhappy), broadly attitudinal effects (the polite feel of I've got an “Opel), and nuances that lie somewhere in between (the different effects with comparatives and superlatives). This shows that there can be no sharp line drawn between grammatical and attitudinal uses in our analysis






of intonational form; that is, we cannot say, like Lieberman (1967), that such-and-such a formal distinction signals grammatical meaning, while some other distinction is purely emotional. Moreover, the fall-rise analysis exemplifies again the pitfalls of ‘syntactic determinism’ discussed at the end of Chapter 4. Jackendoff assumes that the logical relation of negation to quantifiers is specified in ‘deep structure’, and that deep structure configurations “trigger” one intonation

contour or another. Because of this assumption, Jackendoff must construct an elaborate logico-syntactic device to deal with the fact that in many cases the distinction is irrelevant, or even worse, unclear: scope of nega-

tion is not always specified. What is specified is the message of focus within a given set, and it is inferences based on this meaning that give us clear scope differences in certain cases. This suggests that we should approach the complexity of semantics not by assuming that all sorts of relations are specified in deep structure and determine various surface phenomena, but by examining what is specified on the surface, in its own terms, and then seeing how speakers use that information to make inferences about deeper connections.

5. Intonation and Phrasing

The reader may have noticed the absence, in the discussion so far, of any reference to what might be called phrasing cues—the role of intonation in signalling the organization of utterances into phrases, sentences, parentheses, and the like. While the lexical analysis just illustrated does give detailed accounts of certain intonational effects, I believe it is futile to analyze the phrasing cues in the same way. Obviously, these cues are an important part of intonation’s grammatical role, but I believe they are systematized quite differently from the ‘lexical’ contours so far discussed. My purpose in this section is not to present a detailed treatment of phrasing—the subject certainly merits a book of its own—but simply to show how such a treatment must be integrated into the framework developed so far, and to provide evidence for considering the two intonational functions to be formally distinct, a. Pitch Contours and Boundaries in American Work

Most American investigators of intonation have assumed that phrasing is partly a function of pitch, and in their analyses they do not separat e

Intonation and Grammar


the uses of pitch in phrasing from the specification of meaningful pitch contours. For example, Lieberman’s marked and unmarked breath group depend on the assumption that phrasing is one of the functions of contours: specifically, Lieberman claims that the rising pitch which distinguishes You’re coming? from You're coming represents a linguistic unit [+BG] (marked breath group), the placement of which also allows us to distinguish [I'll move on] [Saturday] from [I'll move] [on Saturday]. In the Trager-Smith analysis, ‘terminal junctures’, which are primarily seen as cues to phrasing, are defined in terms of pitch movement, tempo, and volume, and the specification of any pitch contour has to include a terminal juncture at the end. Thus while /||/ might be seen as a signal of a phrasing break in 2Are you *finished? || *he *asked*||, it is also considered to be an integral part of the intonation contour /? **||/ on the first half of the sentence. M. Liberman’s ‘boundary tones’ apparently play a similar role. In the clearest statement known to me of the general point of view that pitch and phrasing are intimately related, Gardiner (1977) states that intonation has the primary function of segmenting the stream of speech into separate phrases, signalling to what extent the phrases are related to one another and what element within the phrase is the center of attention.

[4] The morphemes (not phonemes) of intonation are pitch levels not unlike the Trager-Smith levels (though Gardiner sees the levels not as purely relative but as organized into musical intervals). These morphemes have meanings like emphasis, finality, and non-finality; contours are seen as sequences of pitch morphemes: There is no question that characteristic configurations of pitch do exist in each language. This paper tries to make a case for the individual pitches of the configuration as independently meaningful; in this view the intonation gestalt is a syntactic sequence with an internally logical



Gardiner adds (personal communication) that this view “does not explain the conventionalized types” such as the contradiction contour or the stylized fall (see Chapter 8), which he sees as “analogous to idiomatic phrases.”

b. Boundary Phenomena as Relational The essence of the American view, then, is that the specification of

the pitch at boundaries is part of the larger problem of specifying pitch






contours in general. The point of this section is to take issue with this view, and to present evidence that the pitch contours of the intonational lexicon function independently of phrasing cues. Before discussing the evidence, we may once more turn to Chinese tone for a helpful analogy. It is indisputable that Chinese uses pitch contours for something other than grammatical phrasing, yet there are cues to phrasing in Chinese as well. While their basis is not well understood—presumably pause timing, pitch relations between adjacent phrases, and modification of tone contours before pause all play a role—the principle that a language may have both significant pitch contours and separate phrasing cues is established beyond a doubt. This is the principle I propose to apply to the analysis of English intonation. The most obvious evidence for this idea—noted by Bolinger (19614, 1970 )—is parenthesis. A parenthesis is set off from the sentence in which it is inserted both by pauses before and after, and—perhaps more importantly—by an overall lowering of pitch within the parenthesis:

(43) J°



don't tell him rt told and


about to get



What is significant is that within the parenthesis, the function of pitch contours (in Bolinger’s terms, the pitch accents) remains unchanged. Both the function of the parenthesis and the intonational cues which identify it are independent of the tones or pitch contours involved. Better evidence for the independence of phrasing from tone is seen in cases where a phrasing pause actually interrupts the tone, separating part of the tail from the rest. This is seen most readily in quotationattributions: (44)




Are you

(high-rise with nucleus

on com-)

t (45) Get


of here,

he yelled

(fall with nucleus on out)


(46) x don't

So. che ™1¢

(fall-rise with nucleus on think)

Intonation and Grammar


Notice that in the last example the identification of the tone as fall-rise depends on the rise on said. What these data suggest is that phrasing cues involve the use of pause and pitch relations, but that pitch contour—tone—is a separate aspect of intonation. That is, the parenthesis is identified not by anything about the shapes of the pitch contours within it, but by the pauses preceding and following it, and by the pitch of the contours within it relative to the pitch of the matrix sentence. Similarly, the phrase boundary pause in the quotation-attribution sentences seems to interrupt the lexical contour, which suggests either that the pause is the only phrasing cue here, or at least that any pitch movement involved in identifying the boundary is systematized differently from pitch in lexical contours. There is an obvious kinship between this proposal and the typical British view. While admittedly short on phonetic detail, the British in general see the placement of the nucleus, the placement of intonational boundaries, and the pitch contour as three independent aspects of the prosodic system, with three quite separate functions. The tone-unit boundaries are seen to coincide with syntactic boundaries, and to be sig-

nalled by various devices of timing. The idea of phrase boundaries interrupting tone contours, however, does not to my knowledge appear anywhere in the British literature; any stretch of pitch bounded by phrasing pauses by definition contains a tone-nucleus. This is a subject for further investigation.

In order to see the relation between phrasing and the lexical uses of intonation, let us return to consider briefly an important aspect of the stress-as-rhythm hypothesis. The most striking characteristic of Liberman and Prince’s conception of stress is that it does not involve phoneme-like or morpheme-like elements, but is exclusively a function of the hierarchical organization of the segmental sounds. The rhythmic structure is not seen as a sequence of stresses, but only as a representation of the relations between elements of the segmental string. Constituents are interpreted as focused or deaccented according to their position in a structure, not because they are accompanied by some morpheme of accent or phoneme of stress. Traditional accounts of stress fail because they reify notions like ‘secondary stress’ and are then unable to state the acoustic or articulatory correlates of such element. Liberman and Prince’s treatment proposes, in effect, that we reify instead the rhythmic structure and take actually occurring variations of pitch, intensity, and timing not as meaningful or contrastive in themselves, but as cues for identifying a structure. It seems to me that we must explicitly recognize the relational nature






of phrasing as Liberman and Prince have proposed we do for stress. Notions such as ‘terminal juncture’ or ‘boundary tone’, like ‘secondary stress’, can be seen as attempts to treat as segments phenomena that are better handled relationally. The phrasing function of intonation is to be expressed not in terms of contrastive elements of pitch or juncture, but in terms of relational cues which permit us to infer a structure. © The most obvious problem solved by this view is that of boundaries which find no realization as measurable pause or pitch movement, but which nevertheless seem to “be there” somehow. This problem was always particularly acute for the Trager-Smith analysts, whose rule that there could only be one primary stress per ‘phonemic clause’ led to the fairly arbitrary insertion of the juncture / |/ at syntactic boundaries, even in the absence of pause or pitch movement, simply to permit the assignment of two primary stresses in what would otherwise have been a single intonational stretch (cf. Householder 1957). Stating the acoustic correlates of the terminal junctures, especially /|/, often thus demanded a fair leap of faith, as even Trager-Smith followers were prepared to admit.18 Consider, though, the similarity of the problem of ‘terminal junctures’ to that of stress levels. Trager and Smith, bound to a theory of phonology which demanded that all audible, meaningful differences be expressed as segments, analyzed rhythmic relations as strings of ‘stress phonemes’; by explicitly allowing relations as elements of our phonological theory, we can provide a better account of the complexity of stress and rhythm. Rhythmic relations are inferred on the basis of the whole structure, and it is futile to look for acoustic correlates of the ‘stress level’ on a particular syllable. In the same way, confronted with pitch relations between phrases and pitch movements at pauses, Trager and Smith reified the relations as segments—terminal junctures—which could be arranged into strings like other phonemes. But if we extend the relational concept to pitch phenomena, we will not posit boundary segments and then worry about their acoustic correlates, but will see pauses and pitch relations as cues to structure. Syntactic boundaries are present in any case, an inherent part of the structure inferred by the hearer; the perception of boundaries, like the perception of stress, is not a matter of hearing a particular acoustic cue or set of cues, but of understanding the structure of which they are a part. Naturally, this view will require the development of a formalism for representing pitch relations, an undertaking which is beyond the scope of this book. Conceivably, such a formalism might involve something like three ‘pitch levels’: baseline, plus relatively higher and relatively lower;

Intonation and Grammar


see Crystal (1975: Chapter 4) for related speculations. The point here has been simply to present empirical and theoretical justification for distinguishing the lexical and relational functions of intonation. c. Phrasing and the Grammatical-Affective Distinction By separating relational and lexical uses we may incidentally attain new insight into the often-made distinction between grammatical and affective uses of intonation (cf. Chapter 5 Section 2b). Since phrasing functions are—almost by definition—narrowly syntactic, and since (as we saw in Sections 3 and 4 above) the meaningful contours of intonation can have both grammatical and expressive effects depending on context, we may speculate that part of the basis for the grammatical-affective distinction has been an intuitive understanding of the relational-lexical difference, together with factors like the proximity to paralanguage, discussed in Chapters 5 and 6. The grammatical-affective distinction may also be based in part on the intonation of other European languages, notably French, in which most of the grammatical uses seem to involve relative pitch, and where the utility of positing any contours at all is open to question. For example, Delattre (1966) treats the distinction between what he calls major and minor continuation as a distinction between two different contours, but it is just as possible to view it as purely relational, a function not of contour shapes, but of the relative height of phrase-final pitch peaks: oeufs (47)






pren jen


The relative pitch of aeufs and frais signals a more major boundary after frais (major continuation) than after wufs (minor continuation); contours play no role. In any case, it seems indisputable that the English intonation system is vastly richer than that of French (cf. Schubiger 1965:175), and it is likely that the reason French writers on intonation (e.g., Bally 1941, Delattre 1963, 1966, 1972, Martin ms.) tend to equate the grammatical function of intonation only with phrasing is that English-style intonational lexical segments, with their sometimes grammatical, sometimes affective nuances, play no significant role in French. Unfortunately, their native point of view carries over to their studies on English, and even in the






work of careful writers like Delattre we find statements like the following: “The contrast between minor continuation and major continuation is much more marked and much more frequently observed in [French, German and Spanish] than it is in English; only very good English speakers observe it consistently” (Delattre 1963:198). Perhaps by recognizing the relational-lexical distinction, we will be more successful in comparing English intonation to that of other languages.



Stylized Tones and the Phonology of Intonation 1. The Calling Contour

At least as far back as Pike (1945), and from time to time since then, students of English intonation have observed a contour that is generally considered a special ‘calling’ or ‘vocative’ intonation.1 This is best exemplified by the call that a parent uses to summon a child home: (1) John--




Pike (1945:71f) describes this as a spoken chant, and says that “its meaning is of a call, often with warning by or to children.” It is ‘Type T° of four ‘call contours’ discussed by Abe (1962). An exchange of articles in Le Maitre Phonétique (Fox 1969, Crystal 1969b, Fox 1970, Lewis 1970) takes its use as a special calling tone for granted, and concentrates on further questions of phonological and lexical analysis. Liberman (1978) and Leben (1976) have named it the ‘vocative chant’, and Liberman considers it to be one variety of what he calls the ‘warning/calling tune’. And Gibbon (1976a), the most complete discussion of the subject to date, treats this contour in a section (274-287) entitled simply ‘Calls’. The characteristic formal feature of this contour appears to be the stepping down from one fairly steady level pitch to another, though there is somewhat less unanimity among investigators about formal characteristics than about the contour’s function as a calling intonation. Thus for Liberman and Leben, the low pitch that may precede the stepping-down pitches, as in . a= =


dera— Alex







is an integral part of the contour, but they concede that it is optional. Fox, working in British tradition, concentrates on the stepping-down part

(ie., he treats what precedes as the head), and calls this contour the ‘step-down tone’. Crystal (1969b), finally, objects to Fox’s analysis and claims that only the final level pitch is relevant. Throughout Section 1 we will consider the distinguishing mark of this contour to be the two stepping-down pitches; we will return to this question briefly in Section 22 The interval between these pitches is often, as Liberman and Gibbon both observe, about a minor third, but Liberman’s implication (84ff) that this interval has profound significance seems unwarranted. As I write this, a student out on the quadrangle is calling her dog (3 ) Mor--

gan-with the two pitches only a major second apart. Crystal (1969b:36) and Gibbon (1967a:274) both note that the interval is by no means fixed. Finally, we may note that there is often considerable lengthening of the chanted syllables, but I do not believe this to be diagnostic, and this opinion seems to be shared by Pike and Crystal.? What exactly constitutes the ‘chanting’ nature of this contour is thus not clear; Gibbon refers to an unpublished paper of his which subsumes the special acoustic qualities under the term ‘chroma’. But the general characteristics are plain enough, and the reader should have no trouble interpreting the examples in this chapter. The extent to which this is possible is, I think, evidence that what we are discussing is a real unit (morpheme, linguistic sign, intonation contour, or whatever) of English. The notation device already exemplified will be used throughout the chapter; it is intended only to indicate the steady level pitch, and not any prolongation of the syllable that may occur, nor anything about the relation of syllable breaks to the pitch drop. It seems to be a common assumption among those who have investigated this intonation that there is some fundamental connection between its form and its function—between its steady level pitches and the fact that a call must be transmitted over a considerable distance. In this view, the purpose of not letting the pitch drop rapidly is to maintain the volume for calling. Abe (1962:520) makes this assumption explicit: In calls, you assume that the person being called is a certain distance away from you (even if he or she is actually very close to you)... . Distance between the person calling and the person being called is,

Stylized Tones and the Phonology of Intonation


no matter whether this distance is a real thing or an imagined one, a vital factor for prescribing a mid-suspended tone [i.e., the calling contour under discussion], without which it would be impossible for the speaker’s voice to carry far. [emphasis added] Other investigators do not see so direct a link between form and function,

but all have assumed that distance between the interlocutors is in some way significant. Thus Pike quotes Nida as suggesting that the calling intonation is appropriate only if the addressee is out of sight (“.. . if Tommy is in sight, the pitch tends to fall to low, in his usage.”) Pike himself, with characteristic thoroughness, feels the situation to be a bit more complicated (1945:187): “For my speech the application . . . is a bit different: If the hearer were in an unknown place, or distant so that he could not hear readily (even if he were in sight), I would be likely to arrest the fall of pitch at level three [i.e., use the calling intonation].” The idea that distance or eye contact is significant is resurrected or rediscovered in more recent work. Fox, taking Pike as his authority, says (1969:13): “This tone is often used to signal to someone who is some distance away or out of sight.” Lewis, contending that Fox’s treatment covers

only one part of a phenomenon of ‘remote speech’, makes similar remarks (1970:32): “Unlike conversation, which reflects the fact that the speakers are at comfortably close quarters, remote speech reflects the speaker’s feeling of less than normal proximity.” Liberman, citing Leben, who cites R. Oehrle, states that the ‘vocative chant’ is used “to call to people with whom the speaker is not in eye contact” (1978:19). Leben himself adds a footnote (1976:97n) :“O. W. Robinson III notes that this intonation contour is also used for expressions of caution, like Watch it! Be careful! H M L HM and here it is all right for the addressee to be visible to the speaker.” Obviously, the ‘distance hypothesis’, if we may refer to it that way, is*powerfully attractive. Pike, Abe, and the British investigators hedge their statements with qualifications like “often,” “more likely,” “real or imagined distance,” “speaker’s feeling of distance,” etc., but all assume

that distance is somehow the key to understanding this contour. Liberman and Leben are even more categorical in their descriptive statements about distance, but this puts them in the position of having to attribute a dual function to the contour—calling and warning—without making any attempt to explain why the two should be related. Their analysis amounts to saying that the calling intonation is used in cases of distance between speaker and hearer, except when it isn’t.







This latter view is surely unsatisfactory, and not at all in keeping with Liberman’s hypothesis of single abstract meanings for intonation contours. But I think that the warning/calling analysis is entirely avoidable, and that we can take Liberman’s hypothesis farther than he has himself, That is, I would argue that ‘calling at a distance’ and ‘warning’ are simply (in Liberman’s words) “applications to a particular usage” of a more general meaning of the intonation under discussion. This is not a new idea: Gibbon’s analysis of the calling contour attempts to provide a single abstract rubric that will account for its entire range of uses. Specifically, Gibbon allows a very metaphorical interpretation of the notion of ‘distance’, and suggests that the function of this contour is to “secure uptake”—to establish definite contact between speaker and addressee where none has existed, or may not exist. Thus he suggests that greetings, for example, may be explained by either real or metaphorical distance (280-281): The category of greetings may be understood partly in natural terms, since it is often the case that greetings are given from the middle distance; the category may, however, also be understood in a transferred sense: where a greeting is not simply a passing acknowledgement, it is either a prelude or a coda to a dialogue. In other words, it is part of a procedure for setting an appropriate scene for a dialogue... .

But his account becomes rather strained and artificial in certain cases, it

seems to me, notably in his treatment of the use of this contour in ‘trans-

actions’, e.g., (4) Thank--


at a supermarket check-out, of which he says simply: A more obscure transference occurs in the case of [transactions], perhaps to be understood in terms of dialogue setting, as mentioned for the category of greetings. . . .

The metaphor of distance has been stretched past the breaking point. In the next section of this chapter I will argue that we can best understand the use of this contour—while at the same time once again illustrating the potential of the abstract-meaning idea—by abandoning the notion that ‘distance’, real or metaphorical, is the critical semantic element . I will show that it is not essentially a calling intonation, a warning intonation, or a metaphorical distance intonation, but rather a ‘stylized’ intona-

Stylized Tones and the Phonology of Intonation


tion, whose function is to signal an element of predictability or stereotype in the message. In subsequent sections I will show that if this intonation is analyzed in this way in the context of the nuclear-tone framework I have adopted, it can be related to other intonational phenomena to which it would otherwise appear unrelated; and I will discuss the relevance of this analysis to the perennial question of levels vs. configurations, showing how stylized intonation provides one more bit of evidence against pitch-level phonemes.

2. Stylized Fall For reasons that will become clearer later, I will refer to what we have simply labelled the ‘calling contour’ as stylized fall. Let us begin

by observing some of the sorts of circumstances in which this intonation is appropriate. The setting that immediately comes to mind is the one exemplified at the beginning of Section 1—a parent calling a child—but there are many others, such as calling a dog:



calling a group of friends at a picnic: or

(6) Food's--

Come and get--



calling reminders: (7) Don't




calling greetings, etc.: ot G'night--




It is no accident that all these examples have a flavor of everyday domestic life about them. What is signalled by this intonation is the implication that the message is in some sense predictable, stylized, part of a stereo-






typed exchange or announcement. ‘Nothing you couldn’t have anticipated’, it says. Gibbon makes the same observation: “As far as the spoken content is concerned, all uses share decidedly formulaic or stereotyped lexico-syntactic items; what little is conveyed by these tends to be highly situation-dependent . . . and therefore low in information value” (279280). (But as we saw in Section 1, he nevertheless takes distance, rather

than stereotype, to be the most significant semantic element involved.) We can see the ‘stylized’ nuance more clearly by comparing pairs of utterances. Thus the stylized fall is appropriate for warnings that. are essentially reminders: (9) Look out

for the broken


ep-(ie., the step on the way down to the basement that’s been broken for months) but not for warnings in emergencies:* Look out

for the creva.,


eu (one mountain climber to another).

It is used to inform the hearer of events considered commonplace everyday:


(11) Daddy forgot his brief-case--

but not of surprises, emergencies, big news:

(12) paddy fe11 downsta, r


It is used, as we saw, for calling children home for dinner or for bedtime, but it would not be used to call to an acquaintance whom we are not

expecting to see—at a football game, say, or across a city street. In this case we would get instead: (13) (Hey) Har r


Stylized Tones and the Phonology of Intonation


Of course, all these examples are only intended to suggest possibilities; given the appropriate situations, one could readily match up intonations and segmentals in other ways. If the Hardy Boys were creeping up to the attic of a haunted house for the first time, Frank would warn Joe: (14) Look out for the broken st

e Pp.

On the other hand, a mother yeti sending her children off to abominable snowschool might remind them of the danger outside their lair this way: (15) Look out for the creva-asse-—-.

In the same way, if we put the stylized fall on the sentence

(16) Daddy fell downsta-irs--

it gives the listener the distinct impression that Daddy is a hopeless klutz who does this sort of thing all the time.

These data seem to support the hypothesis that we are indeed dealing with a ‘stylized’ intonation. Moreover, they exemplify the value of the abstract-meaning hypothesis; to use Liberman’s words quoted in Chapter 7, stylized intonation does “pick out classes of situations related

in some intuitively reasonable, but highly metaphorical way” (i.e., stereotyped, stylized, predictable), and though “the general ‘meaning’ seems hopelessly vague and difficult to pin down, . . . the application to a particular usage is vivid, effective, and often very exact” (e.g., Daddy is a klutz). Yet even though the hypothesis seems well supported, it is probably worthwhile, in view of the widespread acceptance of the ‘distance hypothesis’, to reinforce the argument with some specific evidence against the notion that we are dealing with a calling contour. First of all, we can easily show that this intonation is not a device to enhance audibility, to maintain volume for transmission as a call. Cries of distress are the most obvious evidence: (17)



1 Pp



P e


¥ e




Surely a person in the position of uttering such a call is vitally interested in being heard, but He ( 18)






oraiis ape--


would not bring results worth bringing. That is, these cases show clearly that distance or lack of eye contact do not favor stylized intonation. Not only does the distance hypothesis let us down here, but the ‘stylized’ hypothesis explains why the latter calls sound so comical: the speaker is in a volatile situation which if handled wrong could mean plunder or violation or death, and yet is calling for help with an intonation that implies that the circumstances surrounding the utterance are routine. The second type of evidence against the calling contour analysis comes from repeated calls. Abe discusses this matter at some length, citing numerous examples from literature and broadcast drama in which the first one or two attempts to attract the attention of a child, servant, etc.,

are called with stylized intonation (Abe’s Type I or Type II), then the subsequent call(s) show a plain falling contour (Abe’s Type III) and raised volume. A typical sequence would be the following:




[no response] [louder]


ny-[no response] [still



n ie

Pike observes this phenomenon in his discussion (quoted above) of the matter of distance and eye-contact between speaker and hearer: “If . the hearer were in a place where he could understand me, and I knew he could hear, then, if I became insistent because he had not responded to earlier calls 1 would allow the pitch to fall to level four [ie., normal falling intonation], but accompany it with extra-strong stress, normal quantity, and lack of a chanting type—in other words, the situation would in that case follow the regular rules of attention and emphasis, instead of utilizing a chant” (1945:187f, emphasis added). This sequence (stylized call[s] followed by normal call) is seen not only with vocatives, but wherever stylized intonation is appropriate; often when an utterance is”

Stylized Tones and the Phonology of Intonation


called with stylized intonation and the addressee does not understand, the speaker will repeat with normal intonation: (20) A:

[from a distance,


to the car

from which

B has

just emerged] left your


x’ B:


[who was









What? A:

[louder] left





Again, in terms of the distance hypothesis, there is no explanation for this shift, but the concept of ‘stylized’ makes clear what is going on: the speaker takes the first call or calls to be routine, ‘stylized’ speech events, but when the message does not go through, he shifts to the more informative intonation. Note the similarity of this explanation to Pike’s comments just quoted.5 Perhaps the most cogent evidence against the calling contour analysis comes from instances of stylized intonation used in face-to-face situations at normal volume, particularly with polite formulas—‘stylized’ again—like thank you or excuse me or good morning. (These are the greetings and transactions that gave Gibbon problems, as we saw in Section 1.) It is especially significant not only that these cannot be explained either as calls or as warnings, but that the ‘stylized’ analysis accounts for the range of appropriateness in these cases as well. Thus to a clerk or a bank teller we might say either


(21) Thank-you--


Vo ue

But to someone who had just returned our lost wallet to us we would not say (22) Thank--

you--. Similarly, we can squeeze past people in a crowd with either





(23) 'Xcuse--


'Xcuse me




but if we bump into someone in a supermarket and cause them to drop a dozen eggs all over the floor, it will not do to say (24) '"Xcuse--


The stylized intonations are appropriate for stereotyped or stylized situations: clerk and customer, or strangers passing in a crowd. If real thanks or real apologies are intended, we must use the intonation that says we mean it. But in either case the volume is that of normal conversation, not

calling. It seems clear, then, that the connection between stylized intonation

and calling is incidental: calls can occur with and without stylized intonation, and more importantly, stylized intonation can occur at calling volumes and at normal conversational levels. Actually, ‘secondary’ might be a better way to describe the connection than ‘incidental’, for there probably is an association, statistically speaking, between stylized intonation and calling. But this can readily be explained if stylized intonation is understood in the way proposed here: at a range where hearing is likely to be difficult, it seldom makes sense to try to communicate anything more than brief shouts, or utterances whose content is largely predictable from the context. The latter, of course, are prime candidates for stylized intonation. Thus the correlation of the ‘calling contour’ with calling is not direct, but is mediated by the element of predictability or stereotype—the semantic common denominator conveyed by stylized intonation. Given the relationship between stylized intonation and calling, we might speak of a linguistic category ‘stylized’ and a gradient or paralinguistic dimension ‘chant’; the so-called vocative chant involves both stylized intonation and chanting voice quality, rhythm, etc. Indeed, the most accurate statement may be that chanted calls are ‘more stylized’ than stylized intonation at normal volume. That is, among the all-or-none contrasts of the intonational

lexicon, we find the distinction between

‘stylized’ and ‘plain’, which is signalled by stepping-down level pitches as opposed to steadily falling pitch; then once we enter the realm of the stylized, we can explain variations in the formal characteristics discussed in Section 1 (voice quality, prolongation of syllables, interval between pitches, etc.) as a function of greater or lesser degree of stylization.

Stylized Tones and the Phonology of Intonation


For example, Leben (1976:94) notices a difference between the pitch contours used in called vocatives and those used by newspaper vendors and train conductors. (Leben’s characterization of the difference is roughly that vocatives have a tendency to change pitch only on rhythmically strong syllables; I would add that the pitch levels in vocatives seem more clearly defined, while in many other chants there may be syllables of intermediate pitch.) Vocatives may be said to be more stylized, with the rhythm and melody more fixed, and often more chanting quality to the voice. Vocatives may also be said to be more stylized functionally: for a parent calling a child, the words matter less than for a newspaper barker shouting a headline or a train conductor announcing the next stop. As with other cases of gradience, form and meaning vary along similar scales. The more formalized melody of vocatives directly reflects their more stylized use. Thus the relationship between gradient and all-or-none here is exactly the sort discussed in Chapter 5. Falling contours can be stylized or not— level pitch sequence, vs. steadily falling pitch—and if they are stylized, they can be stylized alittle or a lot—normal conversational voice, vs. vocative chant. Level pitch is the distinctive feature, as it were, of stylized intonation, but there are other acoustic characteristics with gradient effects. Once again, then, we see that the interplay between gradient and all-ornone is a fundamental aspect of the structure of intonation.®

3. Stylized Rises

So far we have made the implicit assumption that the stylized intonation we have been discussing is related in some way to the ‘plain’ falling intonation, where the pitch drops steadily rather than being sustained. That is, in our discussion, we have not compared stylized intonation to high-rising intonations, or to the ‘contradiction contour’, but have arrived at our semantic analysis by examining ‘minimal pairs’ of utterances ending with stylized fall and plain fall. Because this assumption has remained implicit, though, we have not emphasized the point that the nuance ‘stylized, predictable, stereotyped’ is a modification of or addition to the basic intonational message conveyed by the plain fall. Stylized intonation does not turn statements into questions, warnings into requests, etc.: a warning is still a warning, a statement still a statement, a vocative still a vocative,

and we interpret the implication ‘stylized’ in the light of the basic function of the falling tone in a particular context. To use Liberman’s words again, such interpretation is the “application to a particular usage” of the






general meaning of the stylized contour. It is in this sense that ‘stylized’ is a modification of ‘plain’, and this is the reason we have labelled the socalled calling contour ‘stylized fall’. An even more basic assumption of our discussion so far, of course, is that ‘fall’ and ‘stylized fall’ are significant constituents of any ‘tunes’ of which they are a part. That is, we can scarcely speak of one contour as a modification of another if we do not consider the two contours to be units at some level of analysis. The relationship between plain and stylized may thus shed some light on the tune-tone controversy. Specifically, since our analysis of the structure of intonation takes fall as a nuclear tone and stylized fall as a special modification of that tone, then a reasonable prediction might be that other nuclear tones would have stylized variants as well. If we were to find such variants, we could take them as important evidence for the validity of the nuclear-tone approach. I believe that such variants do exist. This section of the chapter presents evidence that the high-rise and low-rise tones are stylized as single level tones. To keep within the context of this analysis, I will refer in the discussion that follows to ‘stylized high-rise’ and ‘stylized low-rise’, but it should be borne in mind that phonetically these terms mean something rather like ‘high level’ and ‘low level’. a. Low-Rise

Low-rise can be used in both statements and questions for a variety of expressive effects. In questions it may connote curiosity, politeness, or anger, depending on the tone of voice and on the segmental content of the question; in statements it often conveys belligerence or defensiveness, or some special involvement of the speaker. Many of these uses of low-rise can be modified with the stylized low-rise in place of plain low-rise. In many cases the stylized connotation emerges as tiredness, resignation, or

‘I been there before’. Thus: (25) I'm com!


in answer to a parent’s call could come out as insolent, while

(26) I'm coming--

Stylized Tones and the Phonology of Intonation


puts up only token (= stylized?) resistance to parental authority, and conveys resignation to the inexorable approach of bedtime or dinnertime. Resignation or tiredness also shows up in the following example (reported to me by Janet Sternberg). A normal polite/curious question could have been put as follows: you



to be teaching






The actually reported version was considerably less encouraging to a teacher’s self-assurance: youou 20 going

(28) Are

t to

be teaching

this next

semester too--

Another example: ZT






(Liberman-Sag ‘contradiction contour’—belligerent, indignant)

asa not--

(same, but with stylized low-rise—implication is ‘we've been through this before, let’s not argue about it again’)

This example, incidentally, provides some evidence of a somewhat different sort about the analyzability of tunes into head and nucleus. The discussion here is based on the hypothesis that plain and stylized low-rise are systematically related entities in English, and that the ‘contradiction contour’ is not holistic, as Liberman and Sag maintain, but consists of a high-falling head and a low-rise nucleus, which may be either plain or stylized. This example is at least consistent with that hypothesis.® Oaths and epithets are frequently found with a low-rise, both plain and stylized: (31 ) You



damn i







The nuances here are very difficult to describe, but again, the stylized






version may connote resignation. A person informed of the umpteenth bureaucratic delay in a pet project might respond

(aa) eS

K damn


which we might expect to be followed by I’m going to go over and straighten those people out or a similar expression of determination. On the other hand, we might expect

(34)508 dam


in a similar situation to be followed only by a complaint. Yet the most insulting name-calling is likely to be done with stylized low-rise and hatred or anger in the tone of voice: here the implication is one of ‘ritual insult’, where the words hardly matter. A somewhat different set of possible interpretations for stylized lowrise is something like ‘this may be a superfluous background question’. Suppose A andB are touring a city of which A is a longtime resident and B is not. B says: What 're (35)

those towet


This is as it were a ‘real’ question: B does not know the answer, has no reason to assume that he should, and has every reason to expect that A does. But now suppose that B, too, used to live in the town they are visiting, and feels that he ought to remember what the towers are. Then we may get: 6) What 're (3 )



We might say that the force of this question is not ‘I request you to inform me’ but rather ‘I request you to remind me’. The similarity to the use of stylized fall for reminders and plain fall for more informative statements is striking. This type of stylized low-rise question is often used in the middle of a dialogue to confirm crucial background information. Thus we might interrupt a conversation about a mutual acquaintance with:

Stylized Tones and the Phonology of Intonation (37) te.




The implication is ‘I realize this is very relevant to what we're talking about, and I should know, but all of a sudden I’m not sure; could you please confirm’. Compare this to: she

(38) Is




interjected into a conversation at a similar point. Here the implication is more like ‘I’m surprised to infer from the drift of this conversation that she is Jewish; is that true?” b. High-Rise

The high-rise tone can be stylized as well. In questions, we often find the same overtone of tiredness or resignation already illustrated with lowrise. Compare:


go now-—

go n Can




But the most interesting application of stylized high-rise is seen in lists. Plain high-rise is very commonly used for listing in English: mil











I need


Stylized high-rise—with the rise becoming steady level—is also common in lists, with the implication ‘etcetera’. That is, the items in the list are not individually informative, but rather are intended to suggest a loose group-

ing which the hearer can fill out for himself. Thus: (41) A: Hey, B:


these nothing






in "em?

you know--








This ‘etcetera’ use of stylized high-rise in lists shows up frequently in casual conversation:





(42) A: What a ridiculous day I've had. morning running around downtown. B:




I spent

the whole













It doesn’t matter exactly what A had to do; the general idea is ‘time-consuming errands’, Notice that if the elements of the list are actually informative, then stylized high-rise is inappropriate: (43)a:

can I pick you up anything at the store?

B: Yeah! yogurt--

2? I need







ands i.

Similarly inappropriate is: (44) 22 We have

the following Falls—



of school




for today:


We might actually hear something like this, if the snow were so bad that virtually all the schools in the area were closed, implying, in other words, that the elements in the list really were less informative than they might normally be. c. Some Implications of the Analysis of Stylized Rises

Before continuing, I should briefly mention Abe’s call contour Type II, which is a stepping-up sequence of level pitches:

as) oon


While this is undoubtedly a stylized rise, I have excluded it from detailed consideration in this section, and it is worth pointing out why. It seems to be restricted to calls and parental admonitions, often has more chanting voice quality, and has a fairly fixed pitch interval of about a major

Stylized Tones and the Phonology of Intonation


sixth. The single-level stylized rises that I have discussed in detail, on the other hand, need not have a chanting voice quality and are used in normal conversation, which suggests that the step-up stylized rises are ‘more stylized’ (as discussed at the end of Section 2) than the single-level ones. This is in keeping with the idea that the diagnostic characteristic of stylized intonation is level pitch, and that other features (such as chanting voice quality, prolonged syllables, and fixed pitch intervals) are present to different extents reflecting different degrees of stylization. In this connection I may also answer the potential objection that what I have called ‘stylized rises’ in this analysis are actually only plain (i.e., nonstylized) ‘level’ tones. Crystal, for example, treats level tone on a par with other tones, and suggests that this tone signals boredom, sarcasm, etc. (see Crystal 1969a:215-217). But as we saw in Chapter 1, there is far less consensus on ‘level’ than on any of the other nuclear tones, which suggests that there is at least something peculiar about it. The semantic evidence presented in this section makes it fairly clear that the relation of plain to stylized does hold between rising contours and level ones, and suggests that what is peculiar about ‘level’ tone is that it is a modification of something else. That is, there are no ‘plain’ level tones, but only stylized rises. The evidence presented in the first half of this chapter, then, points

to the existence of a general phenomenon of ‘stylized intonation’, which is used to signal that an utterance is in some way part of a stereotyped situation or is otherwise more predictable or less informative than a corresponding utterance with plain intonation. Stylized variants are characterized by level pitches: stylized fall is a stepping-down sequence of two level pitches, and stylized rises, subject to the qualifications just noted, are a single level pitch. Various other acoustic qualities—more formalized melody and rhythm, and chanting tone of voice—are to be considered dimensions of gradience within the category ‘stylized’. Finally, it is important to recall that we were motivated to search for the stylized high-rise and low-rise on the basis of the relationship between plain and stylized fall. This search, as I noted, makes sense only in terms of an analysis of English intonation which takes nuclear tones to be significant structural entities. If we did not consider fall,. high-rise,

and low-rise to be comparable units (andafortiori, if we did not consider them to be units at all), we would have no reason to expect or look for stylized variants of one on the basis of stylized variants of another. The fact that the search was fruitful argues strongly for the validity of the general analysis. In addition, it reminds us that the analysis of form and






the analysis of function must go hand in hand. As long as investigators go on positing contours more or less at will—the warning/calling tune, the newspaper vendor's chant—and directing most of their efforts at treating the formal properties of such contours, then their generalizations—insofar as they deal with entities which are not really units of the language —are bound to be off the mark.

4. The Phonology of Intonation a. Levels vs. Configurations: A Review of the Debate

While I feel that stylized intonation provides valuable evidence about the tune-tone controversy, it is of far more general theoretical significance that the relationship between stylized and plain sheds new light on the old levels-vs.-configurations debate. Let us begin by reviewing some of the issues involved. As we saw in Chapter 1, Bolinger’s original broadside against the level analyses (1951) was deflected by the compromise view proposed by Sledd (1955:328-329). Concluding that “Bolinger’s antithesis between levels and configurations is ultimately false,” Sledd argued: To some extent, a geometrical analogy is justified. If two points determine a line, the occurrence of two pitch phonemes determines a sustention, a rise, or a fall. The real problem is the degree of precision which is necessary in the determination of these geometrical segments,

In other words, Sledd did not argue with Bolinger’s contention that the meaningful elements of intonation were contours, but claimed that this view was compatible with one which described the meaningful contours in terms of pitch-level phonemes. Bolinger, however, had already anticipated this compromise, and rejected it (1951:13-14): If we must analyze the configuration, what shall be the particles into which we break it down? Four levels are not enough, and with five or six there would still be left-over contrasts. Of course, as the size of our element approaches zero, we get a kind of infinitesimal calculus by which even a perfectly continuous figure can be accounted for. It

is, however,


for in the same


in which

the evenly

spaced stippling on a half-tone accounts for the design of the photograph—it is an artificial atomizing imposed from outside that does not represent any of the segments or joints of the given. [Emphasis added]

Bolinger never contested that it is possible to describe contours in terms of levels; it is to the “artificial atomizing” that he was objecting. It is, of

Stylized Tones and the Phonology of Intonation


course, perfectly true that pitch falling from 160 Hz to 80 Hz passes through as many arbitrary pitch levels as the analyst cares to posit. The question is whether or not those levels are structurally significant. Calling them phonemes implies that they are, and this is the view with which Bolinger took issue. An analogy to the systematization of Chinese tones may be helpful. Structuralist analyses of Chinese tones assumed that the tones were ‘phonemes’ (e.g., high-level, low-rising, etc.) but frequently specified them phonetically on a five-point scale which was first introduced by Y. R. Chao in 1930 and which is still often used for convenient transcription of kinetic tones in languages of the ‘Orientalist’ tone language group.® Thus Chao (e.g., 1968) writes the tones of Mandarin this way: First tone Second tone Third tone Fourth tone

7 A v\ \

55 35 214 51

(High level) (Mid-rising) (Falling-rising) (Falling)

Obviously, Chao does not intend to suggest that Chinese has five pitchlevel phonemes; his geometrical analogy, unlike Sledd’s, is at the phonetic level. The numbers are a kind of supplement to the IPA alphabet for transcribing pitch, a means of converting audible contours into marks on paper. The IPA analogy is relevant. For example, the use of phonetic symbols [t t t] for dental, alveolar, and retroflex stops in transcribing a given language does not imply anything about the phonological structure of the language, nor does the use of [t] in a language with only one apical stop imply anything about its exact phonetic nature. The phonetic alphabet is merely a shorthand and has no structural significance whatever. Naturally, it takes no great insight to make that observation today, but in the early part of the twentieth century, when the IPA was in its heyday _ and controversy raged over narrow vs. broad transcription, only a linguist with a time machine could have realized that the issue was not really phonetic precision, but structural significance in a given language. Yet the insights that are now available to everyone in segmental phonology have not been applied to the study of intonation. Sledd’s question about the “degree of precision which is necessary in the determination of these geometrical segments” is strikingly like the arguments about narrow and broad transcription that went on before the phonemic principle was clearly understood. Sledd asks, in effect: How narrow atranscription do we have to have for English intonation? Bolinger, on the other






hand, recognizes the phonemic principle, and sees that phonetic precision is not the main point; you can make your transcription as narrow as you like, he says, but if it doesn’t draw lines where the language itself draws them, you will miss important generalizations about the structure of the language. Bolinger is not concerned with whether pitch levels are a useful device for phonetic notation; he is arguing that phonologically they are irrelevant. Sledd’s geometrical analogy confuses the phonological and the phonetic, and misses the point. And the confusion has been perpetuated. S. R. Greenberg (1969:5 ), for example, taking Sledd as one of his authorities, writes: “. . . one cannot draw a curve without assuming points (potentially representing levels) along the way. Thus, any contour assumes the presence ofa set of levels. . . .” The extent to which Greenberg muddies the distinction between phonological and phonetic can be seen from his judgment that “Ladefoged . . . also discards the strict dichotomization of levels versus configurations.” The passage from Ladefoged that Greenberg quotes in support of this interpretation will help us to unravel some of the phonological-phonetic confusion, and to put the argument in somewhat more current terminology. In fact it seems clear that from the point of view of the higher level phonological rules, the complete contours contrast with one another; but the phonetic specification must be in terms of target pitches. .. The relation between intonation contours and target pitches is in some ways (but not in all ways) analogous to that between phonemes and the bundles of distinctive features or simultaneous categories of which they are composed. [Ladefoged 1967:52]

Now, this is a very different view from the one to which Bolinger was reacting. Ladefoged clearly states that the phonetic specification must be in terms of target pitches; the phonemes are contours. This conception is in sharp contrast to that of Liberman, who sees himself (1978: 87) as a modem successor to Trager and Smith. Liberman does not muddy the distinction between the phonological and the phonetic; indeed, he makes it clear that in his opinion the pitch levels do have a phonological relevance as well as a phonetic one. In Liberman’s system, as we have seen, two distinctive features [High] and [Low] characterize four significant levels: the phonemes are pitch levels. (“The underlying segments of tonal representation are static tones such as Low and High. Kinetic tones are always to be analyzed as sequences of static tones”: 1978:16.) Contours are thus sequences of phonemes, and not, as in Ladefoged’s analogy, bundles of distinctive features.

Stylized Tones and the Phonology of Intonation


In short, the spread of the notion of distinctive feature since the early fifties permits Sledd’s and Bolinger’s positions to be stated more explicitly, and that, in effect, is what Liberman and Ladefoged have done. Ladefoged’s view might more reasonably be seen, not as computerized support for Sledd (as Greenberg seems to feel), but rather as the application of fifteen or twenty years of theoretical development to Bolinger’s original statement. Liberman’s judgment is exactly correct: his analysis belongs in the Trager-Smith tradition. But we are not out of the woods yet. Ladefoged’s explicit (albeit qualified) analogy contours phonemes

: target pitches : distinctive features

brings up a question. Even if we decide with Bolinger that the ‘phonemes’ of intonation are contours, what are the distinctive features? Conceivably we might want to use Sledd’s geometrical analogy at that level, as Ladefoged suggests. Or perhaps we should take a cue from Vanderslice and Ladefoged (1972), whose ‘kinetic’ distinctive features [Endglide] (= rise) and [Cadence] (= fall) define holistic contours (see above, Chapter 1 Section 3c). This, and not Ladefoged (1967), is probably closest to the view expressed by Bolinger (1951). The three possible positions can be schematized as follows:

phoneme DF

Liberman level level

Ladefoged contour level

VéeL contour contour

Once again let us turn to studies of Chinese tone for a look at how a similar problem is handled in a different tradition. In the generative literature of the last ten years we find the three types of solutions just mentioned, viz. analogs to Liberman, Ladefoged, and V & L. Wang (1967) _ proposes a set of distinctive features including ones like [Rise], [Fall], and [Convex], which refer to the shape or direction of the contour without reference to target pitches; this is thus analogous to the V & L position.

Woo (1969) criticizes Wang on a number of grounds, and insists that the distinctive features should refer to pitch levels. Woo, in fact, goes further than that; motivated largely by the search for some principled way of representing the association of tone and segmentals, she proposes breaking down the contours into sequences of static tones defined by distinctive features. This position approximates Liberman’s view on English intonation. But Walton’s critique (1976) of Woo handles tone-segment associa-






tion in a way similar to autosegmental phonology, and thus freed of Woo’s motivation for breaking down tones into sequences of levels, Walton presents a great deal of evidence that tones must be regarded as phonological units. Yet he must still deal with the matter of phonetic representation, and here he agrees with Woo’s arguments against Wang’s use of kinetic features. He feels that it would be preferable to have features specifying only pitch points, because, he says, if one considers distinctive features to be commands to the vocal tract, beginning and ending pitches must be specified. His geometrical analogy is thus at the distinctive feature level, and his position is analogous to Ladefoged’s. But Walton goes on to concede (1976:234-235): If it turned out that [in the tone sandhi rules of a language] the direction of tones were critical whereas the starting and ending points were not, then a theory of tonology that specified that all tones must be characterized as sequences of level tones would be misleading. It would be claiming that all tonal processes involve discrete pitch heights whereas it could be the case that some tonal processes involve only pitch direction (e.g., rise, fall) but not discrete pitch heights. In sum, I believe that it would be premature to constrain the theory such that all underlying tones must be represented as series of discrete, level pitches. . . . It would seem to me that we must also allow that underlying tones be characterized by unitary features such as Rise and Fall if there is morphophonemic justification for such representation and if the use of discrete, level pitch features only obscures such tonal morphophonemic processes.

The crux of Walton’s argument is essentially the same, then, as that of Bolinger’s a quarter of a century ago: a system of distinctive pitch levels, whether phonemes or distinctive features, must break down all contours into pitch sequences, and is in principle unable to express generalizations based on pitch direction. To the extent that there are generalizations based on direction, a pitch-level analysis is inadequate.

b. Stylized Intonation and the Pitch-Level Analyses

The relationship of stylized and plain intonation involves generalizations based on pitch direction, and thus supports Bolinger’s view that chopping up intonation contours into distinctive levels is an “artificial atomizing imposed from outside.” The data presented in the first three sections of the chapter point to the conclusion that there are certain styl-

ized tones, characterized by relatively level final pitch, which are related to other ‘plain’ tones characterized by rising or falling final pitch. The

Stylized Tones and the Phonology of Intonation


semantic facts show that the stylized contours are in some way derivative —that is, that they correspond to plain intonation, with something added. Whether we talk of this ‘something’ as an added feature [stylized], as

some phonological process, or in some other way, stylized contours must be regarded as modifications of the more basic ‘plain’ tones. It is this relationship that makes trouble for the pitch-level analyses. A representation of plain and stylized tones as sequences of distinctive levels only obscures the relationship, as can be seen in Figure 14.’°


Trager—Smith Plain Stylized 3 24 3 14


3 3||



2 24

1 2||

Liberman Plain Stylized H HM LM H

{i}{t HM








Figure 14. Plain and stylized tones as they might be expressed in the pitch-level analyses of Trager and Smith and Liberman.

In Trager-Smith notation, there is nothing in the way that /* 1#/ is related to /? 2#/ to suggest that /* 3|/ and /* °#/ or /? 2||/ and /? ?#/

exhibit the same relationship. Liberman’s system is different in detail, but the same general criticism applies. Not even his proposed distinctive features for the pitch phonemes can help; expressed in those terms, plain and stylized tones would appear as in Figure 15. A rule to relate plain and Stylized

Plain Fall High-Rise Low-Rise





+High +High —Low ] [ ] [ +Low —aHigh —High een l ] [ aLow




] itkan |

-+High +High ‘olafie | eee ] —High —High lar ] ene ]

Figure 15. Distinctive feature representations of contours shown in Liberman’s system of pitch-level phonemes in Fig. 14.

stylized as they are expressed by Liberman’s system would necessarily be tantamount toa listing in the lexicon of plain-stylized correspondences, for no phonological generalization can be made in his terms. Describing the relationship between plain and stylized in terms of the configurations involved, on the other hand, we can say that falling contours are stylized as a sequence of two relatively steady level pitches, while rising contours (either high or low) are stylized as a single level pitch. The configuration theory takes the pitch direction as one of the






defining characteristics of each contour. The relationship of plain to stylized high-rise and plain to stylized low-rise, for example, is expressed only in terms of the fact that both are rising configurations. We thus, as it were, posit a distinctive feature [Rise]; this is the only characteristic of both high-rise and low-rise which we need to refer to in order to describe the modification ‘stylized’. A theory which has no preconceptions about the nature of intonational units allows us to factor out characteristics of contours like [Rise] and [Stylized] where these appear to be relevant, and thus to reap the benefits of the notion distinctive feature. A theory arbitrarily constrained by the idea that pitch movements are really pitch sequences, on the other hand, is unable to express what is truly distinctive about the contours, and, in the specific case at hand, makes the relationship between plain and stylized contours appear arbitrary and accidental. There is a further observation to be made. The fact that stylized intonation is somehow derivative means that we should be cautious about studying chants to learn about intonation. Given the special function and phonological status of chants and calls, it would seem more reasonable to treat them as a category apart, or, at the very least, not to take them as critical evidence about the very heart of the system. Yet Liberman and Leben, under the influence of autosegmental phonology, have both gotten considerable heuristic mileage out of investigations of chanted contours. They assume that the exact intervals and steady tones seen in chants somehow provide valuable insight into the structure of normal spoken intonation, and that the task of the linguist is to describe how the actually observed fluctuations of pitch in ordinary speech can be accounted for by a theory associating the segmental string with an autonomous string of level tones. In some sense they want to treat normal speech as a modification of chant, rather than the other way around.

Leben is explicit about this heuristic use (1976:106): the recognition that the strong autosegmental hypothesis required a rule of Tone Spreading for the vocative chant led to the discovery of other English contours requiring this rule. This, of course, does not prove the validity of the strong autosegmental hypothesis, but it does demonstrate its utility and it illustrates a way in which concern for theoretical restrictiveness can lead to new discoveries.

Now it is unquestionably true that your hypotheses govern to a considerable extent the questions you ask, and that inquiry undirected by hypotheses is likely to lead nowhere. But it is also true that plenty of “new discoveries” have been made within theoretical frameworks that were hopelessly in error—the “discovery” of laws for planetary movement

Stylized Tones and the Phonology of Intonation within the framework of geocentric astronomy, for instance, or, a much more contemporary example from geology, the “discovery” phenomenon of “polar wandering” before the development of the of plate tectonics and continental drift. Scientific knowledge must times be set aside when its theoretical underpinnings are shown

193 to use of the theory someto be


It seems to me that because autosegmental theory and its current formalisms were developed in great part on the basis of ‘Africanist’ tone languages, where contours probably can best be seen as sequences of levels (see Goldsmith 1976), some of the questions it asks are simply inapplicable to languages, like Chinese or English, where contours are atomic rather than compound. Too great a concern for “theoretical restrictiveness” has led Liberman and Leben to attach undue significance to an eccentric subsystem of English intonation solely because it fits their questions better. Yet surely there is nothing in the general autosegmental concept that requires contours to be viewed as sequences of levels; rules of tone spreading could apply just as well to configurations as to levels. Formulators of the autosegmental theory, rather than being bound in by an accident of its birth, should work out ways of discussing the domains of kinetic tones—as Walton attempts to do—rather than forcing them into a mold that only prevents us from understanding how they work. c. Possible Objections to a Contour Analysis

Among the questions which may still remain in the reader’s mind is the following: What is the status of the ‘target pitches’ that Ladefoged includes as part of the ‘phonetic specification’ of the significant contours? This, it seems to me, is not a problem, especially when we consider that Ladefoged was discussing synthetic speech in the passage quoted earlier. All sorts of phonetic detail must be specified in speech synthesis—formant , transitions, for example—but no one would claim that formant transitions are to be treated as segments. Moreover, whether we are dealing with speech synthesis or natural speech production, it is also true that segmental distinctive features like [anterior] and [coronal] leave much unspecified, both acoustically and articulatorily, that is nevertheless part of the speech signal. For example, English [+anterior —coronal] stops are bilabial, while [+anterior —coronal]

fricatives are labiodental, but the

distinctive feature analysis makes the implicit claim that that difference is not structurally distinctive. In the same way, the fact that pitch contours (whether of intonation or of lexical tone) demonstrably do have






starting and ending points does not mean that those starting and ending points are what a language uses to distinguish systematically between contours.

Another possible objection that might be raised is the one that Sledd mentioned in the passage quoted in Chapter 1 (1955:329): “The necessity of levels appears whenever the contourist introduces the terms high and low into his vocabulary, as he regularly has done in the past and presumably must continue to do in the future.” Indeed, I have posited a distinction between low-rise and high-rise: how can this be reconciled with an analysis that does not involve pitch-level phonemes? There are two points to be made in reply. First, given what is known about figureground relationships in general, it seems likely that low-rise and high-rise are distinguished most reliably on the basis of whether the syllable at the nucleus of the tone is stepped down to from the preceding syllable or stepped up to, i.e., whether one says


volt, Cor

Cor or

: ne

(Recall that this is the basis of Bolinger’s B and C accents.) If the difference between step-up-to and step-down-to is perceptually more salient than high vs. low pitch level relative to the speaker’s voice range, then even the distinction between high-rise and low-rise may be as much a matter of configuration as of level.1! In a sense, however, the perceptual facts are irrelevant. The second, more important reply to Sledd is that even if we look on high and low in this case as distinctive levels where the characteristic pitch movement (i.e., rise) takes place, this is still very different from taking pitch movement and analyzing it as a sequence of significant pitch levels. The critical difference between the configuration theory and the level theory is that contours, not their starting and ending points, are the basic units of analysis. Considering pitch level, as well as direction, among the possible distinctive features of contours in no way destroys this distinction. Finally, a different sort of objection might be raised—a judgment of irrelevance. That is, one might claim that Liberman’s most important thesis is the idea of a lexicon of contours with abstract general meanings, and that British-style and American-style phonological representations are merely ‘notational variants’, alternate ways of writing down the contours that are the really significant elements of the system. However, there are at least two ways in which Liberman’s level analysis and the


Stylized Tones and the Phonology of Intonation

analysis presented in this book make different claims about the structure of intonational meaning, one obvious and important, the other less obvious and more speculative. The latter—having to do with the question of pho-

nesthesia or ideophonic meaning—will be discussed separately in the next chapter. The former can be treated briefly here. In Chapter 5 I presented evidence for the hypothesis—based on the work of Bolinger and Crystal—that all-or-none contrasts between tones are a different sort of distinction from gradient differences of pitch range. It is my contention (as it has been Bolinger’s since at least 1951) that any level analysis obscures this difference. Liberman’s discussion of the distinction between what I have treated as plain fall and stylized fall will do very well as an example (1978:104): the surprise/redundancy tune [which ends in what I have fall] has the skeleton [—High] [+High] [—High], with [+Low] in the various positions serving to ‘modulate’ roughly, . . . +-Low in medial position signifies restrained, in terminal position signifies definite, final.

called plain the feature the effect— and +Low

That is, Liberman says that a fall HM L is more restrained than a fall H L, and a fall H L is more definite or final than a fall H LM. He con-

tinues: We





the terminal

[+High] produces a very different tune, which

sequence H HM)




(given the terminal

lends an admonitory air to the utterance, and can

also be used for calling people (the ‘vocative tune’).

Liberman’s discussion is not at odds with our account of an essential fact: stylized fall (H HM in his terms) is a “very different tune” from plain fall (the sequence [+High] [—High], unspecified for [Low] ). This striking contrast is different, as Liberman says, from the distinctions among the variants of the plain fall (H L, H LM, HM L, HM LM), which in Liberman’s words “modulate” the effect of the contour’s meaning. If we take his description of the modulating effect of the feature [Low] at face value, then the ‘sequence’ H L should be the most definite, and HM LM the most restrained. I would be the first to agree with Liberman that the plain fall expresses differences of definiteness in somewhat the way he has described. It seems doubtful, though, that there are exactly four degrees of definiteness, as Liberman’s system implies. Far more likely is that there is a gradient scale from ‘most definite’ (steepest fall) to ‘most restrained’ (shalsame lowest fall); any variant along the scale is still an instance of the






basic linguistic unit, namely a plain fall. On the other hand, the reason for the sharp difference that Liberman observes between the H HM ‘sequence’ (the stylized fall) and any of the ‘sequences’ representing the plain fall is the very fact that there is an all-or-none contrast between them. We could no more perceive a modulation or an intermediate form between plain and stylized than we could between vat and fat. In short, linguistic systems force users to identify certain signals as discretely different from one another, and linguists’ analyses should reflect these discrete differences. But an analysis of intonation in terms of pitch levels forces us to distinguish points along a gradient as also being discretely different, ‘for the theory provides no principled way of knowing when changing a certain feature in a sequence is going to produce a “modulation” and when it is going to produce a “very different tune.” No amount of tinkering with theoretical mechanisms will ever remedy this defect; the best that any pitch-level theory can do is ignore it. And I think that to continue to ignore the difference between the gradient and the allor-none by forcing it into a foreordained system of distinctions is only to put off attaining an understanding of how intonation really functions in language.



Intonation and Phonesthesia The reader may have noticed that during the course of the foregoing chapters I have treated intonation contours as both lexical and phonological elements. This ambivalence is no accident; it is, I believe, intimately related to the question of the connection between intonation and pho-

nesthesia or ideophonic meaning. The phonesthetic nature of intonation has been mentioned by others (e.g., Bolinger 1947, also 1949), and is discussed at some length by Liberman (1978), who attempts to state rigorously what is involved in ideophonic meaning. Since topics like phonesthesia have long lain at the fringe of respectable linguistic inquiry, Liberman’s attempt is valuable, and his explicit claim that intonation and phonesthesia share some specific semantic characteristic is, as we shall see, an important insight. But I believe his description of the relation between the two is seriously in error, and in this chapter I present an alternative view.

At the heart of Liberman’s discussion of phonesthesia is the postulation of two different “modes of lexical structure,” ideophonic and morphemic. This is not a rigid dichotomy—“it is possible for a given word to have both ideophonic and morphemic analyses independently, and most lexical systems have this character to some extent”—but the distinction is basic to his characterization of the intonational lexicon, which in his view

“has a fundamentally ideophonic structure” (97). The two modes exhibit a cluster of attributes as follows:

ideophonic: iconic, metaphorical, not clearly segmentable; morphemic: arbitrary, referentially precise, generally clearly segmentable. Liberman explicitly states (93-97, passim) that iconicity and arbitrariness are the important features of the two modes, and that the other characteristics follow naturally from them.

erIt seems to me that his whole discussion suffers from a misund







standing—more specifically, a gross overextension—of the concept of icon-

icity. He refers to the iconicity of echoic or ideophonic words as “the metaphorical relationship of the sound of a word to a non-linguistic sound” (94), and defines iconic meaning as a “mode of meaning in which the signifié is a general metaphorical extension of some intrinsic property of the signifiant” (96). This appeal to metaphor is his way of explaining the obvious fact that some ideophones are less iconic than others: [English has] scattered classes of examples which have ideophonic or.

partly ideophonic character, and which shade off into areas where meanings are iconically arbitrary. An example would be certain classes of words for noises, like clang, clank, clink, click, clop, cluck, clomp, clunk, etc. A restriction of the metaphor to shape and consistency, rather than sound, is seen in glop and glob, and modes of fastening give us clip, clasp, and clamp. The system is of course far from complete—climp and clont don’t exist at all, while Clint is neither a noise, a mode of fastening, nor yet a smaller or sharper counterpart to glint, but simply a name. The fact that ‘cl-’ is used for ‘noises with abrupt onset’, while ‘gl-’ is used for shapes, and for ‘attention-attracting emissions of light’ (glow, gleam, glisten, glint, etc.) is an example of apparently arbitrary restriction of ideophonic iconism. [96]

Liberman has failed to distinguish metaphor from what I will call conventionalization. Any iconic symbol naturally involves some conventionalization of the relation between one percept (the symbol) and another (the referent). “At the lowest size-level of most, or all, iconic sys-

tems,” says Hockett, “one finds a layer of arbitrariness. Thus a road-map

means the territory it represents iconically down to a certain level, but there is no precise correlation between the width of the line representing a road or a river and the actual width of the road or river—these features are not represented to scale” (1958:577). This conventionalization is simply the extent to which the form of a symbol diverges from the form of the thing symbolized in order to make the symbol fit the exigencies of the symbolic medium. Human noises are not exactly like cows’ noises, and the closest we English speakers come is our conventional word moo. But this conventionalization

is not, pace Liberman,

the same


metaphor, and it is important to distinguish the two. No metaphor is involved in getting from the sound a cow makes to the English word moo, only conventionalization. But when a child uses this word moo to mean ‘cow’ rather than the noise the cow makes, then that is not part of the conventionalization, but a metaphorical extension of meaning. In the same way, the fact that click meaning ‘sudden insight’ or crash meaning ‘financial collapse’ are at bottom iconic does not mean that click is some.

Intonation and Phonesthesia


how an iconic representation of the form of a sudden insight, or crash an approximation of what it sounds like when an economic system fails. Iconically, click represents only the sound of, say, two parts of an assembly falling into place together, and crash the sound of a heavy object fall-

ing or colliding with something. It is metaphor that does the rest. Naturally, such metaphorical extensions are themselves so often conventionalized that it may be difficult to separate literal from metaphorical, and I am obviously not proposing a way to draw the line between the two in specific cases. The point of the distinction, rather, is to escape the necessity of concluding, as Liberman does, that metaphor is the essence of iconicity. That conclusion is surely to be avoided, for it is beyond question that thoroughly arbitrary symbols, not just iconic ones, are subject to metaphorical extension as well. Cold can mean ‘unfriendly’. Bright can mean ‘intelligent’. Sharp can refer to knives, but also to cheeses and repartees. Liberman’s assumption that metaphorical extension of abstract meaning is a natural concomitant of non-arbitrary symbolism, and that

“for obvious reasons,” the arbitrary signs in the “morphemic mode of lexical structure” are characterized rather by “referential precision” (97) is a most extraordinary view, one which cannot possibly withstand serious scrutiny.


Equally untenable is the related claim that all ideophonic meaning is iconic. While it is evident that there is some similarity of sound and meaning in words like moo, bang, knock, etc., I see no basis for the claim

that the combination of sounds /gl-/ has anything but an arbitrary relationship with vague emissions of light. In the same way, the widespread assumption (which Liberman shares; cf. p. 96) that gestural meaning is iconic is also lacking in empirical support. It is difficult, for example, to imagine what sort of non-arbitrary connection might be found between up-and-down head movements and the meaning ‘yes’, or between sideways head movements and ‘no’, especially in light of the fact that in the Middle East the connections work the other way around.’ And yet there does seem to be some basis for asserting that gesture, phonesthesia, and intonation somehow exhibit a more direct connection between form and meaning than other segmental morphemes. Liberman assumes that this is because their meaning is iconic, and I suspect that many would agree. But if the term ‘iconic’ is not to be reduced to means inglessness, I think it must be reserved for those cases where, in Hockett’ words (1958:577), “there is some element of geometrical similarity bewhat is tween the [word] and its meaning.” I think that we can express bespecial about phonesthesia with reference not to iconicity, but to its






havior in relation to the design feature of language that Hockett (e.g., 1958) has called ‘duality of patterning’ and which is also known especially among European linguists as ‘double articulation’. This is, roughly,

the fact that the huge number of meaningful chunks of any language (pleremes) are arrangements of one or more of a relatively small number of meaningless chunks (cenemes ).? Since language has duality of patterning, words have an ambivalent nature: they are simultaneously strings of cenemes and configurations of sound. Iconicity exploits this ambivalence: in iconic words, the phonemes function not only as cenemes, but also as sounds. That is, bang is distin-

guished from ban, bag, pang, bung, etc. by the cenematic—message-differ-

entiating—function of the phonemes of which it is composed. But independently of the purely linguistic structure of the phonology, there is also a more direct connection between the sound of the word bang and its meaning. From the point of view of the design features of language, the peculiarity of iconicity lies in the fact that there is a sound—-meaning link which is independent of cenematic structure. This property of bypassing the cenematic structure is the hallmark of phonesthetic or ideophonic meaning in general. Iconicity is only the simplest case—where the direct sound—meaning link is due to similarity in form between the sound of the symbol and the sound of the referent. But if we define phonesthesia as a sound-meaning link independent of cenematic structure, then there is no need to equate it with iconicity. Iconicity is a special case of phonesthesia, but not all phonesthesia is iconic. For example, there is no need to claim that the sound—meaning relationship in the English words gleam, glimmer, glisten, glow, etc. is in some mysterious metaphorical way ‘not arbitrary’, but only that the effect of this initial /gl-/ is an effect of the sound, independent of the cenematic structure. Such a treatment accounts for the fact that phonesthetic effects have long frustrated the segment-minded analyst’s attempts to assign them to specific morphemes. It is a well-known problem of morphemic analysis that words like gleam appear to contain a ‘recurrent partial’ /gl-/, which makes a recognizable (if vague and abstract) contribution to their meaning, yet if we segment /gl-/ as a morpheme, we are left with forlorn would-be morphemes -int, -immer, -isten, etc., which do not recur. But if

we view phonesthetic meaning in the terms I have suggested, we will not attempt to segment these words. Rather, we will say that the sound of /gl-/ at the beginning of a word has acquired an arbitrary meaning in English, a meaning independent of the function of /g/ and /1/ as cenemes, and thus without the segmentability characteristic of ordinary morphemes.

Intonation and Phonesthesia


Such a definition of phonesthesia also explains why effects like those in gleam and glisten seem somehow secondary. The phonemes of which such words are composed function primarily as cenemes, and only secondarily as actual sounds. That is, the lexical structure of English would in some sense not be affected if the word bang were replaced by, say, foss, or if the gleam series were replaced by steke, forstle, warken, sugg, etc. The same meanings in the lexical structure of English could be attached to the new creations, and all that would be lost would be the phonesthetic meaning associated with the gl- initial of the originals. By saying “all that would be lost,” I do not mean to belittle the power of such ‘secondary’ associations. Obviously, they are exploited fully by poets and other sensitive users of the language. Moreover, there can be little doubt that the word-formation processes of a language are influenced to some extent by the phonesthetic intuitions of native speakers. (See, e.g., Bolinger 1940, 1950). The point is simply to characterize the sense in which they are secondary and the sense in which they truly involve sound-symbolism. The phonesthetic effects exist, as it were, by the grace of the actual sounds which manifest the cenematic structure of the language and the actual string of cenemes in a given word.

An analogy to written language will help make this clearer. A certain bagel store in Ithaca has a flashy sign advertising HOT BAGELS. The sign is painted so that a picture of a bagel is used for the O in HOT. Now this is the visual analog of the phonesthesia in, say, gleam. It is entirely irrelevant to the primary message of the sign HOT BAGELS that the actual shape of the letter (ceneme) O is like the shape of a bagel, but the sign painter exploited the similarity in shape to add a vague secondary meaning to the message of the sign. If the word for ‘hot’ were heisse, or the shape of the letter ‘O’ were 5, then the sign would still convey exactly the same primary message with the form HEISSE BAGELS or Hs5T BAGELS: but the secondary effect would be absent. Again, this is not to belittle the power of the ‘secondary’ association, for no doubt it sells a lot of bagels on a grey winter’s day. The point is simply that the primary function of the letter O is as a ceneme, a letter of the alphabet, and not as a shape similar to that of a bagel. In its cenematic function, O could theoretically take on any form; the secondary association in the sign is based on the shape it actually has. This characterization of phonesthesia, incidentally, is independent of any element of universality that may be involved. There may well be universal associations, such as those that appear to exist between high a front vowels and the notion of smallness, and indeed, there may be we whole network of human phonesthetic associations at whose existence






can scarcely yet guess. On the other hand, many such associations—e.g., the English /gl-/ case—may be idiosyncratic in individual languages. Our knowledge is simply inadequate to say for sure. But in either event, the structural peculiarity of phonesthesia is the bypassing of duality of patterning, the fact that sounds are associated with meanings solely by virtue of their sound and independently of their function as the realization of cenemes. The connection of phonesthesia to gesture should by now be readily apparent:

since gestures are not sequences

of cenematic

segments but

perceptual gestalts, no cenematic structure is involved in their meaning.

Again, the question of iconicity is irrelevant; gestural meaning is different from ordinary lexical meaning not because it is iconic—it may or may not be—but because it exhibits no duality of patterning. And if the arguments presented for a contour rather than a level analysis are valid, then in exactly the same way, intonation exhibits no duality of patterning either. The smallest meaningful elements—e.g., nuclear tones—are not sequences of cenematic segments; the sound—mean-

ing link is direct. In the terms we have been using, the exceptional aspect of intonation, phonesthesia, and gesture which distinguishes them from normal modes of lexical meaning is not iconicity, metaphor, or universality, but the fact that they do not involve cenematic structure. Phonesthesia is a sort of halfway-house, because the sounds used phonesthetically still constitute phonemes at another level—as in the English /gl-/ example. With intonation and gesture, on the other hand, cenematic structure is not bypassed; it does not exist. Obviously, because Liberman chooses to perpetuate the analysis of contours as sequences of pitch levels, i.e., to postulate a cenematic structure for intonation, he cannot express the similarity between phonesthesia and intonation in the terms I have used here. I have argued that the explanations he does offer are based on confusion about the nature of iconicity and metaphor. But notwithstanding the confusions, his intuition that intonational and ideophonic meaning share some fundamental characteristic must be reckoned asignificant insight. I feel that the foregoing discussion, in conjunction with the arguments in Chapter 8 on levels vs. configurations, preserves Liberman’s basic insight, while discarding some undesirable implications which are largely a result of his assumption that intonation must be expressed in terms of static tones.



Conclusion By now, I hope, it should be clear to the reader in what sense this work has been a discussion of intonational meaning and not simply an analysis of English intonation. If there is a single most important point to what I have written, it is the following: insofar as past investigators of intonation have failed to produce satisfactory analyses, it is because they have failed to consider in what ways intonational meaning is structured like segmental meaning and in what ways it is different. They have gone into their study with implicit preconceptions about the organization of intonation and have forced suprasegmental phenomena into inappropriate

molds. The two most widespread—and yet contradictory—such preconceptions are that intonation, like the rest of language, is organized into all-ornone contrasting segments, but that functionally it is somehow around the edge; a third, consequent preconception is that anything that does not fit into the all-or-none categories is paralinguistic or ‘emotional’ (Chapters 5 and 6). I have shown how these preconceptions have made it possible for one investigator to analyze as all-or-none contrast what another ignores as emotional variation: since all is implicitly considered to be peripheral anyway, there is little basis for a decision about which parts are to be assigned to the explicitly peripheral—i.., paralinguistic—domain. I have argued that one of the principal ways in which suprasegmental phenomena differ from the rest of language is that they involve the systematic use of gradience. It should be emphasized that this is an aspect of meaning and not merely of form. In the segmental lexicon we assume that meaning is conveyed by contrasts between categories; in intonation, we must also assume that semantic continua are expressed by unsegmented formal continua within contrasting categories. I have shown (Chapter 5) how the explicit recognition of certain gradient dimensions relieves us of the need to see suprasegmental phenomena as either all-ornone or paralinguistic. Specifically, I have suggested that pitch range—







both overall height relative to the speaker’s voice, and relative width or steepness of pitch movement—is an independently meaningful dimension of gradience, along which segments can vary without in some sense destroying their identity as segments; for example, steep fall and shallow fall are not two different contours, but are both instances of a single category fall, together with a difference of pitch range which makes a separate contribution to the meaning of the contour. Note, however, that gradience, so defined, implies the existence of segments. I have argued (Chapter 7) that intonation proper includes a lexicon of contrasting contours; intonational meaning is thus lexical as well as gradient. Two claims are subsumed under this rubric. First, there is, as most past analysts have assumed, some part of the intonational system of English that is organized into all-or-none contrasting categories, like more familiar segmental phenomena. Second, and more controversial, the choice of intonational category is not specified by the grammar of the sentence of which it is a part, but makes an independently meaningful contribution to the interpretation of the whole sentence, a contribution not unlike that made by modal particles like doch and etwa in German. It is with regard to this second claim, perhaps, that I have strayed furthest from my announced ‘pretheoretical’ aim of merely providing a basis for further discussion of intonation. Naturally, no scientific work exists in a theoretical vacuum, and my opinions about what it is important to observe have been shaped to a considerable extent by the ‘anti-deterministic’ views of Bolinger and others. My conclusions in Chapters 4, 7, and 8 in particular reflect my view that as much as possible of what a speaker says should be seen as representing an independently meaningful choice. I have shown in Chapter 8 that evidence from the use of certain ‘stylized’ tones argues against any phonological analysis in which contours are seen to consist of sequences of pitch level phonemes, and that the lexical segments of English intonation are to be considered phonologically unitary. This is a significant claim in two respects. First, it explicitly recognizes gradience as a separate parameter of intonational meaning; pitch level analyses are in principle unable to distinguish the more-or-less nature of certain pitch differences from the all-or-none nature of others. Second, and perhaps more important, it provides a possible explanation for the widespread preconception that intonation is somehow peripheral—specifically, that it is gesture-like—and for Liberman’s contention that intonational meaning is somehow like the meaning of ideophonic words: I have suggested (Chapter 9) that the essence of phonesthesia



and gesture is that they do not involve duality of patterning, and that intonation contours, by being phonologically unitary, thus convey meaning phonesthetically as well. Finally, I have devoted a good deal of space to exploring the implications of the rhythmic-relational view of stress developed by Liberman and Prince. I have showed how this concept enables us to interpret certain instrumental phonetic data which fit poorly into a ‘stress-level’ or an ‘accent’ analysis (Chapter 2), and how it enables us to understand the phenomenon of deaccenting and to unify a good deal of mysterious accent-placement data (Chapters 3 and 4). Further, I have suggested that a similar approach might prove fruitful in discussing the ‘phrasing’ function of intonation (Chapter 7). It should be pointed out that accent seems to be exclusively a relational phenomenon, and does not involve segments except, of course, the segmental items being related—at all. By contrast, the relational aspects of ‘intonation proper’ involve (at least in English) relations between intonational segments; the role of intonation cannot be expressed in terms of either segments or structural relations alone. Intonational meaning, in summary, is lexical, gradient, phonesthetic, and relational. This means that the first—essentially taxonomic—task of the analyst is to separate out lexical, gradient, and relational effects from among the observable semantic distinctions of intonation. The analyst will need to present not only an inventory of the lexical segments (meaningful contours) of intonation, but also a list of the dimensions of gradience along which the meaningful contours may vary (and the semantic effects of such variation), as well as a discussion of the role of relative height of nuclear pitch peaks and of relative pitch at boundaries. I have outlined the general form of such a taxonomy, basing myself largely on past work. The most important features of the taxonomy, including the separation of accent (rhythmic structure) from intonation (pitch movement), the internal structure of intonational tunes, and the there inventory of nuclear tones, are presented in Chapter 1. However, and are many phenomena I have not discussed, such as: the classification of use of different heads (see Crystal 19692); the uses of the dimensions and gradience; the semantics of focus and the status of compound stress; Thus the kinds of phenomena involved in intonational phrasing signals. is another much work remains to be done at this taxonomic level, which but only reason I regard the book not as an analysis of English intonation, a discussion of preliminaries. lly Once a taxonomy is better established, the phonologist will natura






be interested in questions like: what the distinctive features of intonation contours are, and how similar they are to tone in tone languages; how the domain of pitch contours is to be specified relative to the segmental string; what implications the simultaneous nature of intonation contours has for phonological theory. The phonetician, meanwhile, will want to investigate the acoustic cues by which we identify the different contours, looking, among other things, for evidence of categorial perception; much phonetic research also needs to be done on the nature of relational cues to structure. Finally, the grammarian will want to test different theories of syntax and semantics, not only in terms of segmental grammar, but also in terms of suprasegmental phenomena. As I noted above, it is here that I have let my theoretical viewpoint show through most strongly. It remains true, however, that my primary goal has not been that of testing models of language, but only of discussing how intonation must be included in whatever model we construct.


1. General Introduction and Review of Past Work

1. Excellent reviews of the literature are to be found in Crystal (1969a, Chs.

1-3)—the best synthesis of past work—and Gibbon (1976a, Ch. 3)—the most complete and up-to-date author-by-author synopsis available. 2. Sledd’s evidence was principally the ‘scooped’ intonations (see below, Ch. 2, Sec. 1), as in der

i. Won


As Sledd pointed out, this would have to. be written /2won*derful!#/, which frustrates Hockett’s criterion for identifying the intonation center. 3. There can be no doubt that ‘nucleus’ and ‘sentence stress’ are simply different names for the same phenomenon (see, e.g., Crystal 1969a:210). 4. In the following discussion I will use the term ‘pitch obtrusion’ rather than ‘pitch prominence’ (Bolinger 1958a seems to use the two more or less interchangeably), in order to avoid any possible confusion with ‘prominence’, my impartial cover term for what is variously known as stress, accent, etc. 5. O'Connor and Arnold, however, use the term ‘tone group’ for what we are here calling ‘tune’. 6. For example:


American The numerical systems are probably more suitable for showing they than part, nt intonation, in which voice pitch seems to play an importa marked more are are for marking British intonation, where the kinetic tones and more important. [Kingdon 1958:xxix] ion of English, No attempt will be made here to analyze the American intonat in K. L. which has already been examined and described in detail, notably Pike’s The Intonation of American English. [Ibid., p. 264]

probably of the The only monographic treatment of English intonation (and with methodiintonation of any non-tone language) which probes the matter is that by cal thoroughness and is based on conscientious investigation intonation the es K. L. Pike (Intonation of American English). He describ age advant great of American English in terms of four pitch phonemes. The but nts, ng stateme of the book is that the writer is not contented with sweepi



Notes examines every detail. . . . The system at which he arrives is essentially different from that which is being described in the present study, and this may be due partly to the application of a different method of investigation (Pike starts from meanings, whereas I start from forms), and partly to intrinsic differences in intonation between American and Southern British English. [Jassem 1952:10-11]

7. Bolinger (1966:690) writes: “I doubt .. . that there is that much difference between the dialects. All that I have ever heard a British speaker say has sounded normal to me. He merely has the annoying habit of using certain intonations too much and others too little.” The following anecdote may help to explain our impression of substantial differences between the dialects. I was once in a conversation with an American anthropologist who had just returned from Nigeria, where for two years his only contact with native speakers of English had been with British missionaries. While he was catching up on presidential politics, someone said to him, “Did you know Carter was a nuclear engineer when he was in the Navy?” Somewhat surprised, the anthropologist replied: li, Was he

I made a mental note of this apparent British influence on his intonation. Think-

ing about it later, however, I realized that what an American would have said in the same context is: oo ill. He

Ww a


That is, in both dialects such echo questions (conveying perhaps a kind of restrained surprise or even skepticism) have falling-rising intonation; the difference is that the British use question syntax, while the Americans do not. The British influence was on his syntax, not his intonation. My initial reaction, however, suggests that there is a tendency to attribute dialect differences to funny intonation when they should actually be ascribed to funny syntax or funny lexical choice. 8. I grant that the concept of ‘intonational lexicon’—the term is Liberman’s— is rather novel; the discussion here and in Ch. 7, Sec. 2 is intended to show that the idea is not as far-fetched as it might seem. g. Bolinger, it is true, rejected the whole notion that continuously changing pitch contours were appropriately described in terms of levels; see Ch. 8. 10. As I was making final corrections to the manuscript, Janet Bing drew my attention to the fact that Pike’s terms precontour and primary contour cannot be equated with the British terms as straightforwardly as I have done in Fig. 1. As she points out, Pike’s taxonomy shares significant characteristics with Bolinger’s accent analysis, in that each (Bolinger-style) accented syllable could be associated with a complete precontour-plus-primary-contour.

Notes focane


a ‘



Prince :





precontour } primary

precontour | primary



The same is true, I might add, for American analyses in the Trager—Smith tradition, in which this sentence could be transcribed 2De*Tocqueville? |2studied at 3Princeton!#, and thus in Hockett’s (1958) terms would consist of two macrosegments each composed of a pendant and a head. While Bing’s observation that Fig. 1 is oversimplified is certainly correct, the problem in this instance is not so much a matter of how the terms of one tradi-

tion match up to those of another, but how the terms of any tradition are to be applied to the data. Specifically, both the Pike and the Trager-Smith analyses have trouble in practice in deciding whether the little variations in pitch that accompany stressed syllables are simply ‘allophonic variation’ or another pitch level phoneme, though this decision determines whether a given stressed syllable is part of a precontour or can claim its own primary contour. The same practical difficulty—in a different theoretical disguise—plagues the British tradition in its decisions on the location of tone group boundaries, which determines

whether a given syllable is seen as a nucleus or only part of a head. In shorter sentences, where the practical uncertainties are less severe, then the correspondence between British and American is indeed as shown in Fig. 1: do What



ing precontour head

primary nucleus

contour + tail

The “common denominator” implied by the partial correspondence between the two traditions seems to be that all accented syllables have certain intonational characteristics in common; this is the basis for the American analyses and for the Kingdon/Schubiger use of head to refer only to the main accented syllable preceding the nucleus. However, one accented syllable of a sentence or tone group—usually the last—also has peculiar characteristics of its own, which is the basis of the British tradition. The reason for the indeterminacy in application of both traditions lies in the very indeterminacy of such notions as pitch phoneme, accented syllable and tone group. 11. Note that Vanderslice and Ladefoged’s attention to the intonation contours that follow the intonation center, while virtually ignoring what precedes it, implicitly involves the division between head and nucleus. Cf. Fig. 1. 12. Interestingly, however, Bolinger notes (1958a:55) that “after an A accent but a there seems to be an all-or-none difference between a level and a rise,



gradient difference between alevel anda fall’—i.e., the very distinction between fall-rise and fall. 13. Jackendoff (1972, Ch. 6) misses this point. While borrowing Bolinger’s terminology and appearing to borrow his concepts, he is actually using ‘A accent’ and ‘B accent’ to refer to units like the British tones, as can be seen from his definition: “the intonation on an emphatically stressed syllable plus the intonation following until the next emphatically stressed syllable or the end of the sentence, whichever comes first” (Jackendoff 1972:258, emphasis added). 14. This implies some reanalysis of full and reduced vowels, which he undertakes in Bolinger (1976). 15. Since Lieberman’s emphasis is on refuting the Trager-Smith analysis, with its claim that stress levels are manifested by different degrees of loudness and are perceptible independently of context, it is not always clear how his proposals compare to others’. For example, his 1965 experiment involved only a single short sentence (They have bought a new car) and his data are presented in very condensed form, so it is impossible to tell whether the Trager— Smith ‘secondary stresses’ were perceived in the processed utterances as stressed or unstressed, or inconsistently. Moreover, to judge from his use of the feature [+Ps] (‘prominence’) in other parts of his 1967 book, it appears that that feature does not correspond to the syllable prominence consistently perceived by the linguists in the 1965 experiment. Clarification would be helpful. 16. Nowhere in the article Stockwell cites (Bolinger 1958b) does Bolinger introduce the phrase ‘morphological stress’ as a technical term; the sentence from which Stockwell appears to have taken this is the following: “I use accent now on the syntactic side, and reserve stress for the morphological” (82). In the context of the rest of his work—and indeed, in the context of the passage (79-83) in which the sentence just quoted appears, it is clear that what Bolinger means by ‘morphological stress’ is not a level of stress at all but the “potential for prominence (accent)” discussed above. As we have seen, in Bolinger’s original scheme the only actual distinction of prominence is between accented and unaccented. 17. Hockett, in a seminar discussion of the origins of the phonemic principle (Spring 1977), compared the establishment of an important new idea to starting a car in winter—just as the car will sputter and die several times before catching, so the idea will surface and disappear several times independently before finally being put in a form which captures the attention of a whole field. Liberman and Prince’s article seems to indicate that the idea of stress-as-rhythm is finally running smoothly. 18. In this connection, Gunter observes wryly that class arguments over how to mark a certain type of falling-rising contour in the Trager-Smith system “are settled by professorial fiat, a kind of authority that must have settled many such arguments over the past fifteen years” (1972:212n). 19. The role of rhythm in stress has also been discussed by a few British linguists, notably Abercrombie and Halliday. The notion of ‘rhythmic foot’, which Halliday (1967) incorporates into his analysis, is taken over from traditional metrics into linguistic work by Abercrombie (1964); one of Abercrombie’s concerns is formalizing the elusive notion of ‘stress timing’ first suggested by Pike (1945) (see Halliday 1967:12). As we saw, Halliday’s distinction between sali-



ent and weak syllables is based on position in the foot, and would thus seem to be a purely rhythmic phenomenon. Yet an ambiguity appears in the analysis when he talks (e.g., in the passage quoted above in Section 5a) of acoustic correlates of stress like pitch movement and syllable duration. Do these physical properties define the salient syllables, which in turn define the foot boundaries, or does the rhythmic structure of the feet define the salient syllables, whose physical properties then follow from their salience? (This latter position would be roughly the equivalent of Liberman and Prince’s.) Halliday does not say, which is presumably the source of Crystal’s skepticism (1969a:202-203, 1969c: 382) about the whole idea of rhythmic feet. To my knowledge, there has been no other British development of the notion of stress as a rhythmic phenomenon. 20. Level tone, as can be seen from Fig. 5, is the subject of much disagreement and uncertainty; in Ch. 8 I propose a new analysis which shows that the level tones do differ from the others on the chart in that they are part of a subsystem of ‘stylized intonation’. As for the taxonomy of heads and preheads, the best that can be said is that there is far less agreement—and for that matter far less discussion—than there is on the taxonomy of tones. No attempt has been made to treat heads and preheads here; the reader is referred to Crystal (19692) for the most complete discussion to date. See also n. 10 above and Ch. 3, Sec. 3


2. Evidence for the Rhythmic Nature of Prominence

1. This ceteris paribus condition might be seen unsympathetically as reducing Vanderslice and Ladefoged’s definition to near-circularity. At the very least, we should expect a little more precision about what other factors are taken to be cetera and why, especially since the rhythm hypothesis seems to offer a consistent account of how the phonetic nature of nearby syllables can affect the timing of a syllable and its perceived prominence. 2. The term ‘scoop’ in this sense goes back at least to Chao (1932), and has been used by Pittenger et al. (1960) and Vanderslice and Ladefoged (1972). Gunter (1972) calls these intonations ‘humped descent’. These were the contours which Sledd (1956) used as evidence of the inadequacy of Hockett’s definition of intonation center (see above, Ch. 1 n. 2), though Sledd did not use the term ‘scoop’. Householder (1957:239n) uses ‘scooped’ to refer to the Liberman— Sag ‘contradiction contour’, i.e., a sequence of high-falling head and low-rising nucleus; I do not know if this usage is widespread. 3- In British terms, this major pitch jump marks the end of the prehead and the beginning of the head. 4. Bolinger and Gerstman would not disagree with this point: crediting both Householder and J. D. O’Connor, they note (85-86) that the difference between lighthouse keeper and light housekeeper is functionally the same as the one seen in high-line voltage and high line-voltage, even though in the latter examples only added length, not actual silence, is involved. However, they take this as refining their understanding of the notion ‘disjuncture’, which they define not as ‘separation of syllables’ but as ‘separation of syllable centers’. They are then free to see disjuncture, so understood, as indicating constituent structure. 5. In this context we can see again the error of Stockwell’s identifying accent with primary stress, commented on above in Ch. 1, Sec. 5b.



6. This solution involving the feature [+emph] was suggested to me in a letter from Vanderslice. 3. The Phonology of Deaccenting 1. Compare Jakobson and Halle’s remarks (1971:33-38) on the ‘relative’ nature of prosodic features as compared to ‘inherent features’, 2. Thirty years ago, Nida (1948:269, n. 44) suggested that “Junctures are to be treated on the same level as order, which is the other formal feature of arrangement”—that is, junctures are not chunks, but features of the arrangement of chunks. 3. The ideas in this chapter and the next were developed before I had seen Liberman (1978) or Liberman & Prince (1977), and an earlier written version has appeared as Ladd (1979). This chapter especially has been rather extensively rewritten to link the exposition more explicitly to the relational concept of accent. 4. Schmerling admits as much and attempts to head off such criticism with the following apologia: The study presented here is not a phonetic study; it is a study of what might be called the syntax of stress. That is, I am concerned here not with the phonetic nature of stress but with the question of “which stress goes where”: the abstract principles which appear to be involved in assigning relative prominence to the different items in an utterance. I am thus defining stress for the purposes of this study as subjective impression of prominence [emphasis hers]. . .. Because I am taking no stand on the physical nature of stress, no theoretical significance should be attached to the notation I use in the examples, which was chosen purely for convenience. [1976:3-4]

But it should be clear that any notation system carries with it a load of theoretical implications that cannot be so easily wished away. In particular, in the context of the treatment of deaccenting proposed here, it can be seen that the notion of ‘absence of stress’ is quite meaningless. More specifically, Schmerling’s separation of ‘topic-comment’ sentences from the rest of deaccenting, which we shall mention presently, is purely an artifact of her notation. 5. The cases of ‘prenuclear’ deaccenting discussed in the next several pages raise two related questions: (1) Why does the accent shift right instead of left in some cases of deaccenting? (2) In certain of the ‘normal’ or ‘out of the blue’ versions, why is the accent not on the rightmost content word, as predicted by almost every ‘normal stress’ rule ever proposed? The reader is asked to postpone these questions until the next chapter, where we discuss the whole question of how accent ends up where it does; only the phonological and relational nature of deaccenting is at issue here. 6. This is often claimed to be common in women’s speech, but is actually found in both men and women in such contexts, to signal hesitance, tentativeness, or uncertainty.

7. Vanderslice and Ladefoged’s concept of deaccenting—changing [+-accent] to [—accent]—does not handle these cases any more convincingly, since both a pretonic accent and an A accent involve a ‘pitch obtrusion’ and must thus, by



their definitions, be considered [+accent]. Vanderslice’s (1977) example Jean ate her soup at the smorgasbord does appear to treat the difference between accented (i.e., A-accented) Jean and pretonic-accented (i.e., deaccented) Jean simply by changing the value of the feature [Accent]:

i. Jean

soup ate her

smorg at the


i soup smorg at the ii. Jean ate her asbord

(Jean is [+accent])

(Jean is [—accent])

However, this example succeeds only because Jean begins the utterance with a strong syllable, and there is thus no pitch jump (see Ch. 2 Sec. 1). If we changed the name to Janine, we could still have the contrast between accented (A-accented) Janine and pretonic-accented (deaccented) Janine, but this time both would involve a pitch jump to the syllable -nine: ili. Ja


nine ate her


at the

ate her

(Janine is accented, iv,




[+accent]) at the


Ja (Janine is pretonic-accented)


In the second case Janine is functionally deaccented just as is the second example of Jean, but because of the pitch obtrusion of the pretonic accent Vanderslice and Ladefoged must label it [+accent]. Cf. the comments on Bolinger’s explanation of rhythmic stress shift, above page 38. 4. The Grammar of Accent Placement 1. As Schmerling argues and as Bolinger himself has pointed out repeatedly,

it is generally not possible to attach an explicit meaning to the term ‘contrastive stress’ anyway. This is discussed further in the next section. 2. Schmerling (1976:76) rejects Chomsky’s use of focus in this context largely because his attempt to formalize it involves what she sees as an inappropriate use of the notion of presupposition. In one of Chomsky’s formulations, if the is focus of a sentence is replaced by a variable (e.g., an indefinite), the result ‘an expression of one of the presuppositions of the sentence. Thus in John writes poétry in the garden, the focus is poetry and the presupposition is John writes something in the garden or John writes X in the garden. Jackendoff (1972) of makes some further attempts along the same lines to formalize the notion focus (see Ch. 7, Sec. 3 below). of ‘con3. My analysis thus separates the accentual and intonational aspects trastive stress’. Narrow focus is signalled solely by the location of the accent; d pitch various intonational characteristics such as greater volume and widene It is quite range can also be used to signal what might be called ‘emphasis’. equally and cues, ional intonat possible to have narrow focus without emphatic in as possible to have emphatic intonational cues without narrow focus,



i. Wha


t in the


1d are you doing


But it is quite true that the two frequently occur together, especially if a narrow focus is intended on an item which would receive ‘normal stress’ anyway, e.g., sh

™ John painted

the ed yesterday.

(Cf. also n. 5 just below). Something like the separation of accentual from intonational cues is suggested by Cutler and Isard (1978) in their discussion of contrast.

The fact that the two types of cues can be used in combination should not be permitted to obscure their separateness, though it has certainly tended to confuse the whole question of ‘contrastive stress’ even more than it is to begin with: more often than not contrastive stress is considered in theory to depend on the intonational cues but is defined in practice, as we have seen, on the basis of accentual cues, as the converse of normal stress. 4. The idea is hinted at in a few places. Bolinger (1958a:52)

gives an example of a sentence with the accent on a preposition and notes that in the context any other accent position would be ‘contrastive’. Schmerling (1976:89) also mentions accent placement by default in connection with ‘topic-comment’ sentences; but as we have seen, she considers deaccenting to involve actual absence of stress, and does not develop the notion of default accent. 5. Several friendly critics have objected that it is possible to disambiguate these two readings with intonation. Thus: lii.How many


do you spe

(with a high level head)


would likely be interpreted as deaccenting languages, while spe iv. How many


do you

(with a high falling tone)


would tend to imply focus on speak. But while it is possible to disambiguate,

it is not necessary; moreover, the correlations between certain intonations and

certain interpretations are, I think, only probabilities. From the point of view

of where the accent goes and not how it relates to the intonation—i.e., purely

from ture 6. with

the point of view of the rhythmic structure, not the intonation—the strucis ambiguous. This brings up the interesting question of the interaction of the word order accent, which is discussed in some detail by Dane (1967) and Schubiger


7. Halliday p. 206.


also presupposes

such a distinction;

see especially



8. The greater accentability of content words than function words is part of any description of English sentence accent and I presume needs no justification here. As for the greater accentability of nouns than other content words, the reader is asked to suspend the amassing of counterexamples until subsection d. below, where some of the details of this idea are discussed. g. Notice that this view of ‘contrastive stress’ (in Bolinger’s sense) actually argues against the distinction that Bolinger makes between contrastive accent and contrastive stress. Focus on part of a word is seen here merely as one end of a spectrum, not as something essentially different from focus on a word or constituent. 10. Instances of such accent shift in noun compounds are surprisingly common in everyday speech, as one quickly learns when listening for them. Note that the examples here have been chosen so that the ‘contrastive’ argument is not possible. In, say, We’ve got lots of fuel pimps, but no fuel filters, one could argue that pumps and filters take the accent away from the normally accented fuel because of the contrast. But in We’ve got lots of béoks, but we haven't got any bookcdses, in order to explain the accent on cases as contrastive, one would have to say that it contrasts with zero—which is true of anything. 11. In this case there is another possible solution, namely, not to deaccent at all. (We've got lots of books, but we haven't got any béokcases). Compounds seem to vary in their tolerance for having their internally determined accent pattern disrupted by deaccenting. 12. Note that in all these examples George must be interpreted as deaccented, as in the other cases of deaccented names discussed above. 13. Thanks to Eric Brook for stretching my interpretive component with regard to these sentences. 14. This is not to imply, of course, that this general view is uniquely Bolinger’s; it is espoused by a variety of (broadly) ‘functionalist’ writers. Bolinger, however, has been most active in applying this outlook to suprasegmental analysis. 5. Paralanguage and Gradience

1. For the intonation of yes-no questions to which Bolinger refers in this quote, see Fries (1964) and Lee (forthcoming). 2. I am taking for granted here that dimensions such as loudness, tempo, and ‘various differences in ‘tone of voice’ are properly considered “less linguistic” or paralinguistic and are organized in more or less the way described by Crystal (19694, Ch. 4; see also 1975). 3. But Nash and Mulac, concentrating on the pitch movement on thought rather than on the overall pitch contour of the utterance, call the difference between these two contours the contrast between Bolinger’s A and B accents. This shows, among other things, the difficulty of applying Bolinger’s taxonomy.

6. Around the Edge of Language?

1. From Wooden Ships, by David Crosby and Stephen Stills, copyright 1969 by Gold Hill Music and Guerrilla Music.



2. Bolinger (1964:20-21) make comparable observations about the universality of certain paralinguistic cues. 3- To the extent that Osser was eliciting judgments of paralinguistic emotional signals and not of linguistically systematized intonation contours, this result is exactly what we would expect. 4. In fact, my colleague Louis Mangione informs me that compared to other European works of the same period, Hillier’s book is actually quite enlightened. 7. Intonation and Grammar

1. Pike is obviously concerned here with distinguishing intonation from lexical tone in tone languages, but the fact that that is his emphasis does not alter his fundamental distinction between intonational and segmental (‘lexical’) meaning.

2. Reacting to the foregoing section, Cutler writes (personal communication) that some as yet unpublished research by herself and others suggests that ‘closed class’ or function words are stored rather differently in memory from ‘open class’ or content words. If true, this is obviously an important finding, and to some extent deflates my assertion above that Cutler would “certainly” include closed class words in the lexicon. It seems to me nevertheless that the possibility of significant subdivisions within the lexicon does not affect my basic point, which is that intonational meaning should be assumed to be like lexical meaning until proved otherwise. We can, if research like Cutler’s warrants, change this

to “intonational meaning should be assumed to be like one significant subdivision of lexical meaning—say, closed class words—until proved otherwise”; the point is essentially the same. 3. An earlier version of the fall-rise analysis presented here appears in Ladd


4. Jackendoff incorrectly borrows Bolinger’s terminology, calling fall ‘A accent’ and fall-rise “B accent’. We have already seen (Ch. 1, Sec. 3) that in Bolinger’s system fall and fall-rise are both A accents; B accent is very different. See Fig. 2, and also Ch. 1 n. 12. 5. Further evidence for this analysis is presented in Ch. 8. 6. The nucleus of the contradiction contour, of course, can move around freely: '

t 4a



ge I


ing With ae



(.. . we were shooting pool.)

tan t go



= I


ing with ae :


(... it must have been Max you saw me with.)

t go


yesterd ay

drinking with




(... it was last Thursday. )



This seems to provide further evidence that the contradiction contour consists of a low-rise nucleus and a high-falling head; as we said in Ch. 1, however, this does not preclude interpreting the combination of the two as some sort of ‘holistic’ contour as Liberman and Sag suggest. 7. However, this is not to convey the impression that the taxonomy of fallingrising contours in English is clear and simple. Many writers distinguish fall-rise (with nucleus early in the sentence) not only from the contradiction contour, but also from fall-rise with nucleus late in the sentence. Lee (1956b), for example, distinguishes the falling-rising tune (roughly = contradiction contour), falling-rising sequence (nucleus early in the sentence), and falling-rising tone (nucleus late in the sentence). Halliday likewise separates his tones 13 and 53 (nucleus early in the sentence) from tone 4 (nucleus late in the sentence); Crystal’s notion of ‘nuclear tail’ (196ga:Ch.5) deals with the same problem. However, the distinction between the falling-rising sequence and the fallingrising tone is much more elusive than that between the fall-rise and the contradiction contour, and Kingdon, for example, explicitly treats fall-rise early in the sentence (‘Divided Tone III’ or Tone IIIp) as a type of fall-rise (Tone III). (This treatment is implicit also in Palmer 1922.) Moreover, even analysts who insist on a distinction between the falling-rising sequence and the falling-rising tone show by their sometimes inconsistent transcriptions that this distinction is somehow less critical than others; these inconsistencies will be the subject of a separate paper. In any case, I do not feel that my failure to make finer distinctions among fall-rise tones represents a setious deficiency of my analysis, for the discussion in the next section seems equally applicable no matter where the

nucleus occurs. 8. Here and throughout the rest of the chapter we will use the widely used British ‘tonetic’ notation first developed by Kingdon (1939). A ~ placed directly before the nuclear syllable indicates a fall-rise; a * placed directly before the nuclear syllable indicates a fall. Thus I fed the “cat is to be interpreted as fed



ce at

and In “Ithaca, maybe is to be read h In





g. Actually, this is an oversimplification. With a fall-rise tone the speaker could also say In Massa “chusetts, maybe (... but he doesn’t even live in New repYork). Here the set being referred to is something like ‘states of the U.S.’, on resented in the context by New York. This recalls the situation remarked of basis the on context above, where the set ‘animals’ is taken to be in the speaker A’s mention of dog. 10. Chafe (1976) implies the same kind of analysis of double-nucleus senthe pairtences as Jackendoff’s: he talks of ‘pairings’ of nouns, and implies that ing is what is being asserted.



11. Something like the interaction with quantifiers and superlatives seems to be the explanation for another specific fall-rise nuance pointed out by Cutler

(1977): iv. A: How B: Not

do you “bad.

like my new



The use of fall-rise in such contexts—with a wide variety of words expressing mild approval, like OK, all right, decent—sets up a hierarchy of possible evalvations and explicitly puts the evaluation given somewhere between the possible extremes. Hence the nuance of mediocrity, which is absent from not ‘bad, O *K, all ‘right, “decent. These latter make no reference to other possible evaluations,

set up no hierarchy and do not downgrade the mild approval. Note in this connection the unlikelihood of using fall-rise with evaluative adjectives that are clearly at one end of the scale or the other: v. A: B:

How (7?)

do you



my new




scheme? Fan~tastic.

12. Halliday, in a most distressing choice of terms, calls these three aspects tonicity (nucleus placement), tonality (boundary placement), and tone (pitch contour selection). 13. Cf. Gage (1958:128): “No source has ever given a satisfactory description of the phonetic nature of the terminals it assumes. The situation with regard to /|/ is particularly bad, since it is often defined no more than it would be by saying: ‘end of a phrase known not to end in either /\\/ or /#/.” This problem is not merely of historical interest, but seems likely to plague Liberman as well once he attempts to transcribe longer stretches of speech than are discussed in his dissertation. Since his tunes contain only a single pitch peak, two adjacent pitch peaks will therefore belong to separate tunes (cf. Ch. 3, Sec. 3 above), and will need to be separated by a boundary with optional boundary


8. Stylized Tones and the Phonology of Intonation

1. A slightly different version of this chapter appears as Ladd (1978). 2. The difference between Liberman’s and Leben’s view and Fox’s view is a specific instance of the tune-tone controversy. For Liberman and Leben, the whole configuration is a ‘tune’. In British terms, on the other hand, the lowpitched stretch that may precede the stepping-down pitches is the head, and the stepping-down pitches themselves constitute the nucleus (plus tail). Fox’s term ‘step-down tone’ expresses the structural parallel between the calling contour and the other nuclear tones. Crystal (1969b), taking only the final pitch into

account, denies such a structural parallel, and claims that the true analog to the other nuclear tones is what he calls ‘level’ tone; Fox (1970), however, shows that Crystal's analysis makes incorrect predictions about nucleus placement. See Crystal (1969b) and Fox (1970) for details. 3. Pike (1945:71): “The same call can be given rapidly, but preserving the chanting character.” Crystal (1969b:34f) makes related observations.



4. This distinction goes unnoticed and unaccounted for in the ‘warning/calling’ analysis. 5. Dwight Bolinger (personal communication) offers a more general explanation for the shift in this example, namely that stylized intonation would seldom be used to answer questions (i.e., in this case the question What?). This makes sense in terms of the stylized analysis, since a speaker who asks a question signals that the answer is not predictable from the context. As Bolinger points out, stylized intonation in answer to a question has overtones of ‘I’ve told you a million times’:

i. A, Where's the phone book? B.




(... right where it belongs)


6. The extent to which gradience is involved here was suggested to me by both Dwight Bolinger and Duncan Gardiner (personal communications). Gardiner finds a “hierarchy of melodic forms”, from the least formalized (normal speech intonation) to the most formalized (accompanied singing), with the spiels of talking blues performers, street vendors, auctioneers, etc., on some sort of scale in between. Bolinger also suggests that there must be some connection between music and the steady level pitches of stylized intonation, and speculates that it may have to do with repetition. He makes the striking observation that one feature of poetry, music, and artistic utterance in general that

makes it different from ordinary speech is that it is designed to be repeated. Hearing a song or poem for the twentieth time can be a profoundly moving experience; hearing the daily news for even the second or third time is a crashing bore. Bolinger suggests that the constant subtle ups and downs of pitch— which are levelled in both music and stylized intonation—are the “accompaniment of informative utterance.” While the functions of music and stylized intonation are obviously quite different, they both nevertheless convey a sense of familiarity or repetition. 7. Or perhaps more precisely, “level tone stepped up to” and “level tone stepped down to.” This, of course, is reminiscent of the basis on which Bolinger distinguishes B and C accent. 8. In the same way, it seems likely that the only difference between Liberman’s ‘warning/calling tune’ and ‘surprise/redundancy tune’ is that one ends in a plain fall and the other in a stylized fall but that both have the same type of head. 9. Woo (1969) distinguishes two broad traditions of tone language analysis, the ‘Orientalist’ and ‘Africanist’, the main difference being that the Africanists have had to deal with systems making considerable use of level tones, while the Orientalists have been confronted with languages having a great many contour tones. This has, understandably, colored their theoretical outlooks, and Woo’s distinction is very useful. As I suggest below, the autosegmental approach to pitch phenomena seems to have been much influenced by its development in the analysis of Africanist tone languages.



10. Writing the stylized tones in Trager—Smith notation is rather difficult, the difficulties arising principally from theoretical restrictions on the use of the juncture /|/, and from descriptive waffling on the acoustic nature of the juncture /#/ when it follows anything other than pitch level /1/. I asked a half-dozen members of the Cornell linguistics faculty, all well-trained in Trager-Smith notation, how they would transcribe the stylized contours, and got wildly differing

responses—all three terminal junctures were suggested at least once. The Trager-Smith transcriptions here are based on discussions with C. F. Hockett. It is probably true that many linguists would prefer to use the /|/ terminal juncture for stylized contours rather than /#/; Hockett’s use of /|/ (as explained for example in Hockett 1958:37) is somewhat more restricted than in the original Trager-Smith system. But even if we wrote all the stylized tones with /|/, the relation to the plain tones would still not be very obvious. Liberman’s system is not much clearer. My choice of representation for plain and stylized fall is based on his discussion, pp. 98-104, and is surely the way he would handle the contrast. But the transcriptions of the rising tones must be extrapolated from various examples; possibly he would choose to represent them not as sequences of two ordinary tones, but as sequences of a tone and a ‘boundary tone’. In any case, my observations here would still be applicable. 11. Psychoacoustic experiments discussed by Howie (1976: see especially pp. 222 and 242-243) suggest that Mandarin second and third tones (roughly high-rise and low-rise, though they differ in other respects as well) are more readily confused in isolation than other pairs of tones, and that native speakers rely to some extent on the pitch of adjacent (especially following) syl!ables for information about the tone on a given syllable. g. Intonation and Phonesthesia

1. The details vary, and in no case that I know of is a Middle Eastern gesture identical to the European one with the opposite meaning, but in any case upand-down movements are used for ‘no’ and sideways ones for ‘yes’. 2. Hockett borrows from Hjelmslev the terms plereme and ceneme (based on the Greek roots for ‘full’ and ‘empty’, respectively) for use in his discussion of design features of symbolic systems, mainly because their applicability is not restricted to human language. For language, the pleremes are morphemes and the cenemes phonemes.

REFERENCES This is not intended to be a complete bibliography on intonation, but contains only those works referred to in the text. Except as otherwise noted below, page references in the text are to anthology versions and not to the original source. For more complete bibliographies, the reader is referred to the following sources:

Crystal (1969a) for the most comprehensive listing on English, and Crystal (1975), which updates Crystal (1969@) and adds coverage of paralanguage and acquisition of prosody. Léon and Martin (1970) for broad coverage of studies on tone and intonation in a variety of languages. Lieberman (1967) and Lehiste (1970) for coverage of acoustic phonetic and physiological studies. Pike (1945) for the most comprehensive listings of pre-20th-century work. Abe, Isamu. 1962. Call-Contours. In Proceedings of the F ourth Int'l. Congress of Phonetic Sciences, Helsinki, pp. 519-523. The Hague: Mouton. Abercrombie, David. 1964. Syllable Quantity and Enclitics in English. In Abercrombie et al., pp. 216-222. Abercrombie, David, D. B. Fry, P. A. D. McCarthy, N. C. Scott, and J..L. M. Trim, eds. 1964. In Honour of Daniel Jones. London: Longmans. Akmajian, Adrian, and Ray Jackendoff. 1970. Coreferentiality and Stress. Linguistic Inquiry 1: 124-126.

Armstrong, Lilias E., and Ida C. Ward. 1926. A Handbook of English Intonation. Leipzig and Berlin: Teubner. Bally, Charles. 1941. Intonation et Syntaxe. Cahiers F. de Saussure 1: 33-42. Stress. Berman, Arlene, and Michael Szamosi. 1972. Observations on Sentential Language 48: 304-325. of Bierwisch, Manfred. 1968. Two Critical Problems in Accent Rules. Journal Linguistics 4: 173-178. Bloomfield, Leonard. 1933. Language. New York: Holt. ReBolinger, Dwight. 1940. Word Affinities. American Speech 15: 62-73. printed in Bolinger 19654, pp. 191-202.

——— ——— ——— ———


in 1947. Comments on Pike’s Intonation of American English. Studies Linguistics 5: 69-78. . 194g. Intonation and Analysis. Word 5: 248-254. 1950. Rime, Assonance, and Morpheme Analysis. Word 6: 117-136. Re1951. Intonation: Levels versus Configurations. Word 7: 199-210. printed in Bolinger 19654, pp. 3-16. 1955. Intersections of Stress and Intonation. Word 11: 195-203.


222 ———


——— ———

1957a. On Certain Functions of Accents A and B. Litera 4: 80-89. Reprinted in Bolinger 1965a, pp. 57—66. 1957b. Maneuvering for Accent and Position. College Composition and Communication 8: 234-238. Reprinted in Bolinger 1965a, pp. 309-315. 1958a. A Theory of Pitch Accent in English. Word 14: 109-149. Reprinted in Bolinger 1965a, pp. 17-55. 1958b. Stress and Information. American Speech 33: 3-20. Reprinted in Bolinger 1965a, pp. 67-83. 19614. Generality, Gradience, and the All-or-None. The Hague: Mouton. 1961b. Contrastive Accent and Contrastive Stress. Language 37: 83-96. Reprinted in Bolinger 1965a, pp. 101-117. 1961c. Ambiguities in Pitch Accent. Word 17: 309-317. Reprinted in Bolinger 1965a, pp. 119-127. 1964. Around the Edge of Language: Intonation. Harvard Educational Review 34: 282-296. Reprinted (slightly abridged) in Bolinger 19722, pp. 19-29. 19654. Forms of English: Accent, Morpheme, Order (1. Abe & T. Kanekiyo, eds.). Cambridge, Mass.: Harvard Univ. Press. 1965b. Pitch Accent and Sentence Rhythm. In Bolinger 1965a, pp. 139-180. 1966. Review of Faure 1962. Language 42: 670-690. 1970. Relative Height. In Léon et al., pp. 109-127. Reprinted in Bolinger

——— ———

(ed.) 1972a. Intonation. Harmondsworth, England: Penguin. 1972b. Accent Is Predictable (If You’re a Mind-Reader). Language 48:

——— ——— ——— ——— ——— ———


——— ———


19724, pp. 137-153. 633-644.

1975. Aspects of Language (2d ed.). New York: Harcourt Brace Jovanovich. ——— 1976. Length, Vowel, Juncture. Bilingual Review 3: 43-61. ——— 1977. Meaning and Form. London: Longmans. ——— 1978a. Intonation Across Languages. In Greenberg et al., eds., Universals of Human Language, vol. 2 (Phonology). Stanford Univ. Press. ——— 1978b. Free Will and Determinism in Language: Or, Who Does the Choosing, the Grammar or the Speaker? In M. Sufer, ed., Contemporary Studies in Romance Linguistics. Washington: Georgetown Univ. Press, pp. 1-17. Bolinger, Dwight, and Louis J. Gerstman. 1957. Disjuncture as a Cue to Constructs. Word 13: 246-255. Reprinted in Bolinger 1965a, pp. 85-93. Bresnan, Joan. 1971. Sentence Stress and Syntactic Transformations. Language 47: 257-281. ——— 1972. Stress and Syntax: A Reply. Language 48: 326-342. Chafe, Wallace L. 1970. Meaning and the Structure of Language. Chicago: Univ. of Chicago Press. ——— 1973. Language and Memory. Language 49: 261-281. ——— 1974. Language and Consciousness. Language 50: 111-133. ———_ 1976. Givenness, Contrastiveness, Definiteness, Subjects, Topics, and Points of View. In C. N. Li, ed., Subject and Topic, pp. 25-55. New York: Academic Press. Chao, Y. R. 1930. A System of Tone Letters. Maitre Phonétique 45: 24-27. ——— 1932. A Preliminary Study of English Intonation (with American Variants) and its Chinese Equivalents (T‘sai Yiian Pei Anniversary Volume, Supplementary Volume I of Bulletin of the Institute of History and Philology of the Academica Sinica), Peiping.




1968. A Grammar of Spoken Chinese. Berkeley and Los Angeles: Univ. of California Press. Chomsky, Noam. 1971. Deep Structure, Surface Structure, and Semantic Interpretation. In D. Steinberg and L. Jakobovits, eds., Semantics: An Interdisciplinary Reader in Philosophy, Linguistics and Psychology, pp. 183-216. Cambridge: Cambridge Univ. Press. Chomsky, Noam, and Morris Halle. 1968. The Sound Pattern of English. New York: Harper & Row. Chomsky, Noam, Morris Halle, and Fred Lukoff. 1956. On Accent and Juncture in English. In M. Halle et al., eds., For Roman Jakobson, pp. 65-80. The

Hague: Mouton. Coleman, H. O. 1914. Intonation and Emphasis. Miscellanea Phonetica 1: 6-26. Coker, C. H., N. Umeda, and C. P. Browman. 1973. Automatic Synthesis from Ordinary English Text. In J. L. Flanagan and L. R. Rabiner, eds., Speech ie a pp. 400-411. Stroudsburg, Pa.: Dowden, Hutchinson, & Ross, ne. Cross, D. V., and H. L. Lane. 1964. An Analysis of the Relation between Identification and Discrimination Functions for Speech and Non-Speech Continua. Unpublished Report #05613-3-P, Behavior Analysis Laboratory, Univ. of Michigan. Crystal, David. 1969a. Prosodic Systems and Intonation in English. Cambridge: Cambridge Univ. Press. (Parts of Ch. 5 reprinted in Bolinger 1972a, but page references herein are to original.) ——— 1969b. A Forgotten English Tone: An Alternative Analysis. Maitre Phonétique 84: 34-37. ——— 196g9c. Review of Halliday 1967a. Language 45: 378-393. ——— 1975. The English Tone of Voice. London: Arnold. Cutler, Anne. 1977. The Context-Dependence of ‘Intonational Meanings’. In Papers from the 13th Regional Meeting, Chicago Linguistic Society, pp. 104-115. Cutler, Anne, and Stephen D. Isard. 1978. The Production of Prosody. In B. Butterworth, ed., Language Production. New York: Academic Press. Dane, FrantiSek. 1960. Sentence Intonation from a Functional Point of View. Word 16: 34-54. ——— 1967. Order of Elements and Sentence Intonation. In To Honour Roman Jakobson, pp. 499-512. The Hague: Mouton. Reprinted in Bolinger 19724, pp. 216—232. Delattre, Pierre. 1963. Comparing the Prosodic Features of English, German, Spanish, and French. International Review of Applied Linguistics 1: 193210. * _—~— 1966. Les Dix Intonations de Base du Frangais. French Review 40: 1-14. ——— 1972. The Distinctive Function of Intonation. In Bolinger 19724, pp. 159-174.

Downing, Bruce. 1970. Syntactic Structure and Phonological Phrasing in English. Dissertation, University of Texas. Faure, Georges. 1962. Recherches sur les caractéres et le role des éléments musicaux dans la prononciation anglaise. Paris: Didier. Fox, Anthony. 1969. A Forgotten English Tone. Maitre Phonétique 84: 13-14. ——— 1970. The Forgotten Tone: A Reply. Maitre Phonétique 85: 29-31. Fries, Charles C. 1964. On the Intonation of ‘Yes/No’ Questions in English. In Abercrombie et al., pp. 242-254. Fry, D. B. 1955. Duration and Intensity as Physical Correlates of Linguistic Stress. Journal of the Acoustical Society of America 27: 765-769.




1958. Experiments in the Perception of Stress. Language and Speech 1: 126-152. Gage, William. 1958. Grammatical Structures in American English Intonation. Dissertation, Cornell University. Garcia, Erica. 1975. The Role of Theory in Linguistic Analysis: The Spanish Pronoun System. Amsterdam: North-Holland. Gardiner, Duncan. 1977. Two Assumptions in the Study of Intonation. Paper read at LSA Annual Meeting, Chicago.

Garding, Eva, and Louis J. Gerstman. 1960. The Effects of Changes in Location of an Intonation Peak on Sentence Stress. Studia Linguistica 14: 57-59. Gary, Norman. 1976. A Discourse Analysis of Certain Root Transformations in English. Formerly distributed by Indiana University Linguistics Club, Bloomington. Gibbon, Dafydd. 1976a. Perspectives of Intonation Analysis (Forum Linguisticum, vol. 9). Bern: Lang. ——— 1976b. Performatory Categories in Contrastive Intonation Analysis. In D. Chitoran, ed., 2d Intl. Conference of English Contrastive Projects, Bucharest, pp. 145-156. Bucharest: University Press and Arlington, Va.: Center for Applied Linguistics. Glenn, Marilyn. 1977. The Pragmatic Function of Intonation. Paper read at LSA Annual Meeting, Chicago. Goldsmith, John. 1976. Autosegmental Phonology. Dissertation, MIT. Distributed by Indiana University Linguistics Club, Bloomington. Greenberg, S. Robert. 1969. An Experimental Study of Certain Intonation Contrasts in American English. Los Angeles: UCLA Working Papers in Phonetics, no. 13.

Gunter, Richard. 1966. On the Placement of Accent in Dialogue: A Feature of Context Grammar. Journal of Linguistics 2: 159-179. Reprinted in Gunter ———


1972. Intonation and Relevance. In Bolinger 19724, 194-215. Reprinted in Gunter 1974a. (Page references are to Bolinger 1972< version.)

——— 1974a. Sentences in Dialog. Columbia, S.C.: Hornbeam Press. ——— 1974b. Context Grammar and Relevance. In Gunter 1974a.

——— 1976. Review of Lieberman 1967. Language in Society 5: 390-401. Hadding-Koch, Kerstin, and Michael Studdert-Kennedy. 1964. An Experimental Study of Some Intonation Contours. Phonetica 11: 175-185. Reprinted in Bolinger 19724, pp. 348-358. Halle, Morris, and S. J. Keyser. 1971. English Stress: Its Form, Its Growth, Its Role in Verse. New York: Harper & Row. Halliday, M. A. K. 1967a. Intonation and Grammar in British English. The Hague: Mouton. ——— 1967b. Notes on Transitivity and Theme in English (Part IT). Journal of Linguistics 3: 199-244. Hillier, Sir Walter. 1910. The Chinese Language, and How to Learn It. London: Kegan Paul. Hirst, D. J. 1974. Intonation and Context. Linguistics 141: 5-16. Hirst, D. J., and M. Ginésy. 1974. An Approach to the Integration of Intonation into the Syntactic Description of English. Linguistics 121: 45—-55. Hockett, Charles F. 1955. A Manual of Phonology (Indiana University Publications in Anthropology and Linguistics, no. 11), Bloomington. ——— 1958. A Course in Modern Linguistics. New York: Macmillan. ——— 1977. Some Historical Notes on Autonomous Phonology. Paper read at



Symposium on Autonomous Phonology, LSA Annual Meeting, Chicago. Householder, Fred. 1957. Accent, Juncture, Intonation, and My Grandfather’s Reader. Word 13: 234-245. Howie, J. M. 1976. Acoustical Studies of Mandarin Vowels and Tones. Cambridge: Cambridge Univ. Press. Hultzén, Lee. 1956. ‘The Poet Burns’ Again. American Speech 31: 195-201. ——— 1959. Information Points in Intonation. Phonetica 4: 107-120. Jackendoff, Ray S. 1972. Semantic Interpretation in Generative Grammar. Cambridge, Mass.: MIT Press. Jakobson, Roman. 1971. Shifters, Verbal Categories, and the Russian Verb. In Selected Writings II. The Hague: Mouton. Jakobson, Roman, and Morris Halle. 1971. Fundamentals of Language, 2d ed. The Hague: Mouton. Jassem, Wiktor. 1952. Intonation of Conversational English (Educated Southern British) (Travaux de la Société des Sciences et des Lettres de Wroclaw, Seria A, no. 45), Wroclaw. Jones, Daniel. 1909. Intonation Curves. Leipzig and Berlin: Teubner. Joos, Martin. 1962. The Five Clocks. New York: Harcourt Brace & World. Kac, Michael B. 1978. Corepresentation of Grammatical Structure. Minneapolis: Univ. of Minnesota Press.

Kaplan, Eleanor. 1969. The Role of Intonation in the Acquisition of Language. Dissertation, Cornell University. Kingdon, Roger. 1939. Tonetic Stress Markers for English. Maitre Phonétique 54: 60-64.

——— 1958. The Groundwork of English Intonation. London: Longmans. Ladd, D. Robert, Jr. 1977. The Function of the A-Rise Accent in English. Formerly distributed by Indiana University Linguistics Club, Bloomington. ——— 1978. Stylized Intonation. Language 54: 517-540. ——— 1979. Light and Shadow: A Study of the Syntax and Semantics of Sentence Accent in English. In L. Waugh and F. van Coetsem, eds., Contributions to Grammatical Analysis: Semantics and Syntax. Leiden: E. J.Brill. Ladefoged, Peter. 1967. Linguistic Phonetics. Los Angeles: UCLA Working Papers in Phonetics, no. 6. ge Lakoff, George. 1972. The Global Nature of the Nuclear Stress Rule. Langua 48: 285-303. Lakoff, Robin. 1972. Language in Context. Language 48: 907-927.

——— 1975. Language and Woman's Place. New York: Harper & Row. Review. Lane, H. L. 1965. The Motor Theory of Speech Perception: A Critical Psychological Review 72: 275-309. In L. A. Jeffress, _ Lashley, Karl. 1951. The Problem of Serial Order in Behavior. Sons. ed. Cerebral Mechanisms in Behavior. New York: John Wiley & s 2: Analysi tic Leben, William. 1976. The Tones in English Intonation. Linguis 69-107. Lingua 5: 345-371. Lee, W. R. 1956a. English Intonation: A New Approach. s 37: 62-72. ——— 1956b. Fall-Rise Intonations in English. English Studie ndings of Yes/No ——~ forthcoming. A Point about the Rise-Endings and Fall-E Questions. To appear in Waugh and van Schooneveld. MIT Press. Lehiste, Isle. 1970. Suprasegmentals. Cambridge, Mass.: New York: John age. Langu of s ation Lenneberg, Eric. 1967. Biological Found

Wiley & Sons. 1970. Prosodic Feature Léon, Pierre, Georges Faure, and André Rigault, eds., . Didier al: Montré Analysis (Studia Phonetica, Vol. 3),



Léon, Pierre, and Philippe Martin. 1970. Prolégoménes a [étude des structures intonatives (Studia Phonetica, Vol. 2), Montréal: Didier. (Portions appear in translation in Bolinger 1972a, pp. 30-47.) Lewis, J. W. 1970. The Tonal System of Remote Speech. Maitre Phonétique 85: 31-36. Li, Charles, and Sandra Thompson. 1978. The Acquisition of Tone. In V. Fromkin, ed., Tone: A Linguistic Survey. New York: Academic Press, pp. 271— 284. fren 4 A. M., K. S. Harris, H. S. Hoffman, and B. C. Griffith. 1957. The Discrimination of Speech Sounds within and across Phoneme Boundaries. Journal of Experimental Psychology 54: 358-368. Liberman, Mark. 1978. The Intonational System of English. Dissertation, MIT. Distributed by Indiana University Linguistics Club, Bloomington. Liberman, Mark, and Alan Prince. 1977. On Stress and Linguistic Rhythm. Linguistic Inquiry 8: 249-336.

Liberman, Mark, and Ivan Sag. 1974. Prosodic Form and Discourse Function. In Papers from the 10th Regional Meeting, Chicago Linguistic Society,

pp. 416-427.

Lieberman, Philip. 1965. On the Acoustic Basis of the Perception of Intonation by Linguists. Word 21: 40-54. ——— 1967. Intonation, Perception, and Language. Cambridge, Mass.: MIT Press. (Paperback edition 1975.) ——— 1976. Review of Crystal 1969a. Language 52: 508-511. Lieberman, Philip, and Sheldon Michaels. 1962. Some Aspects of Fundamental Frequency and Envelope Amplitude as Related to the Emotional Content of Speech. Journal of the Acoustical Society of America 34: 922-927. Reprinted in Bolinger 1972a, pp. 235-249. Malone, Kemp. 1926. Pitch Patterns in English. Studies in Philology 23: 371-


Martin, James G. 1972. Rhythmic (Hierarchical) versus Serial Structure in Speech and other Behavior. Psychological Review 79: 487-509. Martin, Philippe. ms. Syntax and Intonation: An Integrated Theory. Toronto Semiotic Circle Prepublication. Mol, H., and E. Uhlenbeck. 1956. The Linguistic Relevance of Intensity in Stress. Lingua 5: 205-213. Nash, Rose, and Anthony Mulac. forthcoming. The Intonation of Verifiability. To appear in Waugh and van Schooneveld. Newman, Stanley. 1946. On the Stress System of English. Word 2: 171-187. Nida, Eugene A. 1948. The Identification of Morphemes. Language 24:414441. Reprinted in M. Joos, ed., Readings in Linguistics, I. Chicago: Univ. of Chicago Press, 1957. O’Connor, J. D., and G. F. Arnold. 1961. Intonation of Colloquial English. London: Longmans. Osgood, C. E., G. J. Suci, and P. H. Tannenbaum. 1957. The Measurement of Meaning. Urbana: Univ. of Illinois Press. Osser, Henry A. 1964. A ‘Distinctive Features’ Analysis of the Vocal Communication of Emotion. Dissertation, Cornell University. Palmer, Harold. 1922. English Intonation, with Systematic Exercises. Cambridge: Heffer. Peck, Charles. 1969. An Acoustic Investigation of American English Intonation. Ann Arbor: Univ. of Michigan Phonetics Lab. Pike, Kenneth L. 1945. The Intonation of American English. Ann Arbor: Univ.



of Michigan Press. (Portions of Section 3 reprinted in Bolinger 1972a, pp. 53-82, but page references herein are to the original version.) Pilch, H. 1970. The Elementary Intonation Contour of English: A Phonemic Analysis. Phonetica 22: 82-111.

Pittenger, R. E., C. F. Hockett, and J. J. Danehy. 1960. The First Five Minutes. Ithaca, N.Y.: Martineau.


Emily. 1972. Questions and Answers in English. Dissertation,

Rando, Emily. forthcoming. Intonation in Discourse. To appear in Waugh and van Schooneveld. Reid, Wallis. 1976. The Quantitative Validation of a Grammatical System: The passé simple and the imparfait. In Papers from the 7th Annual Meeting, Northeast Linguistic Society. Sag, Ivan, and Mark Liberman. 1975. The Intonational Disambiguation of Indirect Speech Acts. In Papers from the 11th Regional Meeting, Chicago Linguistic Society, pp. 487-497.

Schane, Sanford. 1977. The Rhythmic Nature of English Word Stress. Paper read at LSA Annual Meeting, Chicago. Schmerling, Susan F. 1974. A Re-examination of Normal Stress. Language 50: 66-73. ——— 1976. Aspects of English Sentence Stress. Austin: Univ. of Texas Press. Schubiger, Maria. 1956. Again: Fall-Rise Intonations in English. English Studies 37: 157-160. ——— 1958. English Intonation: Its Form and Function. Tiibingen: Max Nie\ meyer. of Word Order and Intonation in ion Cooperat and y Interpla The 1964. ——— English. In Abercrombie et al., pp. 255-265. ——— 196s. English Intonation and German Modal Particles: A Comparative Study. Phonetica 12: 65-84. Reprinted in Bolinger 19724, pp. 175-193. ——~ forthcoming. English Intonation and German Modal Particles: A Contrastive Study II. To appear in Waugh and van Schooneveld. a Sharp, Alan E. 1958. Falling-Rising Intonation Patterns in English. Phonetic 2: 127-152. Shibatani, Masayoshi. 1977. Grammatical Relations and Surface Cases. Language 53: 789-809. . Sledd, James. 1955. Review of Trager and Smith 1951. Language 31: 312-335 ——— 1956. Superfixes and Intonation Patterns. Litera 3: 35-41. Smith, Henry Lee. 1955. Review of Jassem 1952. Language 31: 189-193.

Stetson, R. H. 1951. Motor Phonetics. Amsterdam: North-Holland. Grammar of ‘Stockwell, Robert. 1960. The Place of Intonation in a Generative English. Language 36: 360-367. . ——— 1972. The Role of Intonation: Reconsiderations and Other Considerations In Bolinger 19724, pp. 87-109. Clarendon Press. Sweet, Henry. 1892. New English Grammar, Part I. Oxford:

. In AberTrager, George L. 1964. The Intonation System of American English 83-86. pp. 1972

169, 186, 219n4, 219n8

Weak stress. See Unstressed Weak syllable, 18, 20, 22, 50, 68, 129. See also Rhythmic structure WH-question, 63, 85, 111

Women’s speech, 212n6 Word order, g2, 214n6

Subject Index Word stress, 20-21, 33, 86-87, 150. See also Stress vs. accent; Stress shift Writing systems, intonation in, 120


Yes-no questions, 105, 124, 139, 208n7

0-stress. See Unstressed

Structure of intonational meaning =





11g9. 5 L3




The structu of intonational Bees



PE 1139.5

°L3 1980







re of ff intonatioOne aera ll meag~ if

THU uneeee









(continued from front flap) Finally, Ladd proposes a model that integrates into a coherent whole the broad areas of consensus and agreement in past work. It is a model in which ‘intonation’ and ‘stress’ are clearly distinguished: ‘intonation’ consists phonologically of ‘kinetic’ tones much like those of the typical British analysis, while ‘stress’ patterns are based on the hierarchical rhythmic organization of utterances. “,.. should be read by every linguist who has any special interest in intonation.

As a

general overview of the field, it may in fact also be the best available book-length study for the nonspecialist.” —Bruce T. Downing University of Minnesota

D. ROBERT LADD, JR., taught English and linguistics at Babes-Bolyai University, Cluj, Romania, on a Fulbright teaching award, and

is on the staff of the Department of Modern Languages and Linguistics at Cornell University. He has written articles published in Language and elsewhere.

Indiana University Press pRINTE, in? Us, &

Modern Linguistics The Results of Chomskys Revolution By Neil Smith and Deirdre Wilson “The emphasis of the book is, understandably, on transformational grammar, but such topics as phonetics and phonemics, syntax, semantics and meaning, linguistic variation and change .. . and universals of language are also included. . . . the book is very useful for either individual or classroom use.”—Choice “[Smith and Wilson's] book is, I think, the best general introduction that has yet appeared to a discipline that is of its nature very complex.”—Anthony Burgess, The Observer ISBN 0-253-19457-1 336 pages, index, diags., figs.

The Sound Shape of Language By Roman Jakobson and Linda R. Waugh Assisted by Martha Taylor

What are the “ultimate constituents of language,” the smallest units of sound we are able to discriminate and to which we attach significance? In this volume, Roman Jakobson and Linda R. Waugh consider the speech sound as artifact endowed with multiple functions. Individual chapters treat speech sounds and their tasks, the quest for ultimate constituents, the network of distinctive features, and the

spell of speech sounds. “. . . an integrated, comprehensive of Jakobson’s view of phonology with particular reference tive feature theory. Sound symbolism and poetics are also full detail. . . . Excellent bibliography and index. An work.” —Choice 352 pages, bibl., index

synthesis to distinctreated in important

ISBN 0-253-16417-6

Current Approaches to Phonological Theory Edited by Daniel A. Dinnsen Since the advent of generative phonology, linguists have turned their attention to elaborating or constraining the “standard theory.” The present volume engages in dialogue the leading proponents of some of the most stimulating current approaches to phonological theory and makes possible a serious and systematic comparison of their views. “. . . a state-of-the-ferment . . . volume, it is important to the

specialist and the historian.”—Library Journal 352 pages, bibl.

ISBN 0-253-31596-4

Indiana University Press * Bloomington & London ISBN 0-253-15864-8