Intonational Morphology (Prosody, Phonology and Phonetics) 9811522634, 9789811522635

This book discusses the morphological properties of intonation, building on past research to support the long-recognized

127 22 9MB

English Pages 256 [248] Year 2020

Report DMCA / Copyright

DOWNLOAD PDF FILE

Table of contents :
Acknowledgments
List of Abbreviations
Contents
Chapter 1: Introduction
References
Chapter 2: The Forms and Functions of Intonation
2.1 The Functions of Suprasegmentals
2.2 A Definition of Intonation Based on Its Functions
2.3 The Forms of Suprasegmentals
2.4 Concluding Remarks
References
Chapter 3: Intonational Meaning
3.1 The Nature of Intonational Meaning
3.1.1 Context-Dependent Versus Context-Independent Meaning
3.1.2 Compositional Versus Holistic Meaning
3.1.3 Phonological Similarity and Homophony
3.1.4 Gradient Versus Categorically Distinct Forms and Meanings
3.1.5 The Linguists Theory of Intonational Meaning
3.1.6 Testing the Linguists Theory of Intonational Meaning
3.2 Intonation and Discourse Particles
3.2.1 Intonation and Segmental Particles Are Two Forms of the Same Thing
3.2.2 The Similar Debates About Particle and Intonational Meanings
3.3 Concluding Remarks
References
Chapter 4: Evidence of the Morphological Nature of Intonation
4.1 Tonal Grammatical Particles and Their Segmental Counterparts
4.2 Tonal Discourse Particles and Their Segmental Counterparts
4.3 Concluding Remarks
References
Chapter 5: Evidence via Cantonese
5.1 The Cantonese Language
5.1.1 Why Cantonese Is Ideal for This Kind of Research
5.1.2 Intonation in Cantonese
5.1.3 Cantonese Sentence-final Particles
5.2 The Design of the Research
5.2.1 The Participants
5.2.2 The Corpus and the Dialogues
5.2.3 Data Collection
5.2.4 Data Analysis
5.3 Defining Sentence-final Particles
5.3.1 The Natural Semantic Metalanguage Theory
5.3.2 Defining Sentence-final Particles with the Natural Semantic Metalanguage
5.4 Concluding Remarks
References
Chapter 6: The Results of the Research
6.1 Two Evidential Particles: lo1 and aa1maa3
6.1.1 The Particle lo1
6.1.2 The Particle aa1maa3
6.1.3 Summary and Analysis
6.2 Two Question Particles
6.2.1 The Particle me1
6.2.2 The Particle aa4
6.2.3 Summary and Analysis
6.3 Two “Only” Particles: ze1 and zaa3
6.3.1 The Particles ze1 and zaa3
6.3.2 Summary and Analysis
6.4 Concluding Remarks
References
Chapter 7: The Syntax of Intonation
7.1 Background Information
7.1.1 Intonation and Syntax
7.1.2 Cartographic Syntax
7.2 Tonal Morphemes That Function as Grammatical Particles
7.3 Tonal Morphemes That Function as Discourse Particles
7.3.1 The Syntax of Polar Interrogative Particles
7.3.2 The Syntax of Discourse Particles
7.4 Prosodic Structure
7.5 Concluding Remarks
References
Chapter 8: Conclusions and Implications
References
Appendix
X-Bar Theory
The Split-CP Hypothesis and the Cartographic Approach
References
Recommend Papers

Intonational Morphology (Prosody, Phonology and Phonetics)
 9811522634, 9789811522635

  • 0 0 0
  • Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up
File loading please wait...
Citation preview

Prosody, Phonology and Phonetics

John C. Wakefield

Intonational Morphology

Prosody, Phonology and Phonetics Series Editors Daniel J. Hirst, CNRS Laboratoire Parole et Langage, Aix-en-Provence, France Hongwei Ding, School of Foreign Languages, Shanghai Jiao Tong University, Shanghai, China Qiuwu Ma, School of Foreign Languages, Tongji University, Shanghai, China

The series will publish studies in the general area of Speech Prosody with a particular (but non-exclusive) focus on the importance of phonetics and phonology in this field. The topic of speech prosody is today a far larger area of research than is often realised. The number of papers on the topic presented at large international conferences such as Interspeech and ICPhS is considerable and regularly increasing. The proposed book series would be the natural place to publish extended versions of papers presented at the Speech Prosody Conferences, in particular the papers presented in Special Sessions at the conference. This could potentially involve the publication of 3 or 4 volumes every two years ensuring a stable future for the book series. If such publications are produced fairly rapidly, they will in turn provide a strong incentive for the organisation of other special sessions at future Speech Prosody conferences. More information about this series at http://www.springer.com/series/11951

John C. Wakefield

Intonational Morphology

John C. Wakefield Department of English Language and Literature Hong Kong Baptist University Kowloon Tong, Hong Kong Department of Asian Studies

Palacký University Olomouc, Czech Republic

ISSN 2197-8700     ISSN 2197-8719 (electronic) Prosody, Phonology and Phonetics ISBN 978-981-15-2263-5    ISBN 978-981-15-2265-9 (eBook) https://doi.org/10.1007/978-981-15-2265-9 © Springer Nature Singapore Pte Ltd. 2020 This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors, and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Singapore Pte Ltd. The registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore 189721, Singapore

I dedicate this book to the memory of my father, Jerry J. Wakefield. Everything good in me comes from him. The rest is my own doing.

Acknowledgments

I could never have completed this book without the help of many people at different stages of the project. There are too many to name, but I will mention a few. Robert Bauer was instrumental in getting me into my PhD program at HKPolyU and has always been a great mentor for my research related to Cantonese. Tang Sze-Wing supervised my PhD and has taught me a great deal over the years and still motivates me to work hard to this day. Cliff Goddard, who has always been very approachable and encouraging, helped me with some of my definitions of discourse meanings in Chap. 6. Daniel Hirst motivated me tremendously as a PhD student by expressing interest in my research. He has given me crucial comments and advice about the contents of the book and graciously provided the script that was used for all of the PRAAT figures. Liliane Haegeman, who is another friendly and approachable linguist, has been an inspiration to me and has taught me a great deal. Enoch Aboh has been very encouraging about my work and has given me very helpful comments. Lian-Hee Wee has been the source of many fruitful discussions and was kind enough to read and comment on a draft of the book. Noam Chomsky, who I unfortunately have never met, is responsible for my initial interest in linguistics; I hope he considers this book a worthwhile contribution to the field. I consider all of these people to be mentors and hope they are pleased with the final product of my work. Of course, I am solely responsible for any and all errors that remain. I thank all those who, over fine drink and food and on many occasions, have demonstrated that it is possible to be an academic and maintain a sense of humor: František Kratochvíl, Jason Polley, Mark Wallbanks, and Nick Williams. I thank the ambilingual participants, who generously volunteered their time to provide the translations that make up the data in Chap. 6. And finally, I would never have amounted to much at all without the support of my wife and daughter, Mona and Angie—I thank them most of all. This research was partially supported by a Hong Kong Baptist University Faculty of Arts research grant: Discovering Cantonese Semantic Primes and Defining Sentence-final Particles [FRG2/12–13/060]. The Faculty of Arts also provided use of the HKBU Phonology Laboratory. During my time working as a Key Researcher at Palacký University, Olomouc, this work was also supported by the European Regional Development Fund Sinophone Borderlands—Interaction at the Edges CZ.02.1.01/0.0/0.0/16_019/0000791. vii

List of Abbreviations

1s First person singular 2s Second person singular 3s Third person singular -pl Plural marker Adv-M Adverbial marker ASP Aspect marker Head of complementizer phrase C0 CL Classifier CM Comparative marker COP Copula CP Complementizer phrase D Discourse element/Determiner DM Delimitative marker DP Determiner phrase EST Extended Standard Theory EXP Experiential marker Fundamental frequency F0 FinP Finite phrase FocP Focus phrase GEN Genitive marker L1 First language L2 Second language MP Modal particle NDP Null discourse particle NEG Negation NSM Natural Semantic Metalanguage P Proposition PERF Perfective marker PFX Noun class prefix PROG Progressive marker PRT Particle ix

x

List of Abbreviations

REL Relative clause marker SAI Subject-auxiliary inversion SM Subject marker SFP Sentence-final particle SPEC Specifier Head of tense phrase T0 TNS Tense TopP Topic phrase TP Tense phrase VP Verb phrase X Variable to represent element put into focus by “only” Head of phrase category X X0 X′ Constituent of category X between the word and phrase levels XP Phrase of category X

Contents

1 Introduction������������������������������������������������������������������������������������������������    1 References��������������������������������������������������������������������������������������������������    2 2 The Forms and Functions of Intonation��������������������������������������������������    5 2.1 The Functions of Suprasegmentals ��������������������������������������������������    6 2.2 A Definition of Intonation Based on Its Functions ��������������������������   12 2.3 The Forms of Suprasegmentals��������������������������������������������������������   14 2.4 Concluding Remarks������������������������������������������������������������������������   18 References��������������������������������������������������������������������������������������������������   18 3 Intonational Meaning��������������������������������������������������������������������������������   21 3.1 The Nature of Intonational Meaning������������������������������������������������   21 3.1.1 Context-Dependent Versus Context-Independent Meaning����������������������������������������������   23 3.1.2 Compositional Versus Holistic Meaning������������������������������   25 3.1.3 Phonological Similarity and Homophony����������������������������   26 3.1.4 Gradient Versus Categorically Distinct Forms and Meanings������������������������������������������������������������   28 3.1.5 The Linguists Theory of Intonational Meaning��������������������   28 3.1.6 Testing the Linguists Theory of Intonational Meaning��������������������������������������������������������   29 3.2 Intonation and Discourse Particles����������������������������������������������������   31 3.2.1 Intonation and Segmental Particles Are Two Forms of the Same Thing����������������������������������������������   33 3.2.2 The Similar Debates About Particle and Intonational Meanings����������������������������������������������������   34 3.3 Concluding Remarks������������������������������������������������������������������������   37 References��������������������������������������������������������������������������������������������������   37 4 Evidence of the Morphological Nature of Intonation ����������������������������   41 4.1 Tonal Grammatical Particles and Their Segmental Counterparts��������������������������������������������������������������������   42 xi

xii

Contents

4.2 Tonal Discourse Particles and Their Segmental Counterparts��������������������������������������������������������������������   47 4.3 Concluding Remarks������������������������������������������������������������������������   51 References��������������������������������������������������������������������������������������������������   52 5 Evidence via Cantonese ����������������������������������������������������������������������������   55 5.1 The Cantonese Language������������������������������������������������������������������   55 5.1.1 Why Cantonese Is Ideal for This Kind of Research������������������������������������������������������������������   56 5.1.2 Intonation in Cantonese��������������������������������������������������������   57 5.1.3 Cantonese Sentence-final Particles ��������������������������������������   58 5.2 The Design of the Research��������������������������������������������������������������   63 5.2.1 The Participants��������������������������������������������������������������������   65 5.2.2 The Corpus and the Dialogues����������������������������������������������   66 5.2.3 Data Collection ��������������������������������������������������������������������   67 5.2.4 Data Analysis������������������������������������������������������������������������   69 5.3 Defining Sentence-final Particles������������������������������������������������������   73 5.3.1 The Natural Semantic Metalanguage Theory ����������������������   73 5.3.2 Defining Sentence-final Particles with the Natural Semantic Metalanguage ������������������������������������   75 5.4 Concluding Remarks������������������������������������������������������������������������   78 References��������������������������������������������������������������������������������������������������   79 6 The Results of the Research����������������������������������������������������������������������   83 6.1 Two Evidential Particles: lo1 and aa1maa3�������������������������������������   84 6.1.1 The Particle lo1 ��������������������������������������������������������������������   84 6.1.2 The Particle aa1maa3 ����������������������������������������������������������  107 6.1.3 Summary and Analysis ��������������������������������������������������������  127 6.2 Two Question Particles ��������������������������������������������������������������������  136 6.2.1 The Particle me1 ������������������������������������������������������������������  136 6.2.2 The Particle aa4��������������������������������������������������������������������  152 6.2.3 Summary and Analysis ��������������������������������������������������������  160 6.3 Two “Only” Particles: ze1 and zaa3�������������������������������������������������  167 6.3.1 The Particles ze1 and zaa3����������������������������������������������������  167 6.3.2 Summary and Analysis ��������������������������������������������������������  185 6.4 Concluding Remarks������������������������������������������������������������������������  190 References��������������������������������������������������������������������������������������������������  191 7 The Syntax of Intonation��������������������������������������������������������������������������  195 7.1 Background Information������������������������������������������������������������������  195 7.1.1 Intonation and Syntax ����������������������������������������������������������  196 7.1.2 Cartographic Syntax�������������������������������������������������������������  197 7.2 Tonal Morphemes That Function as Grammatical Particles ����������������������������������������������������������������������  198 7.3 Tonal Morphemes That Function as Discourse Particles������������������  201 7.3.1 The Syntax of Polar Interrogative Particles��������������������������  202

Contents

xiii

7.3.2 The Syntax of Discourse Particles����������������������������������������  213 7.4 Prosodic Structure����������������������������������������������������������������������������  223 7.5 Concluding Remarks������������������������������������������������������������������������  227 References��������������������������������������������������������������������������������������������������  227 8 Conclusions and Implications ������������������������������������������������������������������  231 References��������������������������������������������������������������������������������������������������  236 Appendix������������������������������������������������������������������������������������������������������������  237 References����������������������������������������������������������������������������������������������������������  243

Chapter 1

Introduction

This book is an updated and extended version of my Ph.D. thesis (Wakefield 2010), and portions of it have been published in various forms elsewhere (Wakefield 2012, 2014, 2016, in press). The book’s title was inspired by Ladd’s (2008) book Intonational Phonology. It is widely accepted that intonation is phonological, though there is still much debate about its phonological features; in contrast, there is less consensus about what Ladd (2008: 41) refers to as the “Linguist’s Theory of Intonational Meaning, … [t]he central idea of [which] is that the elements of intonation have morpheme-like meaning” (emphasis in italics his). Among those who adopt this view, few take as strong a stance about the morphemic nature of intonation as is proposed here. In the chapters that follow, I argue that intonation’s phonological components represent morphemes that exist in the lexicons of speakers’ minds, and that these morphemes occupy syntactic positions within the structure of the sentence. This idea is not new; in essence, I am following in the footsteps of Hirst (1977, 1983, 1993), who has made attempts to align intonation with the theory of generative syntax. To some linguists, these will sound like strong, unwarranted views; at the very least, they will wonder what kind of evidence there is to support the hypothesis that intonation is no different from the rest of language, other than its form. The purpose of this book-length treatment of the subject is to clarify precisely what this hypothesis entails and to present evidence in support of it. Intonation is arguably the most controversial feature of language. It has suprasegmental forms and abstract meanings, making it extremely challenging to analyze and describe. There is much disagreement about its forms and functions, and further complicating the matter, the meanings and uses of many of the technical terms that describe intonation are not consistent throughout the literature. Yet another complicating factor is the widely-held assumption that only some suprasegmentals are part of the linguistic structure; the remaining have been variously described as paralinguistic (CouperKuhlen 1986; Ladd 2008), nonlinguistic (Fox 2000), or a form of animal communication (Gussenhoven 2004). These two types of suprasegmentals, i.e., linguistic versus paralinguistic, not only occur simultaneously along with the linear stream of an utterance’s segments but also simultaneously with each other. Referring to the two types of supra© Springer Nature Singapore Pte Ltd. 2020 J. C. Wakefield, Intonational Morphology, Prosody, Phonology and Phonetics, https://doi.org/10.1007/978-981-15-2265-9_1

1

2

1 Introduction

segmentals, Ladd (2008: 6) said, “it is a matter of considerable controversy which aspects are which, or whether such a distinction is even possible.” Even though it remains unclear how to physically and perceptually isolate linguistic from paralinguistic suprasegmentals, it is assumed throughout this book that the two types are qualitatively different in nature. Linguistic suprasegmentals exhibit an arbitrariness of the sign that is characteristic of other grammatical components within language (Hirst 1983; Couper-Kuhlen 1986; Fox 2000), while paralinguistic suprasegmentals are used to express emotions and attitudes such as fear, anger, impatience, or boredom and are assumed to be fundamentally the same cross-­ linguistically—though there are of course differences in use and production that stem from sociocultural and individual differences. This book does not take on the monumental task of teasing apart the physical and/or the perceptual differences between the forms of linguistic and paralinguistic suprasegmentals—something for which other linguists are more qualified than myself. Rather the goal here is to propose a way of conceptualizing the differences between the two, based on the assumption that the two are qualitatively distinct. To this end, I will propose that the term intonation be redefined based on what I propose its functions to be. This book has two goals. The first is to propose how intonation should be conceptualized and recategorized based on the hypothesis that it is morphemic. This is dealt with in Chaps. 2 and 3. Chapter 2 describes the forms and functions of intonation and then offers a definition of it based on the functions I propose it has. Chapter 3 discusses the meanings of discourse intonation and contrasts them with the meanings expressed by segmental discourse particles. The second goal of the book is to present evidence and arguments in support of the hypothesis—both from the literature and from my own research. Chapter 4 reviews the morphemic nature of tones that have been reported in the literature, which includes tonal morphemes that have grammatical functions, as well as those that have discourse meanings. Further evidence of discourse tonal morphemes is presented in Chaps. 5 and 6, where my own research is reviewed, offering empirical evidence to indicate that specifically shaped pitch contours in English have definable, context-independent meaning. Chapter 7 proposes how intonation might be represented in the syntactic structure, which is, as far as I know, the most comprehensive proposal presented to date on the syntax of discourse intonation. Finally, Chap. 8 offers some concluding remarks and discusses the implications of analyzing intonation in this way.

References Couper-Kuhlen, E. (1986). An introduction to English prosody. London: Edward Arnold. Fox, A. (2000). Prosodic features and prosodic structure: The phonology of suprasegmentals. Oxford: Oxford University Press. Gussenhoven, C. (2004). The phonology of tone and intonation. Cambridge: Cambridge University Press.

References

3

Hirst, D. (1977). Intonative features: A syntactic approach to English intonation. The Hague: Mouton. Hirst, D. (1983). Interpreting intonation: A modular approach. Journal of Semantics, 2(2), 171–182. Hirst, D. (1993). Detaching intonational phrases from syntactic structure. Linguistic Inquiry, 24(4), 781–788. Ladd, D. R. (2008). Intonational phonology (2nd ed.). Cambridge: Cambridge University Press. Wakefield, J. C. (2010). The English equivalents of Cantonese sentence-final particles: A contrastive analysis. Unpublished doctoral thesis, The Hong Kong Polytechnic University, Hong Kong. Wakefield, J. C. (2012). A floating tone discourse morpheme: The English equivalent of Cantonese lo1. Lingua, 122(14), 1739–1762. Wakefield, J.  C. (2014). The forms and meanings of English rising declaratives: Insights from Cantonese. Journal of Chinese Linguistics, 42(1), 109–149. Wakefield, J.  C. (2016). Sentence-final particles and intonation: Two forms of the same thing. In J.  Barnes, A.  Brugos, S.  Shattuck-Hufnagel, & N.  Veilleux (Eds.), Speech prosody 2016 (pp. 888–892). Boston: Boston University. https://doi.org/10.21437/SpeechProsody.2016. Wakefield, J.  C. (in press). It’s not as bad as you think: An English tone for downplaying. In W. Gu (Ed.), Studies on tonal aspects of languages. Hong Kong: Journal of Chinese Linguistics Monograph.

Chapter 2

The Forms and Functions of Intonation

The term intonation is restricted here to refer only to prosodic features that are morphemic. The term prosody is used more broadly to refer to two types of suprasegmentals: linguistic and nonlinguistic. Linguistic forms of prosody are further divided into those that are morphemic and those that are not. This categorization of prosodic features results in three categories: nonlinguistic prosody, which expresses emotions; linguistic prosody that is nonmorphemic; and linguistic prosody that is morphemic, which is what I define as intonation. This categorization of prosody is unconventional and somewhat controversial, but it facilitates the goal of this book, which is to describe and present evidence in favor of the hypothesis that all meaningful prosody is morphemic. Hirst (1983: 93) said that “[i]ntonation, what Bolinger has called the ‘greasy part of language’, is notoriously difficult to describe.” ‘t Hart et al. (1990: 2) also recognized the slippery nature of intonation, saying that “it is a fairly elusive subject matter [because it has] features [that] are more difficult to observe, transcribe and analyse than are their segmental counterparts.” Intonation is difficult to study for at least the following four reasons: (1) the term intonation may refer to more (or fewer) suprasegmental features and functions when used by different linguists (Johns-­Lewis 1985); (2) it is not yet and perhaps never will be possible to mechanically record intonation the way that the native speaker’s ear hears it. Something that a machine records as a rise in pitch, for example, is not necessarily heard by listeners as a rise, and therefore—even though clearly seen on paper—may not be linguistically meaningful (Roach 2009); (3) there is no one-to-one correspondence between form and function (‘t Hart et al. 1990; Botinis et al. 2001; Chun 2002); and (4) the various subtypes of suprasegmentals are used simultaneously in speech, one atop another, making it difficult to isolate one type and its associated forms and functions from those of another. Due to its complex nature, it is not surprising that different linguists have analyzed and described intonation’s forms and functions differently. This book hypothesizes yet another way of analyzing intonation, plus a recategorization of its forms and functions based on that analysis. My proposal is based on the theoretical assumption that intonation is morphemic, and therefore is only justified to the extent that empirical evidence © Springer Nature Singapore Pte Ltd. 2020 J. C. Wakefield, Intonational Morphology, Prosody, Phonology and Phonetics, https://doi.org/10.1007/978-981-15-2265-9_2

5

6

2  The Forms and Functions of Intonation

can be found to support this claim. Before reviewing some of this evidence, however, it will be helpful to first explain in sufficient detail what the hypothesis is. I will do this by first describing what I consider to be the functions of all suprasegmentals, dividing them into the three categories stated above. I will then propose a definition of intonation based primarily on its functions. After that I will then discuss the forms of the suprasegmentals that are used within each of the three categories of prosody.

2.1  The Functions of Suprasegmentals Botinis et al. (2001: 267) said that “[t]he main functions of intonation are centered round the notions of prominence, grouping and discourse, which are related to various grammatical components as well as linguistic levels,” (emphasis in italics theirs). Referring to intonation, Gussenhoven (2004: 50) said, “people use it to express their feelings; it encodes the information structure of the sentence; [and] it appears sensitive to syntactic categories like ‘argument’ and ‘predicate’.” Many authors, regardless of which linguistic theory they adhere to, seem to agree that intonation is a central part of the grammar, working to mark phrasal, clausal, or theme-rheme boundaries, as well as speech act types, such as question versus statement (Trager 1972; Pierrehumbert and Hirschberg 1990; Crystal 1997a, b; Halliday and Greaves 2008). Except for the marking of speech acts, I will explain below why I do not adopt all of these assumptions about intonation. Chun (2002) divided the functions of intonation into three categories: (1) grammatical functions, (2) discourse functions, and (3) attitudinal and affective functions. She pointed out, however, that “there are no firmly established or universally agreed upon principles for classifying the functions of intonation” (p. 56), which means that any choices made for delimiting and describing its functions will be somewhat controversial. Nevertheless, it is essential for the purposes of this book that I settle on a particular definition of intonation, which in turn requires that I describe and classify its forms and functions. The categories of intonational functions that are assumed here include only the first two of the three that Chun included—the third one, which is the expression of attitudinal and affective meanings (i.e., human emotions), is assumed to be nonlinguistic and therefore not expressed by intonation. Crystal (1997a, b) recognized only two key functions of intonation by roughly combining Chun’s (2002) second and third functions into a single function. Crystal (1997a: 173) said that in addition to signaling grammatical structure, intonation functions “to express a wide range of attitudinal meanings—excitement, boredom, surprise, friendliness, reserve, and many hundreds more,” and elsewhere he added to this list some “personal attitude[s]: sarcasm, puzzlement, [and] anger” (Crystal 1997b: 202). What Crystal referred to as “attitudinal meanings” included both affective meanings and discourse meanings. In contrast, I use the terms attitudinal and affective meanings to refer only to nonlinguistic human emotions and distinguish them from the linguistic meanings expressed by intonation, which are assumed to

2.1  The Functions of Suprasegmentals

7

be part of the lexicon, expressing things such as focus, speaker stance, epistemic and evidential modality, and other discourse-related notions. An important factor in determining and describing the functions of intonation is to decide how the term intonation is used in relation to the term prosody; are they the same thing, two separate systems, or is intonation a subset of prosody? The answer to these questions will determine whether they have the same, overlapping, or separate functions. Crystal (1997b), for example, separated intonation from prosody based on form. He said intonation is “the distinctive use of patterns of pitch” (p. 202), while prosody is “variations in pitch, loudness, tempo and rhythm” (p. 313). At the same time, however, he did not seem to separate the functions of these two systems; the meanings he attributed to intonation included the human emotions excitement, boredom, and anger, even though these are expressed by the forms he classified as prosody. Crystal (1997b: 202) listed friendliness, surprise, and anger together under a single function of intonation (i.e., “the communication of personal attitude”) and said that these “can all be signaled by contrasts in pitch, along with other prosodic and paralinguistic features.” This seems to imply that the suprasegmentals that are used to express the discourse notion entailed in surprise and those used to express an emotion such as anger are both expressed by a combination of intonation and prosody. In the present book, in contrast, prosody is defined in a way that clearly separates those forms and meanings that are linguistic from those that are paralinguistic, and intonation is contained within the subset of prosody that is linguist. This “clear separation” of prosody into two types is largely theoretical; describing and demarcating linguist versus paralinguistic forms and functions of prosody is not made easy simply because we recognize that they should be kept separate. In reference to these two types of suprasegmentals, Ladd (2008: 6) said, “it is a matter of considerable controversy which aspects are which, or whether such a distinction is even possible.” Nevertheless, even if we can only conclude for now that this distinction is theoretically possible, proposing theories about the precise nature of prosody and intonation is useful and important for determining how future research might proceed. After Fox (2000: 269) pointed out the frequent attention that scholars have drawn to “the difficulties and uncertainties surrounding [intonation’s] analysis, its systematic description, and it’s incorporation into linguistic models and theories,” he concluded that “the problems of intonation are more of a theoretical than a practical kind, and relate to its nature and role rather than to its phonetic properties.” Based on the classifications I adopt here, Crystal’s (1997a, b) list of examples for attitudinal and affective meanings can be divided into three groups: those that express only attitudinal and affective meanings (e.g., boredom, excitement, and anger); those that are a combination of discourse and attitudinal/affective meanings (e.g., surprise, puzzlement, and sarcasm), and all the remaining that are probably too broad to classify (e.g., friendliness and reserve). The first group now lies outside this book’s definition of intonationally-expressed meanings, and instead belongs to nonlinguistic meanings that are expressed through the use of paralinguistic prosody. The labels given to the meanings in the second and third groups should ideally be

8

2  The Forms and Functions of Intonation

replaced by simplified labels that can be clearly identified either as discourse meanings or as attitudinal/affective meanings.1 Hirst (1977) pointed out that few if any linguists doubt that intonational features contribute information to sentences. The only questions are what kind of information, and whether or not this information is systematic. Is intonational information comprised of discrete features that are acquired by learners along with the other syntactic, semantic, and phonological features of language? If so, we would expect there to be surface differences in the intonational systems of languages. There is plenty of evidence that this is in fact the case. Hirst (1977: 3) went on to say that if, on the contrary, “we consider intonation as merely a direct, physical manifestation of the speaker’s emotions and feelings, [then] we should normally expect different languages to use the resources of intonation in very similar, if not exactly the same, ways.” Suprasegmentals can be used in both of these ways, i.e., to express discrete linguistic features and to express emotions. The task at hand then is to separate and classify these two types of suprasegmentals. Some linguists include the expression of emotions as a property of intonation (e.g., Chun 2002; Gussenhoven 2004), but here these are instead classified as a property of nonlinguistic prosody. Many authors have distinguished suprasegmentals that are linguistic from those that are nonlinguistic. The latter have been referred to as a form of animal communication that functions to express emotions and nonlinguistic pragmatic meanings. Couper-Kuhlen (1986: 174) said, “we must distinguish an unmonitored, purely physiologically determined externalization of emotional state, presumably universal across linguistic communities, from a ‘cognitively’ monitored expression of attitude, conventionalized and communicative in purpose.” Fox (2000: 270) likewise distinguished “non-linguistic” suprasegmentals that relate to emotions and attitudes from “the pitch features associated with [linguistic functions and intonation patterns that are] by no means always ‘natural’ and universal, but differ from language to language, and hence reflect an arbitrariness characteristic of linguistic, rather than nonlinguistic, phenomena.”2 Gussenhoven (2004: 50) distinguished between two categories of intonation, saying that “intonation is both a form of animal communication … and part of the linguistic structure.” He said that human language has the arbitrariness of the sign and that some aspects of suprasegmentals are clearly nonarbitrary because, across languages, [w]hen we are excited, our pitch goes up, and when we are depressed we tend to have a low pitch with few excursions …When we wish to emphasize a word, we may raise our pitch, in addition to raising our voice in the sense of speaking more loudly. When we want to sig-

 Attentive readers may note that I later use the terms surprise and doubt to describe some of the discourse meanings within my own research in Chap. 6. This does not conflict with what I say here about such terms, however, because I use those terms only in reference to clearer, fuller definitions; I do not use them to refer directly to the intonational forms themselves. 2  Note that Couper-Kuhlen (1986) and Fox (2000) both use the term “attitude,” but the former uses it to describe a linguistic meaning, while the latter uses it to describe nonlinguistic meaning, illustrating yet another example of inconsistency in the use of terminology. 1

2.1  The Functions of Suprasegmentals

9

nal—for real, or more probably in jest—that we need the speaker’s protection or deserve his mercy, we instinctively raise our pitch, to create a “small” voice. (Gussenhoven 2004: 51)

We can see from this that Gussenhoven (2004), like Crystal (1997b), distinguished between two forms of suprasegmentals. The forms that Crystal referred to as prosody were considered by Gussenhoven to be a form of animal communication. And the forms Crystal referred to as intonation, Gussenhoven considered as part of the linguistic structure. However, Gussehnoven placed both of these types of suprasegmentals under the category “intonation,” as did Pike (1945: 24), who said that “various types of intonation, such as the general pitch of the voice as a whole in contrast to the different pitches occurring within a single sentence, must be studied separately in so far as is possible.” What I call paralinguistic prosody is what Pike referred to as pitch of the voice as a whole (i.e., its range and key), and what I call linguistic prosody is what Pike called sentence-internal pitch manipulation. Other authors have also followed Gussenhoven’s (2004) and Pike’s (1945) classification of referring to intonation as both linguistic and paralinguistic (e.g., Couper-Kuhlen 1986; Brazil 1997; Fox 2000). In contrast, I divide suprasegmental features of speech (i.e., prosody) into two categories: (1) linguistic suprasegmentals and (2) paralinguistic suprasegmentals. Category 1 is assumed to be part of the linguistic structure, and category 2 a form of animal communication. Category 1 is further divided into morphemic and nonmorphemic suprasgementals, with the former being what I define as intonation (see Table 2.1). This definition of intonation is narrower than that of Pike (1945) and Gussenhoven (2004) because it excludes those forms that are not part of the linguistic structure. In fact, it is narrower than most linguists’ definition of intonation because it further excludes those forms of prosody that have been said to delimit phrasal structures—my reasons for this exclusion are given below. Some linguists make a clear distinction between intonation and prosody, others partially combine the two, and yet others use the two terms interchangeably. Ladd’s (2014, Chap. 3) detailed description of the origin and historical uses of the term Table 2.1  Suprasegmental categories based on function

Prosody (all suprasegmental features of speech) Linguistic suprasegmentals

Intonation Functions: 1. To express grammatical functions 2. To express discourse meanings

Nonmorphemic linguistic prosody Functions: To produce lexical tone To mark the prosodic structure

Paralinguistic suprasegmentals Non-linguistic prosody (a form of animal communication) Function: To express attitudinal, affective meanings

10

2  The Forms and Functions of Intonation

prosody indicates that most authors consider intonation to be a subcomponent of prosody. Johns-Lewis (1985: xix) asked, “Is there a dividing line between intonation and prosody? The answer, as with so many terms, is that it depends on who is using the terms.” Like most authors, she classified intonation as a subset of prosody, and I adopt that practice here, additionally restricting it to being a subset of linguistic prosody. This results in three subsets of prosody with some overlapping forms (see Sect. 2.3), but theoretically these different types of prosody are qualitatively different in nature because they have no overlapping functions, as indicated in Table 2.1. Some of the classifications in Table  2.1 are nonconventional, so they warrant clarification. One classification that should not be controversial is making intonation a subset of prosody. Linguists must choose whether to classify these as the same thing, to classify them as two separate things, or to classify intonation as a subset of prosody. I have chosen the latter for two reasons: (1) prosody has multiple functions, so it makes practical sense to distinguish it from intonation, which can then be used as a term to refer to a subgroup of prosody’s functions and (2) classifying intonation as a subset of prosody, rather than as something distinct, makes sense because intonation is made up of prosodic features and therefore should be considered as belonging to prosody. Inherent features are associated with vowels and consonants and can be defined without reference to the sequence of sounds in an utterance. Prosodic features, in contrast, can only be defined in reference to acoustic changes or contrasts within the utterance, or in reference to a person’s voice range (Ladd 2008: 189–92). What is unusual about Table 2.1 in this respect is that I classify all prosodic features of language as prosody, including tonal morphemes and lexical tones, which, as far as I know, have never before been referred to as forms of prosody and/or intonation. Gussenhoven and Jacobs (2014: 148–9) classified tones into three types based on their functions: lexical tones, which distinguish syllabic morphemes which otherwise share the same segments; grammatical tones, which are themselves morphemes (i.e., what I refer to as tonal morphemes); and intonation tones, which “function to signal discourse meaning or phrasing.” They said that tonal morphemes, “unlike intonational morphemes, have meanings that fit into the morphological and syntactic paradigms of the language, instead of expressing discoursal meanings” (p. 157). This differs from what I am arguing here, which is that discourse meanings are in the lexicon and that intonational tones therefore belong in the morphological and syntactic paradigm of language to the same extent as tonal morphemes. This means, for example, that a rising question tone in English is analyzed no differently from the rising tone that marks progressive aspect on verbs in Inland Ewe, a Kwa language from West Africa (Aboh and Essegbey 2010). These two tones belong to different categories, with one being a discourse particle and the other an aspect particle, but both are equally represented in the lexicon and in the syntax. As such, they are both analyzed here as tonal morphemes. The first function listed for intonation in Table  2.1 is also controversial. The expression of grammatical functions refers to what are traditionally called tonal morphemes, like the example from Inland Ewe just mentioned. (It must be made clear that the expression of grammatical functions does not refer to the use of intonation to mark phrasal structure, which I tentatively argue is not a function of into-

2.1  The Functions of Suprasegmentals

11

nation.) The production of lexical tone is shown as a function of nonmorphemic linguistic prosody because lexical tone is linguistic, has prosodic features, and does not comprise morphemes.3 The other function listed under nonmorphemic linguistic prosody is the marking of prosodic structure. Many linguists consider the marking of phrasal, clausal, and prosodic boundaries to be a function of intonation. I instead adopt the view of Ladd (2008), who said, intonation has no privileged status in signaling prosodic structure ... intonational features of pitch and relative prominence are distributed in utterances in ways allowed by the prosodic structure. In some cases this means that conspicuous phonetic breaks occur at major constituent boundaries, but this is neither the essence of the boundary nor the only factor governing the distribution of the intonational features. (Ladd 2008: 10–11)

It must be pointed out that prosodic structure does not fully match syntactic structure; it is merely related to it. In her study on the interactions between syntax and prosody in Connemara Irish, Elfner (2012) argued that the prosodic form of a sentence is not only a result of syntactic structure; it is also influenced by linearization and prosodic well-formedness. Bennet and Elfner (2019: 153) took it as a “fact that prosodic structure is derived from syntax, but need not be identical to it.” Explaining why this is the case, Gussenhoven and Jacobs (2014: 203) said that “[s] ince morphosyntactic constituents of a given rank may vary hugely in length, a one-­ to-­one correspondence between phonological and morphosyntactic constituents would put unreasonable demands on speakers.” It makes sense that prosodic structure should not clash with morphosyntactic structure, as that would affect communication, but the fact that they are not the same makes it reasonable to assume that the morphosyntactic structure is not determined (i.e., marked) by the prosodic structure but is instead marked by lexical and grammatical features. I tentatively assume that prosodic structure is strictly phonetic in nature, residing outside the lexicon and syntax; it is merely a by-product of the grammatical structure of language (along with linearization and prosodic well-formedness). Based on this assumption that the suprasegmentals used for prosodic phrasing are nonmorphemic, the marking of prosodic structure is not listed as a function of intonation in Table 2.1. Saying that the marking of prosodic structure is not a function of intonation is no trivial matter, and it has consequences for the theory of intonation that I am proposing here. The implications and some of the issues involved are discussed in Sect. 7.3. Suprasegmentals that mark prosodic structures are distinguished from those that have discourse-related meanings. In the case of topicalization, for example, an ­associated tone can be seen as marking the topicalized phrase in the same way that topic particles do in languages like Chinese and Japanese. Consider this observation from Gussenhoven and Jacobs (2014):

3  Categorizing lexical tone under the term linguistic prosody makes sense for these reasons, but it does not mean I endorse the practice of regularly referring to lexical tone as prosody. That would be impractical; thinking of it and referring to it as distinct from prosody facilitates discussing how lexical tones interact with what is traditionally thought of as prosody.

12

2  The Forms and Functions of Intonation A phrasing function of tone occurs in an English sentence like Once we’re in China, we can practice our Chinese, where the last syllable of China is likely to have a high tone indicating the boundary. When tones function to signal discoursal meaning or phrasing they are intonational tones. (Gussenhoven and Jacobs 2014: 149)

This tone on the second syllable of “China” can reasonably be analysed as a tonal morpheme functioning as a topic marker. Jumping ahead to the ideas that will be discussed in Chap. 7, the prepositional phrase “once we’re in China” is a topicalized phrase that is assumed to have moved to what is called the specifier position of a Topic phrase at the left periphery of the sentence. This Topic phrase is headed by a feature [+Top] (cf. Aboh 2010). The tone on the second syllable of “China” can be analyzed as the phonological realization of this [+Top] feature, making it a tonal morpheme that is no different from the segmental topic markers in other languages (e.g., Japanese wa; Korean nun; Mandarin ne). The only difference is that instead of being segmental, it is a floating tone that is associated with one or more of the segments at the end of the topicalized phrase. Sometimes no tone is used, and the presence of the [+Top] feature is made known only by the fact that the topicalized phrase has raised. This is not surprising since topic markers in Chinese are also optional. Another example comes from Aboh (2016), who contrasted a Hungarian sentence with one from Gungbe, both of which included both a focused and a topicalized phrase. He said the only difference between their ways of marking these phrases is that Hungarian uses intonation where Gungbe uses the segmental topic marker yà and a focus marker wὲ. This shows a clear contrastive comparison between segmental and intonational topic and focus markers, indicating that the intonational forms associated with topicalized and focused elements are morphemic. This differs from those tones associated with prosodic structure that do not appear to have any segmental counterparts.

2.2  A Definition of Intonation Based on Its Functions Based on the classifications shown in Table 2.1, a definition of intonation is proposed as follows: (1) Intonation: A suprasegmental form that has semantic content or a grammatical function The definition in (1) incorporates the classic definition of a morpheme: “A morpheme—the minimal linguistic unit—is thus an arbitrary union of a sound and a meaning (or grammatical function) that cannot be further analyzed” (Fromkin et al. 2013: 38). Defining intonation as morphemic results in the normal practice of excluding lexical tone, which is classified here as a form of nonmorphemic linguistic prosody. However, this definition of intonation now includes tonal morphemes, which have not traditionally been considered forms of intonation. They have instead been referred to as suprasegmental morphemes that function to mark things such as grammatical aspect, definiteness, grammatical case, and so on. These have always been seen uncontroversially as morphemes that reside in speakers’ lexicons and that

2.2  A Definition of Intonation Based on Its Functions

13

are therefore part of core syntax and semantics, but they have, as far as I know, always been described and defined without reference to intonation beyond how they may interact with intonational phonologically. The definition in (1) also defines the intonation used to expresses discourse meanings as tonal morphemes. The idea that discourse tones are morphemic in the same sense as grammatical tonal particles will be discussed in Chap. 3, and further evidence will be presented in Chaps. 4 and 6. The definition in (1) will now be compared with some definitions found in the literature in order to illustrate the differences. Starting with a relatively short and simple definition, Tench (1996: 2) said that “intonation is the linguistic use of pitch in utterances.” There are two key differences between his definition and that of (1): first, Tench too narrowly defines intonation’s form as pitch alone (see Sect. 2.3) and second, in Trench’s own words, it “specifies that intonation is concerned with utterances” (p. 3). While specifying that intonation relates to utterances works to exclude lexical tone, as it should, it is not specific enough—it does not work to clarify what functions and meanings are included or excluded. Cruttenden (1997: 7) said that “intonation involves the occurrence of recurring pitch patterns, each of which is used with a set of relatively consistent meanings, either on single words or on groups of words of varying length.” Again this definition takes the practical step of simplifying the form of intonation to pitch alone, but a more critical difference between this definition and the definition in (1) is that intonational meanings are said by Cruttenden to be only “relatively consistent.” This does not align with the strong claim entailed in (1), which is that intonational forms have semantic content or a grammatical function—their core meanings are therefore assumed to be constant from one occurrence to the next. Gussenhoven (2004: 12) said “intonation is treated as the use of phonological tone for non-lexical purposes, or—to put is positively—for the expression of phrasal structure and discourse meaning.” The term phonological tone implies the possibility of a combination of phonetic features working in tandem to form a tone, rather than pitch alone. The definition in (1) therefore could have used the term phonological tone in place of “suprasegmental form.” Saying that intonation is used for nonlexical purposes works to exclude lexical tone, as it should, but it also excludes tonal morphemes, thus excluding everything that I define as intonation. Another difference is that Gussenhoven’s definition includes the expression of phrasal structure as a function of intonation, while I tentatively exclude this for the reasons explained above. Ladd (2008: 6) defined intonation as “the use of suprasegmental phonetic features to convey ‘postlexical’ or sentence-level pragmatic meanings in a linguistically structured way.” Saying that intonation is “postlexical” excludes lexical tone, but it again also excludes tonal morphemes, and saying that intonation expresses “sentence-level pragmatic meanings” excludes all word-level or phrase-level grammatical tones. While intonation has traditionally only included sentence-level tones, there is no reason that such tones cannot be categorized together with word-level or phrase-level grammatical tones. Discourse tones have frequently been compared with segmental discourse particles, so categorizing all tonal morphemes together is analogous to categorizing all segmental particles together. The differences between a tonal discourse particle and a tonal grammatical particle are their meaning, their

14

2  The Forms and Functions of Intonation

scope, and their grammatical function. Categorizing them together under the term intonation is comparable with categorizing all of their segmental counterparts, both grammatical and discoursal, under the term particles for the purpose of describing and categorizing linguistic features. In both cases, we recognize that the collection of grammatical elements included under an umbrella term can and should be subcategorized. Another difference between the definition in (1) and Ladd’s (2008) definition is that the former considers intonational forms to have meaning beyond merely pragmatic meaning (see Chap. 3).

2.3  The Forms of Suprasegmentals The paralinguistic expression of emotional attitudes is a form of animal communication which uses prosodic forms that are basically universal across languages, and these affective meanings are not lexicalized in any language. Calling them a form of animal communication is not meant to imply that the nonlinguistic meanings they express are not unique to humans. The paralinguistic suprasegmentals that accompany our speech are used to express uniquely human emotions, and their forms are a unique product of the human vocal tract. They are not part of the syntax or lexicon of human language, so it is reasonable to conclude that they are a form of animal communication that is used by the animal species known as Homo sapiens. Their forms and meanings are naturally rather consistent across the species, and many of their qualities may or may not be shared with the forms of communication used by other animal species. Saying that the forms of paralinguistic prosody are basically universal does not mean one would expect it to be used in the same way cross-linguistically and cross-­ culturally. From language to language, there will be some variations in both the forms and the meanings of paralinguistic prosody. Variations of form will occur because prosodic forms overlap and interact with the unique phonological features of a given language, including its systems of intonation and linguistic prosody. And differences of both form and meaning will result from cultural and speaker individual differences that influence the expression and interpretation of human emotions. Some evidence for the universality of “paralinguistic features” comes from Maekawa (2004: 8), who concluded that “the perception of [paralinguistic information] as voice-quality is language-independent, or universal, like perception of emotion,4 while the perception of [paralinguistic information] as manifested by the manipulation of the features of phrase-phonology is language-dependent.” He came to this conclusion based on his experiment with various forms of suprasegmental information, all of which he called “paralinguistic information,” but which I refer to as prosody. His term “voice quality” referred to what I call paralinguistic prosody.

4  It should be noted that the universality of the perception of emotion through facial expressions has been challenged, e.g., Russell (1994).

2.3  The Forms of Suprasegmentals

15

And what he referred to as “manipulation of the features of phrase phonology” is basically what I am calling intonation. Maekawa (2004) conducted a study that included meanings such as admiration, suspicion, disappointment, and indifference. The participants of his study were instructed to record a single sentence several times, each time superimposing a form of “paralinguistic information” (i.e., prosody) that was supposed to have one of the meanings listed above. To help his participants, he clarified the various meanings by paraphrasing them. For example, Maekawa (2004: 1) told his participants that the prosody of admiration should convey the message “That’s great. I love it” and that the prosody of suspicion should convey the message “I doubt it and I don’t believe it.” By taking terms that were too vague and broad in meaning and rewriting them as speakeroriented emotional attitudes (“That’s great. I love it.”) or as discourse-­related speakeroriented beliefs (“I doubt it and I don’t believe it”), Maekawa at least partially addressed a problem that was demonstrated by Crystal (1969), which is that native-English speakers lack the ability to consistently produce and recognize intonational forms that are labeled with terms that linguists have commonly used for describing intonational meanings (e.g., “bored,” “puzzled,” etc.). Maekawa’s study dealt with Japanese, but the same problem no doubt applies, which is why he used the strategy that he did. Another group of participants in Maekawa’s (2004) study were asked to match each recording to one of the meanings on the list. Some of the participants who conducted this task were native-Japanese speakers, some were L2-Japanese speakers at various levels, and some had no knowledge of Japanese at all. Maekawa’s results showed a connection between (1) the form of prosody and (2) whether or not that form of prosody required knowledge of Japanese. If participants with no knowledge of Japanese were successful at making a particular match between form and meaning, then that form of prosody was concluded to be universal, rather than specific to Japanese. Although Maekawa’s terminology was different, his saying that “voice quality” (i.e., paralinguistic prosody) is universal and that “phrase-­ phonology” (i.e., intonation) is language dependent agrees with Table 2.1. It is not surprising that Maekawa (2004) used the term “voice qualities” to refer to paralinguistic prosody. Johns-Lewis (1985: xx) concluded that voice qualities are not forms of intonation, and Crystal (1969: 102) agreed, saying that quality of voice is hard to define, but that one relatively useful definition is a negative one, which says that voice quality is any aspect of tone other than one of the three main ones that form intonation: pitch, loudness, and length.5 Most forms of quality of voice (e.g., breathy, whispery, husky, etc.) can probably be classified as paralinguistic suprasegmentals. Cruttenden (1997: 124) said that “[a]mong those emotions reported as having wide [range] and high [key] are joy, anger, fear, and surprise; among those reported as having narrow [range] and low [key] are boredom and sorrow.” These are the kinds of meanings expressed by the attitudinal and affective function of s­ uprasegmentals. 5  Pitch, loudness, and length refer to mental representations; their respective physical correlates, which can be directly measured by instruments, are fundamental frequency (F0) measured in Hertz, intensity measured in decibels, and duration measured in seconds. See Chap. 5 for the implications that these mental vs. physical representations have on intonational research.

16

2  The Forms and Functions of Intonation

They are expressed through “variations in pitch,” (i.e., high vs. low key and wide vs. narrow range), which, along with “loudness, tempo and rhythm,” make up the forms that Crystal (1997b: 313) labeled “prosody.” Here, in contrast, I categorize these forms, along with qualities of voice, as a subset of prosody, i.e., as “nonlinguistic prosody.” Table 2.2 categorizes the forms and functions of prosody adopted here. What are the suprasegmental forms that make up intonation? Crystal (1969: 195) observed that “scholars have been anxious to restrict the formal definition of intonation to pitch movement alone.” One example comes from Wells (2006: 1), who said that the study of intonation involves studying “how the pitch of the voice rises and falls, and how speakers use this pitch variation to convey linguistic and pragmatic meaning.” Many linguists give preeminent status to pitch over the other, relatively less important features that make up intonation. Whether other features are included in the definition of intonation or not, it is generally agreed that pitch is the most influential of the three main components that convey intonational meaning, with length being the second, and loudness being the third (Johns-Lewis 1985; Cruttenden 1997; Hirst et al. 2000; Chun 2002). Many scholars have recognized and/or followed the practice of describing intonation in terms of pitch alone (‘t Hart et  al. 1990; Brazil 1997; Crystal 1997b; Botinis et al. 2001; Chun 2002; Wells 2006). This is done only for practical reasons. Linguists all realize that there is more to intonation than just pitch. Brazil (1997: 3) Table 2.2  Suprasegmental categories based on form and function

Prosody (all suprasegmental features of speech, except lexical tone) Linguistic prosody (part of the linguistic structure) Intonation Functions: 1. To express grammatical functions 2. To express discourse meanings Forms: Manipulation of pitch, loudness, length

a

Paralinguistic suprasegmentals

Nonmorphemic linguistic prosody Functions: To produce lexical tone To mark prosodic boundaries

Nonlinguistic prosody (a form of animal communication) Functions: To express attitudinal, affective meanings

Forms: Manipulation of pitch, length, tempo, and rhythm

Forms: Pitch range/key, loudness, tempo and rhythm, and qualities of voicea

Nonlinguistic qualities of voice, such as male vs. female, or sick vs. healthy, do not express emotions the way that paralinguistic qualities of voice do and are therefore not included under the label “qualities of voice” that is used here.

2.3  The Forms of Suprasegmentals

17

referred to his “adherence to the traditional practice of speaking of intonation in term of pitch variation” as an “over-simplification” and went on to say, “[i]t seems inherently improbable that a human being can make systematic variations on one physical parameter without its affecting others.” Crystal (1969: 195) pointed out that there is a research-based problem with restricting the formal definition of intonation to pitch alone: “when the question of intonational meanings is raised, then criteria other than pitch are readily referred to as being part of the basis of a semantic effect.” In other words, a contradiction arises by defining intonation as a single physical parameter, but then analyzing intonational meaning in relation to multiple physical parameters. Table 2.2 therefore indicates that the forms of intonation minimally include the three parameters of pitch, loudness, and length. In the bottom middle cell of Table 2.2, the reference to “manipulation of pitch” includes tone contours, as well as level tones, such as a High, Mid, or Low tone. I adopt Liberman’s (1975) stance of analyzing intonation as pitch contours that interact with accent tones and boundary tones, rather than as a series of High and Low tones that include accent tones and boundary tones as is done in Pierrehumbert and Steele’s (1989) Tones and Break Indices (ToBI) system, which can be compared to Brazil’s (1997) description of tones as a series of binary choices at any given moment between rise and fall. Liberman (1975) used L’s, M’s, and H’s for describing his pitch contours, but that was presumably for convenience of notation. I agree with Roach (2009), who said, There is something special about a contour, like fall-rise or rise-fall. It is the contour itself. It’s the global shape of it, not the fact that it starts low and then goes to a high point and then goes to a low point. (Roach 2009: 2:24ff)

It is not necessary to take the stance that I tentatively take here regarding the phonological properties of tone contours. That is a separate issue from whether or not a tone contour is a morpheme (or a series of morphemes). For an excellent, detailed discussion of the nature of tones and tone contours, see Wee (2019). Although the bottom left cell of Table 2.2 only includes the three features widely accepted as being the properties that form intonation, I acknowledge that other properties are also likely to be involved with the makeup of intonational phonology. The purpose of this book is not to describe in detail the phonological features involved with intonation, but rather to propose a theory and categorization of intonation. If other phonological features are demonstrated to be part of intonation, then those features can and should be added into Table 2.2. The forms and functions that are the focus of this book are shown in the two boxes with bold borders. It should be clear by now that these classifications are not typical, but if the arguments I am making here about the nature of intonation are correct, then they are justified. Another important theoretical issue is the nature of tones and the tone bearing unit (TBU). According to my hypothesis, tones can be divided into lexical tones and tonal morphemes. Lexical tones are realized on the syllables that they are associated with, while tonal morphemes are realized on one or more syllables that represent separate lexical items from the tones themselves. In tone languages, those lexical

18

2  The Forms and Functions of Intonation

items will include syllables that use lexical tones, thus raising the obvious question as to whether or not a TBU can host both types of tones at the same time. Chao (1968: 39) argued that the two types of tones can coexist, comparing lexical tones and what he called “expressive” intonation patterns to “small ripples riding on large waves,” respectively. There is no question that tone languages, such as Mandarin, use meaningful intonation on sentences made up primarily of morphemes whose phonology includes lexical tones. For a detailed discussion of this theoretical issue, see Wee (2019, Chaps. 2, 3, and 6).

2.4  Concluding Remarks If Table 2.2 gives the impression that I believe suprasegmentals should be easier to isolate and study than linguists have thus far claimed, this is not the case, because the fourth complicating factor mentioned at the beginning of this chapter says that subtypes of prosody are used simultaneously, one atop the other. As explained by Ladd (2008: 6), “paralinguistic features interact with intonational features [and] paralinguistic aspects of utterances are often exceedingly difficult to distinguish from properly intonational ones.” In addition to this, changes of pitch range/key and qualities of voice can represent emotional attitudes that are very compatible with particular discourse meanings and may therefore regularly be expressed along with those discourse meanings. Not only does this make it difficult to isolate their individual forms, but it can cause linguists to combine their linguistic and paralinguistic “meanings” into a single definition. While the classifications of Table 2.2 do not make the study of intonation any easier, assuming them to be correct will nevertheless have an influence on how one approaches the subject, making a key goal of intonation research the separation of linguistic and paralinguistic forms and meanings, even while at the same time recognizing that this is of course no simple matter. The type of research reported in Chaps. 4 and 6 arguably moves us closer to our goal of distinguishing the two types of suprasegmentals. This research offers empirical evidence that, in at least some cases, it is indeed possible to isolate intonational forms and meanings from paralinguistic prosody. But before that research is presented, Chap. 3 discusses intonational meanings and relates them to the meanings expressed by segmental particles.

References Aboh, E. O. (2010). Information structuring begins with the numeration. IBERIA, 2(1), 12–42. Aboh, E. (2016). Information structure: A cartographic perspective. In C. Féry & S. Ishahara (Eds.), The Oxford handbook of information structure (pp. 147–164). Oxford: Oxford University Press. Aboh, E. O., & Essegbey, J. (2010). The phonology syntax interface. In E. O. Aboh & J. Essegbey (Eds.), Topics in Kwa syntax (pp. 1–10). New York: Springer.

References

19

Bennet, R., & Elfner, E. (2019). The syntax-prosody interface. Annual Review of Linguistics, 5, 151–171. Botinis, A., Granström, B., & Möbius, B. (2001). Developments and paradigms in intonation research. Speech Communication, 33(4), 263–296. Brazil, D. (1997). The communicative value of intonation in English. Cambridge: Cambridge University Press. Chun, D. M. (2002). Discourse intonation in L2: From theory and research to practice. Amsterdam: J. Benjamins. Couper-Kuhlen, E. (1986). An introduction to English prosody. London: Edward Arnold. Cruttenden, A. (1997). Intonation (2nd ed.). Cambridge: Cambridge University Press. Crystal, D. (1969). Prosodic systems and intonation in English. London: Cambridge University Press. Crystal, D. (1997a). A Dictionary of linguistics and phonetics (4th ed.). Oxford: Blackwell. Crystal, D. (1997b). The Cambridge encyclopedia of language (2nd ed.). Cambridge: Cambridge University Press. Fox, A. (2000). Prosodic features and prosodic structure: The phonology of suprasegmentals. Oxford: Oxford University Press. Fromkin, V., Rodman, R., & Hyams, N. (2013). An introduction to language (10th ed.). Boston, MA: Cengage Learning. Gussenhoven, C. (2004). The phonology of tone and intonation. Cambridge: Cambridge University Press. Gussenhoven, C., & Jacobs, H. (2014). Understanding phonology (4th ed.). London: Routledge. Halliday, M.  A. K., & Greaves, W.  S. (2008). Intonation in the grammar of English. London: Equinox. Hirst, D. (1977). Intonative features: A syntactic approach to English intonation. The Hague: Mouton. Hirst, D. (1983). Structures and categories in prosodic representations. In A. Cutler & D. R. Ladd (Eds.), Prosody: Models and measurements (pp. 93–156). Berlin: Springer. Hirst, D., Di Cristo, A., & Espesser, R. (2000). Levels of representation and levels of analysis for the description of intonation systems. In M. Horne (Ed.), Prosody: Theory and experiment: Studies presented to Gösta Bruce (pp. 51–88). Dordrecht: Kluwer Academic Publishers. Johns-Lewis, C. (1985). Introduction. In C.  E. Johns-Lewis (Ed.), Intonation in discourse (pp. ix–xxxiv). London: Croom Helm. Ladd, D. R. (2008). Intonational phonology (2nd ed.). Cambridge: Cambridge University Press. Ladd, D. R. (2014). Simultaneous structure in phonology. Oxford: Oxford University Press. Maekawa, K. (2004). Production and perception of “paralinguistic” information. Presented at the Speech Prosody, Nara, Japan. Pierrehumbert, J., & Hirschberg, J. (1990). The meaning of intonational contours in the interpretation of discourse. In P. R. Cohen, J. Morgan, & M. E. Pollack (Eds.), Intentions in communication (pp. 271–311). Cambridge, MA: The MIT Press. Pierrehumbert, J., & Steele, S. A. (1989). Categories of tonal alignment in English. Phonetica, 46(4), 181–196. Pike, K. L. (1945). The intonation of American English. Ann Arbor: University of Michigan Press. Roach, P. (2009, July). Advantages and disadvantages of the ToBI system: A lecture by Peter Roach. Retrieved from http://www.youtube.com/watch?v=AL-uMriM4ns Russell, J. A. (1994). Is there universal recognition of emotion from facial expression? A review of the cross-cultural studies. Psychological Bulletin 115(1), 102–141. ‘t Hart, J., Collier, R., & Cohen, A. (1990). A perceptual study of intonation: An experimental-­ phonetic approach to speech melody. Cambridge: Cambridge University Press. Tench, P. (1996). The intonation systems of English. New York: Cassell. Trager, G. L. (1972). The intonation system of American English. In D. Bolinger (Ed.), Intonation: Selected readings (pp. 83–86). Middlesex: Penguin Books. Wee, L.-H. (2019). Phonological tone. Cambridge: Cambridge University Press. Wells, J. C. (2006). English intonation: An introduction. Cambridge: Cambridge University Press.

Chapter 3

Intonational Meaning

In Chap. 2, intonation was defined as tonal morphemes. This definition is justified only if there is sufficient supporting evidence. The purpose of this chapter and Chaps. 4–6 is to present this evidence. It is widely accepted and not very controversial to claim that tones which function to mark grammatical features (e.g., definiteness, grammatical case, etc.) are morphemes, so nothing will be said about those types of tonal morphemes in this chapter, but Chap. 4 will cite several examples of grammatical tones. The discussion in this chapter is limited to discourse intonation. Linguists generally agree that discourse intonation is meaningful, but as is the case with all aspects of intonation, linguists’ views on this issue vary widely. In the following section, I will review some of the key views expressed about intonational meaning in the literature, and where appropriate, will relate those views to what is proposed about intonational meaning in this book. In Sect. 3.2, these views on intonational meaning are then contrasted with linguists’ views on the meanings of discourse particles. Views about the relationship between discourse particles and intonation will also be discussed. Based on all of this, I will ultimately argue that discourse particles and discourse intonation are two forms of the same thing. Chapter 4 will then present examples from the literature that provide evidence of this relationship between segmental and tonal particles, both grammatical and discoursal.

3.1  The Nature of Intonational Meaning Some linguists attribute vague pragmatic meanings to intonational forms, which are often assumed to be meaningful only in relation to the immediate context within which they appear. In other words, unlike true morphemes, it is often assumed that a particular tone does not have a core meaning that it expresses independent of the sentence and the discourse context. Brazil (1997) advised caution when claiming that a particular intonational form has a particular meaning: © Springer Nature Singapore Pte Ltd. 2020 J. C. Wakefield, Intonational Morphology, Prosody, Phonology and Phonetics, https://doi.org/10.1007/978-981-15-2265-9_3

21

22

3  Intonational Meaning [There is a] need for extreme tentativeness in providing phonetic descriptions of the meaningful choices that make up the intonation system .… [T]he only research procedure available is to make tentative phonetic observations and try to associate them with generalisable meaning categories. (Brazil 1997: 3–4)

Brazil’s conclusions are understandable, not only because intonational forms are so difficult to isolate from other prosodic forms, but also because intonational meanings are so difficult to isolate from the emotions expressed by nonlinguistic prosody and from the discourse pragmatic meanings expressed by other linguistic elements in the sentence and in the immediate discourse. Nevertheless, I do not agree with his conclusions. There are other research procedures available, and there is now a substantial and growing amount of evidence in support of the hypothesis that specific intonational forms express discrete linguistic meanings, rather than merely “generalizable meaning categories.” It is still difficult to isolate their forms and meanings, but there are now cases where this has arguably been successfully done. Brazil (1997) conceptualized intonational meaning as being different from the meaning expressed by words and morphemes: A description which handles intonation properly must, indeed, seek to answer questions of quite a different kind from those which sentence grammars are set up to answer. The latter can be said to start from the assumption that parties to a verbal interaction have available for their use a system for classifying experience which, in theory if not in fact, they share with the whole speech community and which, for most practical purposes, remains constant through time. But to make sense of intonation we need to think of speakers as classifying experience along lines that are valid for themselves and their interactants, and in the here-­ and-­now of the utterance. (Brazil 1997: xii)

These assumptions are not very compatible with the idea that intonation is morphemic. Morphemes are part of speakers’ lexicons and are shared among the members of a single speech community and are essentially constant through time, at least within a given generation. Brazil (1997) saw intonation as being meaningful only in relation to its given context and only at the time it is uttered. Nevertheless, he did argue for two rather abstract and broad discourse meanings that could consistently be attributed to general tone shapes. He said rises and fall-rises have the discourse function of “referring,” while falls and rise-falls have the discourse function of “proclaiming” (p. 83).1 Cruttenden (1997) referred to two types of intonational meanings, which he called local meanings and abstract meanings. It is clear that this was not a contrast of concrete versus abstract because he said the local meanings expressed by “nuclear tones have fairly abstract basic meanings which as a result of conditioning factors turn up with a variety of local meanings” (p. 106, emphasis in italics mine). The contrast of Cruttenden’s local versus abstract meanings is based on the idea of context-dependent versus context-independent meaning. In his discussion on the conditioning factors that affect meaning, he said, “[t]he meanings of any one nuclear tone 1  The original source of this idea comes from Jassem (1952: 70), who said, “Falling nuclear tones have a proclamatory value. Rising nuclear tones have an evocative value.” Brazil (1975) cites Jassem as the original source of this idea.

3.1  The Nature of Intonational Meaning

23

clearly vary at least slightly in different contexts; in some cases such variation is considerable rather than slight” (p. 104). This contrasts with what he referred to as abstract meaning, which, according to Cruttenden, is the result of linguists’ attempts at assigning context-independent meaning to tones. He said the result is that the shapes of tones become fewer and less specific, resulting in a contrast between rises and falls, plus perhaps fall-rises. The resulting meanings are typically abstract discourse meanings, such as Brazil’s (1997) “referring” or “proclaiming,” which Cruttenden cited. Cruttenden (1997) compared Brazil’s ideas with those of Gussenhoven (1983), who discussed the location and shape of nuclear tones. Gussenhoven argued that nuclear tones have context-independent meanings, illustrating his point with sentences in which the entire proposition (p) is put into focus. When a nuclear falling tone is used, Gussenhoven (1983: 384) said its meaning can be paraphrased as “I want you to know that from now on I consider [p] to be part of our [discourse] Background.” He paraphrased the meaning of a fall-rise nuclear tone as “I want you to take note of the fact that [p] is part of our Background,” and paraphrased a rising nuclear tone as meaning, “I will leave it up to you to determine whether we should establish [p] as being part of the background.” Brazil’s and Gussenhoven’s ideas are comparable in the sense that they both propose consistent, context-independent discoursal meanings to pitch falls, rises, and, in the case of Gussenhoven, fall-rises. I think Gussenhoven (1983) had the right approach; he used more specific definitions for the tones and wrote them out as speaker-oriented paraphrases. This is comparable with the approach I use in my own research for defining discourse particles and intonation (see Chaps. 5 and 6). It makes sense that definitions of tones should be speaker-oriented if intonation is assumed to express meanings such as epistemic modality, speaker stance, speaker belief, and so on.

3.1.1  Context-Dependent Versus Context-Independent Meaning Morphemes by definition have context-independent meaning. This does not mean that pragmatic and deictic meanings are not also involved, but there must be some core meaning that is expressed each time the same morpheme is used. The pronoun “they,” for example, is deictic because it requires an antecedent in order to be interpreted. It also requires the use of pragmatics for the hearer to match the pronoun to the correct antecedent. At the same time, however, the pronoun “they” always expresses the semantic features [+plural] and [+3rd person]. I will argue in Chap. 5 that discourse particles and tones can and should be similarly analyzed; they have deictic meanings that are context-dependent and require pragmatic interpretation, but they also have constant core meanings that are context-independent. The idea of context-independent meaning is therefore critical to my hypothesis, so it is worth looking at what linguists have said regarding intonational meanings in this regard.

24

3  Intonational Meaning

O’Connor and Arnold (1973) concluded that the meaning of intonation is context-dependent: [T]hat part of the speaker’s meaning which is assumed to be carried by the structure of the sentence—words and word order—and that part attributed to intonation are welded together to form the total meaning of the utterance at a particular time and in a particular context. O’Connor and Arnold (1973: 36)

O’Connor and Arnold described ten tone groups, each of which they named based on their pitch contour shape. They defined a tone group as “a group of tunes which, though intonationally not identical, all have one or more pitch features in common and convey the same attitude on the part of the speaker” (p. 287, emphasis in italics mine). They said that all tone groups can be used on more than one clause type, and no clause type uses only one tone group. Although they said that the same tone group always implies the same attitudinal meaning, they described the same tone on different clause types (or speech acts) as having quite different meanings. For example, the High Drop used on statements “give[s] the impression of involvement in the situation, of participation, and of a lightness and airiness,” but when used on Wh-questions sounds “brisk, businesslike, considerate, not unfriendly” (p. 54). There are different possible reasons as to why O’Connor and Arnold attributed different meanings to a tone group when used with different speech acts. They could have been referring to different tones that are homophonous or to different tones that are not homophonous but phonologically similar. They could also have been combining meanings from the sentence and the discourse with the meanings of the tones. And finally, and I think most likely, they were both referring to multiple tones and including meanings from the context. O’Connor and Arnold’s (1973) analysis contrasts with the proposal made here, which is that intonational morphemes comprise specifically shaped pitch contours and have core meanings that remain consistent from one context to the next. Despite O’Connor and Arnold’s claims being much weaker in terms of concluding that there are matches between form and meaning, Liberman (1979: 94) referred to their list of tone groups as “the nearest thing available to an adequate intonational lexicon.” He discussed their tone group called “The Take-Off,” further reducing it to a specific tone shape within that tone group. Following his discussion of this tone shape, Liberman (1979: 96) said he had “demonstrate[d] that there is some real linguistic entity here, whose properties are a fit object of study.” He referred to it as the “surprise/redundancy contour” and called it “a sort of intonational word, a unit of meaning” (p. 97, emphasis in italics his). He was essentially claiming that this tone has context-independent meaning. Bolinger (1986: 246–247) argued in favor of context-independent meaning in relation to the so-called “contradiction contour,” which has the shape rise-fall-rise. He said it can be used to express contradiction on contradictory statements, but that this meaning is not conveyed when the same contour shape is used on questions, command, or noncontradictory statements. Based on that, he then concluded the following:

3.1  The Nature of Intonational Meaning

25

If changing the syntax does not change the intonation, and if intonation is meaningful, then all such statements, questions, and commands must have a common intonational meaning. It cannot, for obvious reasons, be “contradiction,” but it must be of such a nature that contradiction can be inferred from it. (Bolinger 1986: 247, emphasis in italics his)

He accounted for this by arguing that the rise-fall-rise contour is compositional in form and meaning. The meanings of the individual components are constant from context to context, but combining those meanings on syntactically different sentences in different contexts results in what appears to be different meanings. He said the rise-fall-rise contour consists of the following five components: 1 . Initial rise cueing the hearer to the concern, interest, etc. of the speaker. 2. High pitch on an unaccented syllable, marking the concern, interest, etc. as nonselective, that is, as applying to the utterance as a whole. 3. Immediately succeeding stepped or gradual fall, showing the tension to be under control, and therefore intended. 4. Accented syllable at low pitch, deemphasizing the referent of the word and contributing to the restraint of the downmotion. 5. Terminal rise, leaving the utterance “open” to further comment or to continuation with a larger utterance. The gradient extent of the terminal rise augments the effect of the initial rise. (Bolinger 1986: 248)

Bolinger (1986: 140) referred to specific pitch shapes as “profiles” and considered these to be “the minimum morphological units of intonation.” The meanings of the rise-fall-rise contour’s components as described by Bolinger are vague, and combining these five vague meanings together makes it difficult to determine whether or not all the types of sentences cited by Bolinger actually contain this combination of meanings when said with this contour. I believe that Bolinger’s method of describing intonational meaning makes it difficult to determine whether his descriptions are accurate, but it is notable that he concluded intonation to be morphemic and to have context-independent meaning. Research that provides empirical evidence in support of intonation having context-independent meaning will be reviewed in Chaps. 4 and 6.

3.1.2  Compositional Versus Holistic Meaning Similar to Bolinger’s (1986) analysis, Pierrehumbert and Hirschberg (1990) assumed the rise-­fall-­rise contour to be made up of three parts. Based on their method of description, this tone comprises an accented Low tone plus a High tone (∗L+H), a Low tone (L), and a High boundary tone (H%). Using their Tones and Break Indices (ToBI) notation, these three components combine to look like this: ∗L+H_L_H%. It is worth repeating here that I tentatively analyze specifically shaped contours to be single morphemes and do not assume that they are made up of component parts. In Chap. 6, I discuss a rise-fall-rise contour that I argue has a specific meaning that is linked to its holistic phonological form. Nevertheless, there are similarities between my analysis and that of Pierrehumbert and Hirschberg.

26

3  Intonational Meaning

I assume that this contour begins on an accented syllable, which then must be pronounce Low, since this pitch contour starts out low. And the pitch contour, which ends high, results in the use of a high boundary tone at the end of the intonational phrase. In this sense, I see it as interacting with an accent tone and a boundary tone, but not as including them. Related to this, see Wee (2019, Chap. 3), who discusses how pitch contours may be made up of a combination of register and pitch values, resulting in a holistic contour. Whether or not tone contours are single morphemes with a single meaning, or instead comprise compositional parts and meanings, is of course an empirical matter, but it is difficult to prove one way or the other. More than one tonal morpheme can co-occur within a sentence, and in fact, Focus tones and discourse tones regularly do co-occur. And there is no theoretical reason that prevents a meaningful tone contour from being like a word that is made up of more than one morpheme. However, there is at least one good argument against the ToBI system, which is that it does not distinguish, for example, between a mid-rise and a high-rise. However, that it not an argument against tones being compositional; it just means that if they are compositional, a notation system that distinguishes between tones of different heights is needed.

3.1.3  Phonological Similarity and Homophony Whether arguing for or against the idea that intonation has context-independent meaning, linguists should not link this argument to the mistaken idea that similarly shaped forms necessarily ought to have similar meanings. This is analogous to assuming that the English words “foal” and “full” should be expected to have similar meanings based on the fact that they have similar phonological forms (at least in my Midwest American variety of English). Even two tones with exactly the same phonological shape and features do not need to share any portion of their meanings, just as homophonous words need not share any portion of their meanings with each other (e.g., “read” and “reed”). Related to this, Chap. 4 reviews Hirst’s (1983) reasons for arguing that English’s emphatic and contrastive intonation are both floating tones (what I refer to as tonal morphemes). He said these two tones are homophonous, but that this should be no more problematic to deal with than segmental homophony. In addition, a high-falling “obvious” tone that is discussed in Chap. 6 is also possibly homophonous with contrastive and emphatic intonation,2 and if so, this is a case of three-way homophony, which is something that occurs regularly in language (e.g., “wear,” “where,” and “ware”; “sight,” “site,” and “cite” in my variety of English). Which meaning is intended is determined by the context. In short, there is no validity to an argument that cites two different examples of a tone contour in 2  Intuitively, I have the impression that the “obvious” tone starts from a higher point, but a detailed contrastive phonological analysis of these three tones (i.e., contrastive, emphatic, and obvious) would need to be conducted in order to make any strong claims.

3.1  The Nature of Intonational Meaning

27

different contexts, demonstrates that there is no shared meaning between the two examples, and then concludes that tones therefore do not have context-­independent meaning. Another example of what has been described by some in the literature as homophony is a question tone used on rising declaratives and the rising tone of uptalk, which is a rising, nonquestioning discourse tone.3 Warren (2016) defined uptalk as follows: [Uptalk is] a marked rising intonation pattern found at the ends of intonation units realised on declarative utterances, and which serves primarily to check comprehension or to seek feedback. (Warren 2016: 2)

Despite many authors having said that the uptalk tone has the same form as a question tone (see Warren 2016: 26–31), these tones do not appear to be homophonous. First of all, I demonstrate in Chap. 6 that there are at least two forms of rising question tones with distinct (i.e., not gradient) meanings, and there are likely more than just these two, so it is unclear whether linguists are even referring to the same question tone, or even if the tokens of question tones examined within a single study are all the same tone. Nevertheless, studies that claim to have found consistent differences or similarities between question tones and the uptalk tone are still meaningful. Fletcher and Harrington (2001, cited in Warren 2016: 30) found a phonetic distinction in Australian English between uptalk rises and question rises, with higher pitch accents used on questions in comparison with those used on uptalk. Ritchart and Arvaniti (2014) likewise concluded that tones associated with uptalk differ from question tones in Southern Californian English. Regardless, “even if there is no phonetic difference between uptalk rises and question rises, ambiguity is unlikely, as contexts will clarify the intended meaning” (Warren 2016: 30). There is no good reason I am aware of for assuming that the phenomena of homophony and phonological similarity exist among segmental morphemes but not among tonal morphemes.

3  This rising discourse tone has received much negative press, and the term “uptalk” itself can have negative connotations. Warren (2016) considered using other terms but settled on this one because it is well known and because other names given to this tone also have problems. Webb (2008, cited in Warren 2016: 5), for example, called it a high-rising tone (HRT), and used the labels “declarative HRT” and “question HRT” to distinguish the declarative uptalk tone from question tones. Warren said HRT is problematic because “hormone replacement therapy” is the most common thing to show up on internet searches. I have a problem with Webb’s labels for a different reason, which is that I consider rising-tone questions to be declaratives (cf. Bartels 1999; Gunlogson 2003; see also Sect. 7.2.2).

28

3  Intonational Meaning

3.1.4  G  radient Versus Categorically Distinct Forms and Meanings Warren (2016: 2) observed that “the shape of uptalk is variable and quite possibly differs from one variety of English to another.” That it varies among English varieties is what one would expect, but based on what I am claiming in this book about intonation, if a single speech community consistently uses more than one form of uptalk (putting aside allotones that result, for example, from interacting and assimilating with focus and/or boundary tones), then this should be analyzed as more than one tone, with each having its own unique, context-independent meaning. To make it clear, I am proposing that intonational morphemes are neither gradient in form, nor in meaning. In contrast, Ladd (2008: 126), after citing other authors’ claims that the uptalk tone and the question tone are phonetically distinct, concluded that “the differences are subtle, and arguably gradient.” What is important is the mental representation of these tones, which relates to their phonological rather than their phonetic properties, and I assume that the differences in these two tone’s phonological properties are distinct unless they are homophones. Evidence against gradience being a property of intonation comes from two perception studies by Gussenhoven (1984) and Pierrehumbert and Steele (1989). Both studies found that speakers make a categorical distinction (i.e., not a gradient distinction) between a rise-fall-rise tone and a fall-rise tone. This goes against Fox’s (2000) claim that intonation has a property different from those expressed by other meaningful features in language, which is that intonation’s meaning and its form is gradient and nondiscrete. He said that “[s]ince for many scholars linguistic distinctions are in principle discrete, and do not permit such gradience, the status of intonation as a genuinely ‘linguistic’ phenomenon is thereby placed in doubt” (p. 270). It is possible that his conclusion that intonational meaning can be gradient results from the common practice of combining and/or confusing intonational meanings with the emotions expressed by nonlinguistic prosody, which prefumably can be increased or descreased in a gradient fashion.

3.1.5  The Linguists Theory of Intonational Meaning Ladd (2008: 41) referred to what he called the “Linguists Theory of Intonational Meaning, [t]he central idea of [which] is that the elements of intonation have morpheme-­like meaning” (emphasis in italics his). It should be obvious to readers by now that I am proposing a strong version of this theory. There are a significant number of linguists who have argued in favor of this theory, even though their overall descriptions of intonation differ from mine and from that of each other. Ladd (2008: 41) cites a number of them, including Liberman (1975), Ladd (1978), Gussenhoven (1984), Bolinger (1986), Pierrehumbert and Hirschberg (1990), and I will add Hirst (1983) to this nonexhaustive list. In recent years, a growing number of linguists appear to have joined this camp, but there are still many linguists who disagree.

3.1  The Nature of Intonational Meaning

29

Bartels (1999: 4) questioned whether “we [can] make a plausible case for associating a given tone at some level of abstraction with the same interpretational feature across all occurrences, independent of lexical content and situational context”—in other words, is there a case for intonation having context-independent meaning? My answer to this is the same as that of Ladd (1978: 144), who reviewed the debate in the literature about whether intonation has context-free meaning and then concluded that no writers up to that point had “ever really considered what seems to me to be the simplest hypothesis: that intonational meaning is like segmental meaning” (emphasis in italics his). He said the inability to give context-­free definitions to intonational forms can be compared with the inability to do the same for Japanese sentence particles. This is an excellent and insightful comparison that is discussed in detail in Sect. 3.2.

3.1.6  Testing the Linguists Theory of Intonational Meaning Liu et  al. (2013) explicitly stated that they were testing the Linguist’s Theory of Intonational Meaning, and they concluded that the results of their study supported it. For their study they constructed a number of two-line dialogues in order to elicit utterances from five English-speaking and eight Mandarin-speaking participants. The elicited utterances were either answers to questions, or echo questions in response to statements. They manipulated the position of a focused word within the utterance, making it either sentence-medial or sentence-final, and they also manipulated what they referred to as sentence modality (question vs. statement), using clauses that were syntactically declaratives throughout. They collected data in both English and Mandarin in order to test the intonational forms’ interactions with positions of pitch accent in English and different lexical tones in Mandarin. After examining the F0 contours of the data, they discovered a consistency in the forms of both focus and sentence modality within each language, and also that both had allomorphic forms resulting from assimilations with pitch accent and lexical tone. Elicitation studies like that of Liu et al. (2013) can be quite powerful. Using naturally occurring data is extremely messy and difficult to interpret and analyze. Linguists can collect tokens of homophonous (or merely similar) tones together and analyze them all to look for a common meaning, but they may not all represent the same tonal morpheme. And tones that do represent the same morpheme may look very different because they are allotones, which Liu et  al. demonstrated to exist. Naturally occurring data also make it difficult to control for the context, resulting in the common practice of attributing meaning from the context to the tone itself, giving the impression that the tone has no context-independent meaning of its own. Persson (2018: 82), for example, examined a pitch contour in French that he described as “a salient pitch prominence occurring on the secondary accented syllable, followed by a low tonal target (low or falling pitch) associated with the ­primary accented syllable.” After collecting and examining 230 tokens of this form of intonation from naturally occurring data, he concluded that

30

3  Intonational Meaning the investigated contour does not seem to perform its pragmatic work independently of the verbal material accompanying it and the sequential organisation of the talk .… [This paper] takes the first step towards an account that treats the pragmatic meaning of the intonation pattern [salient initial accent + low primary accent] as a combined product of prosodic form, verbal material, and action-sequential context. (Persson 2018: 96)

It is very possible that the intonation pattern that Persson referred to as salient initial accent + low primary accent describes more than a single phonological tone in French. At the same time, it is also possible that this tone’s (or these tones’) core meaning(s) can be defined independently of the discourse context, but that Persson did not successfully tease the meaning(s) apart from the context. If the hypothesis of intonation proposed in this book is correct, then the most likely possibility is that this pitch pattern represents more than a single tone and these tones have not yet been accurately defined. This is not a criticism of Persson’s methods, nor of his very understandable results. These kinds of conclusions are expected when analyzing naturally occurring data and searching for multiple occurrences of a particular form, rather than starting with a particular meaning and then eliciting the data. Constructing examples from the linguist’s own mind can also be useful and informative, but this can raise questions of reliability if it is not verified with other native speakers. Elicitation studies such as that of Liu et al. (2013), on the contrary, can produce strong forms of evidence for a match between form and meaning, insofar as the study is well designed. In addition to Liu et al.’s study, which tested for multiparticipant consistency in form (including allotones) when expressing predetermined intonational meanings, recall Maekawa’s (2004) study reported in Sect. 2.3, where participants were asked to express specific discourse meanings, and then the resulting audio was played to other participants to see if they could match given forms to given meanings. Studies like these show that specific meanings are expressed with specific forms of intonation, not only within individuals but within speech communities. It is not easy to start with a meaning and then validly and reliably elicit its form from multiple speakers of a given language, but it is arguably more valid and reliable than starting with an observed intonational form, and then trying to discover its meaning by analyzing naturally occurring data. The latter method runs the risk of analyzing homophonous (or similarly shaped contours) as being tokens of the same tonal morpheme, which then makes it impossible to match form with meaning. Of course, all research methods are worth trying, and all studies on intonation, regardless of methodology, can work in tandem to bring us closer to understanding this most difficult feature of language. My own research reported in Chaps. 5 and 6 began with discourse meanings and then elicited the production of their forms. My methodology was based on the hypothesis that the types of meanings expressed by discourse particles are the same as those expressed by intonation, and that in some cases, there can exist cross-­ linguistic particle/intonation pairs that are (near) semantic equivalents. This method is only valid to the extent that the hypothesis upon which it is based is correct. With this in mind, the next section reviews some of the relevant literature on discourse particles. Interestingly, this literature repeatedly compares the meanings of discourse particles to the meanings of intonation and even includes a similar debate about context-dependent versus context-independent meaning.

3.2  Intonation and Discourse Particles

31

3.2  Intonation and Discourse Particles Ladd (2008) said the following about the widely accepted view that segmental discourse particles and intonation are related: [I]t has long been observed that many languages use segmental morphemes to convey the kinds of meanings that in other languages can often be signaled intonationally … It may be that the functional similarity between such particles and intonation as defined here should outweigh the clear phonetic and syntactic differences4 [and that if] such a comparison is valid, it is clearly important not to define intonation solely in terms of phonetic suprasegmentals. Ladd (2008: 5)

It is not difficult to find examples of linguists comparing particles to intonation. Schubiger (1965), for example, linked German modal particles to specific forms of English intonation. Referring to Mandarin, which has approximately ten sentence-­ final particles (SFPs), Chao (1932: 115) said that “the speech element in Chinese which may be equated to English intonation is the use of grammatical particles.” A number of linguists have said the same about Cantonese SFPs, of which there are more than 30. Kwok (1984: 8), for example, said that “[a]s a system [SFPs] share many characteristics with intonation,” and Matthews and Yip (2011: 389) said that the “functions [of SFPs] are often conveyed by intonation patterns [in English].” Wakefield (2012, 2014, in press) linked six Cantonese SFPs to specifically shaped pitch contours in English. Arndt (1960: 327) concluded from his analysis of Russian and German modal particles that “they have some functional resemblance, not to traditional ‘parts of speech,’ but to phonemes of intonation [and] may be thought of as, in a sense, ‘suprasegmental morphemes’.” Arndt admitted there is an obvious problem with labeling modal particles “suprasegmental,” because they “occur in the linear sequence of morphemes making up an utterance” (ibid). One of the major aims of this book is to demonstrate that this comparison between segmental particles and intonation is indeed valid, and that a reanalysis of intonation, as Ladd suggested, is therefore justified. Contrary to Arndt’s (1960) and Ladd’s (2008) suggestions, however, the definition of intonation proposed here does not reclassify discourse/modal particles as belonging to intonation. Instead, it defines ­intonation as a class of suprasegmental morphemes, that is, as tonal particles (both grammatical and discoursal). I refer to them as tonal morphemes/particles, but another term used in the literature is floating tones. Goldsmith (1976: 57) defined a floating tone as “a segment specified only for tone which, at some point during the derivation, merges with some vowel, thus passing on its features to that vowel.” Defining a floating tone as “a segment” is problematic, however, since it is suprasegmental. It is also problematic to associate it with a single vowel, since some discourse tones are clearly associated with more than one syllable, based on the assumption, which I tentatively

4  Contrary to this view, Chap. 7 offers a proposal for the syntactic properties of intonation based on the hypothesis that there are no syntactic differences between segmental particles and intonation (i.e., tonal morphemes).

32

3  Intonational Meaning

adopt, that pitch contours are not compositional and can be analyzed as a single morpheme.5 Ladd’s (2008) quote at the beginning of this section (probably unintentionally) implies that individual languages use either intonation or segmental particles, but not both. It is unlikely that any language uses only one of these strategies. Probably all languages use a combination of the two forms available for expressing connotative meaning. Languages that primarily use discourse tonal particles also have discourse segmental particles, such as English’s sentence-final huh?, Canadian English’s eh?, and French’s hein?, and languages that use a large number of segmental particles also use discourse tones. Cantonese, for example, has a rising question tone. Based on this close link between SFPs and intonation, Yau (1980: 51) argued that “there is a mutual compensation between [SFPs] and intonation patterns and that the more a language relies on the use of [SFPs] in expressing sentential connotations, the less significant will be the role played by intonation patterns, and vice versa.” There has been little research to test Yau’s claim, but one study that arguably supported it was Hirst et  al. (2013), who found that gender differences in read speech, with regard to the amplitude of pitch rises and pitch falls, were smaller in Cantonese than in Mandarin, and smaller in Mandarin than in English and French. This shows an inverse correlation between the number of SFPs in the languages of this study and the degree to which they use intonation as a sociolinguistic marker of gender. This is not a large number of languages, and contrasting sociolinguist factors among the speech communities could be involved, so more research needs to be done before any strong claims can be made. Nevertheless, this study provides some preliminary evidence in favor of the assumption made regularly throughout the literature on Cantonese SFPs, which is that the existence of so many SFPs in Cantonese is a consequence of its restricted use of intonation. Hong Kong Cantonese is a tonal language with six lexical tones (Bauer and Wakefield 2019: 18ff),6 and this restricts its use of intonation. Cheung (1986) explained that Not only is Cantonese a tone language, but it has one of the richest tonal systems in the world. And not only is the number of contrastive tones in Cantonese one of the greatest, but the tonal system exploits both pitch height and pitch orientation at the same time, [The result is a variety of SFPs that] fulfill more or less the same function as intonation (pp. 250–251).

Cheung (1986: 251) concluded that it is “beyond doubt” that lexical tones, SFPs, and intonation are all interrelated. This is because lexical tones and intonation both share the same form (i.e., pitch patterns), while SFPs and intonation share the same semantic content (i.e., they both express speaker stance, epistemic modality, etc.). Again, the implication is that Cantonese cannot freely use intonation to express connotative meanings because its lexical tone system prevents it from doing so, and it

5  Of course, this does not disallow the possibility of two or more contours overlapping or occurring consecutively. 6  Cantonese can also be said to have nine or ten tones, depending on how they are analyzed (Bauer and Wakefield 2019: 22).

3.2  Intonation and Discourse Particles

33

therefore has compensated for this by developing a system of SFPs that express these same types of meanings. Tang (2010: 61), referring to SFPs as mood particles, said that “with respect to form, mood can be realized as mood particles or as intonation” (translation that of the author). There is clearly a consensus among Cantonese linguists that SFPs and intonation are closely related, and some appear to agree that they are in fact two forms of the same thing.

3.2.1  I ntonation and Segmental Particles Are Two Forms of the Same Thing Yip (2002) made a claim that argues strongly in favor of the idea that discourse tones are morphemic, but it only related to the discourse tones found in lexical tone languages: The most commonly described intonational mechanism in tone languages is the addition of a phrase-level tone. As in Cantonese, these can be thought of as a type of particle that lacks segments, consisting solely of tone. (Yip 2002: 273)

Directly related to this, Tang (2006) argued that the rising question tone in Cantonese is an SFP that, except for its phonological form, is grammatically no different from the Cantonese question SFP me1. Leung (1992/2005: 80–83) likewise included six pitch contours among his list of Cantonese SFPs. I am not aware of any theoretical reason for assuming that tone languages can use intonational forms to function as discourse particles, but that nontone languages cannot. In other words, if tonal discourse particles are assumed to exist in a language like Cantonese, there is no reason to assume they cannot also exist in a language like English. Saying that Cantonese discourse tones are SFPs is similar to Da Mota and Herment’s (2016) direct comparison of Canadian English’s sentence-final eh to a rising contour (see Sect. 4.2). I adopt a strong version of the hypothesis that segmental discourse particles and intonation are two forms of exactly the same thing (Wakefield 2016). If this hypothesis is correct, then segmental particles can be exploited as a means of learning something about intonation. This is not necessarily easy to do, though. There are challenges to making valid and reliable links between a given segmental particle and a given form of intonation. What those challenges are and how I addressed them in my own research are discussed in Chap. 5. The meanings of SFPs are just as abstract, and just as linked to the discourse context, as those of intonation. As a result, the debates about SFPs’ functions and meanings share resemblances with the debates about intonational meanings. The fact that SFPs are segmental, however, allows the linguist to easily recognize when the same form is being used, both across discourse contexts and across speakers.7

7  Polysemy of course remains a complicating factor, but that issue can also be more manageably analyzed due to the fact that the particles’ forms are easy to recognize and compare from one occurrence to the next. The polysemy of Cantonese SFPs will be addressed in Chaps. 5 and 6.

34

3  Intonational Meaning

3.2.2  T  he Similar Debates About Particle and Intonational Meanings It should come as no surprise that the abstract nature of SFPs has made it difficult for linguists to come to a clear consensus regarding their functions and meanings. Interestingly, many things said about their meanings are similar to what has been said about the meanings of intonation. Bauer and Benedict (1997), for example, said that Cantonese relies on SFPs to: perform different kinds of speech-acts, such as requesting, reminding, refusing, advising, asserting, persuading, questioning, etc., and to express the speaker’s emotional attitudes of surprise, outrage, passion, blaming, doubt, dissatisfaction, patience, impatience, conceit, hesitation, reluctance, etc., toward situations and his/her interlocutor’s utterances. (Bauer and Benedict 1997: 291)

I agree that SFPs are used to perform various speech acts but argue that they do not express emotional attitudes, which are assumed to be expressed by nonlinguistic prosody. I believe the reason many authors have concluded that SFPs express emotions is because, like intonational meanings, the meanings of certain SFPs are compatible with certain emotions. They are therefore frequently used in tandem with the forms of paralinguistic prosody that express those emotions. Just as what happens when trying to determine the meaning of a form of intonation, the emotions expressed by simultaneously-occurring paralinguistic prosody is often assumed to be expressed by the SFPs themselves. There have been many attempts to describe and define Cantonese SFPs. Echoing the debate on intonational meaning, some authors have concluded that SFPs have no context-independent meaning (e.g., Kwok 1984; Luke 1990; Baker and Ho 2006), while others have argued that they do (e.g., Fung 2000; Sybesma and Li 2007; Wakefield 2010, 2012, 2014). Fung (2000) explained why many linguists have failed to accurately define SFPs and have instead concluded that their meanings can change drastically from one context to the next. She said that some “researchers are easily tempted to include as part of some specific [S]FP all sorts of meanings that are conveyed by other linguistic or paralinguistic elements” (p. 6). She offered the example of Leung (1992/2005, cited in Fung 2000: 6), who proposed that the SFP laa1 encodes possibility, but only when it is used with modal adverbs such as waak6ze2 (“perhaps”), daai6koi3 (“probably”), or daai6joek3 (“probably”). Fung correctly pointed out that the “possibility” meaning comes from the modal adverbs rather than from the SFP laa1. Another example of incorporating meanings from the context into the meaning of an SFP comes from Kwok’s (1984) descriptions of the two related SFPs lo1 and aa1maa3, both of which express a meaning related to “obviousness.” Consider these four paraphrases of their meanings shown in italics in the English translations:

3.2  Intonation and Discourse Particles (1)

(2)

(3)

(4)

35

Loeng5 dim2 bun3 lo1. two CL half LO “Two thirty, of course. Don’t you know?” Duk6 dak1 m4-gau3 lo1. study Adv-M NEG-enough LO “You haven’t studied enough, that’s why.” Bun2dei6 ge3 zau6 peng4-di1, jan1wai6 keoi5 hei1 aa1maa3. local GEN then cheaper-CM because 3s thin AAMAA “The local kind is cheaper because it’s thin, don’t you know?” Go2 jat6 mou5 gam3 je6 aa1maa3. that day NEG so late AAMAA “It wasn’t as late that evening, that’s why.” (Kwok 1984: 59, 61–2, italics in the translations that of the author)

Sentences (1) to (4) come from four different contexts. In the contexts of (1) and (3), the SFP-suffixed sentences are interpreted as obvious information, and in the contexts of (2) and (4) as obvious reasons.8 Kwok (1984: 58) said that lo1 “seems to give the reason for something” and used (2) as one example, the context of which is a father who attached lo1 to a sentence when telling his son the reason he performed poorly on a test. We should not conclude that lo1 functions to “give the reason for something” based merely on the fact that some lo1-suffixed sentences state reasons. There are many lo1-suffixed sentences that do not give reasons for things, and all the lo1-suffixed sentences that do state reasons would still be construed as reasons if lo1 were removed. This indicates that it is not lo1, but rather the reason-giving context and the proposition (p) itself, that express the meaning: “p is the reason.” When either lo1 or aa1maa3 was attached to obvious information, Kwok paraphrased each as “don’t you know” (i.e., (1) and (3)). When either was attached to an obvious reason, it was paraphrased as “that’s why” (i.e., (2) and (4)). If both lo1 and aa1maa3 can be paraphrased in either of these two ways depending on the context, the implication is either that (1) neither SFP has any intrinsic meaning of its own, in which case it is difficult to see what if any meaning it contributes to the sentence, or that (2) both are polysemous, giving us the two particles lo11 and lo12 and the two particles aa1maa31 and aa1maa32, and we can conclude that lo11 has the same (or a very similar) meaning to aa1maa31, while lo12 has the same (or a very similar) meaning to aa1maa32. Contrary to this, I argue that both particles have their own intrinsic meanings independent of the contexts within which they are used. In Chap. 6, the meanings of these two related particles are teased apart and a unique definition is proposed for each. Kwok (1984: 7–8) compared what she assumed were context-dependent meanings of SFPs to those of intonation, saying that “just as in English the same tune carried by different structures have different meanings, the meaning of a particle may vary according to the type of structure to which it is affixed.” Similarly, Luke (1990: 3) said that one distinctive feature of SFPs that was identified in prior studies was that “they have 8  Saying that an SFP is “suffixed” to or “attached” to a sentence is metaphorical. They are bound morphemes in the sense that they cannot be used in isolation, but they are not bound morphemes in the sense that they attach to a root morpheme to form a word. Syntactically, an SFP is assumed to be a morpheme that heads its own functional phrase (see Chap. 7).

36

3  Intonational Meaning

no semantic content.” Baker and Ho (2006: 246) likewise concluded that “[p]articles are words which for the most part have no meaning in themselves.” Ball (1888/1971: 112) said that “the Final Particles so freely used in Chinese have in most cases no exact meaning as separate words.” Schubiger (1965: 66) notably made the same claim regarding German modal particles, saying “[t]he precise meaning of the particle can in many cases be gathered only from the contents and context of the sentence.” It is easy to understand why many linguists have concluded that discourse particles have no intrinsic meanings of their own. Their meanings are intricately connected to the sentence as a whole, as well as to the discourse. I suggest that they should be seen as deictic words, referring to antecedents in the sentence and in the discourse. Referring to the epistemic SFP lo1, Luke (1990: 191) said, “it would be a futile exercise to try and define an intrinsic or original meaning of [lo1], or even a small number of basic meanings.” He concluded that lo1 is only meaningful in reference to the particular contexts in which it appears. I actually agree that discourse particles are only meaningful “in reference to” the context, as Luke says, because the discourse-related deictic elements that I argue are included in their meanings depend on the context for their reference. As explained above, the logic is the same as saying that a pronoun is only meaningful in reference to its antecedent in the discourse, even though it has unchanging, core semantic features. I align with those authors who assume that SFPs have context-independent definitions that are consistent and unvarying. I propose that their meanings only appear to change because the antecedents of the deictic portions of their meanings differ from context to context. At the same time, because an SFP has scope over the whole sentence and links it to the discourse, it appears to change the meaning of the sentence to which it is attached. This is because the sentence’s proposition (or a portion of it) is a variable within the SFP’s definition (this will be explained in detail in Chap. 5 and illustrated in Chap. 6). Ball (1888/1971: 112) eloquently described his observation of this, saying “[i]t is curious, and most interesting to notice how small and insignificant a word at the end of a sentence will change the meaning of the whole sentence, like the rudder at the stern of the ship governing the motions of the whole vessel.” All of this echoes claims found in the literature about intonational meanings. Chun (2002) said, “[t]here is no one-to-one correspondence between form and function; rather, intonation must be viewed and interpreted from the context in which it occurs, i.e., is spoken” (p. xvii). I propose that intonational meanings have the same characteristics as those of discourse particles, but the fact that intonational forms are suprasegmental makes it extremely difficult to clearly isolate one form from another, so that their core meanings can be separated from those meanings that come from the context. Therefore, I believe one of the most reliable ways to make a strong claim about the form and meaning of a given intonational form is to link it directly to a segmental particle. This is of course limited by the fact that only a minority of discourse particles have segmental counterparts. Nevertheless, the more of these links that are made, the more supporting evidence we will have for claiming that intonation is morphemic and has context-independent meaning.

References

37

3.3  Concluding Remarks It is widely recognized that intonation and segmental particles are related. I take the strong stance of assuming that they are in fact two forms of the same thing and that discourse intonation should be analyzed as morphemic in the same sense and to the same degree as discourse particles. If this is assumed, then by logical extension, discourse intonation is related to grammatical tonal particles in that they both comprise tonal morphemes. They differ in only two respects: (1) grammatical tones tend to be level—High, Mid, or Low—while discourse tones are often made up of pitch contours and (2) discourse tones have scope over the entire sentence, with meanings that link the proposition to the discourse—they are therefore positioned higher within the syntactic structure than grammatical particles (see Chap. 7). Categorizing grammatical and discourse tones together under the label intonation does not imply that I think they form a single grammatical category, just as linguists understand that segmental discourse particles and grammatical particles are not of the same category even though they are both referred to as particles. Bailey (2010: 25) said that “[g]enerally, a loose definition [of particle] is something like ‘an invariant element with grammatical function that does not belong to one of the major grammatical categories’.” In other words, when different grammatical categories share particular properties, it can make sense to categorize them together for the purpose of linguistic descriptions. Here, I am classifying all grammatical elements that share the property of being tonal morphemes together under the label intonation. The benefit of classifying discourse intonation together with grammatical tonal morphemes is that it emphasizes my hypothesis that discourse intonation comprises tonal morphemes. And following logically from that, this classification makes a clear theoretical distinction between intonation on the one hand, and all other suprasegmental forms on the other. Ladd (2008) suggested that intonation should be redefined if the link between it and segmental particles can be verified. I suggest that the evidence found in the literature—much of which is reviewed in this and the next three chapters—is now sufficient enough to consider this link to be verified, and that a redefinition of intonation along the lines of (1) in Chap. 2 is therefore justified.

References Arndt, W. (1960). “Modal particles” in Russian and German. Word, 16, 323–336. Bailey, L.  R. (2010). Sentential word order and the syntax of question particles. Newcasltle Working Papers in Linguistics, 16, 23–43. Baker, H., & Ho, P. (2006). Teach yourself Cantonese. London: Teach Yourself Books. Ball, J. D. (1971). Cantonese made easy: A book of simple sentences in the Cantonese dialect, with free and literal translations, and directions for the rendering of English grammatical forms in Chinese (2nd ed.). Taipei: Ch’eng Wen. (Original work published in 1888). Bartels, C. (1999). The intonation of English statements and questions: A compositional interpretation. New York: Garland. Bauer, R. S., & Benedict, P. K. (1997). Modern Cantonese phonology. Berlin: Mouton de Gruyter.

38

3  Intonational Meaning

Bauer, R.  S., & Wakefield, J.  C. (2019). The Cantonese language. In J.  C. Wakefield (Ed.), Cantonese as a second language: Issues, experiences and suggestions for teaching and learning (pp. 8–43). Oxon: Routledge. Bolinger, D. (1986). Intonation and its parts: Melody in Spoken English. London: Edward Arnold. Brazil, D. (1975). Discourse intonation, vol. 1. Birmingham University: Department of English. Brazil, D. (1997). The communicative value of intonation in English. Cambridge: Cambridge University Press. Chao, Y. R. (1932). A preliminary study of English intonation (with American variants) and its Chinese equivalents. Bulletin of the Institute of History and Philology of the Academia Sinica (The Ts’ai Yuan P’ei Anniversary volume: Supplementary volume I), 104–156. Cheung, K.-H. (1986). The phonology of present day Cantonese. Unpublished doctoral dissertation, University College, London. Chun, D. M. (2002). Discourse intonation in L2: From theory and research to practice. Amsterdam: J. Benjamins. Cruttenden, A. (1997). Intonation (2nd ed.). Cambridge: Cambridge University Press. Da Mota, C. R., & Herment, S. (2016). The pragmatic functions of the final particle eh and of high rising terminals in Canadian English: Quite similar, eh! In J. Barnes, A. Brugos, S. Shattuck-­ Hufnagel, & N. Veilleux (Eds.), Speech prosody 2016 (pp. 878–882). Boston: Boston University. Fox, A. (2000). Prosodic features and prosodic structure: The phonology of suprasegmentals. Oxford: Oxford University Press. Fung, R. S.-Y. (2000). Final particles in standard Cantonese: Semantic extension and pragmatic inference. Unpublished doctoral dissertation, Ohio State University, Columbus, OH. Goldsmith, J. A. (1976). Autosegmental phonology. Doctoral dissertation, Massachusetts Institute of Technology. Retrieved from http://dspace.mit.edu/handle/1721.1/16388 Gunlogson, C. (2003). True to form: Rising and falling declaratives as questions in English. New York: Routledge. Gussenhoven, C. (1983). Focus, mode and the nucleus. Journal of Linguistics, 19(2), 377–417. Gussenhoven, C. (1984). On the grammar and semantics of sentence accents. Dordrecht: Foris. Hirst, D. (1983). Interpreting intonation: A modular approach. Journal of Semantics, 2(2), 171–182. Hirst, D., Wakefield, J., & Li, Y. H. T. (2013). Does lexical tone restrict the paralinguistic use of pitch? Comparing melody metrics for English, French, Mandarin and Cantonese. Presented at the international conference on phonetics of the languages in China (ICPLC-2013), Hong Kong. Jassem, W. (1952). Intonation of conversational English (educated Southern British). Wroclaw: Wroclawskie Towarzystwo Naukow. Kwok, H. (1984). Sentence particles in Cantonese. Hong Kong: Centre of Asian Studies, University of Hong Kong. Ladd, D. R. (1978). The structure of intonational meaning: Evidence from English. Bloomington: Indiana University Press. Ladd, D. R. (2008). Intonational phonology (2nd ed.). Cambridge: Cambridge University Press. Leung, C. (2005). 當代香港粵語語助詞的研究 [A study of the utterance particles in Cantonese as Spoken in Hong Kong]. Hong Kong: Language Information Sciences Research Centre, City University of Hong Kong. (Original work published in 1992). Liberman, M. Y. (1975). The intonational system of English. Doctoral thesis, Massachusetts Institute of Technology, Cambridge, MA. Liberman, M. (1979). The intonational system of English. New York: Garland. Liu, F., Xu, Y., Prom-on, S., & Yu, A. C. L. (2013). Morpheme-like prosodic functions: Evidence from acoustic analysis and computational modeling. Journal of Speech Sciences, 3(1), 85–140. Luke, K. K. (1990). Utterance particles in Cantonese conversation. Amsterdam: John Benjamins. Maekawa, K. (2004). Production and perception of “paralinguistic” information. Presented at the Speech Prosody, Nara, Japan. Matthews, S., & Yip, V. (2011). Cantonese: A comprehensive grammar (2nd ed.). London: Routledge.

References

39

O’Connor, J. D., & Arnold, G. F. (1973). Intonation of Colloquial English: A practical handbook (2nd ed.). London: Longman. Persson, R. (2018). On some functions of salient initial accents in French talk-in-interaction: Intonational meaning and the interplay of prosodic, verbal and sequential properties of talk. Journal of the International Phonetic Association, 48(1), 77–102. Pierrehumbert, J., & Hirschberg, J. (1990). The meaning of intonational contours in the interpretation of discourse. In P. R. Cohen, J. Morgan, & M. E. Pollack (Eds.), Intentions in communication (pp. 271–311). Cambridge, MA: The MIT Press. Pierrehumbert, J., & Steele, S. A. (1989). Categories of tonal alignment in English. Phonetica, 46(4), 181–196. Ritchart, A., & Arvaniti, A. (2014). The form and use of uptalk in Southern Californian English. Proceedings of Speech Prosody, 7, 20–23. Dublin. Schubiger, M. (1965). English intonation and German modal particles: A comparative study. Phonetica, 12(2), 65–84. Sybesma, R., & Li, B. (2007). The dissection and structural mapping of Cantonese sentence final particles. Lingua, 117(10), 1739–1783. Tang, S.-W. (2006). 粵語疑問句「先」的句法特點 [Syntactic properties of sin in Cantonese interrogatives]. 《中國語文》 [Zhongguo Yuwen], 312(3), 225–232. Tang, S.-W. (2010). 漢語句類和語氣的句法分析 [A syntactic analysis of clause types and mood in Chinese]. 漢語學報 [Hanyu Xuebao (Chinese Linguistics)], 29(1), 59–63. Wakefield, J.  C. (2010). The English equivalents of Cantonese sentence-final particles: A contrastive analysis. Unpublished doctoral thesis, The Hong Kong Polytechnic University, Hong Kong. Wakefield, J. C. (2012). A floating tone discourse morpheme: The English equivalent of Cantonese lo1. Lingua, 122(14), 1739–1762. Wakefield, J.  C. (2014). The forms and meanings of English rising declaratives: Insights from Cantonese. Journal of Chinese Linguistics, 42(1), 109–149. Wakefield, J.  C. (2016). Sentence-final particles and intonation: Two forms of the same thing. In J.  Barnes, A.  Brugos, S.  Shattuck-Hufnagel, & N.  Veilleux (Eds.), Speech prosody 2016 (pp. 888–892). Boston: Boston University. https://doi.org/10.21437/SpeechProsody.2016. Wakefield, J.  C. (in press). It’s not as bad as you think: An English tone for downplaying. In W. Gu (Ed.), Studies on tonal aspects of languages. Hong Kong: Journal of Chinese Linguistics Monograph. Warren, P. (2016). Uptalk: The phenomenon of rising intonation. Cambridge: Cambridge University Press. Wee, L.-H. (2019). Phonological tone. Cambridge: Cambridge University Press. Yau, S. (1980). Sentential connotations in Cantonese. Fangyan, 1, 35–52. Yip, M. (2002). Tone. Cambridge: Cambridge University Press.

Chapter 4

Evidence of the Morphological Nature of Intonation

This chapter reviews some of the evidence from the literature offering support to the claim that intonation comprises morphemes. Intonation can be in the form of level tones, rises, falls, or more complex contours including combinations of rises and falls, but I will refer to these collectively as tones or tonal morphemes.1 Many of the tones reviewed below are compared with segmental morphemes, providing evidence in support of the argument made in Chap. 3 that tonal morphemes and their segmental counterparts are two forms of the same thing. If a tone is shown to have virtually the same function or meaning as a segmental particle, then this can be taken as a form of empirical evidence to indicate that the tone in this tone/particle pair is a morpheme and, in many cases, will be of the same grammatical category. The degree to which such a claim is justified is positively correlated with the rigor and validity of the methods used to test the similarity between the tone and the particle. I am not implying that the tonal counterpart to a segmental particle will necessarily have exactly the same meaning—in fact in many cases it won’t, especially for discourse meanings. Many words seem to translate readily from one language to the next, but a morpheme or lexeme from language A (Morpheme-LA), which translates best as a particular lexeme into language B (Morpheme-LB), is almost never an exact equivalent. Linguists would rarely claim that Morpheme-LA is perfectly equivalent to Morpheme-LB, though some might make this claim for certain pronouns, grammatical cases, and so on. Therefore, arguing that the existence of a segmental counterpart is evidence that a tone is a morpheme does not entail the idea that the intonational form has exactly the same semantics as its segmental counterpart. So long as a morpheme’s meaning or function is consistently translatable into another language, then its translation can also be analyzed as a morpheme (i.e., not merely pragmatic in nature). When the English word “chair” is translated into other languages, for example, it will not have the same semantic representation in the minds of the speakers of those other languages as it does in the minds of English

 Of course, the term tone as used here excludes lexical tone.

1

© Springer Nature Singapore Pte Ltd. 2020 J. C. Wakefield, Intonational Morphology, Prosody, Phonology and Phonetics, https://doi.org/10.1007/978-981-15-2265-9_4

41

42

4  Evidence of the Morphological Nature of Intonation

speakers. But whatever word in those other languages best matches “chair” will be a morpheme. It does not seem possible that “chair” could be translated into another language as some type of pragmatic meaning. It seems reasonable to hypothesize that the same logic applies to intonational meanings. If a segmental morpheme consistently corresponds to a specifically-shaped tone in comparable cross-linguistic contexts, then this should be seen as evidence of the tone’s morphemic status. The following section reviews the evidence for the morphemic nature of grammatical tones, and then Sect. 4.2 does the same for discourse tones. The only distinction I make between a grammatical and a discourse tone is whether or not it relates the contents of the proposition to the discourse. It is assumed that all tones belong to functional, closed class categories of words, rather than to open lexical classes.

4.1  T  onal Grammatical Particles and Their Segmental Counterparts A large number of grammatical tonal morphemes have been described in the literature—it would be impractical to attempt an exhaustive review of them all, but several examples are given below. Saying that these tones are morphemes is uncontroversial, but classifying them as intonation is unusual. My justification for such a classification was explained in Sect. 3.3. Gussenhoven (2004: 35) observed that “morphemes that consist only of tone [are] not uncommon in African languages.” Kikuyu, for example, has a modal morpheme that consists of a floating Low tone (ibid, p. 107). Hirst (1983) cited several examples of tonal morphemes in African languages: Welmers (1959) cites the case of Jukun (Takun dialect, Eastern Nigeria) where “the replacement of any tone by high is a morpheme signaling the ‘hortative’ construction.” In Babete (West Cameroon), Hyman and Tadadjeu (1976) describe an “associative marker” which is realized as a “tone-raising on the prefix of the second noun.” A well documented example of such “floating tones” is to be found in Bambara (dialect of Bamako, Mali) where a floating low tone is the only phonetic manifestation of the definite determiner (cf. Bird, 1966; Bird, Hutchinson and Kante 1977). (Hirst 1983: 174)

Linguists typically compare the functions and meanings of grammatical tones to those expressed by segmental particles. The tonal morphemes cited by Hirst (1983) presumably all have segmental counterparts in their own language’s historical pasts, and they can additionally be compared semantically to the segmental morphemes in other languages that function as hortative markers, associative markers, and definite determiners, respectively. And the same is true of all the tones described below. Kalabari uses tones to express many morphological and syntactic categories (Harry and Hyman 2014, cited by Gussenhoven and Jacobs 2017: 158). The following illustrates the use of a rising LH tone to change a verb from transitive to intransitive:

4.1  Tonal Grammatical Particles and Their Segmental Counterparts (1)

43

Transitive Intransitive dìmà “change” dìmá “be changed” sá!kí “begin” sàkí “begin” kíkímà “cover” kìkìmá “be covered” pákìrí “answer” pàkìrí “be answered” gbóló!má “mix up” gbòlòmá “mix up” kán “demolish” kàán “be demolished” (Gussenhoven and Jacobs 2017: 158)

Referring to the Kwa languages of West Africa, Aboh and Essegbey (2010: 3) gave the following examples of “grammatical morphemes [that] … are expressed with tones: Akan uses tone to distinguish between the habitual and the stative (e.g., dá ‘sleeps’ versus dà ‘in a lying posture’)”; Gungbe expresses progressive aspect with a low tone on the verb; Inland Ewe expresses progressive aspect with a rising tone on the verb; and Yoruba uses a low tone to express negation. In her book length treatment of tones in White Hmong, a language spoken by an ethnic group originating in Southwestern China and Southeast Asia, Ratliff (1992/2010, especially Chap. 3) reported a number of tones that have grammatical functions. For example, White Hmong uses “tone to signal membership in a word class” (p. 93). Two other examples are the first five numerals being marked with a high-level tone, and dual and plural pronouns being marked with a high-level tone and a high-falling tone, respectively. van Oostendorp (2005) reported the tonally marked distinction of gender in the Maasbracht dialect of Dutch. The examples in (2a) show that when a neuter tone is high, the feminine version of that word uses a falling tone. When the neuter tone is already falling, as shown in (2b), then the feminine tone is the same as the neuter tone. In all cases, a schwa is used for the masculine form. (2)

neuter a. wíís dóúf láám

feminine wíìs dóùf láàm

masculine wíìzə dóùvə láàmə

b. ká`lm ká`lm ká`lmə kléèn kléèn kléènə (van Oostendorp 2005: 108)

“wise” “deaf” “lame” “calm” “small”

Palancar and Léonard’s (2016b) edited volume on tonal morphemes is, as far as I know, the most comprehensive treatment of such tones to date. The contributors discuss a number of languages, with special focus given to the Oto-Manguean languages of Mexico. Palancar and Léonard (2016a) refer to what I call grammatical tones as inflectional tones or relational tones. I think the term inflectional tone can be misleading if it is used as a cover term for all grammatical tones because it implies that the morpheme is an affix. This is not necessarily the case because a tonal morpheme can be a floating tone whose associated morpheme occupies a different syntactic slot from the word upon which it is realized phonologically. There

44

4  Evidence of the Morphological Nature of Intonation

is too much in Palancar and Léonard’s (2016b) book to review here, but it is worth noting that it illustrates how complex and difficult the task is of describing the forms and functions of grammatical tones (see Konoshenko (2017) for a detailed discussion of the issues involved with special reference to the studies in Palancar and Léonard (2016b)). Something that all these authors agree on is that “tonal morphology can do anything that non-tonal morphology can do” (Hyman 2016: 15). Hyman went beyond that, however, and argued that “tonal morphology can do things that non-­tonal morphology cannot do” (ibid). There is no question that tonal morphology can do things phonologically that are not possible for segmental morphology because they can be floating tones that are phonologically associated with morphemes occupying different syntactic slots from their own. However, I believe this is all that tones do differently; it counters the hypothesis of this book to claim that tonal morphology can defy syntactic rules as Hyman seems to imply. One example Hyman (2016) gave of a tone being able to do something beyond what is possible for a segmental morpheme relates to the tonal morphemes that mark tense in Kikuria, a Bantu language from Tanzania. A High tone is assigned to one of the first four morae of a verb stem. Depending on whether the tone is on the first, second, third, or fourth mora, it marks past, past progressive, future, and inceptive tense, respectively. Inceptive tense (i.e., “Someone is about to do something”) is an interesting case because many verbs do not have enough morae to realize this tone on the fourth mora. In such cases, the tone is realized on the object of the verb instead, as illustrated here: (3)

a. μ4 to-ra- [ karaaŋg-á b. μ4 to-ra- [ sukur-a c. μ4 to-ra- [ βun-a d. μ4 to-ra- [ ry-a (Hyman 2016: 33)

éɣétɔ́ɔ́kɛ “we are about to fry a banana” éɣétɔ́ɔ́kɛ “we are about to rub a banana” eɣétɔ́ɔ́kɛ “we are about to break a banana” eɣetɔ́ɔ́kɛ “we are about to eat a banana”

The symbol μ4 indicates that the tone appears on the fourth mora. In (3a), the verb karaaŋg-a (“fry”) has four morae, so the tone appears on the final one, indicated with a tone diacritic and underlining. In (3b), the verb sukur-a (“rub”) only has three morae, so the tone appears on the first mora of the noun object éɣétɔ́ɔ́kɛ (“banana”). The verbs in (3c) and (3d) have only two morae and one mora, respectively, and the inceptive tense tone therefore shifts further toward the right on the noun’s morae. Hyman says this appears to violate a basic principle of canonical morphology stated by Corbett (2007, cited in Hyman 2016: 34), which is that “[m] orphs should stay on their own word!” (emphasis his). One could argue, however, that a floating tone can be realized phonologically on a word other than the one it is syntactically associated with or, in this case, that tense morphemes are at the verb phrase (VP) level, or even above the VP inside the tense phrase, and are realized on the fourth mora within the verb phrase. Phonologically, this is clearly something that cannot be done with a segment particle, but it is unlikely to be a violation of morphosyntax.

4.1  Tonal Grammatical Particles and Their Segmental Counterparts

45

A more difficult example to explain comes from Chimwiini, a Bantu language from Somalia, which distinguishes second and third person by a final High tone versus a penultimate High tone realized on the noun object (Kisseberth 2009, cited in Hyman 2016: 34). Distinguishing a second- versus third-person subject pronoun through the position of a tone on the noun object is interesting on its own, but it becomes even more interesting when we see that this still holds when the noun is embedded between two verb-modifying phrases within the VP. Presumably each tonal morpheme either occupies its own specific syntactic slot or is like an affix that attaches to another morpheme, but in this case, it is not easy to see precisely where these two tones are located in the structure, or why they are realized phonologically on the object. Putting aside the difficulty in describing the syntactic properties of these two tones, we know they function to mark a subject pronoun as being second or third person, and they are therefore clearly tonal morphemes that have segmental counterparts. Cheng and Kula (2006) discussed Bemba’s use of a relative marker in restrictive relative clauses that can optionally take the form of a Low tone, which they argued was a tonal morpheme that is functionally equivalent to its segmental counterpart. They gave the following example shown in (4). The noun phrase subject in (4a) does not include a modifying relative clause. In (4b), the noun phrase subject includes a relative clause that is marked with the preprefix á-. In (4c), the modifying relative clause is marked with a Low tone morpheme in place of á-, changing the subject marker from bá- to bà-.2 (4)

a. ba-kafúndisha bá-léé-lolesha panse. 2PFX-teacher 2SM-TNS-look 16outside “The teacher is looking outside.” b. ba-kafúndisha á-bá-léé-lolesha pansé ni ba-Mutale. 2PFX-teacher 2REL-2SM-TNS-look 16outside COP 2PFX-Mutale “The teacher who is looking outside is Mr Mutale.” c. ba-kafúndishá bà-léé-lolesha pansé ni ba-Mutale. 2PFX-teacher 2REL.2SM-TNS-look 16outside COP 2PFX-Mutale “The teacher who is looking outside is Mr Mutale.” (Cheng and Kula 2006: 34; note: numbers refer to agreement classes)

In addition to its own segmental counterpart, this floating tone can be compared with the segmental relative clause markers of other languages. Another example of what appears to be segmental-to-tonal evolution comes from Svenonius and Kennedy (2006), who proposed that there is what they referred to as a null degree operator in the complementizer phrase of certain Northern Norwegian degree questions. Their analysis was based on the contrast between the following two sentences in Icelandic (5a) and Northern Norwegian (5b): 2  Some readers may notice from looking at (4c) that the final vowel of the head noun changes to a high tone when the relative clause is marked with a low tone. Cheng and Kula (2006) argued that this high tone on the noun’s final vowel is not what marks the relative clause. See Cheng and Kula (2006: 34 and 40ff) for details.

46 (5)

4  Evidence of the Morphological Nature of Intonation a. Hvað ertu gammall? what are.you old “How old are you?” b. Er du gammel? are you old “Are you old?” “How old are you?” (Svenonious and Kennedy 2006: 134)

Svenonius and Kennedy argued that the Icelandic degree operator Hvað (“what”), which questions the degree of the adjective “old,” has a counterpart that exists at the front of the Northern Norwegian sentence in (5b). The difference, they claimed, is that this operator is phonologically null in Northern Norwegian. They argued that this operator originates inside the AdjP and moves into the complementizer phrase, just as its counterpart Hvað is assumed to do in Icelandic. The sentence in (5b) is interpreted in one of two ways depending on the intonation. When the prosodic peak is associated with the predicative adjective old, then this sentence is interpreted as a yes/no question, but when the peak is associated with a word further to the left, then it is interpreted as a degree question. They did not provide any detailed phonological analysis, but what they described implies that the degree operator is realized phonologically as a floating tone since it is disambiguated with intonation. In other words, the degree operator is not null, as they suggested, but rather is realized as a tonal morpheme. The final example I will give of a grammatical tone is a sentence-final tone that changes declaratives into interrogatives. These are not considered to be discourse tones because they type the clause without adding any semantic content. This differs from question tones, which are assumed to have discourse meanings and which are therefore discussed in the following section (Sect. 7.2.2 offers justification for distinguishing questions from interrogatives). Two languages that use an interrogative tone are Gungbe and French (Aboh and Pfau 2010). Regarding the former, note the contrast between the tones on the verbs in (6a) and (6b): (6)

a. Sέtò kò wá. Seto already come “Seto arrived already.” b. Sέtò kò wȃ? Seto already come.INTER “Has Seto arrived yet?” (Aboh and Pfau 2010: 92)

In (6b), a sentence-final low tone combines with the high tone of the verb wá, changing it to wȃ and types the clause as an interrogative. This sentence-final interrogative tone is comparable to the sentence-final interrogative particles used in other languages (e.g., Mandarin ma, Polish czy, Turkish mi, Bengali ki, etc.). Dryer (2013) listed the forms of polar interrogatives in 955 languages, 585 of which use a seg-

4.2  Tonal Discourse Particles and Their Segmental Counterparts

47

mental particle and 173 of which use what he called “interrogative intonation.” He said he restricted his list to “unbiased questions” as opposed to “leading questions,” so to the extent that he did, they are examples of grammatical particles, either tonal or segmental. Any that are question particles rather than interrogative particles would belong in the next section, and it is likely that some of those he listed actually are question particles, because the literature has frequently used the terms question and interrogative interchangeably (see Sect. 7.2.2) and Dryer’s data came primarily from secondary sources. Claiming that the tones reported in this section are morphemes is not very controversial, but there is much less agreement about the status of the tones reported in the following section.

4.2  T  onal Discourse Particles and Their Segmental Counterparts This section reviews the literature on meaningful discourse tones, which typically have semantic scope over the entire sentence and which most linguists have not analyzed as tonal morphemes on a par with those reported in the previous section. Compared with the tonal morphemes described in the previous section, many discourse tones do not have obvious segmental counterparts. The grammatical functions of grammar tones are considered to exist in most if not all languages. They mark things such as tense, aspect, gender, person, case, definiteness, relative clauses, polar interrogatives, and so on. Because of this, comparisons to segmental particles are considered valid, even when those particles belong to unrelated languages. In contrast, the meanings of many discourse tones appear to be language specific. This, combined with the fact that both their forms and meanings are harder to describe than those of grammatical tones, makes it more difficult to prove that they are tonal morphemes. I will begin with the less controversial cases. These involve tones that express universal meanings. The tones that express emphatic and contrastive meaning are among the easiest discourse-related forms of intonation to identify and define, allowing linguists to contrast the forms used to express these meanings cross-linguistically. Hirst (1983) proposed that English’s emphatic and contrastive intonation are tonal morphemes, or what he called floating tones, and he supported this analysis with the examples shown in (7) from Bambara, which uses a contrastive particle de that follows the element the speaker wishes to contrast. His logic was the same as what I am proposing here: if a tone has a segmental counterpart in another language, we can conclude that the tone is a morpheme.

48

4  Evidence of the Morphological Nature of Intonation

(7)

a. Muso fila bé John fe. “John has two wives.” b. Muso fila de bé John fe. “John has TWO wives.” c. Muso fila bé John de fe. “JOHN has two wives.” (Hirst 1983: 179–80, no glosses provided)

Oshima (2005) attempted to provide a uniform analysis of the semantics between what he referred to as contrastive focus particles in Japanese (wa) and Korean (nun) on the one hand, and the contrastive contour of English on the other. His tentative conclusion was that the semantics of contrastive contours, as proposed by Büring (2003, cited in Oshima 2005: 371), cannot be fully applied to contrastive focus particles. This conclusion does not counter the hypothesis proposed here because, as explained above, the segmental counterpart of a tone need not be an exact equivalent in order to be evidence of its morphemic status—the fact that it consistently translates as the same form is sufficient evidence on its own. Having said that, it is possible that Oshima did not choose the closest counterparts among these languages; it appears that he was comparing contrastive focus in English with topic-focus in Japanese and Korean. English’s emphatic intonation can also be compared with segmental particles in other languages. For example, Mandarin marks emphasis with the emphatic marker shi (Shi 1994), and Columbian Spanish does so with the emphatic marker es (Curnow and Travis 2003), both of which precede the element that they mark. Even if we do not compare emphatic intonation with segmental particles, we can see that the tonal expression of emphasis is language specific and that it therefore behaves like a morpheme—it is not merely an exaggerated form of already existing word stress patterns. An example demonstrating that its tonal form is language specific was given by Carlos Gussenhoven (2004): [i]n the Zagreb variety of Serbocroat, F0 peaks in initially stressed words occurred in the second syllable when they were spoken in a sentence with neutral intonation, but in the first syllable when the focus is on the word in question .… This pattern is the opposite of that found for Hamburg German. (Gussenhoven 2004: 93)

Moving on to tones with more complex discourse meanings, Da Mota and Herment (2016) conducted a qualitative contrastive analysis of Canadian English’s sentence-final eh and sentence-final rising tones, which they knew had been increasing in use in North American, including Canada. Based on their analysis, they concluded that the functions of eh and a sentence-final rise are so similar that they are usually interchangeable. Interestingly, they had originally planned to analyze a corpus of data based on recordings of natural conversations that they collected from speakers aged 20–22 years, but virtually no tokens of eh appeared in their data. They therefore had to complement this corpus with another one consisting of humorous shows. They concluded that their combined data showed evidence of the interchangeability of these two grammatical items. They further concluded that the fact that they have similar functions “would explain why eh tends not to appear in the speech of young people, who make great use of [high-rising terminals]” (p. 878–879).

4.2  Tonal Discourse Particles and Their Segmental Counterparts

49

In other words, they believe young Canadians are replacing eh with sentence-final rises. This would only be possible if the tone and the segmental particle of this pair were functionally equivalent, which is evidence that they are grammatically the same. Sadat-Tehrani (2017: 21) reported an intonation pattern in Persian that “adds a contradictory or alternative comment to the previous discourse.” The way that this tone is realized indicates that it is a tonal morpheme. As Sadat-Tehrani describes it, This nuclear pitch accent is on a final verbal element in the utterance, and crucially, this final element is not associated with any pitch accent in the normal declarative reading of the same utterance, and it is only in this structure that it becomes nuclear-accented. Thus, the nuclear pitch accent behaves here as an intonational morpheme, but one that is bound to its location in the utterance. (Sadat-Tehrani 2017: 1, emphasis in italics mine)

Armstrong (2015), who looked at three question tone contours in Puerto Rican Spanish, explicitly argued that discourse intonation is comparable with sentence-­ final particles (SFPs) and that studying SFPs can offer insights into the nature of intonational meanings. She rightfully pointed out that studies on the intonation of questions normally define the distinction between different tones based on overly simplistic dichotomies such as information-seeking versus confirmation-seeking or neutral versus biased. This is actually only a distinction between a neutral, unbiased question tone (e.g., example (6) or a French equivalent) on the one hand, and any biased question tone on the other. This lumps all biased question tones together as if they are the same (see Chap. 6 for a discussion of two distinct English rising question tones). The result is that few linguists have recognized a distinction among different biased question contours with unique shapes and meanings. Recall that in Sect. 3.1.3, it was pointed out that authors have often compared the form of the rising uptalk tone with a question tone, implying that there is only one form of a question tone in English with one meaning. Prieto and Roseano (2016) used discourse completion tasks to collect utterances that were controlled for the expression of specific epistemic biases. The two groups of participants were native speakers of two Romance languages: Central Catalan and Northern Friulian. In addition to lexical words, which are a resource in all languages, Central Catalan primarily uses intonation to express speaker stance, while Northern Friulian uses a mix of intonation and modal particles. This study therefore not only produced tone/particle contrasts between two languages’ expression of certain epistemic meanings but also tested the production of these meanings within a single language that uses both forms. They concluded their study showed “that intonation closely parallels the function of pragmatic markers in their encoding of speaker commitment operators” (p. 891). Further evidence of intonational meaning being like segmental meaning was evidenced by the fact that “both languages place restrictions on how epistemic interjections and intonation are paired” (ibid). The precise meanings of the tones that they investigated will probably require further investigation, but this restriction on how tone and segmental meanings are paired is evidence that tonal meanings and segmental meanings equally contribute specific semantic content to the sentence. When these meanings are in conflict, they cannot

50

4  Evidence of the Morphological Nature of Intonation

co-occur, and when they are in agreement, or work to reinforce each other, they can co-occur. Regarding the latter case, Prieto and Roseano (2016: 891) observed that Catalan “makes extensive use of lexical epistemic markers in combination with intonation patterns.” A study conducted by Zhang (2014) exploited Cantonese’s use of both SFPs and intonation. Based on her careful phonetic analysis, she demonstrated that Cantonese sentence-final intonation exists in complementary distribution with SFPs, providing empirical evidence that strongly indicates the two are of the same category. There are at least two pioneering studies that can be seen as precursors to my own research on intonation. Both studies used translations to match specific English tones to specific discourse particles. The first study was done by Chao (1932), who contrasted English intonation with Mandarin particles, based on the assumption that “the speech element in Chinese which may be equated to English intonation is the use of grammatical particles” (p. 115). For his study, he translated written lines from stage plays that included intonational markings and translated written sentences from the literature on English intonation. Chao (1932) did not say he was specifically looking for any one-to-one correspondences between Chinese particles and forms of English intonation. The closest that he came to indicating a one-to-one correspondence was seen in the fact that five of his six translations of “statements with a reassuring implication” were all rendered as an optional falling tone on the sentence-final particle a. He showed English statements of this type as having an intonational phrase that starts high and ends with a rise. One example using his notation is “That’s ̅ all right ,” which was translated into Mandarin as Bu yaojin ( ) a (NEG important PRT). These translations are not a strong enough piece of evidence to claim that this form of English intonation has the same meaning as this particle in Mandarin (or this particle plus a falling tone). His definition for the meaning of the English intonation was quite vague (i.e., “statements with a reassuring implication”), and he did not translate this meaning consistently into Chinese as the same form—the tone on the particle is shown as optional, and one of his six translations did not use this particle. Nevertheless, Chao’s (1932) seminal study used an insightful approach that I think can work toward resolving the empirical question of whether or not intonational functions and meanings can be directly matched to the meanings of particles. The second pioneering study comes from Schubiger (1965), who equated the German modal particles doch and eben to forms of English intonation. Like Chao (1932), she used written translations, some of which were her own and some of which came from professional translations of German and English works of literature. Using written language to study intonation no doubt reduces the validity of the research. Austin (1975: 74) pointed out the inherent problem with using such sources, saying that intonation is “not reproducible readily in written language .... Punctuation, italics, and word order may help, but they are rather crude.” Schubiger offered a definition for a connotative meaning and proposed that it defines two different German modal particles (doch and eben) and six different pitch contour shapes in English. Her definition was as follows: “rejoinders with the connotation ‘by the way you talk (or act) one would think you didn’t know’” (p. 68). This is a

4.3  Concluding Remarks

51

speaker-oriented definition, which I think is the correct approach, but the fact that her definition was proposed to describe so many different forms either means it is not precise enough or that she is attributing its meaning to forms that do not actually have that meaning. Pennington and Ellis (2000: 376) listed what they called “English equivalents of some Cantonese utterance particles.” Like Chao (1932), Pennington and Ellis’s use of the word “equivalents” implied that they believe Chinese discourse particles are equivalent to English intonation. Going even further, they argued that specific forms of intonation are equivalent to specific particles. Unfortunately, however, their list of so-called equivalents was only included as an aside in their paper and was not accompanied by any explanation as to how they determined the equivalence between any of the tone/particle pairs on their list.

4.3  Concluding Remarks This chapter reviewed how linguists have shown that tones are functionally equivalent to segmental particles. Tones were divided into two types: grammatical tones that do not express discourse meanings and discourse tones that link a sentence to the discourse. Demonstrating grammatical tones to be like segmental particles is uncontroversial and relatively easy because many of their functions are universal and comparatively easy to describe. In this sense, the discourse tones that express focus, contrast, emphasis, and so on are similar to grammatical tones because they are universal and comparatively easy to describe, which makes them easy to contrast with segmental particles that express those same meanings. In contrast, many of the tones that express more abstract discourse meanings are much more difficult to link directly to segmental particles. This is not only because they are harder to define but also because many do not appear to have easily recognizable equivalents in other languages. Nevertheless, I suggest that the research to date has provided strong evidence in favor of the hypothesis that discourse tones are morphemes to the same extent that grammatical tones are morphemes. My own research in this area is a follow-up to that of Chao (1932) and Schubiger (1965). I borrow their idea of using translations, but use a more rigorous method. Like Pennington and Ellis (2000), my goal was to match specific Cantonese SFPs with specific forms of English intonation, and again, I use a more rigorous method than they did.3 As far as I know, my research provides the strongest evidence to date that a particular discourse tone has the context-independent discourse meaning ascribed to it. Chapter 5 explains my methodology, and the results are then presented in Chap. 6.

3  Though in this case, I can only presume that my method was more rigorous, because Pennington and Ellis (2000) did not say what their method was.

52

4  Evidence of the Morphological Nature of Intonation

References Aboh, E. O., & Essegbey, J. (2010). The phonology syntax interface. In E. O. Aboh & J. Essegbey (Eds.), Topics in Kwa syntax (pp. 1–10). New York: Springer. Aboh, E. O., & Pfau, R. (2010). What’s a wh-word go to do with it? In P. Benincà & N. Munaro (Eds.), Mapping the left periphery (pp. 91–124). Oxford: Oxford University Press. Armstrong, M. E. (2015). Accounting for intonational form and function in Puerto Rican Spanish polar questions. Probus: International Journal of Romance Languages, 29(1), 1–40. Austin, J. L. (1975). In J. O. Urmson & M. Sbisa (Eds.), How to do things with words: The William James lectures delivered at Harvard University in 1955 (2nd ed.). Oxford: Clarendon Press. Chao, Y. R. (1932). A preliminary study of English intonation (with American variants) and its Chinese equivalents. Bulletin of the Institute of History and Philology of the Academia Sinica (The Ts’ai Yuan P’ei Anniversary volume: Supplementary volume I), 104–156. Cheng, L., & Kula, N. C. (2006). Syntactic and phonological phrasing in Bemba relatives. ZAS Papers in Linguistics, 43, 31–54. Curnow, T.  J., & Travis, C.  E. (2003). The emphatic Es construction of Colombian Spanish. Proceedings of the 2003 conference of the Australian linguistic society. Presented at the 2003 conference of the Australian Linguistic Society, University of Newcastle. Da Mota, C. R., & Herment, S. (2016). The pragmatic functions of the final particle eh and of high rising terminals in Canadian English: Quite similar, eh! In J. Barnes, A. Brugos, S. Shattuck-­ Hufnagel, & N. Veilleux (Eds.), Speech prosody 2016 (pp. 878–882). Boston: Boston University. Dryer, M. S. (2013). Polar questions. In M. S. Dryer & M. Haspelmath (Eds.), The World Atlas of language structures online. Leipzig: Max Planck Institute for Evolutionary Anthropology. Gussenhoven, C. (2004). The phonology of tone and intonation. Cambridge: Cambridge University Press. Gussenhoven, C., & Jacobs, H. (2017). Understanding Phonology (4th ed). Routledge. Hirst, D. (1983). Interpreting intonation: A modular approach. Journal of Semantics, 2(2), 171–182. Hyman, L. M. (2016). Morphological tonal assignments in conflict: Who wins? In E. L. Palancar & J. L. Léonard (Eds.), Tone and inflection: New facts and new perspectives (pp. 15–38). Berlin: De Gruyter Mouton. Konoshenko, M.  B. (2017). Tone in grammar: What we already know and what we still don’t. Voprosy Jazykoznanija, 4, 101–114. Oshima, D. (2005). Morphological vs. phonological contrastive topic marking. Proceedings from the Annual Meeting of the Chicago Linguistic Society, 41(1), 371–384. Palancar, E. L., & Léonard, J. L. (2016a). Tone and inflection: An introduction. In E. L. Palancar & J. L. Léonard (Eds.), Tone and inflection: New facts and new perspectives (pp. 1–11). Berlin: de Gruyter Mouton. Palancar, E. L., & Léonard, J. L. (Eds.). (2016b). Tone and inflection: New facts and new perspectives. Berlin: de Gruyter Mouton. Pennington, M. C., & Ellis, N. C. (2000). Cantonese speakers’ memory for English sentences with prosodic cues. The Modern Language Journal, 84(3), 372–389. Prieto, P., & Roseano, P. (2016). The encoding of epistemic operations in two Romance languages: Intonation and pragmatic markers. In J. Barnes, A. Brugos, S. Shattuck-Hufnagel, & N. Veilleux (Eds.), Speech prosody 2016. Boston: Boston University. Ratliff, M. (2010). Meaningful tone: A study of tonal morphology in compounds, form classes, and expressive phrases in White Hmong. DeKalb: Northern Illinois University Press. (Original work published in 1992). Sadat-Tehrani, N. (2017). Expressing concession by means of nuclear pitch accent. California Linguistic Notes, 41(1), 1–25. Schubiger, M. (1965). English intonation and German modal particles: A comparative study. Phonetica, 12(2), 65–84. Shi, D. (1994). The nature of Chinese emphatic sentences. Journal of East Asian Linguistics, 3(1), 81–100.

References

53

Svenonius, P., & Kennedy, C. (2006). Northern Norwegian degree questions and the syntax of measurement. In M. Frascarelli (Ed.), Phases of interpretation (pp. 133–161). Berlin: Mouton de Gruyter. van Oostendorp, M. (2005). Expressing inflection tonally. Catalan Journal of Linguistics, 4, 107–126. Zhang, L. (2014). Segmentless sentence-final particles in Cantonese: An experimental study. Studies in Chinese Linguistics, 35(2), 47–60.

Chapter 5

Evidence via Cantonese

Chapter 3 explained that Hong Kong Cantonese has a large number of sentence-­ final particles (SFPs) that express meanings comparable to those expressed by discourse intonation in English. It was further argued that intonation and segmental particles are two forms of the same thing. Chapter 4 then reviewed a number of studies that compared discourse particles in one language to forms of discourse intonation in the same or another language. This chapter will now describe the methodology I used to match some Cantonese SFPs to specific forms of English intonation. Before doing so, some basic information about the Cantonese language is provided, explaining why it is ideal for conducting this kind of research.

5.1  The Cantonese Language Among the Chinese languages, Cantonese (or Yue Chinese) is the second most widely spoken variety after Mandarin.1 Cantonese and Mandarin are mutually unintelligible, the difference between them being comparable to the difference between French and Italian (Bauer and Benedict 1997). Cantonese and other Chinese varieties are often referred to as dialects of Chinese for sociopolitical reasons, but it is worth keeping in mind that this is comparable to saying that French and Italian are dialects of Romance. The majority of Cantonese speakers are in southeast China, but many others have migrated around the world. In total, there are about 73 million Cantonese speakers according to Ethnologue (Eberhard et al. 2019). The research reported here relates to the variety of Cantonese spoken in Hong Kong, where, at the time of the Hong Kong 2016 By-census (Hong Kong Census and Statistics Department 2017: Tables A104, A107, A111), the ethnic Chinese population numbered 6.75 million, approximately 90% of whom are Cantonese

 See Bauer and Wakefield (2019) for a detailed introduction to the Cantonese language.

1

© Springer Nature Singapore Pte Ltd. 2020 J. C. Wakefield, Intonational Morphology, Prosody, Phonology and Phonetics, https://doi.org/10.1007/978-981-15-2265-9_5

55

56

5  Evidence via Cantonese

speakers. This is less than one tenth of the total number of people who speak some variety of Cantonese throughout the world, but Hong Kong Cantonese is normally considered the standard variety, and the varieties spoken in Guangzhou, Macao, and their surrounding areas are very similar. Many speakers in overseas Chinese communities around the world also speak Hong Kong Cantonese or a similar variety. The meanings given to the Hong Kong Cantonese SFPs in Chap. 6 may or may not apply to other varieties of Cantonese.

5.1.1  Why Cantonese Is Ideal for This Kind of Research If discourse intonation and discourse particles really are two forms of the same thing, then it is reasonable to predict that at least some segmental discourse particles have intonational counterparts with the same (or very similar) meanings. If so, one could further predict that these segmental particles could be exploited to discover the forms and meanings of those intonational counterparts. While it seems unlikely that all discourse particles and all forms of intonation have semantically equivalent counterparts in other languages, it seems equally unlikely that none of them do. Therefore, it is worthwhile to make attempts to discover tone/particle matches. Any matches that can be found will provide further empirical evidence in support of the hypothesis that tones and particles belong to the same functional category. And the more valid the matches are seen to be, the greater the support for this hypothesis. Cantonese is especially well suited for this kind of research for two reasons, the first of which is that it lies at or near the extreme end of the spectrum of languages that use discourse particles more than intonation for expressing connotative meaning (see Sect. 3.2). Yau (1980) said that Cantonese and English represent the two extremes of the SFP-intonation continuum, and Luke (1990) and Leung (1992/2005) supported this claim saying that, as far as they know, Cantonese has more SFPs than any other language studied thus far. If true, then English lies at one end of the continuum along with all the languages that have very few SFPs,2 while Cantonese lies at the other end with perhaps the largest number of SFPs—the number cited varies in the literature, but all authors seem to agree there are more than thirty. The second reason Cantonese is especially suited for this kind of research relates to Hong Kong’s 150-year history as a British colony up until 1997. This has resulted in a large degree of exposure and use of both English and Cantonese in the daily lives of Hong Kong speakers—particularly those who have attended English-medium international schools, a small number of whom have acquired each language to a native level, meaning they are native bilinguals (or ambilinguals) with two first languages (L1s). The existence of such speakers allowed me to find participants suitable for this research (see Sect. 5.2.1), enabling me to elicit translations of the connotative 2  As explained in Sect. 3.2, English has sentence-final elements that are comparable with Cantonese SFPs. For example, there is the sentence-final question particle “huh?” and the well documented Canadian particle “eh.” We therefore should not say that English has no sentence-final particles.

5.1  The Cantonese Language

57

meanings expressed by Cantonese SFPs into English, which expresses those same (or very similar) meanings in the form of intonation. In short, Hong Kong Cantonese is an extreme example of a language that can be exploited as a window through which to study English intonation, and thereby test the hypothesis that intonation comprises morphemes.

5.1.2  Intonation in Cantonese So as not to mislead readers, it must be pointed out that although the lexical-tone system of Cantonese severely limits its use of intonation, it is not entirely void of it. Many authors have described aspects of Cantonese prosody and intonation. As is the case in all languages, intonation functions in Cantonese to express discourse-related meanings, but compared with English, this function is severely restricted (see Sect. 3.2). As far as I know, the literature on Cantonese discourse intonation talks about only two forms of discourse intonation: focus stress, which is marked by vowel lengthening (Bauer et al. 2004), and a rising question tone (Kwok 1984; Cheung 1986; Wu 1989; Ma et al. 2006; Fox et al. 2008). The fact that Cantonese has discourse intonation is obviously a potential complicating factor, but I do not believe it had any effect on the data, especially considering Zhang’s (2014) study, which demonstrated that Cantonese SFPs and intonation exist in complementary distribution. Some authors have discussed suprasegmentals in Cantonese that they referred to as forms of intonation, but these do not fall under the definition of intonation used here. One example of this is the declination effect that has been observed in Cantonese declarative statements (Bauer and Benedict 1997; Bauer et al. 2004; Fox et al. 2008). A decline in pitch results from the relatively smaller amount of air in the lungs at the end of an utterance than at the beginning. This is a nonlinguistic physiological effect that is a universal phenomenon observed in numerous languages; it is therefore not considered relevant here and is assumed not to have had any effect on the data. Cheung (1986) used the term intonation to describe the manipulation of the pitch key up or down or of the pitch’s range narrow or wide in Cantonese. I consider these manipulations of pitch to be paralinguistic expressions of emotions; presumably, they have very similar forms and functions across languages (see Sect. 2.3). These forms of prosody are often used by Cantonese speakers to express the sorts of emotive meanings that such pitch changes carry. The affective meanings expressed by these types of prosody are not expressed through SFPs or intonation and therefore are not relevant to the aims of the research. However, it is worth noting that the emotive meanings expressed by changes in pitch key and range, as well as those expressed through voice qualities, have been attributed to SFPs, just as they have been attributed to discourse intonation in English. But it is much easier to tease these apart from SFPs than intonation because their suprasegmental forms are more easily isolated from the segmental forms of SFPs—this relative ease of separability from nonlinguistic prosody is one of the key advantages that SFPs offer over intona-

58

5  Evidence via Cantonese

tion. Nevertheless, it must be acknowledged that some emotion-expressing suprasegmentals could potentially have been translated from Cantonese to English. This was presumably controlled for in two ways. First, each particle was translated multiple times in different contexts, and second, my analysis focuses on the translations of pitch contours, and not on changes in pitch range or key. The most widely discussed form of Cantonese intonation is the rising tone used to form declarative questions. Wu (1989) demonstrated that rising declaratives can occur in sentences that end with a syllable having any of the six lexical tones of Hong Kong Cantonese. For example, in rising declaratives that end with a syllable having a high-rising lexical tone (i.e., tone 2), the syllable rises to a higher ending point than normal, and in rising declaratives that end with a syllable having the high-level tone 1, the final syllable is produced with a higher than normal pitch level. Both Wu (1989) and Fox et al. (2008) observed that the pitch level across the entire sentence also appears to be higher in rising declarative questions than in nonrising declarative statements. The fact that Cantonese has rising declaratives is relevant. Based on the arguments put forth here about discourse intonation, it means that Cantonese, in addition to having a rime lengthening emphatic/contrastive morpheme, also has at least one discourse morpheme in the form of a rising tone. Significantly, the Cantonese rising question tone (or possibly tones) expresses a meaning related to the question forming SFPs of this study: me1 and aa4. Ideally, it would be good to discover the English equivalent(s) of the Cantonese rising declarative morpheme(s) and to compare it to the English equivalents of me1 and aa4. However, there are at least two complicating factors involved with this. One is the fact that rising declaratives are not marked in the Cantonese corpus I used, so it is not possible to conduct computer searches for samples of them. Another more serious complicating factor is that rising declarative morphemes have abstract suprasegmental forms in both languages, so we cannot automatically assume that Cantonese has only a single rising declarative form. These complicating factors can be worked around, but this is left for future research. It should be noted that the existence of a rising question tone in Cantonese did not interfere with the study. None of the SFPs included in this research are compatible with a rising question tone in Cantonese, and therefore there is no question that rising question tones were not present in any of the extracts from the corpus that were used for the translations.

5.1.3  Cantonese Sentence-final Particles The sentences in (1) through (6) illustrate the use of SFPs in Cantonese. Sentence (1) does not include an SFP, while sentences (2) through (6) each include a different SFP. Other than the SFPs, the sentences are exactly the same, which means that the pairing of any two of these linguistic examples forms a minimal pair at the sentence level; any differences in meaning can be attributed entirely to the SFPs themselves.

5.1  The Cantonese Language

59

This simple illustration demonstrates the tremendous advantage that segmental particles offer over intonation for investigating discourse meanings. (1)

Keoi5 hai2 Waan1zai2 faan1 gung1. 3s at Wanchai return work “S/he works in Wanchai.”

(2)

Keoi5 hai2 Waan1zai2 faan1 gung1 wo3. 3s at Wanchai return work SFP “Oh, so s/he works in Wanchai.” (This SFP indicates noteworthiness or sudden realization of the proposition.)

(3)

Keoi5 hai2 Waan1zai2 faan1 gung1 me1? 3s at Wanchai return work SFP “S/he works in Wanchai?!” (This SFP indicates surprise, doubt, or disbelief in the proposition.)

(4)

Keoi5 hai2 Waan1zai2 faan1 gung1 gwaa3. 3s at Wanchai return work SFP “I guess/think maybe s/he works in Wanchai.” (This SFP indicates a lack of commitment to the proposition.)

(5)

Keoi5 hai2 Waan1zai2 faan1 gung1 zaa3. 3s at Wanchai return work SFP “S/he only works in Wanchai.” (This SFP adds the meaning “only,” placing its focus on the entire verb phrase, the verb, or the object of the preposition. Which among these is intended by the speaker must be determined pragmatically.)

(6)

Keoi5 hai2 Waan1zai2 faan1 gung1 wo5. 3s at Wanchai return work SFP “Someone said S/he works in Wanchai.” (This SFP marks the clause as reported speech that was not uttered by

the speaker.)

The fact that (1) is grammatical demonstrates that SFPs are an optional component of a sentence, but a sentence with no SFP will in many cases sound abrupt or unnatural (sometimes even to the point of sounding unacceptable), and SFPs are of course necessary for conveying the connotative meanings that they express. The fact that (2) through (6) are all grammatical appears to indicate that SFPs are often interchangeable with one another, but this is misleading. Most SFPs cannot be used to initiate a conversation because they must be preceded by a specific type of context. Most SFPs are therefore not interchangeable within the same discourse context, and this is the case for the SFPs in (2) through (6). Consider sentences (3) and (4), for example. The sentence in (3) that uses the question SFP me1 must be used in a context where the contents of its proposition have just been revealed to the speaker, either verbally or pragmatically, and the speaker up to this point has believed a negative version of the proposition, which in this case is that the person being referred to does not work in Wanchai. Sentence (4), in contrast, must follow a request for information related to the proposition, which in this case relates to where the person works, and the speaker does not have a strong commitment about it, i.e., the speaker believes the person may work in Wanchai but is not sure. This demonstrates not only that SFPs’ meanings differ from one another but also that they

60

5  Evidence via Cantonese

include speakers’ stances and are linked to the surrounding discourse context in specific ways. This means that (2) through (6) are minimal pairs at the sentence level, but not at the discourse level. Accurate definitions of SFPs must account for these facts. As already stated, we should not expect every SFP to have an English intonational equivalent. Consider the English translations of sentences (2) to (6), which are repeated below as (2′) to (6′). Based on my intuition, discussions with native speakers, and in some cases, the results of the research, I have concluded that some of the SFPs in these sentences translate into something other than (or in addition to) intonation. The SFPs in (2′) and (3′) have what is arguably a one-to-one correspondence with a form of English intonation, and the SFP in (5′) translates into English as the word “only” plus focus stress, but the other SFPs do not appear to have any English intonational equivalents: (2′)

“S/he works in Wanchai.” (the SFP wo3) (Preliminary research results indicate that English intonation can be used to indicate noteworthiness, or sudden realization, with a lengthened rise-fall tone on the noun “Wanchai.”)

(3′)

“S/he works in Wanchai?!” (the SFP me1) (There is strong evidence that a high-rising tone can be used to indicate surprise, doubt, or disbelief.)

(4′)

“I guess/think maybe s/he works in Wanchai.” (the SFP gwaa3) (In addition to using the adverb “maybe” and placing the proposition inside an embedded clause as the complement of “guess” or “think,” it does not seem that any form of English intonation expresses that the speaker has little commitment to the proposition in the way that gwaa3 does.)

(5′)

“S/he only works in Wanchai.” (the SFP zaa3) (The word “only” is used along with stress on the verb “works,” or on “Wanchai,” depending on the speaker’s intended focus.)

(6′)

“(Someone said) S/he works in Wanchai.” (the SFP wo5) (In addition to placing the proposition inside an embedded clause as the complement of “said,” there does not seem to be any form of intonation that can express wo5’s meaning, which I and a colleague argued is used to distance the speaker from the proposition (p), and which we defined as: “someone said p; I did not say p” (Wakefield and Lee 2018).)

To the extent that (2′) through (6′) are described accurately, it shows that some but not all SFPs have English intonational counterparts. Some SFPs translate partially or entirely as intonation, while others appear to have no English intonational equivalent. A question that remains is whether or not an SFP (or portion of an SFP) that translates into some form of English intonation is perfectly equivalent to that intonational form with regard to its function and meaning, or if the functions and meanings of the two are only very similar. As argued in Chap. 4, the answer to this question does not affect the validity of this research; so long as a morpheme in one language consistently translates as a specific form into another language, this is evidence that this specific form is a morpheme in that other language. In the case of discourse meanings, it is possible that there are some exact equivalents ­cross-­linguistically. Yau (1980: 51)

5.1  The Cantonese Language

61

wondered if perhaps there are “common connotative concepts that will be handled either by means of [SFPs] or by intonation pattern variations in all languages.” Most words have culture-specific meanings, but many SFPs and forms of intonation seem to have common functions and meanings (i.e., what Yau called “common connotative concepts”) that are used to facilitate communication in most, if not all, languages. Speakers use them to situate propositions and ideas into the discourse in various ways and to express their stances and beliefs about them. I believe that the SFP lo1, for example, which expresses listener-­oriented epistemic modality, is a possible candidate for expressing a meaning that is expressed in all languages (see Sect. 6.1.1). I am not the first to propose that some SFPs may have English intonational equivalents. Chan (2001), for example, showed two questions that formed a minimal pair, with one using the interrogative SFP maa3 and the other using the question particle me1. She said that one possible way to convey the difference between the two in English would be for the me1-suffixed sentence to be rendered an “echo question,” which is the term she used for a rising declarative (p. 59). Another example came from Yip and Matthews (2001), who said that the SFP zek1 can be rendered in English as coy intonation. Describing the SFP zek1’s English equivalent as “coy intonation” does not make its form very clear, even to native English speakers, but this claim is again based on the assumption that an SFP has an English intonational counterpart. The best example comes from Baker and Ho (2006: 40), who said that the way to express the meaning of the question SFP me1 in English is to “raise your voice almost to a squeak.” Although informal, this is a good description that is easily understood by readers. They did not provide a precise meaning of me1, and like Chan (2001), they did not contrast me1 with the closely related question SFP aa4 (see Sect. 6.2.2), but again they adopted the assumption that I used as a working hypothesis to carry out this research, which is that some SFPs have English intonational equivalents. It is worth repeating Hirst’s (1983: 93) statement that “[i]ntonation, what Bolinger has called the ‘greasy part of language,’ is notoriously difficult to describe.” It is because of these difficulties that SFPs are a welcome tool with which to study intonation, enabling us, I argue, to isolate and thereby discover tone contours that have specific linguistic meanings, removing some of the grease, so to speak, and allowing us to start with meanings and then match them to intonational forms. SFPs provide one alternative to the complicated procedure of simultaneously searching for both the form and the meaning of intonational contours. Exploiting SFPs (or other discourse particles) to study the discourse-related forms of intonation offers a relatively new research methodology that can significantly help to overcome the four complicating factors stated at the beginning of Chap. 2. Bartels asked, Can we make a plausible case for associating a given tone at some level of abstraction with the same interpretational feature across all occurrences, independent of lexical content and situational context, despite the fact that tunes in different contexts appear to yield highly variable effects? (Bartels 1999: 4)

Similar to the debate on intonational meaning, Cantonese linguists have also debated whether or not the meaning of a given SFP is the same “across all occurrences,

62

5  Evidence via Cantonese

independent of lexical content and situational context.” This question is harder to address when studying a form of intonation, because, unlike the easily identifiable form of a segmental SFP, it is difficult to identify when the same form of intonation is being used on different sentences in different contexts. If we are to deal with the difficulty that Brazil (1997) observed of matching an intonational form to a given meaning, we must first address Bartels’s (1999) question of whether or not the meaning of a particular form is in fact the same from one context to next. The way I have dealt with this relates to Ladd’s (1978) argument that the difficulty in defining intonational forms can be compared with the difficulty in defining Japanese discourse particles, and defining Cantonese SFPs is equally applicable to his analogy. I believe that the method I adopt for defining SFPs provides strong evidence that they have context-independent meanings (see Sect. 5.3). Once an SFP is defined, it can then be used to discover whether English has any intonational forms that express that same meaning. This is done by matching a given SFP to a given pitch contour. This relates directly to Brazil’s problem of matching form to meaning, something that requires overcoming the four complicating factors stated in Chap. 2, repeated here for convenience in simplified form: ( 1) different linguists may define intonation differently; (2) machines cannot record intonation the way native speakers hear it; (3) there is no one-to-one correspondence between form and function; (4) subtypes of suprasegmentals are used simultaneously, one atop the other. Problem (1) was addressed by stating my working definition of “intonation” in Chap. 2. Problem (2) was addressed by starting with an intonational meaning (i.e., a pitch contour that is discovered via an SFP is assumed to have the same meaning as the SFP) and then listening to multiple tokens of its form on different sentences from different speakers. In this way, linguistic intuition can reliably take precedence over what is seen on paper. Problem (3) was overcome by producing the same intonational form in different contexts and by different speakers. This resulted in being able to confidently say that these different instances of the same contour shape are in fact the same tone with the same meaning. For example, in the case of lo1-equivalent intonation, which expresses “obviousness,” its form in English is similar to, if not the same as, both emphatic and contrastive intonation, but it has a distinct meaning which is clearly defined and easily understood (see Sect. 6.1.1). This enables us to recognize and separate its occurrence from occurrences of emphatic and contrastive intonation, except in those contexts where more than one of these meanings is acceptable. There may be some cases where these intonational forms are ambiguous, but I intuitively have the impression that lo1-equivalent intonation is slightly higher in pitch than emphatic and contrastive intonation, though further research will need to be done to confirm this. It is worth pointing out that any fine distinctions of form can only be made once the meanings of the individual forms have been identified; otherwise, we cannot say for sure that they are two different tones and cannot easily recognize when one or the other is being used. Having a clear definition allows linguists to design affective elicitation experiments involving native speakers.

5.2  The Design of the Research

63

In relation to problem (4), Kwok (1984: 98) perhaps inadvertently illustrated the advantage of studying segmental discourse particles when she said that “the expression of ‘emotions’ and ‘attitudes’ does not depend on the mere presence of the particles, but on the intonation superimposed on them in real contexts.” This statement implied that the emotions expressed by nonlinguistic suprasegmentals can be distinguished from the discourse meanings expressed by the SFPs themselves. Making this distinction is much easier to do in Cantonese than in English because problem (4) is largely nonexistent in Cantonese. Problem (4) is therefore addressed by this research since it begins with an intonational meaning, which is the meaning of the SFP. By analyzing multiple examples of this same intonational meaning in different contexts from different translators, its form can more readily be isolated from any potential paralinguistic suprasegmentals that co-occur.

5.2  The Design of the Research Lee and Law (2001) explained that SFPs represent a lexicalized form of a variety of knowledge states, and therefore provide a good window through which to observe epistemic notions. I exploited this linguistic fact to study the forms of these epistemic notions in English, which primarily expresses them with intonation. Starting with segmental particles allowed  me to easily identify sentences within a corpus that have a specific connotative meaning. If this meaning was found to consistently translate into English as a specific form, then this was taken as evidence that this SFP’s meaning and function also apply to its English equivalent. In this sense, an SFP in Cantonese can be seen as a lens through which one can discover the form and meaning of a discourse tonal morpheme in English—so long as this SFP has been accurately defined, has an English intonational equivalent, and has been reliably translated. Segmental SFP

Meaning and function

Translation process

Pitch contour in English

We are still far from having an agreed-upon list of meaningful forms of intonation. Because of this, segmental discourse particles are a welcome tool that can be used to help us study at least some forms of intonation. Based on our current ­knowledge of intonation and on current technology, it seems very unlikely that computer software could be designed to accurately and consistently find all occurrences

64

5  Evidence via Cantonese

of a particular pitch contour with a particular meaning. For example, there are different rising tones with different meanings, and it is difficult to consistently find and distinguish them manually, let alone by machine. Another problem is that tones change their forms through assimilation with other tones, such as focus stress. In contrast, all occurrences of a segmental particle can easily be searched for in an annotated corpus. Constructed examples of dialogues that include sentences with any chosen discourse particle can also be written and recorded in order to elicit acceptability judgments or oral translations into another language. Both of these methods were used for this research. Once the English-equivalent form and meaning of a given SFP is identified, described, and defined, native English-speaker intuition can then be used to judge the accuracy of the proposed meaning and can be used to recognize and distinguish that form of intonation from other similar forms that a machine might fail to distinguish. Linguists must rely at least partially on their ears and on linguistic intuition for their data analyses because these are as of yet far more sophisticated instruments than the technology available for recording and analyzing acoustic data (see Sect. 5.2.4). As far as I know, this research design is the most rigorous to date for translating modal particles from one language into intonational forms in another language. I would be pleased if linguists adopted this methodology, either as is or modified, to search for other particle/intonation equivalents in other pairs of languages. Prior to this research, no attempts had been made to systematically investigate whether there are any one-to-one correspondences between any Cantonese SFPs and specific forms of English intonation.3 In Sect. 4.2, a number of studies were cited that compared discourse particles with intonation between different language pairs, three of which used methodologies that were partially related to mine. Prieto and Roseano’s (2016) study is related in the sense that it began with a specific discourse meaning and then designed discourse completion tasks to elicit this meaning from subjects in two different languages, one of which used intonation to express this meaning. Chao (1932) and Schubiger (1965) both used translations, but their data were based on their own intuitions. Relying on the linguist’s own intuition is not a problem per se, and this was presumably the method used by Chan (2001) and Baker and Ho (2006) regarding the English counterpart to me1 and by Yip and Matthews (2001) regarding the English counterpart to zek1. In fact this method is probably sufficient for most of the grammatical particles and their tonal counterparts discussed in Sect. 4.1. However, to the extent possible, more rigorous and objective methods should ideally be used to verify any such claims regarding the comparability of a discourse particle in one language to a tone in another. Collecting data from multiple participants who are ambilinguals should be more reliable than collecting data from one or two linguists who are advanced bilinguals.

3  I have cited several studies that mention SFP/intonation equivalents (i.e., Pennington and Ellis 2000; Chan 2001; Yip and Matthews 2001; Baker and Ho 2006), but none of them described the use of any research methodology for coming up with their conclusions.

5.2  The Design of the Research

65

There are five ways in which the methodology I adopted improves on the pioneering studies of Chao (1932) and Schubiger (1965). First, the translations went from the direction of segmental particles to suprasegmental intonation instead of vice versa. Second, the intuition of ambilinguals was exploited. Valdés and Figueroa (1994: 11) defined an ambilingual as “two native speakers in one individual.” Third, oral translations of recorded speech were used as opposed to translations of written sentences. Fourth, each intonational form was analyzed using both an intuitive analysis of the intonation, and an examination of each utterance’s F0 contour on paper using Praat, a free software for phonetic analysis. Fifth, simply-worded definitions of the discourse particles were developed, facilitating the use of native English-speaker judgment about whether the intonational forms discovered through the translations appear to have the same meaning as the particles.

5.2.1  The Participants Using native-speaker intuition as a source of data has obvious limitations, but it is nevertheless widely used and accepted in linguistic research for determining linguistic meaning and acceptability. The tokens of SFP-suffixed sentences used for the translations in this research were assumed to be grammatical because they were taken from an audio corpus of naturally-occurring conversations or from constructed dialogues for which native speakers were consulted. I was therefore not seeking participants’ judgments of acceptability; I was seeking their judgments of meaning and was exploiting their presumed ability to translate the linguistic representation of this meaning from its segmental form in Cantonese to its pitch contour form in English. It was concluded that native English-speaking participants were required because intonation is one of the most difficult things for L2 learners to master (Chun 2002). If we assume that SFPs are the Cantonese equivalent of English discourse intonation, it follows that the intuition of native-Cantonese speakers would also be required in order to fully grasp the intuitive meanings of the SFPs. Therefore, the best kind of participants for this research are ambilingual speakers of L1 Cantonese and L1 English. The research reported here involves several studies. The minimum number of participants in each study was four, all of whom were of Hong Kong Chinese origin. They had English-medium educations and regularly spoke English with some friends, but spoke Cantonese at home, with some other friends, and with relatives and members of the surrounding community in Hong Kong. Some had lived in Englishspeaking communities for a period of time. The participants’ status as ambilinguals was not based on their backgrounds; that was only used as a screening process. Their status as ambilinguals was based initially on English-medium ­conversations with me, a native-English speaker, and on Cantonese-medium conversations with another native-Cantonese speaker. This was then followed up with playing audio recordings

66

5  Evidence via Cantonese

of them speaking English and Cantonese to native speakers of English and Cantonese, respectively, who then judged whether or not they considered them to be native speakers. They needed to be considered a native speaker by both groups before they were classified as ambilingual and asked to translate. Asking other native speakers to judge their status as native speakers is obviously impressionistic, but Butler and Hakuta (2004: 125) pointed out that there is no clear method for determining the norm for a native speaker. Therefore, other than judgments by other native speakers, it is not clear how to determine whether one or both of a bilingual’s languages meet the native-speaker norm. Valdés and Figueroa (1994: 30–6) said it is generally assumed that there is no need to test the linguistic competence of native speakers so as to prove they are native speakers. The test of being fully native, according to them, is based on whether or not a speaker sounds native to other native speakers. Other than this, there are no recognized tests, as far as I know, which are designed to determine whether a person is a native speaker of a given language. It is possible that an advanced L2 speaker of either Cantonese or English could accomplish the task required, but it seems more likely that an ambilingual is better suited for it. Because of this, I wanted to ensure that they were actually ambilinguals, or at least close to being so. Therefore, their self-identity as ambilinguals and their backgrounds were not considered sufficient, and native speakers of both English and Cantonese were consulted to see if they agreed. Guthrie’s (1983) study involved a participant that he referred to as a native Cantonese–English bilingual, and I adopt his practical description of a native bilingual, or what I call an ambilingual, which is that “both [his or] her Cantonese and English were native-like” (p.  40). Here, the term native-like is not used to mean similar to a native speaker, or nearly native, but rather is used to mean that the speaker’s language sounds as if it is that of a native speaker for each language involved. Even if the participants of this study were not fully native speakers of both languages (however one defines “native”), it does not damage the validity of the results. It is more likely that ambilinguals would succeed at the required task, but the critical test of validity was not the linguistic status of the participants; it was whether or not the participants produced consistent translations of an SFP as the same intonational form. If they did this among their own translations and if this form then matched the form that also appeared in the other participants’ translations, this was taken as a form of empirical evidence to indicate that there is a shared meaning between the SFP and the form of English intonation.

5.2.2  The Corpus and the Dialogues When choosing which type of linguistic data to use for an SFP-related study, arguments can be made for using naturally-occurring data, constructed data, or a combination of the two as was done for this research. The choice will depend on the study’s aims. For this research, a Cantonese corpus of naturally-occurring speech was used, which is the best source for collecting tokens of SFPs since they are primarily used

5.2  The Design of the Research

67

in casual colloquial speech (Luke and Nancarrow 1997). Unlike English intonation, SFPs have written forms, so it would have been possible to have the participants translate and vocalize English sentences from written Cantonese. Such sentences could have come either from a corpus or from constructed dialogues. However, the participants were required to orally “mimic” the target sentences’ “tones of voice,” and this would have been extremely difficult to do if the source had been in written form. It would have made their task considerably more subjective, increasing the likelihood that the translations would differ among the participants, because each participant would have almost certainly imagined the written dialogue being said in a different way. Therefore, the data consist entirely of audio-to-oral translations. In addition to using naturally-occurring speech from the audio corpus, constructed dialogues were used for follow-up data collection. The constructed dialogues included the question SFPs me1 and aa4, and the “downplaying” SFP ze1. These dialogues were constructed and recorded with the help of native Cantonese speakers. In the case of the question particles, the use of minimal pairs allowed their English-equivalent forms, both of which are rising tones, to be differentiated from each other (see Sect. 6.2.3). Constructed dialogues were created for the SFP ze1 in order to single out and elicit one specific meaning of this polysemous particle (Sect. 6.3.1). Native Cantonese speakers acted out the constructed dialogues in as natural a manner as possible. These were recorded and then later played for the ambilingual participants, who translated the target sentences into English. For the initial data collection, audio files of dialogues were extracted from a searchable audio corpus called the Hong Kong Cantonese Corpus (HKCanCor— http://compling.hss.ntu.edu.sg/hkcancor/), which was created by K.K.  Luke and O.T. Nancarrow. The corpus consists of 180,000 words of naturally-occurring oral Cantonese. It consists of radio talk shows (42 conversations) and spontaneous speech (51 conversations) in ordinary settings among family members, friends, and colleagues. The corpus is word-segmented and annotated, and the words are all marked for parts of speech. The corpus can be searched for all occurrences of any particular word, which allowed me to locate and extract dialogues from within the corpus that each contained a token of a particular SFP. Five or six such dialogues were extracted for each SFP. In addition to the target utterance, each extraction of audio included preceding and sometimes subsequent utterances from the surrounding discourse. Despite this, in some cases, there was still not enough dialogue to enable the participants to fully understand the discourse context within which the given SFP was used. For those cases, the larger surrounding contexts were explained to the participants.

5.2.3  Data Collection Referring to the translatability of SFPs into English, Ball (1888/1971: 112) said, “[i] t will be seen that [SFPs] are very difficult, or impossible even of translation into English where accent and emphasis alone do their work to a great extent.” Yip and

68

5  Evidence via Cantonese

Matthews (2001: 156) similarly stated that many SFPs “are untranslatable, the ideas being expressed in English by intonation patterns and tone of voice rather than words.” These statements are only true if intonation is not counted as part of the English sentence’s translation. For this research, intonation was assumed to be as much a part of the translation as any other grammatical component, and it was further assumed that the intonation can be isolated and analyzed both phonologically and semantically from all the other grammatical components of the sentence. The data consist of Cantonese-to-English oral translations from the ambilingual participants. The translations were of sentences that included tokens of the targeted SFPs, and these sentences all came from the audio dialogues described in the previous section. The ambilingual participants were given the following instructions to read and were allowed to ask questions for clarification: Pretend you are the person who says the phrase that you are going to translate. Imagine that all of the people conversing are perfectly bilingual, just like you, and that they will therefore completely understand your English translation. With this in mind, mimic the speaker, including attitude, tone-of-voice, intonation, mood, etc. Imagine your English version of the phrase being inserted in place of the Cantonese phrase in such a way that the conversation would continue along exactly as it does on the audio.

Although this experiment was an attempt to tap into the participants’ subconscious linguistic intuition, there was no reason to assume that allowing them to consciously think about the task would invalidate the results. To the contrary, it was concluded that conscious consideration would increase the likelihood of their succeeding at the task. They were essentially required to mimic, which is similar to acting. The participants were therefore allowed to take as much time as they needed and were allowed to listen to the audio as many times as it took for them to get it clear in their minds. They were also allowed to listen to their own translations and were permitted to redo any that they felt they could improve on. None of them were allowed to listen to any of the other participants’ translations, and therefore, all of them translated individually and separately from one another. Because the participants were instructed to mimic the speakers in the corpus, the data are referred to as “mimic translations.” Each participant translated at least five tokens of each SFP. Each token was attached to a different sentence in a different context. Every study within the research had at least four participants, resulting in a minimum of twenty mimic translations being collected for each SFP. The entire set of data included mimic translations from 17 SFPs. Five of these SFPs’ translations strongly indicated that they have an English intonational equivalent, while two other of these SFPs’ translations indicated that they possibly have an English intonation equivalent. Four of the five SFPs that clearly translated as a form of intonation comprised two related pairs: the question particles me1 and aa4 and the evidential particles lo1 and aa1maa3. My initial study focused on defining those four SFPs and describing the forms of their English equivalents. The fifth SFP that clearly translated as intonation was the restrictive focus particle zaa3, which is semantically related to the particle ze1, and ze1 is one of the two particles whose translations weakly indicated they had English equivalents. A follow-up study was done based on the assumption that ze1 is polysemous, as reported in the literature

5.2  The Design of the Research

69

(e.g., Fung 2000), and dialogues were designed specifically to elicit the meaning of ze1 that appears to have an English equivalent. The results indicated that zaa3 and ze1 form yet another pair of semantically related SFPs which both have English equivalents, making three pairs in total.

5.2.4  Data Analysis Some models of intonation describe it solely in terms of pitch, and pitch is often defined in terms of F0 ('t Hart et  al. 1990; Pierrehumbert and Hirschberg 1990; Botinis et al. 2001; Chun 2002). While this is done for practical and understandable reasons, it is important to keep in mind that intonation is more than just pitch, and pitch is not the same thing as F0. Chun (2002: 4) explained that “[w]hile fundamental frequency involves acoustic measurement of what is produced physiologically by speakers, pitch usually refers to how fundamental frequency is perceived by listeners.” Roach (2009: 4:23ff) clearly stated that “[f]undamental frequency is not intonation. Fundamental frequency is a physical counterpart to intonation, but intonation really is in your head, and in your ears. It is not what a computer measures.” Hirst et al. (2000: 52) explained that “linguistic representations” (e.g., pitch) refer to how information is represented in the minds of speakers, while “physical representations” (e.g., F0) refer to the ways in which scientists choose to analyze data. Nevertheless, for practical reasons, linguists often refer to pitch and F0 interchangeably. But to the extent possible, researchers should allow the native speaker’s ear to be the ultimate judge of pitch analysis. Doing so increases the validity of any claims about intonation. Pierrehumbert’s (1980) model of phrasal intonation uses the tones and break indices (ToBI) transcription system for transcribing accents (i.e., word prominence within utterances) and phrasing. Botinis et  al. (2001: 280) said Pierrehumbert’s model “is widely regarded as the single most influential work in the field of intonational phonology,” and Chun (2002: 29) added that “Pierrehumbert’s (1980) seminal monograph sets forth the now-standard generative model of intonation.” Despite its innovations and its wide use, I do not believe that ToBI fully captures the shapes of contours, and I tentatively assume tones to be made up of contours. I therefore do not believe ToBI to be the best system for describing forms of intonation and did not use it for this research. ToBI marks the pitch of an utterance as a sequence of high and low tones that are labeled according to their function. This only indicates relative pitch, based on relative F0, and thus does not represent the global shape of a pitch contour in sufficient detail. Peter Roach made the following observations about ToBI and contours: There is something special about a contour, like fall-rise or rise-fall. It is the contour itself. It’s the global shape of it, not the fact that it starts low and then goes to a high point and then goes to a low point. It is a contour, and contours were not part of the basic equipment of [the ToBI] system; a contour was something at a higher level that you made up out of these building blocks—the low tone and the high tone. And more and more I came to feel that the importance of contours was being neglected in ToBI. (Roach 2009: 2:24ff)

70

5  Evidence via Cantonese

Related to this, Kenneth Pike concluded that In order to describe an intonation contour it does not suffice to say that it is rising, or falling, or falling-rising. Even the simplest rise has a complex series of relationships to other contours, and complex internal structure. The size of the interval between beginning and ending points, the height of the beginning point relative to the general pitch level of the sentence, paragraph, conversation, or speaker’s norm [are all important]. (Pike 1945: 25)

A contour involves both time and absolute pitch; in order to measure intervals between beginning and ending time, the measurement must include time, and to measure the height of the beginning point, the measurement must include frequency. The ToBI system does not include either time or absolute F0 in its representations and therefore does not provide a detailed representation of an F0 curve, let alone a pitch curve. ToBI represents relative F0 as high or low instead of actual F0 measured in hertz. Just as importantly, it does not account for native-speaker intuitions regarding pitch; it is strictly a mechanical measurement of relative pitch heights. Of course, this does not mean that ToBI notation cannot be used for intuition-based research, and indeed it has, as in, for example, the study by Pierrehumbert and Steele (1989) cited in Sect. 3.1.4. Hirst (1983: 98) proposed an idea for representing the global shapes of F0 contours using “a sequence of target pitches (ti, hi), where ti represents the time value and hi the fundamental frequency for each target pitch.” There is no limit to the number of target pitches that can be measured across an F0 curve, allowing one to mark enough target pitches on the curve to show its actual shape. The F0 curve on paper can be thought of as an infinite number of (ti, hi) values, resulting in a solid line. Even with this method, however, the problem that F0 is not the same thing as pitch remains, and the ultimate goal here, as with any study on the mental representation of intonation, is to describe a pitch contour, not an F0 contour. The fact that acoustic F0 is different from auditory pitch means that the F0 representations of the same pitch contour will vary from one occurrence to the next. This is because other features, such as loudness and length (and perhaps even voice qualities), which are not visible on the F0 graphs, can influence the perception of pitch. Brazil (1997: 3) concluded that it “seems inherently improbable that a human being can make systematic variations on one physical parameter without its affecting others. Changes in loudness and in speed result from intimately connected adjustments to the same speech mechanism as that which determines pitch.” All of these factors contribute to the difficulty of developing technology that can accurately measure and record pitch. We would somehow need to develop hardware and software that senses and records linguistic utterances analogous to how the native-speaker’s ear and mind perceive them. Exacerbating the problem is the fact that, even if machines could be designed to record F0 in a way that correlates with human pitch perception, that would still only be the auditory and mental perception. It would not include the linguistic intuitions that interpret variations such as the ones seen for lexical tones in the form of allotones. The pitch shape of a particular discourse meaning will differ from one occurrence to the next, even for an individual speaker. There are many reasons for this.

5.2  The Design of the Research

71

Any changes in tempo, pitch range, and so on will result in variations in the shape of a pitch contour, but these different shapes will each have the same mental representation and will represent the same discourse meaning, just as variations of the same lexical tone—because of its interactions with intonation and other suprasegmentals or because of tone sandhi—still represent the same tone underlyingly. Discourse tones’ assimilation with other grammatical and discourse tones will also result in differently shaped allotones, drastically different in some cases. A machine would therefore not only need to be able to perceive linguistic pitch, it would also need to be capable of recognizing allotones. Such machines are unlikely to appear any time soon, if ever, which is why native-speaker intuition still has a central role to play in the analysis of intonational data and which is why I used linguistic intuition for this research. I agree with Fox’s (2000) views about the superiority of “auditory analysis” (i.e., subjective native-speaker intuition) over “instrumental analysis” (i.e., objective recordings of physical sounds): There is considerable reliance on instrumental analysis of pitch patterns in some recent work. But although a concern for an accurate factual basis is laudable, overreliance on such data can also be unsatisfactory, since it cannot of itself provide a phonological analysis, and by obscuring the difference between relevant and irrelevant features, may even impede it. Thus [one can] support the view that auditory analysis is usually a better basis for the phonological analysis of intonation. (Fox 2000: 269, footnote 2)

Hirst (1977) speculated that perhaps the advances of mechanical speech analysis technology have actually had a negative effect on the study of intonation, because they have caused researchers to describe intonation in terms of enormous quantities of data that the phonetician must put in some kind of order. In the past, phoneticians relied on their intuition, which Hirst claimed is still a vital component of intonational data analysis: … even such a fairly simple thing as stress, turns out to be an enormously complex affair, depending not only on the by now fairly classic parameters of fundamental frequency, intensity, duration and vowel-quality, but also on the linguist’s knowledge of the language as a system. The problem is a fundamental one, and one it seems which no amount of machine analysis can solve. However much we refine our techniques, and improve our apparatus, there remains the basic fact that the final judge is the human ear … (Hirst 1977: 1–2, emphasis in italics mine)

Even after another quarter century of advances in technology, Gussenhoven (2004: 3) still agreed with Hirst’s assertions, pointing out that “[b]y definition, the best source for obtaining a record [of the pitch of utterances] is the listeners’ ­perception, since pitch is a perceptual sensation. Unfortunately, listeners lack appropriate conceptualizations and vocabulary to report their sensations.” In other words, while humans are not as good as machines at accurately and consistently reporting what they perceive, machines have not yet been (and perhaps never will be) designed to perceive linguistic audio input in the same way that the speakers of a given language do. Having said all this, I still refer to tones on paper in terms of F0, but this is only used as a complement to my native-speaker intuition, which is capable of naturally

72

5  Evidence via Cantonese

and intuitively analyzing intonational forms, including allotones. The purpose of this research was not to provide a full and accurate phonetic description of the forms of intonation involved, but rather to verify whether or not a specific form of intonation was recognized to exist within multiple translations of a particular SFP. Intuition was the primary source for determining this, and the extent to which the F0 contours looked similar among multiple tokens of tones that I judged as sounding like the same tone, it can be seen as a validation of my perceptions. The F0 contours also allowed me to present the data in print form for this book and to talk about it in terms of falls, rises, and rise-fall-rises. Although the contours shown are not the same as pitch, they are closely related and offer an essential tool for presentation, discussion, and verification. Using Praat, graphic representations of the mimic translations’ contours were created as F0 across time, measured in hertz and seconds, respectively. Within each of the figures that show the F0 contours of the translations, lines above and below a speaker’s median range are used to represent a one-octave range. The center line marked with its hertz value represents the speaker’s estimated median pitch, which was calculated as the geometric mean of the speaker’s highest and lowest F0 point from within all of his or her own translations. The scale of the figures is Hz (Logarithmic), meaning that the values are given in hertz but scaled logarithmically. A given distance on the Y-axis therefore corresponds to a pitch interval, which is a log difference rather than a linear distance. This method follows suggestions by Hirst (for details, see De Looze and Hirst 2014). Note that in some cases, the pitch of a mimic translation went beyond the upper limit of the octave pitch range, and in such cases a third line that is one octave above the median pitch was added to the figure. To analyze the data, I first listened for consistencies in intonation patterns that my native English-speaker intuition interpreted as linguistically meaningful. Consistency was looked for both among a single participant’s multiple translations of a given SFP as well as among the collection of all participants’ translations of that same SFP. When I detected that a form of intonation was repeatedly used among a group of mimic translation, I then looked for consistencies in the shapes of the F0 contours at the relevant position along the F0 contour line, i.e., the position within the utterance where I heard the tone appear. The presence of a discernable shape that matched my intuition was judged to be a linguistically meaningful, discernable pitch contour. If the shape of an individual F0 contour did not appear to match the majority of those among the participants’ own translations or among those of other participants, my native-speaker intuition was still allowed to determine ultimately whether this token of an intonational contour sounded as if it was the same as the others among the group.4 In most cases, there was an obvious resemblance among all the tokens of F0s that were analyzed as representing the same tone. Interestingly, however, there were a few cases where solely looking at the shape of the F0 contour 4  Of course, differences among the participants’ F0 readings with regard to pitch key and range were considered irrelevant, especially considering the fact that some of the participants were male and some female.

5.3  Defining Sentence-final Particles

73

of a mimic translation would likely have caused one to analyze it as not belonging to the group of tokens representing that tone.

5.3  Defining Sentence-final Particles This research required clear and accurate definitions of the SFPs under investigation. The framework adopted for developing these definitions was Wierzbicka’s (1996) Natural Semantic Metalanguage (NSM). More specifically, I used a modified version of Besemeres and Wierzbicka’s (2003: 3) “general model for the investigation of discourse markers.”

5.3.1  The Natural Semantic Metalanguage Theory The NSM theory was first developed by Anna Wierzbicka in the early 1960s. Although Wierzbicka originated the theory, it should be noted that the NSM program has received a significant amount of input from Cliff Goddard for at least 25 years. The core assumption of the NSM theory is that “natural languages are adequate to represent their own semantics via language-internal paraphrase” (Goddard 2008: 3). The theory hypothesizes that humans are endowed with a set of universal semantic primes and that these primes are overtly lexicalized in all languages. These primes and their accompanying syntax form a subset of the natural languages that they belong to, and this is why this subset of language, which is what this theoretical framework uses as its metalanguage, is called the Natural Semantic Metalanguage. In theory, all words’ semantics can be adequately and accurately expressed using a paraphrase definition that is written entirely in semantic primes. An NSM definition is referred to as an explication. The forms and combinatory rules of the primes differ among languages, but their meanings do not. NSM theory hypothesizes that a sentence, or a multiple-sentence explication, that is written entirely in the semantic primes of one language will be isomorphic with that same sentence (or sentences) written in another language’s semantic primes. In theory, an NSM explication therefore provides a language-­ neutral means for defining words because they are void of any language-specific meanings. The first tentative list of English semantic primes included a detailed description of 14 “primitives” (Wierzbicka 1972). More than 20 years later, Wierzbicka (1996) provided a description of English’s NSM grammar, with an expanded list of 56 semantic primes. The list of primes presently includes 65 according to Goddard (2011: 66, 2018: 40). The number of primes has gone up or down as the NSM program has evolved. Any prime’s inclusion on the list is an empirical matter subject to ongoing scrutiny as to whether or not it actually qualifies as a prime. According to the NSM theory, a prime is a morpheme, lexeme, or phraseme with an undecomposable

74

5  Evidence via Cantonese

meaning that exists in all natural languages. For brief introductions of the NSM theory, see Goddard (1994a, 2004, 2008), and for more in depth treatments, see Wierzbicka (1996), Goddard (2011, 2018), and Goddard and Wierzbicka (2014). If the hypotheses of the NSM theory are correct, then the NSM explication of a Cantonese SFP will equally define its English intonational counterpart, to the extent that the explication is accurate and to the extent that the SFP and its intonational counterpart have the same meaning. Having said that, it must be noted that even if it is demonstrated that an intonational contour is the best and most accurate English translation of an SFP, this does not prove that their meanings are exactly the same. Nevertheless, if the ambilingual participants consistently translate an SFP into English using the same tone across a variety of contexts, this strongly indicates that the SFP’s and its corresponding tone’s meanings are very closely related to each other, if not exactly the same. The NSM explications of the SFPs in this study are therefore tentatively assumed to define their English intonational counterparts equally as well. An NSM explication can be easily understood without having any background knowledge of NSM theory. This differs from the definitions used in most formal semantic frameworks, which use symbols and notations that require specialist knowledge before one can understand them. It should be noted, however, that linguists who want to use NSM explications in their own research should not be deceived by how simple the explications look. It is necessary to read at least some of the key works within the NSM literature, and a significant number of trial-anderror attempts are required for the accurate formulation of an explication. As easy as they may look, a great deal of time and effort went into developing the explications written for this research. Although a background of the theory is not necessary to understand the meanings of the SFPs’ explications, it is helpful to understand the approach I used for defining them, an approach that is based on assumptions I make about the nature of their meanings. Even if one does not accept the hypotheses put forward by the proponents of the NSM theory, one can still accept that the words listed as NSM primes can be practically and usefully applied to defining SFPs and intonation. Goddard (2004: 30) said that “[e]ven if one does not ‘buy’ the NSM theory as a whole, it seems to me that it has much to recommend it from a purely practical or heuristic point of view.” The NSM program has, for more than 40 years, searched for a comprehensive list of semantic primes. This search has involved more than 30 languages of a wide variety of language types (Goddard 2004). To the extent that the NSM program has succeeded at discovering semantically simple words that have very close (if not identical) counterparts in most if not all languages, it can be taken as empirical evidence in support of the claim that these so-called semantic primes exist in one form or another (whether as morphemes, lexemes, or phrasemes) in most or all languages. Such evidence makes it reasonable to assume that such words, whether they are absolute primes or not and whether they are perfect equivalents or not, will translate much more readily and accurately cross-linguistically than will words that are semantically more complex and entail meanings that are expressed in only one or a few languages.

5.3  Defining Sentence-final Particles

75

5.3.2  D  efining Sentence-final Particles with the Natural Semantic Metalanguage An NSM explication of a discourse-related SFP should be speaker-oriented, i.e., written from the speaker’s perspective. This assumption is based on the fact that SFPs express various speaker stances, knowledge, and beliefs. SFP explications should also be written in a way that defines them independently of the contexts within which they occur. Saying that their meanings are independent of the discourse context does not entail the idea that SFPs’ meanings do not include any reference to the discourse context. An SFP connects its associated sentence to the discourse in some specific way, and its explication should therefore express this connection explicitly and accurately. The goal is to develop an explication that accurately expresses an SFP’s semantic relationships to any and all discourse contexts where it is found to appear and where it could potentially appear, and to account for why it cannot appear in certain other contexts. At the same time, the explication must not include any meanings that come from the sentences the SFP attaches to or from the contexts in which it appears—a common mistake found in the literature (see Sect. 3.2.2). SFPs’ explications must only include meanings that are intrinsic to the SFPs themselves. In short, an SFP’s explication should include a reference to the antecedents within any and all possible contexts but must not include any actual content that is specific to a single context. To accomplish this, the explication should be contextbound rather than context-specific. This means it should include discourse-­related deictic elements whose antecedents are the proposition of the sentence (or a portion thereof), plus one or more elements in the discourse. This is opposed to including portions of the proposition and discourse as part of the explication itself (see Sect. 3.2.2 for examples of this common mistake). If elements from the sentence’s proposition and the discourse are included as part of the SFP’s explication, then the explication of any given SFP will require an infinite number of variations in order to include every possible sentence that the SFP could conceivably attach to, and within every possible context. The use of deictic elements, in contrast, allows the explication to apply to any and all contexts within which that SFP can be used, while never changing its core meaning. This can be simply illustrated by considering two possible definitions of the English plural morpheme -s. One could say that the morpheme -s in the word “cats” means “more than one cat,” but this is not desirable because this includes “cat” as part of the definition of -s. It would be more accurate to say that the -s in “cats” means “more than one X” and that the antecedent of X is the noun to which -s is attached, and in this case, its antecedent is “cat.” Linguists agree that SFPs are bound morphemes that relate their associated sentences to their associated discourse contexts. Defining an SFP can therefore be seen as analogous to defining a bound morpheme of the type just described, i.e., those whose meanings include a deictic element whose antecedent is the element to which the morpheme is attached. Adapting this to SFPs,

76

5  Evidence via Cantonese

we can include the deictic semantic prime “this” in our explication, and the antecedent of “this” will then be the proposition (or a portion thereof) that the SFP attaches to. By replacing the proposition with the deictic “this,” we avoid something akin to saying that -s means “more than one cat,” which is a definition that would force us to conclude that the meaning of -s changes depending on the context—it can also mean, for example, “more than one beer” or “more than one time.” For the same reason, the meaning of a given SFP should not include a specific proposition, or portion of a proposition, since it can attach to different ones. If some or all of the contents of a sentence are included in an SFP’s explication, then this explication would not accurately define the SFP when it is attached to another sentence. Various descriptions of the SFP aa1maa3, for example, have said that it is used to give (obvious) reason/excuse (Boyle 1970a, b; Kwok 1984; Leung 1992/2005; Matthews and Yip 2011; Yip and Matthews 2001), to remind (Law 2002; Yip and Matthews 2001), or to elaborate (Lee and Law 2000, 2001). Either aa1maa3 has multiple definitions, or it has a single definition with elements that refer to different antecedents in different contexts. I propose it is the latter and that the difference in meanings comes from the fact that the sentences themselves are reasons, reminders, or elaborations, respectively.5 Referring to the proposition with the deictic “this” is a step toward overcoming the problem of multiple definitions. However, adding a single deictic element to an SFP’s explication is not enough because SFPs connect their associated propositions to at least one element in the discourse. SFP explications therefore must also include a deictic element whose antecedent comes from the discourse, and as will be seen in the case of the SFP aa1maa3, two such elements (see Sect. 6.1.2). This context-­ independent way of defining SFPs makes it possible to define them in a way that remains consistent for each of their occurrences. NSM explications have been used by other linguists to define discourse particles in a variety of languages: Chappell (1991) for Mandarin, Goddard (1994b) for Malay (Bahasa Melayu), Wong (1994, 2004) and Besemeres and Wierzbicka (2003) for Singapore English, and H.  Leung (2016) for Cantonese. My goal in defining SFPs was the same as Besemeres and Wierzbicka’s (2003: 19) stated goal for defining the sentence particle lah in Singapore English: to “come up with a formula which would make sense in all the contexts in which lah can occur, and which could also explain why in some contexts ... lah cannot be used at all.” To accomplish this goal, they said, “we will be trying to enter the speakers’ minds; and we will test our hypothesis against a wide range of examples of the particle’s use (as well as against native speakers’ intuitions)” (p. 7). I adopted their basic methodology, which they offered as “a general model for the investigation of discourse markers” (p. 3), but I modified it by proposing that deictic elements whose antecedents are the proposition plus one or more discourse elements must be included in a discourse marker’s explication. 5  This still allows for polysemy, in which case, an SFP could have more than one meaning, but in such cases, it will have only a very few meanings at most, each of which will still be independent of the discourse context.

5.3  Defining Sentence-final Particles

77

Besemeres and Wierzbicka’s (2003: 21) explication of lah was “I think you can know what I want to say.” A deictic element for the proposition is not included in this explication. They make it clear from their examples that the wh- pronoun “what” does not refer to the proposition, but rather to something else that the speaker is thinking about, which, using the methodology I adopted, would be represented as, “I think you can know I want to say this (D).” The D indicates that the antecedent of “this” is some discourse element D. I will not attempt to propose a modification to their explication of Singapore English’s lah but suggest that, for the reasons given above, it could be improved if it included a deictic word whose antecedent is the proposition (P), and another deictic word whose antecedent is some element D from the discourse. I believe that the D (i.e., the “what”) of their explication must be related to P in some way and that their explication should therefore indicate the nature of this relationship. With this modification, I believe that Besemeres and Wierzbicka’s (2003) method of writing a speaker-oriented explication—one that accounts for all and only the contexts where it can appear—is the best method for defining discourse particles, both segmental and intonational. Following Besemeres and Wierzbicka (2003), the development of each SFP’s explication relied on three key sources: the literature, real-life examples, and native speakers’ intuitions. The third source involved discussions with native speakers, which included their acceptability judgments of constructed examples. Hypothesizing and testing when and why an SFP cannot be acceptably used was a critical part of the process. The entire process was greatly facilitated by me having advanced L2 intuition related to the SFPs, but native speakers were always consulted—none of the judgments of Cantonese data, actual or constructed, were mine alone. A useful method for investigating the meanings of SFPs is to construct minimal pairs, which was illustrated above in (1) through (6). Many authors have used this method of contrasting SFPs by attaching them to sentences with identical underlying propositions (e.g., Fung 2000; Fang 2003; Law 2004; Li 2006). However, as explained above, a pair of sentences with different SFPs is rarely a minimal pair at the discourse level because most pairs of SFPs are not interchangeable in the same discourse context, though some are if they are closely enough related. Minimal-pair sentences are therefore a very useful tool for teasing apart the meanings of two SFPs that are so closely related that they can often be used in the same context. Minimal pair sentences allow the linguist to isolate the differences between the two SFPs’ meanings with regard to how each relates the same proposition to the discourse. Examining those contexts within which two related SFPs are not interchangeable reveals critical information about how their meanings differ. The use of minimal pairs was a vital step in my research methodology because the six SFPs reported in Chap. 6 are made up of three pairs of closely related SFPs. Contrasting minimal pairs of semantically related SFPs through manipulation of the discourse context reveals which contexts acceptably allow both SFPs, which allow only one or the other, and which do not allow either one. This helped me to test and refine the NSM explications.

78

5  Evidence via Cantonese

The format used for writing the explications begins with the formula “P + SFP =”, where P refers to the proposition to which the SFP is attached. The speaker-­ oriented meaning of the SFP is then written below that. In each case, the SFP that is being defined replaces the “SFP” of this formula, for example, “P + me1 =”. The six SFPs reviewed in Chap. 6 divide naturally into the following three pairs based on their functions and meanings: two evidential particles (lo1 and aa1maa3), two question particles (me1 and aa4), and two “only” particles (zaa1 and ze1). These six particles are relatively well documented in the literature, which was very helpful as a starting point for defining them. Regarding frequency of occurrence, each of the six particles lies in the upper half of the list of 35 particles that were found in the corpus. The orders of frequencies and number of tokens were as follows: lo1 was the 5th most frequently used, with 1557 occurrences; aa4 was 9th, with 493 occurrences; aa1maa3 was 12th, with 368 occurrences; ze1 was 14th, with 284 occurrences; me1 was 15th, with 281 occurrences; and zaa3 was 16th, with 160 occurrences. These three pairs of SFPs have semantic functions and meanings that have been widely studied in many languages: evidential/epistemic mood markers (lo1 and aa1maa3), question and evaluative mood markers (me1 and aa4), and focus and restriction words and particles that express the meaning “only” (zaa3 and ze1). The core of these SFPs’ meanings is widely assumed to be universally expressed in the world’s languages, though the extent to which their complete meanings are expressed in other languages requires further investigation. The fact that they all translated as discourse tones into English, an unrelated language, indicates a possibility that their meanings are universal.

5.4  Concluding Remarks The goal of this research was to discover whether any Cantonese SFPs have intonational equivalents in English, and if so, what their forms and meanings are. To test this, five to six audio dialogues were extracted from the Hong Kong Cantonese Corpus for seventeen SFPs. Each dialogue included a token sentence with a target SFP attached. The dialogues were played to ambilingual Cantonese–English participants, who then translated them orally into English, mimicking the tone of voice of the original Cantonese. Through this initial collection of data, it was concluded that five of those SFPs have English intonational equivalents and that two others potentially have intonational equivalents. In a follow-up study, one of those two others was also concluded to have an intonational equivalent. This resulted in six SFPs with intonational equivalents that divide into three semantically related pairs: the question particles me1 and aa4, the evidential particles lo1 and aa1maa3, and the restrictive focus particles zaa3 and ze1. The research and conclusions related to each pair is reported in Sects. 6.1, 6.2, and 6.3, respectively. Explications were developed for these six SFPs using the NSM approach, based on a modified version of Besemeres and Wierzbicka’s (2003: 3) “general model for

References

79

the investigation of discourse markers.” It is hypothesized that their English equivalents have the same functions and meanings as their SFP counterparts and that these equivalents are tonal morphemes that exist in the mental lexicons of English speakers. The forms of these English tones are described as pitch contours based on native-English intuition and are presented in print as F0 contours. The research results are presented in detail in Chap. 6.

References Baker, H., & Ho, P. (2006). Teach yourself Cantonese. London: Teach Yourself Books. Ball, J. D. (1971). Cantonese made easy: A book of simple sentences in the Cantonese dialect, with free and literal translations, and directions for the rendering of English grammatical forms in Chinese (2nd ed.). Taipei: Ch’eng Wen. (Original work published in 1888). Bartels, C. (1999). The intonation of English statements and questions: A compositional interpretation. New York: Garland. Bauer, R. S., & Benedict, P. K. (1997). Modern Cantonese phonology. Berlin: Mouton de Gruyter. Bauer, R.  S., Cheung, K., Cheung, P., & Ng, L. (2004). Acoustic correlates of focus-stress in Hong Kong Cantonese. In S. Burusphat (Ed.), Papers from the eleventh annual meeting of the Southeast Asian linguistics society 2001 (pp. 29–49). Tempe, AZ: Arizona State University: Program for Southeast Asian Studies Monograph Series Press. Bauer, R.  S., & Wakefield, J.  C. (2019). The Cantonese language. In J.  C. Wakefield (Ed.), Cantonese as a second language: Issues, experiences and suggestions for teaching and learning (pp. 8–43). Oxon: Routledge. Besemeres, M., & Wierzbicka, A. (2003). The meaning of the particle lah in Singapore English. Pragmatics & Cognition, 11(1), 3–38. Botinis, A., Granström, B., & Möbius, B. (2001). Developments and paradigms in intonation research. Speech Communication, 33(4), 263–296. Boyle, E. L. (1970a). Cantonese: Basic course volume one. Washington: Foreign Service Institute. Boyle, E. L. (1970b). Cantonese: Basic course volume two. Washington: Foreign Service Institute. Brazil, D. (1997). The communicative value of intonation in English. Cambridge: Cambridge University Press. Butler, Y. G., & Hakuta, K. (2004). Bilingualism and second language acquisition. In T. K. Bhatia & W. C. Ritchie (Eds.), The handbook of bilingualism (pp. 114–144). Malden, MA: Blackwell. Chan, M. K. M. (2001). Gender-related use of sentence-final particles in Cantonese. In M. Hellinger & H. Bussmann (Eds.), Gender across languages (pp. 57–72). Amsterdam: John Benjamins. Chao, Y. R. (1932). A preliminary study of English intonation (with American variants) and its Chinese equivalents. Bulletin of the Institute of History and Philology of the Academia Sinica (The Ts’ai Yuan P’ei Anniversary volume: Supplementary volume I), 104–156. Chappell, H. (1991). Strategies for the assertion of obviousness and disagreement in mandarin: A semantic study of the modal particle me. Australian Journal of Linguistics, 11(1), 39–65. Cheung, K.-H. (1986). The phonology of present day Cantonese. Unpublished doctoral dissertation, University College, London. Chun, D. M. (2002). Discourse intonation in L2: From theory and research to practice. Amsterdam: J. Benjamins. De Looze, C., & Hirst, D. J. (2014). The OMe (Octave-Median) scale: A natural scale for speech melody. Proceedings of the 7th International Congress on Speech Prosody, 910–913. Dublin. Eberhard, D. M., Simons, G. F., & Fenning, C. D. (Eds.). (2019). Ethnologue: Languages of the world (22nd ed.). Retrieved from http://www.ethnologue.com

80

5  Evidence via Cantonese

Fang, X. (2003). 廣州方言句末語氣助詞 [Sentence-final mood particles of the Cantonese dialect]. Guangzhou: Jinan University Press. Fox, A. (2000). Prosodic features and prosodic structure: The phonology of suprasegmentals. Oxford: Oxford University Press. Fox, A., Luke, K.-K., & Nancarrow, O. (2008). Aspects of intonation in Cantonese. Journal of Chinese Linguistics, 36(2), 321–367. Fung, R. S.-Y. (2000). Final particles in standard Cantonese: Semantic extension and pragmatic inference. Unpublished doctoral dissertation, Ohio State University, Columbus, OH. Goddard, C. (1994a). Semantic theory and semantic universals. In C. Goddard & A. Wierzbicka (Eds.), Semantic and lexical universals: Theory and empirical findings (pp. 7–29). Amsterdam: John Benjamins. Goddard, C. (1994b). The meaning of lah: Understanding “emphasis” in Malay (Bahasa Melayu). Oceanic Linguistics, 33(1), 145–165. Goddard, C. (2004). Semantic primes within and across languages. In D. Willems, B. Defrancq, T. Colleman, & D. Noël (Eds.), Contrastive analysis in language: Identifying linguistic units of comparison (pp. 13–43). Basingstoke: Palgrave Macmillan. Goddard, C. (2008). Natural semantic metalanguage: The state of the art. In C. Goddard (Ed.), Cross-linguistic semantics (pp. 1–34). Amsterdam: John Benjamins. Goddard, C. (2011). Semantic analysis: A practical introduction (2nd ed.). Oxford: Oxford University Press. Goddard, C. (2018). Ten lectures on natural semantic metalanguage: Exploring language, thought and culture using simple, translatable words. Leiden: Brill. Goddard, C., & Wierzbicka, A. (2014). Words and meanings: Lexical semantics across domains, languages, and cultures. Oxford: Oxford University Press. Gussenhoven, C. (2004). The phonology of tone and intonation. Cambridge: Cambridge University Press. Guthrie, L. F. (1983). Contrasts in teachers’ language use in a Chinese-English bilingual classroom (pp. 39–52). Washington, DC: National Institution of Education. Hirst, D. (1977). Intonative features: A syntactic approach to English intonation. The Hague: Mouton. Hirst, D. (1983). Structures and categories in prosodic representations. In A. Cutler & D. R. Ladd (Eds.), Prosody: Models and measurements (pp. 93–156). Berlin: Springer. Hirst, D., Di Cristo, A., & Espesser, R. (2000). Levels of representation and levels of analysis for the description of intonation systems. In M. Horne (Ed.), Prosody: Theory and experiment: Studies presented to Gösta Bruce (pp. 51–88). Dordrecht: Kluwer Academic Publishers. Hong Kong Census and Statistics Department. (2017). 2016 Population by-census, main tables. Retrieved from https://www.bycensus2016.gov.hk/en/bc-mt.html Kwok, H. (1984). Sentence particles in Cantonese. Hong Kong: Centre of Asian Studies, University of Hong Kong. Ladd, D. R. (1978). The structure of intonational meaning: Evidence from English. Bloomington: Indiana University Press. Law, A. (2002). Cantonese sentence-final particles and the CP domain. UCL Working Papers in Linguistics, 14, 375–398. Law, A. (2004). Sentence-final focus particles in Cantonese. Unpublished doctoral dissertation, University College, London. Lee, T. H., & Law, A. (2000). Evidential final particles in child Cantonese. In E. V. Clark (Ed.), The proceedings of the thirtieth annual child language research forum (pp.  131–138). Stanford: Center for the Study of Language and Information. Lee, T.  H., & Law, A. (2001). Epistemic modality and the acquisition of Cantonese final particles. In M. Nakayama (Ed.), Issues in East Asian language acquisition (pp. 67–128). Tokyo: Kuroshio. Leung, C. (2005). 當代香港粵語語助詞的研究 [A study of the utterance particles in Cantonese as Spoken in Hong Kong]. Hong Kong: Language Information Sciences Research Centre, City University of Hong Kong. (Original work published in 1992).

References

81

Leung, H. H. L. (2016). The semantics of utterance particles in informal Hong Kong Cantonese (Natural semantic metalanguage approach). Doctoral thesis, Griffith University, Australia. Li, B. (2006). Chinese final particles and the syntax of the periphery. Unpublished doctoral dissertation, Leiden University, Leiden. Luke, K. K. (1990). Utterance particles in Cantonese conversation. Amsterdam: John Benjamins. Luke, K. K., & Nancarrow, O. T. (1997). Sentence particles in Cantonese: A corpus-based study. Presented at The Yuen Ren society meeting, University of Washington. Ma, J. K.-Y., Ciocca, V., & Whitehill, T. L. (2006). Effect of intonation of Cantonese lexical tones. Journal of Acoustic Society of America, 120(6), 3978–3987. Matthews, S., & Yip, V. (2011). Cantonese: A comprehensive grammar (2nd ed.). London: Routledge. Pennington, M. C., & Ellis, N. C. (2000). Cantonese speakers’ memory for English sentences with prosodic cues. The Modern Language Journal, 84(3), 372–389. Pierrehumbert, J. (1980). The phonetics and phonology of English intonation. Unpublished doctoral dissertation, MIT. Pierrehumbert, J., & Hirschberg, J. (1990). The meaning of intonational contours in the interpretation of discourse. In P. R. Cohen, J. Morgan, & M. E. Pollack (Eds.), Intentions in communication (pp. 271–311). Cambridge, MA: The MIT Press. Pierrehumbert, J., & Steele, S. A. (1989). Categories of tonal alignment in English. Phonetica, 46(4), 181–196. Pike, K. L. (1945). The intonation of American English. Ann Arbor: University of Michigan Press. Prieto, P., & Roseano, P. (2016). The encoding of epistemic operations in two Romance languages: Intonation and pragmatic markers. In J. Barnes, A. Brugos, S. Shattuck-Hufnagel, & N. Veilleux (Eds.), Speech prosody 2016. Boston: Boston University. Roach, P. (2009, July). Advantages and disadvantages of the ToBI system: A lecture by Peter Roach. Retrieved from http://www.youtube.com/watch?v=AL-uMriM4ns Schubiger, M. (1965). English intonation and German modal particles: A comparative study. Phonetica, 12(2), 65–84. ‘t Hart, J., Collier, R., & Cohen, A. (1990). A perceptual study of intonation: An experimental-­ phonetic approach to speech melody. Cambridge: Cambridge University Press. Valdés, G., & Figueroa, R. A. (1994). Bilingualism and testing: A special case of bias. Norwood, NJ: Ablex. Wakefield, J. C., & Lee, H. Y. (2018). The grammaticalization of indirect reports: The Cantonese discourse particle wo5. In A.  Capone, M.  García-Carpintero, & A.  Falzone (Eds.), Indirect reports and pragmatics in the world languages (pp. 333–344). Cham: Springer. Wierzbicka, A. (1972). Semantic primitives (A. Wierzbicka & J. Besemeres, Trans.). Frankfurt am Main: Athenäum-Verl. Wierzbicka, A. (1996). Semantics: Primes and universals. Oxford: Oxford University Press. Wong, J. O. (1994). A Wierzbickan approach to Singlish particles. Unpublished Master of Arts thesis, National University of Singapore. Wong, J. O. (2004). The particles of Singapore English: A semantic and cultural interpretation. Journal of Pragmatics, 36(4), 739–793. Wu, K. (1989). A linguistic study of interrogation in Cantonese: Comparisions [sic. Comparisons] with English. Unpublished master of arts thesis, University of Hong Kong. Yau, S. (1980). Sentential connotations in Cantonese. Fangyan, 1, 35–52. Yip, V., & Matthews, S. (2001). Intermediate Cantonese: A grammar and workbook. London: Routledge. Zhang, L. (2014). Segmentless sentence-final particles in Cantonese: An experimental study. Studies in Chinese Linguistics, 35(2), 47–60.

Chapter 6

The Results of the Research

Chapters 2 through 4 presented the underlying theories and assumptions that this research is based on, and Chap. 5 described the methodology used. This chapter now presents the results relating to six Sentence-final particles (SFPs) that were found to have English intonational equivalents. These six SFPs divide into three pairs of related particles: the evidential particles lo1 and aa1maa3; the question particles me1 and aa4; and the “only” particles zaa3 and ze1. The chapter is divided into three main sections, each of which is devoted to one pair of SFPs.1 Each of the chapter’s three sections is organized as follows. An SFP’s meaning is described, and its Natural Semantic Metalanguage (NSM) explication is presented before showing and discussing the data related to its English equivalent. The data comprise the Cantonese-to-English oral translations and their accompanying F0 contours. The target utterances that were translated by the participants are shown in bold within each dialogue. The English translations shown for the target utterances represent the most commonly used wording from the oral translations provided by the ambilingual participants; all other English translations within the dialogues are mine. Based on the fact that each SFP translated into English as the same form of intonation by multiple ambilingual translators in multiple contexts, it is assumed that the explication given to each SFP also applies to its English intonational equivalent. To test this assumption, the explication that defines an SFP and its English equivalent is tested against the linguist examples of the SFP that were presented from the literature and from the data. Each of the three subsections then ends with a summary and analysis that contrasts that section’s two related SFPs with their English equivalents.

1  Versions of the study on lo1 and aa1maa3 discussed in Sect. 6.1 are also published in Wakefield (2010, 2012a); versions of the study on me1 and aa4 discussed in Sect. 6.2 are also published in Wakefield (2010, 2014); and a version of the study on ze1 and zaa3 discussed in Sect. 6.3 will be published in Wakefield (in press).

© Springer Nature Singapore Pte Ltd. 2020 J. C. Wakefield, Intonational Morphology, Prosody, Phonology and Phonetics, https://doi.org/10.1007/978-981-15-2265-9_6

83

84

6  The Results of the Research

6.1  Two Evidential Particles: lo1 and aa1maa3 The SFPs lo1 and aa1maa3 both express epistemic and/or evidential modality, depending on how these notions are defined. Many linguists had written about these two particles prior to my research, but neither lo1 nor aa1maa3 had been given a definition that was able to account for all and only the contexts within which it can acceptably occur. Furthermore, the distinction between these two SFPs’ meanings had never been captured and described in a way that explains why in some contexts both are acceptable to use, but in other cases, only one or the other is acceptable. The SFP lo1 has been said to mark a sentence as a reason (Kwok 1984; Deng 1991), being obvious (Kwok 1984; Lee and Law 2001; Yip and Matthews 2001; Yiu 2001), having epistemic modality (Luke 1990; Lee and Law 2001), having a backward-­looking (or discourse-linking) feature (Luke 1990; Fung 2000), and more. The SFP aa1maa3 has been said to mark a sentence as a(n) (obvious) reason/excuse (Boyle 1970b; Kwok 1984; Leung 1992/2005; Lee and Law 2000; Matthews and Yip 2011), a reminder (Kwok 1984; A. Law 2002), an elaboration of information (Lee and Law 2001), and more. There is obvious overlap in authors’ descriptions of lo1 and aa1maa3, but the precise nature of the two particles’ relationship to each other had never been made clear. Like all discourse particles, they appear to have multiple functions, which I argue comes from including portions of the sentence or context in their definitions. Luke (1990: 191), for example, concluded that “it would be a futile exercise to try and define an intrinsic or original meaning of [lo1]” and that it is only meaningful in relation to the context, which is a familiar and common argument made about forms of intonation (see Sect. 3.1.1). Ladd (1978: 142–3) said that “the idea of context-free intonational meanings is hard for many linguists to accept,” but that this “unfavorable reaction … must be seen as part of the larger debate over how to account for contextdependence in general [and] that the problems of accounting for intonational meaning are in this respect no different from the large number of other problems of ‘pragmatics’.” I propose that the method used here successfully overcomes the problem of context-dependence that Luke (1990) observed, by treating SFPs as bound morphemes with definitions that include deictic elements that represent the proposition (or a portion thereof) plus one or more elements from the context (see Sect. 5.3.2).

6.1.1  The Particle lo1 The NSM explication of lo1  Leung (1992/2005: 75) said that lo1 appears in sentences in which the speaker considers the matter (or stated truth) to be obvious and indisputable. Fung (2000) argued that lo1 is seen as expressing obviousness because it encodes the assumption that the listener has a high level of knowledge about the proposition. She said, In general, conclusions deduced by logical reasoning are regarded as highly objective, and knowledge derived in this way should be readily shared by everyone in the community

6.1  Two Evidential Particles: lo1 and aa1maa3

85

since the expectation is that any rational human being should be able to derive valid conclusions from the premises given. (Fung 2000: 116)

Based on this, it can be said that lo1 expresses evidential modality. Yip and Matthews (2001: 157) pointed out that lo1 “is often used together with mai6 ‘then’ which suggests that what follows [i.e., what comes between mai6 and lo1] is an obvious conclusion.” They provided this example: (1)

Lei5 zou6 dak1 m4-hoi1sam1 mai6 wan2 dai6ji6 fan6 gung1 lo1. 2s do Adv-M NEG-happy then find second CL job LO “If you’re not happy in your work, then find another job.”

The Cantonese adverb mai6 (“then/so”) is closely related to lo1 semantically (Lee and Man 1997). It has even been argued that mai6 and lo1 are related syntactically, forming a discontinuous construction of the form mai6…lo1 (Tang 2008).2 The SFP lo1 does not require the presence of mai6, but mai6 cannot be used without lo1. None of the dialogues used for collecting translations of lo1 included the adverb mai6, but some of the examples from the literature do. It is tentatively assumed that mai6 does not change the meaning of lo1 beyond perhaps adding emphasis and that the English equivalent of lo1 is the same for sentences with or without mai6. Something else noteworthy about mai6 is that it translates best as “then” when used in a conditional sentence like (1) but translates better as “so” (i.e., “therefore”) when used in a sentence like (6) below. This is important because it shows that lo1 does not function to mark a sentence as a conditional, as some authors have implied. Deng (1991), for example, showed three conditional sentences with lo1 attached and claimed that lo1 “expresses the idea that, under certain conditions, the outcome will be different” (p. 127, translation that of the author). The problem with analyzing lo1 as a conditional marker based on such examples is that those examples remain to be conditionals even if lo1 is removed. Deng’s interpretation of what lo1 means was apparently influenced by the fact that its meaning is very compatible with conditional sentences. The reason for this compatibility will be made clear below. Consider Kwok’s (1984: 58–59) illustration of the use of lo1 in the following question–answer dialogue: (2)

A: Gei2si4 hoi1 coeng4  aa3?   When open CL(film) SFP   “When does the film start?” B: Loeng5 dim2 bun3 lo1.   two CL(time) half  LO   “Two thirty, of course. Don’t you know?”

2  A predicate appears between mai6 and lo1, which in (1) is wan2 dai6ji6 fan6 gung1 (“find another job”).

86

6  The Results of the Research

If lo1 were not attached to (2B), then the English translation would be “two thirty,” nothing more. This means that the preposition phrase adjunct “of course” and the additional question “Don’t you know?” combine to form Kwok’s translation of lo1. This translation seems to capture the general meaning of lo1 but does not capture it precisely and is not its English equivalent, which, according to the results of this study, is a form of English intonation. The translation of (2B) is a typical example of what the English-medium literature on SFPs does throughout, that is, it paraphrases an SFP’s meaning using English lexical words. This is akin to paraphrasing the meaning of English intonation. Saying “Two thirty” with different forms of intonation results in different connotative meanings, and each of these meanings could be paraphrased. SFPs could similarly be removed from a Cantonese sentence and be replaced by a Cantonese paraphrase. In another example, Kwok (1984: 59) translated lo1 into English as “that’s why.” It was a father’s reply to his son, who wondered why he had done poorly on a history test: (3)

Duk6 dak2 m4-gau3 lo1. study ADV-M NEG-enough LO “You haven’t studied enough, that’s why.”

Kwok said that lo1 “seems to give the reason for something, or to point out what is obvious” (p.  58). Example (3) was given as an illustration of the former, and example (2) as an illustration of the latter. I agree that lo1 does “point out what is obvious” but argue that it never “give[s] the reason for something” and that the meaning of (3) could be more closely represented by replacing “that’s why” with “of course.” Similar to Kwok’s claim, Deng (1991: 127, translation that of the author) said that lo1 “expresses an explanatory mood” when it is attached to an answer to a question. In a context where an answer is expected, the propositional content of the answer itself is enough to indicate that the sentence is an answer or explanation. The addition of the SFP lo1 only functions to present this explanation as something that is “obvious.” The idea that lo1 expresses obviousness has been stated repeatedly throughout the literature, and it is the core of the NSM explication that I propose for lo1 at the end of this section. Luke (1990) said more about lo1 than any other author, devoting 79 pages to it. Within the framework of conversation analysis, he discussed a variety of examples taken from an audio corpus of casual conversations, interviews, and radio programs. Luke said that lo1 “makes available to conversational participants a means with which they can indicate to each other that the full sense and interactional import of what is being said is to be determined by reading the current utterance in such a way as to link it up with something else” (p. 191). This quote from Luke can be paraphrased as follows: the meaning of “P lo1” (P being a proposition) is determined by linking up P with some element D from the discourse context. Luke explains later that this “something else” (i.e., D) is either shared knowledge or something in the prior discourse, either linguistic or pragmatic.

6.1  Two Evidential Particles: lo1 and aa1maa3

87

Luke did not consider his description to be a semantic definition of lo1; rather, he concluded that lo1 “provides nothing more than a loose index, pointing to ways of reading and interpreting,” with no “intrinsic or original meaning” (p. 191). He discussed a wide variety of what he considered to be context-dependent functions, uses, and properties of lo1. I will discuss the functions he listed that I think are the most helpful for understanding the meaning of lo1: • • • • •

It contains an epistemology feature. It contains a backward-looking feature. It confirms an expectation. It reports events that follow naturally under given circumstances. It formulates suggestions and advice.

Referring to the epistemology feature, Luke (1990: 123) said that “states-of-­ affairs are presented as simply and unproblematically known, i.e., having a good sound common sense epistemological basis.” Other linguists have also noted this principle feature of lo1. Lee and Law (2001) said that lo1 was one of the SFPs that expresses epistemic modality. Fung (2000) said that it functions to mark the realization of an epistemic state, and Li (2006) said it is both epistemic and discourse related. Closely related to this epistemology feature is the next feature that Luke (1990) mentioned, which is related to Fung’s (2000) observation that lo1 ties the proposition to the discourse. Luke called this the backward-looking feature, and I think it could also be termed an evidential feature. He gave (4) as an example, which was taken from a radio call-in program in which a boy and a girl called in and were both put on the air at the same time. The DJ talked to the girl first and then addressed the boy. When the boy, Kei, heard the DJ say, “And the boy?”, Kei took it as a cue to say his name. The DJ did not hear Kei say his name, however, because it was drowned out by laughter, so he asked Kei for his name again after Kei had already said it: (4)

DJ: Laam4zai2 le1? boy SFP “And the boy?” Kei: Aa3-kei4. PRT-kei “Kei.” DJ: Hai6. Lei4 giu3 mat1 meng2 aa3? be you call what name SFP “Yes. What’s your name?” Kei: Aa3-kei4 lo1! PRT-kei LO “Kei!” DJ: Aa3-kei4 lo1!::: Ngo5 dou1 mei6  zi1dou3. PRT-kei LO::: 1s also not-yet know “Kei! I didn’t know (your name) yet.” (Luke 1990: 128)

88

6  The Results of the Research

Kei apparently assumed that the DJ should have heard him say his name the first time, not realizing that it was drowned out by the laughter. As a result, when the DJ asked for his name, Kei repeated it with lo1 attached. Luke said this shows that the particle invites the recipient to look backward in the discourse for some feature in the context in order to establish a link between the present utterance and something that has been said before (in this case the giving of his name the first time round). (Luke 1990: 129)

Luke also pointed out that the DJ expressed an understanding of this meaning of lo1 by repeating Kei’s reply (Aa3-kei4 lo1:::) and by stressing lo1 by lengthening its rime considerably (represented as :::). The DJ then said, “I didn’t know (your name) yet,” clearly showing the DJ’s understanding that Kei expected him to know his name. Luke said this backward-looking feature not only points back to prior utterances for the exact information that the listener seeks but may also point to prior information in the discourse or to “a known state of affairs,” through which the listener can get the answer “by means of an inference” (p. 131). This backward-looking feature could be considered a form of evidential modality, which can be seen as the source of the “common sense” epistemic knowledge that Luke referred to. Aikhenvald (2004: 186) said, “evidentials are part of the encoding of epistemology in the sense of how one knows what one knows.” Looked at in this way, the backward-looking feature is not a separate feature from the epistemology feature, but rather is subsumed within it. Another property of lo1 that Luke mentioned is to confirm an expectation, offering the following exchange as an example. Speaker A believes that speaker B’s girlfriend left him (i.e., left speaker B) for a reason other than simply knowing that he had been a cook before. (5)

A: Gam2, lei5dei6 dong1si4 fan1sau2 hai6 wai4 me1 jyun4jan1? thus 2s-pl that-time separate be for what reason “Well, what was the reason you separated at the time?” B: Keoi5… 3s “She…” A: Ze1, wai4 zing6hai6 zi1dou3 lei5 zou6-gwo3 ceoi4si1… PRT for only know 2s do-EXP cook… gam2joeng2 aa3? thusly SFP “I mean, was it only because she knew you had been a cook?” B: Hai6 aa3. be SFP “Yes.” A: Soeng1seon3 m4-zing6hai6 gam2 ge3. believe NEG-only thus SFP “I believe that’s not all it was.” B: Ze1, keoi5 waa6 ngo5 hou2 m4-sai3sam1 laa1. PRT 3s say 1s very NEG-attentive SFP “I mean, she said I wasn’t caring.” A: Hai6 lo1! be LO “Yeah!” (Luke: 1990: 131–2)

6.1  Two Evidential Particles: lo1 and aa1maa3

89

When a speaker A says Hai6 lo1! “Yeah!,” he or she conveys the idea that speaker B’s immediately preceding utterance confirms an expectation to that of speaker A, which is that there was an additional reason for B’s separation from his girlfriend. This property of lo1 that “confirms an expectation,” as Luke (1990: 131) put it, entails both obviousness and the backward-looking feature. This is because expectations are obvious in the minds of those who hold them, and they stem from prior knowledge. This feature is therefore closely related to the two preceding features. The next use of lo1 that Luke cites is that of “reporting.” He said lo1 is attached to reports of events that can be expected to happen naturally as a result of certain events, or under certain circumstances. This too entails the backward-looking feature since the report P in the sentence “P lo1” follows naturally from some event or circumstance D that exists in the prior discourse (or perhaps pragmatic information). Luke gave the following example of a report: (6)

Aam1aam1 jau5 gaa3 dik1si2, mai6 zit3-zo2 dik1si2 ceot1lei4  lo1. just have CL taxi so catch-PERF taxiw out-come LO “There just happened to be a taxi, so we caught a taxi.”

In (6), the report “we came by taxi” follows naturally from the circumstance “there just happened to be a taxi.” This is obviously backward looking because the report gets its interpretation of being natural and obvious by looking back in the discourse at the circumstance within which it occurs, namely, there just happening to be a taxi. The final property I will mention of those that Luke listed for lo1 is that of giving suggestions or advice (Luke 1990: 155–62). For an example of this, we can refer back to (1), cited above from Yip and Matthews (2001). Luke said that lo1-suffixing is a regular feature in advice-givings for it provides a means of establishing a link between a problem or a set of circumstances on the one hand, and a recommended solution on the other. (Luke 1990: 162)

Taking what has been said about lo1 thus far, it makes sense that it would be used for suggestions and advice giving. It presents the suggestion or advice as something that is obvious and directs the listener to look back at something in the prior discourse for the evidence which shows it to be so. Based on the explication I give of lo1 below, this particle could be used in a large number of ways, which is why Luke was able to come up with a fairly long list of functions and uses, almost all of which seem to be correct.3 As a result, he appears to have concluded that it would be impossible to come up with a definition for lo1 that would explain and unify its various functions and uses. Nevertheless, right after saying this he provided a useful and accurate description of its meaning, saying that lo1 3  All except for the function referred to by Luke (1990: 195) as the “completion proposal.” See Wakefield (2010) for details of that function and why I concluded that it was not a function of lo1.

90

6  The Results of the Research invite[s] co-participants to assign a dependency reading to the utterance. In addition, it displays the speaker’s assumption that the co-participant can be relied on to assign those links and connections that are needed for the utterance’s interpretation. (Luke 1990: 192)

In other words, when a speaker says “P lo1,” he or she assumes that the listener can link P to some discourse element D and thereby interpret P. This is very similar to what Luke was quoted as saying about lo1 above, but it additionally states that the speaker assumes the listener to possess the necessary knowledge required to link P to D. Li (2006) gave an example of lo1, taken from Fung (2000), and she consulted native Cantonese speakers who told her that “[w]ith lo1 the speaker seems surprised by the questioner’s ignorance of the reason, i.e., the speaker thinks that the questioner should have known the answer” (Li 2006: 90). In other words, when a speaker answers a question with “P lo1,” he or she is surprised that the listener is ignorant of P. I argue that this is because the speaker believes it follows naturally, logically, and obviously from knowing the discourse element D—the proposition P can be known from knowing D, and the speaker assumes that the listener knows D, as implied by Luke (1990). Along these lines, Fung (2000: 112) said that “lo1 assumes the listener should have a high level of knowledge of the proposition.” This assumed “high level of knowledge” about P stems from an assumed full knowledge of D, which provides, or can lead to, the full knowledge of P. This description of what seems to be going on in the mind of the speaker is in line with the majority of what the literature says, and this type of description, which is based on speaker-oriented thoughts, formulates my explication of lo1. Based on what the literature says about lo1, on consultations with native speakers, and also drawing on my native-English intuition regarding its English equivalent (see next section), I propose the following speaker-­oriented explication: (7)

P + lo1 = a. lei5 ho2ji5 zi1dou3 lei1 joeng6 je5 (P) 2s can know this CL thing “you can know this (P)” b. jan1wai6 lei5 zi1dou3 ling6ngoi6 jat1 joeng6 je5 (D) because 2s know another one CL thing “because you know something else (D)”

P is the proposition that lo1 attaches to, and it is the antecedent of the deictic element lei1 joeng6 je5 (“this thing”) in (7a). D is the element in the discourse that is related to P, and D is the antecedent of the deictic element ling6ngoi6 jat1 joeng6 je5 (“something else”) in (7b). Based on the data presented below, lo1 is proposed to have an English equivalent whose explication is equal to the English translation of (7): (8)

P + lo1-equivalent intonation = a. you can know this (P) b. because you know something else (D)

The explication of lo1 and of its English equivalent captures everything said about lo1 in the literature except for the statements that I argue are incorrect. For example, it captures Kwok’s (1984) claim that lo1 points out what is obvious, but

6.1  Two Evidential Particles: lo1 and aa1maa3

91

not her claim that it gives the reason for something, because that comes from the context rather than from the meaning of lo1. Fung (2000) said that a statement “P lo1” cannot begin a conversation but must be preceded by something linguistic or nonlinguistic. My explication accounts for this because lo1 is uninterpretable unless there is some prior discourse element D that functions as the antecedent of “something else.” Fung (2000: 113) said that “[t]he exact logical relationship [between D and P] is not easily captured and needs to be resolved through context.” This is basically true, but it is important to note that there may not be any real world logic involved because the meaning of lo1 is speaker oriented; its connection to the real world is filtered through the mind of the speaker, which may or may not behave in a logical manner. John may very well expect Mary to know something she does not actually know, for example, John may expect Mary to already know the propositional content (P) of the sentence based on the idea that she knows something else (D), which he said in the prior discourse, but perhaps Mary did not hear him say D, or did not understand D. John’s use of lo1 in such a context is warranted in his mind, even though in Mary’s mind there is no logical relationship between D and P. There is always a logical connection between D and P, but the logic is in the mind of the speaker. Consider a delusional man who thinks everyone can read his mind. When someone asks him a question, he thinks the answer without uttering it aloud. Then when this someone asks him the same question again, he could then say it aloud using lo1 (or its English equivalent) because he would assume that the listener had already heard what he said before he said it aloud. This scenario is comparable with example (4) in which the boy, Kei, assumed that the DJ should have heard his name once already. The DJ is similar to a person listening to the delusional man. He does not understand why the speaker would attach lo1 to his response, but from the speaker’s perspective, it is perfectly logical. In my made-up scenario of the delusional man, we know what the D of the explication is and how it leads to knowing P, because I created the mind of the speaker. In any given case where lo1 is used, the P will normally be given because the proposition that lo1 attaches to is normally uttered aloud. The D, however, must be determined pragmatically. Therefore, in order to test the accuracy of the definition in (7), we must look at a number of occurrences of lo1 and analyze what the D is for each of them. If the explication appears to be an accurate paraphrase of lo1’s meaning in each case, then this can be seen as evidence that the explication is accurate. To this end, after presenting and discussing the English equivalent of lo1 based on the data, the explication of (7) and (8) will be applied to the examples of lo1 cited from the literature above and to the lo1-suffixed sentences that were extracted from the audio corpus for the purpose of collecting data. Unlike the fictitious delusional man, we cannot see into the minds of real-life speakers. Nevertheless, based on the contexts of each dialogue, it is typically quite easy to determine what is likely to be the discourse element D that the speaker believes leads to the knowledge of P. The explications in (7) and (8) can be considered correct to the extent that native speakers of English or Cantonese intuitively sense them to accurately describe the use of lo1 in the data or the use of its English equivalent in the translations, as well as readers’ own uses of lo1 or its English equivalent in their daily lives.

92

6  The Results of the Research

The English equivalent of lo1 based on the data  This subsection discusses the form of the English pitch contour that is proposed to be equivalent in function and meaning to lo1. The lo1-suffixed sentences that were targeted for translation are shown in bold in each dialogue. In the first dialogue shown in (9), the only portion that was translated was speaker A’s second utterance, which was translated as “yeah” with a high-falling tone, represented by the curved line that immediately follows it: (9)

A: Maths haau2 A zi3 siu2 jiu3 gau2sap6 fan1 ji5soeng6 wo3. test most few need ninety point above SFP “To get an A in Maths, you need at least ninety-five points.” B: Lei5 ding2lung2 co3 loeng5 tai4 zaa3, zi2 ho2ji5. 2s most wrong two CL SFP only can “At the most, all you can get wrong is two questions.” A: Hai6 lo1. be LO “Yeah .”

Pitch (Hz logarithmic)

Figs. 6.1 through 6.4 show the F0s for the four participants’ mimic translations.4 To my native-speaker ear, as well as the ears of other native English speakers consulted, there is a high-falling pitch that sounds meaningful, and which, more ­importantly, is recognized as being the same for all four participants’ translations.5 The F0 contours in Figs. 6.2 and 6.3 show a prominent rise before the fall, while it is not clear as to whether or not Figs. 6.1 and 6.4 include a rise. Only the falling portion of each F0 contour sounds prominent, so it is concluded that the rise is not a

245

173

Y

122

E A

H

0.5

0 Time (s.) Fig. 6.1  Female-a: “Yeah”

 Section 5.2.3 explains why the translations are referred to as mimic translations.  The native English speakers included 5 speakers of Standard American English, 3 speakers of Standard Australian English, and 2 native English speakers who were born and raised in Hong Kong. 4 5

Pitch (Hz logarithmic)

6.1  Two Evidential Particles: lo1 and aa1maa3

93

153

108

Y

E

A

H

76

0.4

0 Time (s.)

Pitch (Hz logarithmic)

Fig. 6.2  Male-a: “Yeah”

335

237

Y

E A

H

168

0

0.5 Time (s.)

Fig. 6.3  Female-b: “Yeah”

meaningful part of this tone’s form—it merely shows that speakers may raise their pitch in order to produce the fall. It is interesting to note that it is unlikely these four F0 curves would be considered as four occurrences of the same tone based on visual evidence alone. The prominence of the rise seen in the translation from male-a (Fig. 6.2), and the extended length of the fall in the translation from male-b (Fig. 6.4) represent individual differences in how this same tone can be uttered. They also vary in the degree to which there is a rise before the fall. These differences do not sound linguistically ­meaningful. By this I mean that, at the very least, these differences did not prevent me or the other native English speakers consulted from judging them all to contain some element of meaningful intonation that is the same.

6  The Results of the Research

Pitch (Hz logarithmic)

94

134

95

Y

E

A

H

67

0.5

0 Time (s.) Fig. 6.4  Male-b: “Yeah”

The translation from male-b was noticeably softer spoken than the translations from the other participants, which is perhaps why it does not show the high-falling contour as prominently as the other translations do. Male-b’s translations were all said with less conviction, which caused his F0 contours to be less pronounced, but his utterances sounded like they carried the same connotative meanings, expressed in the same (though muted) forms as the other participants’ translations. His translations were therefore counted as instances of the same pitch contour. This does not seem to be a dialectal difference, but rather a difference in personal speaking style related to personality, which is a complicating factor that I had not considered before analyzing the data. The next dialogue shown in (10) involves two people talking about some piece of equipment. Speaker B said that if everything was ready, then she was going to “pull it,” Speaker A did not know what it was that speaker B wanted to pull, so he asked, (10)

A: Lei5 laai1 mat1je5 aa3? 2s pull what SFP “What are you going to pull?” B: Laai1 li1 lap1 je5 lo1. Pull this CL thing LO “(Pull) this thing.’ (Female-b: ‘Pull this thing

.”)

The target sentence in (10B) shows the participants’ English translations: “(Pull) this thing.” The F0 contours of these translations are shown in Figs. 6.5 through 6.8. The intonational form used by all four participants was again a high-falling tone. Three of the participants (Figs. 6.5, 6.6, and 6.8) placed this tone on the syllable “this,” but Female-b, in contrast, placed it on the syllable “thing” (Fig. 6.7). This positioning of the pitch contour sounds unnatural to me and to the other native English speakers consulted. For us, it sounds more natural when the lo1-equivalent intonation is placed on “this.” This is where the accented syllable of the intonational

6.1  Two Evidential Particles: lo1 and aa1maa3

95

Pitch (Hz logarithmic)

346 245 173

P

U

LL

THI S

TH I NG

122

0

0.7

0.5 Time (s.)

Pitch (Hz logarithmic)

Fig. 6.5  Female-a: “Pull this thing”

153

108

TH I

S

TH

I

NG

76

0

0.5

0.6

Time (s.) Fig. 6.6  Male-a: “This thing”

phrase would lie if it were neutrally intoned. It is hard to determine why this position seems more natural to us, because there is no contrastive meaning here, so technically this tone does not need to be positioned on the word “this,” but placing it later in the utterance sounded unnatural to me and the other native English speakers. More will be said about female-b’s unique positioning of this tone below. Hirst (1983b: 96) explained the well-known problem with using an acoustic representation of pitch, which is that F0 “is not a continuously observable parameter of the sound wave, its presence or absence being dependent on the segmental feature of voicing.” Unlike the translations of (9), the F0 contour for translations of (10) breaks apart because the utterance included unvoiced consonants. In addition, the falling portion of the high-falling contour in Fig. 6.5 did not record properly in Praat and therefore does not appear on paper, but it is clearly heard in the recording.

6  The Results of the Research

Pitch (Hz logarithmic)

96

335

237

P ULL THI

168

S TH I

0

NG

0.8

0.5 Time (s.)

Pitch (Hz logarithmic)

Fig. 6.7  Female-b: “Pull this thing”

134

95

I’ LL P ULLTH I

S

TH I

NG

67

0

0.5 Time (s.)

0.9

Fig. 6.8  Male-b: “I’ll pull this thing”

Another thing to note about Fig. 6.5 is that it shows one of the utterances where the pitch went above the upper limit of the octave pitch range, so a third line that is one octave above the medium pitch was added to the figure (cf. Fig. 6.1 from the same speaker). Readers should be aware that several figures below also include an extra line to accommodate higher pitch, and in most cases this will not be noted. In the next dialogue, speaker A described eating “hot pot” in a way that speaker B did not understand. This caused speaker B to ask for clarification about what speaker A was going to eat.

97

6.1  Two Evidential Particles: lo1 and aa1maa3 (11)

A: Gam1maan5 ngo5dei6 heoi3 jat1 jan4 jat1 wo1 wo3, … tonight 1s-pl go one person one pot SFP … zeng3-m4- zeng3. correct-NEG-correct “Tonight we’re going to have a pot a person—nice huh?” B: Jat1 jan4 jat1 wo1? Sik6 me1 aa3? one person one pot eat what SFP “A pot a person? What are you eating?” A: Daa2 bin1lou4 lo1. hit side-stove LO “Hot pot.” (Female-b: “Hot pot .”)

The target sentence was translated as “hot pot” by all four participants. For all of them again, the translation of lo1 was a high-falling tone. Again for Figs. 6.9 and 6.10, the falling portions of the contours do not show up, but they are clearly heard. A short line appears at the end of the utterance in Fig. 6.9, but its pitch level cannot be correct because “pot” is uttered at a level that is obviously far below that of “hot,” and at what appears to be below female-a’s median pitch level. Interestingly, femaleb again placed the lo-equivalent tone on a different syllable from the other three participants; she placed it on “pot” rather than on “hot.” This appears to be a case of individual-speaker variation. Female-b’s lo-equivalent intonation has the same form (i.e., a high-falling pitch contour), but it is realized at a different position within the intonational phrase—appearing on the final syllable rather than where the nuclear stress of the phrase would be if it were neutrally intoned. This is another complicating factor that I had not considered. The position of her tone seen in Fig. 6.11 is especially unusual because the accented syllable of this single-word utterance is “hot,” and it would seem natural for the lo1-equivalent tone to be placed over an accented syllable. In order to confirm that female-b’s English intonation was not fundamen-

Pitch (Hz logarithmic)

346 245 173

HO

T

P

O

T

122

0

0.5 Time (s.)

Fig. 6.9  Female-a: “Hot pot”

98

6  The Results of the Research

Pitch (Hz logarithmic)

216 153 108

HO

T

P O

T

76

0

0.5 Time (s.)

Pitch (Hz logarithmic)

Fig. 6.10  Male-a: “Hot pot”

335

237

HO

T

P O

T

168

0

0.5 Time (s.)

0.9

Fig. 6.11  Female-b: “Hot pot”

tally different from other native speakers, I constructed two dialogues to elicit her pronunciations of the utterances “pull this thing” and “hot pot” using both neutral and contrastive intonation. For both types of intonation, her nuclear stress was unambiguously placed on “this” and “hot,” respectively, just as would be expected from a native English speaker. The fact that her lo1-equivalent tone was placed on the final syllable of both utterances is perhaps evidence that her mind is treating it as a sentence-final tonal morpheme. While most speakers appear to place such morphemes in the same position that the nuclear stress of the sentence’s final intonation phrase would go if the sentence were neutrally intoned, she perhaps places it in the final position, similar to an SFP.  This cannot be done for contrastive stress because it works to contrast lexical items and therefore must mark a lexical item by stressing its accented syllable. In contrast, a discourse-related tone like the one under discussion has the flexibility to appear in a different positioning within the intonation phrase (Fig. 6.12).

99

Pitch (Hz logarithmic)

6.1  Two Evidential Particles: lo1 and aa1maa3

134

95

HO T

P

O

T

67

0

0.5

0.7

Time (s.) Fig. 6.12  Male-b: “Hot pot”

The next dialogue involves two speakers who were looking at a picture of several people. One of the people in the picture was a person named Ricky, and speaker A asked which one it was: (12)

A: Bin1-go3 Ricky? which-CL “Which one’s Ricky?” B: Lei1 go lo1. this CL LO “This (one).”

The translation of Lei1go3 lo1 from three of the participants was “This one,” and from female-b was “This.” All four again used what sounded like the same high-­ falling tone according to me and the other native English speakers. Figures 6.14 and 6.15 show a rise before the high-falling tone, while Figs.  6.13 and 6.16 do not. Again, the rise is not noticeable and is therefore not considered to be linguistically meaningful; all of the translations sound like falling tones. Seeing a rise or fall of an F0 contour on paper that is not linguistically meaningful is not uncommon (Roach 2009). Though the pitch rises that appear before the falls in the above translations are not meaningful in and of themselves, they are most likely related to the pitch falls. Virginia Yip (p.c., 2010) pointed out that this could be “a matter of raising one’s pitch sufficiently in order to ‘launch’ a high falling contour.” The rise seen in Fig. 6.16 comes after the target utterance is completed and is the result of a nasally expulsion of air by male-b; this F0 rise is therefore not linguistically relevant. Based on the data, it is proposed that lo1 has an intonational equivalent in English, which is the high-falling tone that appears in the ambilingual translators’ English translations. It is further proposed that this tone is a discourse-related morpheme that exists in the mental lexicons of English speakers, having the same grammatical category as the Cantonese SFP lo1 and the same (or very similar) function and meaning.

6  The Results of the Research

Pitch (Hz logarithmic)

100

245

173

TH I

S

ONE

122

0.3

0 Time (s.)

Pitch (Hz logarithmic)

Fig. 6.13  Female-a: “This one”

153

108

TH I

S

ONE

76

0.4

0 Time (s.) Fig. 6.14  Male-a: “This one”

I, along with the native English speakers consulted, recognized the form and meaning of this tone and recognized it as a tone that we hear and use in our daily lives. This is very important, because now that this tonal morpheme has been discovered via lo1, it can be analyzed on its own with or without reference to lo1. According to my intuition, lo1-equivalent intonation is normally higher than that of contrastive stress, which is also a lexical, high-falling tone according to Hirst (1983a). To illustrate the difference, consider the following constructed dialogue containing both contrastive and lo1-equivalent intonation. Imagine that John and Mary are coworkers who have been asked by their boss to write a notice about something or other and pin it up on their office’s notice board. John and Mary have agreed that Mary will write the notice, and the following dialogue ensues. As in the translations above, the high-falling tone is shown immediately following the word with which it is associated:

Pitch (Hz logarithmic)

6.1  Two Evidential Particles: lo1 and aa1maa3

101

335

237

TH I

168

S

0.7

1

1.1

Time (s.)

Pitch (Hz logarithmic)

Fig. 6.15  Female-b: “This”

134

95

THI S

67

ONE

0

0.5

0.8

Time (s.) Fig. 6.16  Male-b: “This one” (13)

[A blue and red pen are both available for writing the notice] John: “Here, use the red pen.” Mary: “I don’t want to use the red pen. I want to use the blue one.” John gets distracted by a phone call and forgets which pen Mary wants to use. John: “Which pen do you want to use?” [Mary is thinking: I just told you!] Mary: “The blue one.”

Just prior to John’s receiving a phone call, Mary had said which pen she wanted to use. She therefore thought that John should know the answer to his own question. This influences the way that Mary intones her response. Both of Mary’s utterances in (13) include the noun phrase “the blue one” uttered with a high-falling tone on

102

6  The Results of the Research

“blue.” The first instance of this is an example of contrastive stress, and the second, which I think would naturally be said with a higher pitch than the first, is an example of lo1-equivalent intonation. Whether or not the form of lo-equivalent intonation consistently and fundamentally differs from emphatic and contrastive intonation is left to future research. The English equivalent of lo1 is concluded to be the type of high-falling tone that is used in contexts such as (13), where the speaker assumes that the listener knows some discourse-based information D, which logically leads to knowledge of the proposition P to which the intonation is applied. P is therefore something that the speaker thinks should not have to be said, because it is assumed that the listener already has access to this information via logical deduction. Based on the e­ xplication in (8), the meaning expressed by the lo1-equivalent tonal morpheme in this context can be illustrated by inserting the antecedents of “this” (P) and “something else” (D) as follows: you can know this (P: I want to use the blue one) because you know something else (D: I already told you I want to use the blue one). Note that the antecedents to P and D are also written from the perspective of the speaker. This will now be done for all the examples of lo1 that were discussed above. Applying the NSM explication to the examples of lo1 from the literature and the data  Inserting the antecedents that are equal to the P and D of the explication in (7) and (8) allows us to see how well the explication accounts for the meaning of lo1 and its English equivalent in any and all contexts where they appear. The way to do this is straightforward and was illustrated directly above for the context of (13). The larger the number and variety of contexts that the explication is judged to accurately describe, the more reliable we can assume it to be. The method for doing this is to identify the “something else” (i.e., the discourse element D) within each dialogue, and then analyze how knowledge of D leads to knowing P (at least in the mind of the speaker). P is normally stated overtly as the sentence to which lo1 is attached, but when only a copula is used, P is silent and must be determined pragmatically. For each example of lo1 discussed below, it is assumed that within an equivalent context in English, the antecedents of D and P will be the same for both lo1 and its English equivalent. It is further predicted that lo1 and its English equivalent will have the same distributional properties with regard to acceptable contexts. The form and position of the lo1-equivalent tone found in the data came from the participants’ translations. For the examples of lo1 cited in the literature, my native-­ English intuition is applied to the English translations in order to determine where the tone would be placed by a speaker. The wordings of the English translations have also been modified in some cases from those found in the literature. Example (1) comes from Yip and Matthews (2001) and is repeated here with its English-­equivalent tone as (1′); all other examples are similarly repeated. In (1), lo1 is used to give a suggestion or advice, which was one of the properties of lo1 that Luke (1990) listed. (1′)

Lei5 zou6 dak1 m4-hoi1sam1 mai6 wan2 dai6ji6 fan6 gung1 lo1. 2s do Adv-M NEG-happy then find second CL job LO “If you’re not happy in your work, then find another job .”

6.1  Two Evidential Particles: lo1 and aa1maa3

103

Using the lo1-equivalent tone on the syllable “job” sounds natural and is a very suitable addition to the meaning of this sentence. The proposition that lo1 is attached to is wan2 dai6ji6 fan6 gung1 (“find another job”), and this therefore functions as the antecedent of “this” (i.e., P) in explication. This P can be known if one knows the discourse element D. In this case, the D is straight forward because it is uttered within the conditional sentence. Using the wording of the explication, lo1 in this example means: you can know this (P: to find another job) because you know something else (D: you’re not happy in your work). When lo1 or its English equivalent are used with a conditional sentence, then the antecedents of D and P are explicitly stated as “if D, then P.” This follows the typical old-before-new ordering of information in sentences. D is something that the listener is assumed to know, and P is the new information that can be known based on knowing D. Example (2) from Kwok (1984) shows the question “When does the film start?” and the accompanying response “Two thirty” with lo1 attached. Kwok translated this response into English as “Two thirty, of course. Don’t you know?” Here, I translate it using the lo1-equivalent tone as follows: (2′)

Loeng5 dim2 bun3 lo1 two CL(time) half LO “Two thir ty.”

In this example, the tonal morpheme lies over the first syllable of “thirty.” In this conversation, we know that P is “[It starts at] two thirty,” but we have to speculate as to what D is because Kwok did not provide any context. It is not difficult to imagine some likely possibilities. Perhaps the speaker, or a third party, had said what time the film was going to start in the prior discourse, in which case D would be something like “I/someone told you the film starts at two thirty.” Another possibility is that the speaker assumed the listener must have seen the time when they bought the tickets, in which case D could be something like “the ticket counter showed that the film starts at two thirty.” In either case, P ([it starts at] two thirty) could be known from knowing D. Example (3) was a father’s reply to his son, who wondered why he did poorly on a history test. Kwok translated lo1 into English by adding “that’s why” at the end of the sentence, which I removed and replaced with a tone on the first syllable of “study,” which, according to me, is the most natural position for the tone to appear. (3′)

Duk6 dak2 m4-gau3 lo1. study ADV-M NEG-enough LO “You didn’t stu dy enough.”

As for its meaning based on (8), D most likely comes from the father’s assumption that his son possesses a particular type of common knowledge. Specifically, the father probably thinks that his son knows something like this: “doing poorly on tests is caused by insufficient studying.” Stating it in the words of the explication then becomes: you can know this (P: you didn’t study enough) because you know some-

104

6  The Results of the Research

thing else (D: doing poorly on tests is caused by insufficient study). The use of lo1 here does not give the reason for doing poorly on the test, as Kwok suggested, because that is done by the proposition itself. Instead, lo1 expresses that the father assumes his son can know the reason because he already knows what causes poor test results. The reason is therefore considered by the speaker to be obvious and it should not need to be stated. The D in example (4) is straightforward. It comes from the fact that the boy Kei had said his name once and assumed that the DJ should have heard it. However, because the DJ did not hear Kei say his name over the sound of laughter, he asked Kei what his name was. Kei then repeated his name with lo1 attached. (4′)

Aa3-Kei1 lo1 PRT-Kei LO “Kei .”

In this case, lo1 and its English equivalent mean: you can know this (P: [my name is] Kei) because you know something else (D: I just said “Kei” after you asked for my name). The English translation of (4′) is a single-syllable utterance, and in such cases, the tone obviously has no place to appear other than on that single syllable. Dialogue (5) comes from Luke (1990: 131), who said that this particular use of lo1 expressed the speaker’s “recognition that the answer [given just prior by the listener] has provided evidence which confirms an expectation.” I agree with his explanation, which can be accounted for by the NSM explication of lo1. The example in (5) provides an informative contrast with example (9), which was the first example from the data. The two examples are related because both of them use the structure hai6 lo1 (be LO), which consists of only a copula and the SFP, and both translate into English as the lo1-equivalent tone used on “yeah.” This use of lo1 is especially interesting and informative because both P and D must be figured out pragmatically. The hai6 lo1 structure appears to be the only case where P is not overtly uttered along with lo1. In dialogue (5), speaker A said that s/he believed there was a reason that speaker B separated from his girlfriend in addition to the reason speaker B had already given, which was that she knew he had been a cook before. Speaker B confirmed this expectation of speaker A by stating another reason. Speaker A then responded by saying, hai6 lo1 “yeah ,” which expressed an indication that his expectation had been confirmed. (5′)

A: Soeng1seon3 m4-zing6hai6  gam2 ge3. believe NEG-only thus SFP “I believe that’s not all it was.” B: Ze1, keoi5 waa6 ngo5 hou2 m4-sai3sam1 laa1. PRT 3s say 1s very NEG-attentive SFP “I mean, she said I wasn’t caring.” A: Hai6 [P] lo1. be LO “Yeah , [P].”

6.1  Two Evidential Particles: lo1 and aa1maa3

105

The P of (5) is construed as an ellipsis in both Cantonese and English. In this example, the elided proposition refers to the same speaker’s previous statement: “I believe that’s not all it was.” Speaker A’s thought process in relation to speaker B can be stated in this way: in addition to your girlfriend knowing that you used to be a cook, you know some other reason that she left you. When speaker B stated an additional reason (i.e., that his ex-girlfriend had said he was not caring), this proved to speaker A that speaker B knew some other reason. This then provided speaker A with a D that licensed the use of lo1 in this context. By saying hai6 [P] lo1, speaker A was saying this to speaker B: you can know this (P: knowing you’d been a cook wasn’t all it was) because you know something else (D: she said you weren’t caring). In contrast to the use of hai6 lo1 in (5), its use in (9) does not express the confirmation of an expectation. Lee and Law (2001: 83) said that hai6 lo1 is a “formulaic expression … which indicates the speaker’s agreement with an earlier comment made by the hearer” and referred to it as an “agreement formula.” While hai6 lo1 in (9) does express agreement, we saw from example (5) that it can also expresses something entirely different, so we cannot refer to hai6 lo1 as a formulaic expression that always expresses agreement. While Lee and Law’s description of hai6 lo1 works to explain (9) and Luke’s description works to explain (5), the NSM explication of lo1 works to explain both (5) and (9). In (9), speaker A’s reply to B expresses that he or she agrees with what B just said and also expresses the idea that speaker B’s comment was obvious because it follows naturally and logically from what speaker A said prior to that. All four of the native-bilingual participants translated the target sentence hai6 lo1 as “yeah” with high-falling intonation. Again, lo1 is construed as being attached to an elided clause. (9′)

A: Maths haau2 A zi3 siu2 jiu3 gau2sap6 fan1 ji5soeng6. test most few need ninety point above “To get an A in Maths, you need at least ninety-five points.” B: Lei5 ding2lung2 co3 loeng5  tai4 zaa3, zi2  ho2ji5. 2s most wrong two CL SFP  only can “At the most, all you can get wrong is two questions.” A: Hai6 [P] lo1. be LO “Yeah , [P].”

P is the elided element and is pragmatically understood to have the meaning of speaker B’s immediately preceding statement. D is speaker A’s first utterance. What speaker A conveys with hai6 lo1 here is that speaker B’s utterance (P) can be known as a result of knowing speaker A’s first utterance (D). Using the explication, speaker A is saying: you can know this (P: at the most, all you can get wrong is two questions) because you know something else (D: to get an A in Maths, you need at least ninety-five points). It is not necessary for our purposes here to propose the syntactic properties of the elided element; all that is necessary is to state that its meaning is equal to something stated in the preceding discourse in both (5) and (9). In both Cantonese and English, the elided element is assumed to be a clause (though ­perhaps only a verb phrase (VP) in Cantonese) that is phonetically null. In English, the tone

106

6  The Results of the Research

has no place to appear other than on “yeah,” which is assumed to precede the null clause to which this tone adds its meaning. Neither Luke’s (1990) nor Lee and Law’s (2001) description of hai lo1 can account for how it is used in both (5) and (9). Furthermore, neither of their explanations can account for why hai6 lo1 can be used as a reply to a question, which is neither the confirmation of an explanation nor an expression of agreement. Consider a scenario in which two coworkers anxiously arrive 10 minutes late to a room where they thought a company meeting had been scheduled to take place, and, to their surprise, the room is empty. If speaker A asks, Dim2gaai2 mou5 jan4 ge2? (why NEG person SFP; “Why isn’t anybody here?”), then it would be perfectly natural for speaker B to respond by saying hai6 lo1 (“yeah ”). A description of hai6 lo1 (“yeah ”) that appears to work for this context, as well as those of (5) and (9), is one which is based on the explication of (7) and (8) and which construes an elided proposition (P): hai6 [P] lo1 (“yeah , [P]”). When hai6 lo1 is used as an agreement formula, then P is semantically equivalent to what the listener has just said. When it is used to confirm an expectation, then P is most likely equal to something that was said previously by the speaker himself or herself, as in (5). In response to a question, P is something like “that’s a good question,” and, for the example just given, D is “a meeting should be taking place in this room now, but there is nobody here.” In this case, it would be natural to follow hai6 lo1 or “yeah” with a repeat of the question: “Why isn’t anybody here?” In the next dialogue, repeated here as (10′), speaker B believed that the answer to speaker A’s question should have already been known to A. All four of the participants’ translations expressed this same type of evidential/epistemic knowledge in English by using the lo1-equivalent tone. (10′)

A: Lei5 laai1 mat1je5 aa3? 2s pull what SFP “What are you going to pull?” B: Laai1 li1 lap1 je5 lo1. pull this CL thing LO “(I’ll Pull) this thing.”

In this dialogue, we can only speculate about exactly what the “something else” (D) is that is assumed to be known by the listener. This is because it was not evident from the audio recording. It seems very likely that D is one of two things in speaker B’s mind: (1) it is something that he assumed to be commonly known information regarding the particular piece of machinery they were talking about or (2) it is pragmatic information, perhaps in the form of having previously pulled (or pointed to) that “thing” in the presence of the listener. Speaker B assumed that speaker A had access to the information D (in the form of (1) or (2) as just stated) and that speaker A could therefore know that it was “this thing” that speaker B was going to pull. In dialogue (11), speaker A appears to have assumed that “having a pot a person” should have been understood by the listener to mean “eating hot pot” and that clarification should therefore not have been required. This meaning was conveyed through the use of lo1-attachment.

6.1  Two Evidential Particles: lo1 and aa1maa3 (11′)

107

Daa2 bin1lou4 lo1. hit side-stove LO “Hot pot.”

This example is straight forward. P is “hot pot,” and the D that leads obviously to this knowledge in the mind of the speaker (though apparently not in the mind of the listener) was the listener having just been told that they would be having “a pot a person.” Speaker B in (12) apparently thought that speaker A should have already known which person in the picture was Ricky and indicated so by attaching lo1 to his reply: (12)

Lei1 go lo1. this CL LO “This one.”

As was the case for (10), we also need to speculate as to exactly what D is in (12) because, again, it was not verbalized. In this case, the D in the speaker’s mind could be that he had just pointed to which person in the picture was Ricky, or it could be the assumption that the listener should have recognized which person in the picture was Ricky because Ricky was a mutual acquaintance that the listener knows, and therefore should recognize. In all of the examples from the literature and the data, the NSM explication of (7) and (8) appears to be an accurate paraphrase of the meaning that both lo1 and its English equivalent add to those sentences within their given contexts. This can be seen as evidence that lo1 and its English equivalent both have a core meaning that does not change regardless of the context and that they share the same (or a very similar) meaning. The next subsection looks at the second member in this pair of evidential SFPs.

6.1.2  The Particle aa1maa3 The NSM Explication of aa1maa3  Much less has been written about aa1maa3 than about lo1. I found no definitions for aa1maa3 in any dictionaries and saw it mentioned in only two textbooks. Boyle (1970b: 327) said that this SFP means “‘that’s why’ in a response sentence which gives [an] explanation of why something occurred, [and that] aa1maa3 adds the connotation (cheerfully without impatience) that the whole thing is pretty obvious.” This compares to descriptions of lo1 as expressing an obvious reason for something, but Boyle adds the idea of patient cheerfulness without offering any explanation as to why she thinks this. She gave the following dialogue as an example:

108 (14)

6  The Results of the Research A: Jan1wai6 soeng1fung1, keoi5 m4-lai4 dak1. because cold 3s NEG-come can “She has a cold, so she can’t come.” B: Dim2gaai2 keoi5 m4-lai4 dak1? why 3s NEG-come can “Why isn’t she coming?” A: Keoi5 soeng1fung1 aa1maa3. 3s cold AA-MAA “She has a cold, that’s why.”

In the final line of (14), aa1maa3 is translated as “that’s why,” which, according to Boyle’s (1970b) description, includes the connotation that it is obvious. Kwok (1984: 58) claimed that the particle lo1 either “seems to give the reason for something, or to point out what is obvious,” and Boyle’s description of aa1maa3 implied that it does both of these things at the same time. I argued against Kwok’s claim that lo1 can be used to give the reason for something, saying that it only points out what is obvious. I think Boyle’s description of aa1maa3 begins to capture the key difference between these two particles: lo1 is used to point out the obvious, while aa1maa3 is used to point out the obvious and to do something else, which often appears to be “reason giving,” though I will argue that that is not actually what it is. Boyle’s (1970b) claim that aa1maa3 entails the connotation of cheerfulness and patience cannot be accurate because, although aa1maa3 is compatible with these attitudes, it is also compatible with opposing attitudes. When a speaker lengthens the rime of the second syllable of aa1maa3, and especially when nasally and breathy qualities of voice are used, an attitude of impatience and displeasure is implied. It is therefore more accurate to say that aa1maa3 is neutral regarding these attitudes and that such attitudes are expressed through suprasegmentals that are said across the utterance, including the segments of aa1maa3; such attitudes (i.e., cheerful vs. uncheerful; patient vs. impatient) are not part of its inherent meaning. Yip and Matthews (2001: 157) said that “aa1maa3 draws attention to something which should [already] be known, typically in response to a question.” This again implies that aa1maa3 is typically used to express an obvious reason. They gave the following example: (15)

A: Dim2gaai2 gam3 ci4 zung6 mei6 faan1 lai4 aa3? why so late still not-yet return come SFP “How come she’s still not back?” B: Zung6 hoi1-gan2 wui2 aa1maa3. still open-PROG meeting AA-MAA “Because she’s still in the meeting (of course).”

The particle aa1maa3 often attaches to sentences that are “reasons,” or “responses to questions,” but it will be seen in some examples below that this is not always the case. Yip and Matthews (2001) gave the following three examples of “other common uses” of aa1maa3:

6.1  Two Evidential Particles: lo1 and aa1maa3

109

Teaching or reminding people of rules and facts which may or may not be obvious, as when a parent is teaching her child the social expectation that one greets (giu3) a person [using the relevant kinship term]: (Yip and Matthews 2001: 158)

(16)

Lei5 jiu3 giu3 jan4 aa1maa3. 2s need call person AAMAA “You should address people, you know.”

To make an excuse, as when a parent tries to explain why the children are so naughty: (ibid)

(17)

Keoi5 zung6 sai3 aa1maa3. Dim2 sik1 gam3 do1 je5 aa3? 3s still small AAMAA how know so much thing SFP “S/he’s still young. How could s/he know much?”

To correct a mistake or faulty information, as when one gets on the wrong bus and the bus driver says: (ibid)

(18)

M4-hai6 sap6 hou6 aa3, jing1goi1 daap3 jat1 hou6 aa1maa3. NEG-be ten number SFP should ride one number AAMAA “You should take number one, not number ten.”

It is evident from the examples seen thus far that Yip and Matthews’s (2001: 157) description of aa1maa3 as “draw[ing] attention to something that should [already] be known” appears to work for all of its uses. In contrast, Boyle’s (1970b: 327) saying that it means “that’s why” does not work for all of its occurrences. We can add “that’s why” to the English translations of (14) and (15), as well as to that of (17) if it is a response to a question, but not to the English translations of (16) or (18), which do not explain why something occurred. Presumably in (16) the child is being reprimanded for not addressing an elder. The proposition “you should address people” is not an explanation as to why the child did not address the person. And in (18), the proposition “you should take number one” is not an explanation as to why the passenger mistakenly got on the number ten bus instead. Many authors concluded that aa1maa3 means (or indicates) an obvious reason. Matthews and Yip (2011: 340) said aa1maa3 is an SFP “indicating obvious reason, excuse, etc.,” Leung (1992/2005: 76) said that it points out an obvious reason that is based on a subjective view of the speaker, and Kwok (1984: 61) said that both aa1maa3 and lo1 are “used to point out the reason for something” and that aa1maa3 “perhaps [has] the additional meaning of ‘you should be aware of it’ or ‘I have already told you the reason’.” However, examples (16) and (18) clearly show that aa1maa3-suffixed sentences do not necessary express a reason for something. Nevertheless, the meaning of aa1maa3 is very well suited for attaching to reasons, and it expresses that the proposition is obvious. This has therefore caused numerous

110

6  The Results of the Research

authors to say it has the meaning of “obvious reason.” Because of examples like (16) and (18), the explication of aa1maa3 must not define it as “an obvious reason,” but it must account for the fact that it so readily attaches to what are construed as obvious reasons. A. Law (2002) classified aa1maa3 as a “reminder” particle, but again this does not apply to all of its uses. Examples (14) and (16) could be construed as reminders, as well as perhaps (18). However, examples (15) and (17) cannot reasonably be analyzed as reminders. In example (15), the listener could be reminded that the woman being spoken of had needed to attend a meeting, but not that she was still in a meeting. In this scenario, the speaker could assume that the listener should know the woman is still in the meeting, but the speaker would assume the listener to have acquired this knowledge pragmatically (i.e., that the listener would have used his or her assumed knowledge related to the woman’s attending a meeting and not having returned yet, and then come to the conclusion that she was still in the meeting); this is different from assuming that the listener knew—sometime prior to the speech time—that the woman was still in the meeting, but had forgotten and therefore needed to be “reminded” of this. In example (17), the speaker is not reminding the listener that the speaker’s child is young; rather he or she is using the child’s age as an excuse for the child’s behavior. It is clear from the examples that it is insufficient, and therefore inaccurate, to define aa1maa3 as being (or only attaching to) a “reason” and/or a “reminder,” but its definition must account for the fact that it is compatible with these types of sentences. I will now consider whether or not aa1maa3 can be defined as “obvious,” a word that shows up repeatedly in the literature. Yip and Matthews (2001: 158) said that aa1maa3 can be used for “[t]eaching or reminding people of rules and facts which may or may not be obvious,” and Lee and Law (2001: 84) similarly said that it “draws the hearer’s attention to information that may or may not be obvious” (emphasis in italics mine for both quotes). I disagree with this and argue that the speaker assumes that the information aa1maa3 attaches to is obvious, in the same sense that the information lo1 attaches to is obvious. Based on all the examples of aa1maa3, I propose that like lo1, it can only attach to a proposition P that the speaker assumes the listener can know because he or she knows something else that can lead to knowing P.  It is only acceptable to use aa1maa3 on propositions that the speaker believes the listener knows, either because it was stated in the discourse or because it can be figured out pragmatically. This is the definition that was given to lo1 in (7) and (8), and my explication of aa1maa3 therefore entails this meaning. The SFP aa1maa3 relates P to the discourse in the same way that lo1 does, but it also relates P to the discourse in an additional way. Related to this, Lee and Law (2001: 84) said that aa1maa3 must satisfy “a minimal level of informativeness,” and concluding this to be its core function, they referred to it as an “informativeness and elaboration marker” (p. 82), giving the following as an example:

6.1  Two Evidential Particles: lo1 and aa1maa3 (19)

111

A: Dim2gaai2 keoi5 gam3 hoi1sam1 aa3? why 3s so happy SFP “Why is s/he so happy?” B: Keoi5 jeng4-zo2 maa5 aa1maa3. 3s win-PERF horse AAMAA “(Because) s/he won a bet on a horse.”

In (19), speaker B elaborates and provides information, giving the reason as to why the person is so happy. Contrary to Lee and Law’s (2001) implication that the proposition need not be obvious, however, this context is suitable for aa1maa3-­ attachment if and only if speaker B in (19) assumes that speaker A, the listener, has some knowledge about the proposition (i.e., about her having won a bet on a horse). If not, then aa1maa3 is unacceptable in this context. It appears that this is always true for aa1maa3-attachment, just as it always is for lo1-attachment. This is all there is to the meaning of lo1, but there is more to the meaning of aa1maa3 than just this; it additionally connects P back to something in the discourse that is different from the D that is included in the explication of lo1—I will call this other discourse element D2. It is aa1maa3’s linking of P to D2 that caused Lee and Law to conclude that the proposition P must have a “minimal level of informativeness” in order for it to license aa1maa3 to attach to it. This means that aa1maa3 is only acceptable in contexts that include a discourse element D2 to which P refers. Lee and Law (2001) said that aa1maa3 typically marks a proposition that elaborates a reason, as in (19), but that it is not restricted to this. It can also “be used to elaborate on the event by specifying the action, participant, time, place, manner or cause of the event” (p.  85, emphasis in italics mine). What they refer to as “the event” is what I refer to as D2, but, contrary to their analysis, I think that P does more than merely “elaborate” on D2. I propose that aa1maa3 marks P as something that the speaker hopes will influence the listener’s beliefs about D2. These are some of the examples that Lee and Law (2001) gave to demonstrate their point: (20)

(21)

A: Lei5 kam4jat6 heoi3-zo2 bin1dou6 aa3? 2s yesterday go-PERF where SFP “Where did you go yesterday?” B: Ngo5 kam4jat6 faan1-zo2 hok6haau6 aa1maa3. 1s yesterday return-PERF school AAMAA “I went to school yesterday.” A: Bin1go3 jeng4-zo2 luk6hap6coi2 aa3? who win-PERF six-combine-lottery SFP “Who won the Mark-six (lottery)?” B: Mary jeng4-zo2 aa1maa3. Mary win-PERF AAMAA “Mary won.”

112 (22)

6  The Results of the Research A: De1di6 fong3-zo2 di1 mat1je5 hai2 syu1gaa2 soeng6min6 aa3? daddy put-PERF CL(pl) what at bookcase top SFP “What did daddy put on the bookcase?” B: Keoi5 fong3-zo2 go3 faa1zeon1 aa1maa3. 3s put-PERF CL vase AAMAA “He put a vase (on it).”

Lee and Law (2001: 85) said that “the exact interpretation [of the elaboration] is contextually determined.” This is surely correct, and the explication of aa1maa3, like other SFPs, accounts for this by including deictic elements whose references come from the context. Lee and Law’s analysis is not sufficient, however, because it does not explain how the interpretation of an aa1maa3 sentence is contextually determined; it merely states that this is so. We could perhaps look at their analysis as an expansion of Boyle’s (1970b) description, saying that aa1maa3 cannot only mean “that’s why” but can also mean “that’s where,” “that’s who,” “that’s what,” and so on. In contrast, the NSM explication will show precisely how the interpretation is contextually determined. Lee and Law’s (2001) description also does not explain why aa1maa3 is compatible with their example sentences. It merely demonstrates that it is. It should also be pointed out that, in each of the answers in (19B–22B), we could replace aa1maa3 with lo1, with some other compatible SFP or with no SFP, and the sentences would still be acceptable. We would not want to conclude from this that lo1 or some other SFPs are also “informativeness and elaboration markers” simply because they appear to “mark” propositions that provide elaborating information in relation to an event in the prior discourse. Lee and Law’s description does not help us to understand the difference in interpretation if we were to exchange aa1maa3 for lo1 in each of those sentences. Lee and Law (2001: 86) said that “[t]he proposition expressed by the utterance [that aa1maa3 attaches to] must have some level of propositional complexity or informativeness.” They claimed that, for this reason, it cannot attach to the answer of an A-not-A polar question or a disjunctive question, giving these two examples that use asterisks to mark the aa1maa3-suffixed answers as ungrammatical, regardless of whether the answer is positive or negative: (23)

A: Lei5 jat1-zan6 gaan3 hoi1-m4-hoi1 wui2 aa3? 2s one-CL time open-NEG-open meeting SFP “Are you going to the meeting in a little while?” B: ∗Hoi1 aa1maa3/∗M4-hoi1 aa1maa3 open AAMAA/Neg-open AAMAA “Yes”/“No”

(24)

A: Lei5 jat1zan6gaan1 heoi3 hoi1 wui2 ding6 faan1 uk1kei2? 2s one-CL-moment go open meeting or return home “Are you coming to the meeting later, or going home?” B: ∗Heoi3 hoi1 wui2 aa1maa3 / ∗Faan1 uk1kei2 aa1maa3 go open meeting AAMAA / return home AAMAA “I’m coming to the meeting”/“I’m going home.”

6.1  Two Evidential Particles: lo1 and aa1maa3

113

I, along with native Cantonese speakers that I consulted, disagree with these judgments if speaker B believes that the listener, speaker A, has been given this information already. If speaker B in either (23) or (24) had already provided this information to speaker A earlier, then attaching aa1maa3 to those propositions used as answers to those questions is acceptable, especially if followed by a lo1-suffixed “I just told you”: (23-B′)

(24-B′)

Hoi1 aa1maa3/ M4-hoi1 aa1maa3 (tau4sin1 mai6 gong2-zo2 lo1.) open AA-MAA/ NEG-open AA-MAA just MAI say-PERF LO “Yes/No. (I just told you.)” a. Heoi3 hoi1 wui2 aa1maa3. (Ngo5 tau4sin1 mai6 gong2-zo2 lo1.) go open meeting AAMAA 1s just MAI say-PERF LO “I’m going to the meeting. (I just told you.)” b. Faan1 uk1kei2 aa1maa3. (Ngo5 tau4sin1 mai6 gong2-zo2 lo1.) return home AAMAA 1s just MAI say-PERF LO “I’m going home. (I just told you.)”

In these two modified examples, aa1maa3 is attached to the exact same propositions, which are answering the exact same questions. The only difference is that a suitable context has been provided within which aa1maa3 can be used. Lee and Law’s “level of propositional complexity” argument does not explain why their aa1maa3-suffixed answers to polar and disjunctive questions can be made acceptable in this way. I argue that the reason aa1maa3 suffixing is licensed in these modified examples is because aa1maa3 is now attached to information that the listener has heard before. This relates to the portion of aa1maa3’s meaning that is the same as that of lo1, which restricts it to being used with propositions that are logically linked to something that the speaker assumes the listener already knows—in this case, having already heard the answer in the prior discourse. If the speaker assumes that the listener has just asked for information about which he or she has no knowledge, then the speaker cannot answer the question using lo1 or aa1maa3. This was the context that Lee and Law (2001) assumed for their examples in (23) and (24). Despite the fact that aa1maa3-attachment actually can be used to answer an A-not-A polar question or a choice-type question, there is still something to Lee and Law’s (2001) “level of propositional complexity” argument. They said that aa1maa3 “is discourse-bound in that it must be a response to an antecedent event, given in the immediately preceding context, either linguistically or non-linguistically” (p. 84). Here, they are referring to D2, which they said P “elaborates” on. However, as implied in the previous paragraph, I think their judgments of (23) and (24) are related to there being no D (i.e., the D from the explication of lo1) rather than their being no D2. I demonstrated in (23-B′) and (24-B′) that if a D is added to the context, which in this case is “I just told you P,” then aa1maa3-attachment becomes acceptable. However, Lee and Law are correct about the need for a D2 as well— what they referred to as the “antecedent event.” I propose that, in addition to referring to a discourse element D, the same way lo1 does, aa1maa3 also refers to a discourse item D2, and the information in the attached

114

6  The Results of the Research

proposition P is presented to the listener in an attempt to influence the listener’s beliefs in relation to this D2. If Lee and Law (2001: 84) are correct in saying that aa1maa3 “must be a response to an antecedent event” (emphasis in italics mine), then aa1maa3 can only be used when the proposition is related to some D2 which is in the mind of the speaker, and which the speaker believes is also in the mind of the listener. I think Lee and Law are right and agree with them that this discourse element can be either linguistic or pragmatic but do not agree that it is necessarily “given in the immediately preceding context” (ibid). The antecedent event in (16), for example, is listener-based knowledge that comes from the long-­term habitual practice of telling the child to address people, not something in the immediately preceding context. Another example is the question in (21), which could be changed to “Who won the lottery last year?” The answer to that could still use aa1maa3suffixing, so long as the speaker assumes that the listener should remember something that happened a year ago, or perhaps remember the consequences of that year-old event, for example, Mary having bought a new house and car. Lee and Law (2001) constructed an example to demonstrate that the antecedent event (i.e., D2) can be a nonlinguistic event. The speaker is a teacher who sees a student working on a math problem and then says, (25)

Zing3ming4 lei1 jat1-bou6 fan6 sin1 aa1maa3. Prove this one-Cl part first AAMAA “Prove this part first.”

Lee and Law (2001) interpreted (25) to be a response to an implicit “how” question, which licenses the particles use. They said the “fact that the particle can be used in responses to implicit questions lends its use to problem solving situations or ones in which an effective course of action is recommended, as if in answer to a[n] implicit ‘what to do’ question” (p. 87). I agree with much of their analysis about this example. First of all, it demonstrates that the speaker-oriented meaning of an SFP refers to antecedents that, while stemming from the discourse context, ultimately exist in the mind of the speaker. In addition to that, their example shows that aa1maa3 can only be used when the speaker has some discourse element D2 in mind to which the listener’s knowing P is relevant. The teacher could only attach aa1maa3 to the proposition “prove this part first” if he or she assumed some D2 to which it was related; in this case, it was the assumption that the student wanted or needed to know how to do the problem. However, this is not the whole story. It is only acceptable to use aa1maa3 if there is also some discourse element D that the teacher believes the student knows and which can lead to the student’s knowledge of P (i.e., the definition of lo1 subsumed within aa1maa3). It is only appropriate for the teacher to attach aa1maa3 to “prove this part first” if he or she assumes that the student has been taught to do this before. Speakers use aa1maa3-attachment when they want the listener to think the same thing about D2 that the speaker himself or herself does. This is why it attaches so appropriately to reasons, excuses, suggestions, and advice. It is also why it attaches to reminders, but, crucially, only to reminders that include information which the

6.1  Two Evidential Particles: lo1 and aa1maa3

115

speaker believes will influence the listener’s thoughts about some discourse element D2. It would be inappropriate to attach aa1maa3 to an out-of-the-blue reminder such as, for example, “You need to buy milk.” In sum, aa1maa3 only attaches to a proposition P that the speaker believes the listener can know from knowing D, and, at the same time, it only attaches to a P that the speaker believes contains information that can influence the listener to hold the same beliefs about D2 that the speaker does. Based on this, the NSM explication I propose for aa1maa3 is as follows: (26)

P + aa1maa3 = a. b. c.

lei5 ho2ji5 zi1dou3 lei1 joeng6 je5 (P) 2s can know this CL thing “you can know this (P)” jan1wai6 lei5 zi1dou3 ling6ngoi6 jat1 joeng6 je5 (D) because 2s know another one CL thing “because you know something else (D)” ngo5 soeng2 lei5 ji4gaa1 lam2-haa5 lei1 joeng6 je5 (P) 1s want 2s now think-DM this CL thing “I want you to think about this (P) now”

d. zi1hau4, after “after this,” e.

lei5 m4-wui5 lam2 lei1 joeng6 je5 (D2) 2s NEG-will think this CL thing “you will not think this (D2)”

f.

lei5 wui5 lam2 ling6ngoi6 jat1 joeng6 je5 (P2) 2s will think another one CL thing “you will think something else (P2)”

The English version of this explication is equal to the English translation of (26) and is shown here in (27). It is proposed to define aa1maa3’s English-equivalent tone, which is described in the following subsection. (27)

P + aa1maa3-equivalent intonation = a. b. c. d. e. f.

you can know this (P) because you know something else (D) I want you to think about this (P) now after this, you will not think this (D2) you will think something else (P2)

Lines a and b do not require any explanation; they are the same as the explication of lo1 and its English equivalent given in (7) and (8). The demonstrative “this” in line d refers to line c, so “after this” means “after you think about P.” The discourse element D2 is a particular belief or stance that the speaker assumes the listener to hold based on some prior evidence, which may be linguistic or nonlinguistic. P2 is a

116

6  The Results of the Research

belief or stance that the speaker holds, and this stance contrasts with the stance of D2. Ultimately, the function of aa1maa3 is to influence the listener to change his or her belief/stance form D2 to P2. All of the Ps and Ds of this explication are propositional in nature, and all of them are in the mind of the speaker. The key difference between them is that a D stems from something in the discourse, while a P originates from the mind of the speaker. At first glance, one might assume that because the meaning of lo1 is embedded within the meaning of aa1maa3, lo1-attachment should be allowed in every context where aa1maa3-attachment is allowed, and aa1maa3 should have a more restrictive use than lo1. It is not quite that simple, however, and this will be discussed in Sect. 6.1.3 where the acceptable uses of these two SFPs are compared and contrasted. The English equivalent of aa1maa3 based on the data  In this first dialogue, speaker A explains that as soon as he looks at something he has bought, he can tell whether it is fake or not. Speaker B thought this was odd because he assumed it meant that speaker A bought something even though he knew that it was counterfeit. Speaker B therefore asked, “You still bought it?” What speaker A actually meant was that he looked at it and saw that it was fake after he bought it, and he explained this by saying, “I didn’t know (that it was fake) when I bought it.” (28)

A: Jat1 mong6 ji5ging1 zi1dou3 hai6 gaa2 ge3. one look already know be fake SFP “As soon as I look (at something) I know it’s fake.” Ngo5 maai5-gwo3 di1 gaa2 je5 zau6 hai6. 1s buy-EXP CL(pl) fake thing then be “That’s how it is when I buy something that’s fake.” B: Lei5 dou1 maai5? 2s also buy “You still bought it?” A: Ngo5 m4-zi1 aa1maa3, maai5 ge3 si4hau6. 1s NEG-know AAMAA buy PRT time “I didn’t know  when I bought it.”

The English translation of A’s last utterance is shown to have a rise-fall tone associated with the word “know” because three of the ambilingual participants’ translations were heard to have a prominent rise–fall pitch contour. Each of these sounded to be the same tone to me and the native English-speaker informants (Figs. 6.17 through 6.20). There are several noteworthy facts about these translations, one of which is that Male-b’s translation showed a rise–fall of the F0 contour on paper associated with the word “know,” but it did not sound like an instance of the tone heard in the other three translations. Therefore, only three of the four translations are considered to have translated aa1-maa3 as the same tone contour, which looks and sounds like a rise-fall. Another interesting observation is that the rise-fall tone is clearly associated with the word “know” in both female-a and female-b’s translations, even

Pitch (Hz logarithmic)

6.1  Two Evidential Particles: lo1 and aa1maa3

117

245

173

YEAH BUT I DIDN’T KNOW WHEN I BOUGHT IT 122

0

0.5

1 Time (s.)

1.5

1.8

Pitch (Hz logarithmic)

Fig. 6.17  Female-a: “Yeah, but I didn’t know when I bought it”

153

108

WELL HOW WOULD I

KNOW

76

0

0.5

0.8

Time (s.) Fig. 6.18  Male-a: “Well how would I know?”

though on paper it looks like the negated auxiliary has an even more pronounced token of the same tone. The tones on “don’t” sound like a falling tone which is used on auxiliaries in English to emphasize the whole proposition. The final thing worth noting is male-b having phrased his translation as a hypothetical question and placing the tone over the word “I” rather than the word “know.”

6  The Results of the Research

Pitch (Hz logarithmic)

118

335

237

BUT I

D IDN’T

KNOW

168

0

0.5 Time (s.)

1.0

Pitch (Hz logarithmic)

Fig. 6.19  Female-b: “But I didn’t know”

134

95

I DIDN’T

KNOW

67

0

0.5 Time (s.)

0.9

Fig. 6.20  Male-b: “I didn’t know”

In this next dialogue, speaker B talks about a boat tour out of Australia that takes tourists to see whales. This does not sound like an enjoyable tour to speaker A, who says she is not interested in whales.

119

6.1  Two Evidential Particles: lo1 and aa1maa3 (29)

A: Ngo5 deoi3 king4jyu4 mou5 mat1 hing3ceoi3. 1s towards whale NEG any interest “I’ve no interest in whales.” B: Aa3, ngo5 soeng2 tai2-haa5 wo3. PRT 1s want look-DM SFP “Oh, I’d like to see them.” A: Hai6 me1? be SFP “Really?” B: Hai6 aa3. be SFP “Yeah.” Jan1wai6 mei6 gin3-gwo3 gam3 daai6 tiu4 king4jyu4 aa1maa3. because never see-EXP so big CL whale AAMAA “Because I’ve never seen a whale that big.” A: Ngaau5-m4-ngaau5 jan4 gaa3? bite-NEG-bite person SFP “Do they bite?”

Pitch (Hz logarithmic)

The intonations are again similar for the two females and male-a, and they are recognizably the same as were the intonations for the translations of (28). In these three participants’ translations, the intonation on the nuclear stress placed over “whale” sounds like a rise-fall pitch contour, and the F0 curves all show this shape. Once again male-b’s translation did not sound the same, and his F0 contour was quite different from the other participants’ contours. His intonation here sounded quite neutral, like the canonical intonation of a declarative clause used to make a statement, or what Stockwell (1972: 87–8) referred to as the “‘neutral’ or ‘normal’ or ‘colorless’ intonation contour for any sentence, serving as a baseline against which all other possible contours are contrastable, and thereby meaningful.” Possible reasons for this will be discussed below (see Figs. 6.21 through 6.24).

245

173 BECAUSE I’VE NEVER SEEN SUCHA BIGWHALE BE

FORE

122

0

0.5

1 Time (s.)

1.5

Fig. 6.21  Female-a: “Because I’ve never seen such a big whale before”

2

2.2

6  The Results of the Research

Pitch (Hz logarithmic)

120

153

108

BECUZ I’VE NEVERSEENAWHALETHATBIG

76

0

0.5

1.5 1.6

1 Time (s.)

Pitch (Hz logarithmic)

Fig. 6.22  Male-a: “Because I’ve never seen a whale that big”

335

237

BECAUSE

168

0

0.5

IHAVEN’T

1

SEEN SUCHA BIGWHALEBEFORE

1.5 2 Time (s.)

2.5

3

3.2

Pitch (Hz logarithmic)

Fig. 6.23  Female-b: “Because I haven’t seen such a big whale before”

134

95

I HAVEN”T SEEN SUCH A BIG WHALE BEFORE 67

0

0.5

1

1.5 Time (s.)

2

Fig. 6.24  Male-b: “I haven’t seen such a seen such a big whale before”

2.5

2.9

121

6.1  Two Evidential Particles: lo1 and aa1maa3

In the third and final corpus dialogue related to aa1-maa3 that will be shown here, two people are talking about a former classmate. Speaker B has just told speaker A that the former classmate plans to quit his job as a marketing assistant. Speaker A asks why, and speaker B then gives the reason: (30)

A: Dim2gaai2 aa3? why SFP “Why?” B: Keoi5 wan2 dou2 hok6haau6 ceng2 keoi5 aa1maa3. 3s find complete school hire 3s AAMAA “(Because) s/he found a school that’s gonna hire him/her.”

Pitch (Hz logarithmic)

Yet again male-b’s translation (Fig. 6.28) did not sound the same as the others, and it again sounded like canonical declarative intonation. Perhaps for the translations of aa1maa3, male-b was not as good at “acting” as the other participants, failing to accurately mimic the Cantonese sentences. Another possibility is that male-b does not have this tone in his mental lexicon, but that seems unlikely because I recognize it as being part of my own lexicon, and the other three participants each produced the same risefall pitch contour, just as they did in their translations of (28) and (29). For the translation of (30), female-a and male-a placed the rise-fall tone on the rime of the syllable “hire,” and female-b placed it on the rime of “-ploy” (see Figs. 6.25 through 6.28). Based on the data and on my native-speaker intuition, I conclude that aa1maa3 has an intonational equivalent in English, which is a tonal morpheme in the shape of a rise-fall. It is proposed to have the meaning of the NSM explication shown in (27). It attaches to a proposition that the speaker considers to be obvious and knowable, and it functions to change the listener’s stance about something. It can therefore only be used in a context where the speaker has evidence of the listener holding a particular stance/belief (i.e., D2), and it is used in an attempt to get the listener to change this stance/belief (i.e., to change from thinking D2 to thinking P2). In this sense, it can be thought of as a “persuasion” tone.

245

173

‘CAUSE HE

122

0

FOUNDASCHOOLTHAT’SGONNA HIRE HIM

0.5

1

1.5 Time (s.)

Fig. 6.25  Female-a: “‛Cause he found a school that’s gonna hire him”

2

2.4

6  The Results of the Research

Pitch (Hz logarithmic)

122

153

108

HE MANAGED TO FIND A SCHOOL THAT WOULD HIRE HIM

76

0

0.5

1 Time (s.)

1.5

2.1

Pitch (Hz logarithmic)

Fig. 6.26  Male-a: “He managed to find a school that would hire him”

335

237

168‘CAUSE HE FOUND A SCHOOL WILLINGTO EMPLOY HIM

0

0.5

1

1.5 2 Time (s.)

2.5

3

3.5

Pitch (Hz logarithmic)

Fig. 6.27  Female-b: “‘Cause he found a school willing to employ him”

134

95

SHE FOUND A SCHOOL THAT WILL HIRE HER 67

0

0.5

1 Time (s.)

Fig. 6.28  Male-b: “She found a school that will hire her”

1.5

2

2.4

6.1  Two Evidential Particles: lo1 and aa1maa3

123

Applying the NSM explication to the examples of aa1maa3 from the literature and the data  This section proposes the antecedents of P, P2, D, and D2 of the explications given in (26) and (27) for each of the examples cited from the literature and the dialogues used for the translations. As was done above for lo1, each example will be repeated using the same number plus an apostrophe. The first example came from Boyle (1970b), with a simple context that she most likely constructed herself. In Boyle’s dialogue, the speaker has just told the listener that some person cannot come because he or she has a cold. Then the listener asks why he or she cannot come, and the speaker repeats the reason with aa1maa3 attached: (14′)

Keoi5 soeng1fung1 aa1maa3. 3s cold AAMAA “S/he has a cold .”

Other than when an SFP is attached to the copula, P is given because it is the proposition to which the SFP is attached, which in this case is: “S/he has a cold.” D is straight forward in this constructed context, because in the preceding discourse, the speaker has just said that this person has a cold. By asking “Why can’t s/he come,” the listener has expressed a lack of knowledge as to why this person cannot come. This is the source of D2, which, from the speaker’s perspective in relation to the listener is, “you don’t know why she can’t come.” Stated in the words of the explication, the use of aa1maa3 in this context expresses the following: you can know this (P: S/he has a cold) because you know something else (D: I told you s/he can’t come because she has a cold) I want you to think about this (P) now after this, you will not think this (D2: you don’t know why she can’t come) you will think something else (P2: you know it is because she has a cold that she can’t come)

The aa1maa3-equivalent tone on “cold” sounds natural, and it is proposed to express the same meaning. The next four examples are all from Yip and Matthews (2001). The context of the first one is a response to the question: “How come she’s still not back?” (15′)

Zung6 hoi1-gan2 wui2 aa1maa3. still open-PROG meeting AAMAA “She’s still in the meet ing.”

For this and some of the remaining examples in this section, I will not say what P is, which is always equal to the sentence to which aa1maa3 or its English equivalent is attached. There was not enough context given for (15′) to know what D is, but a likely possibility is that the listener had previously been told about the meeting. The use of the adverb “still” indicates that the speaker assumed the listener to have knowledge about the meeting. D is likely something like “You know she’s been attending a meeting and that she hasn’t yet returned.” Based on this knowledge, P can obviously be known. D2 again stems from the listener’s question and is something like, “You

124

6  The Results of the Research

don’t know why she’s still not back.” The P2 that the speaker wants the listener to think instead of D2 is this: “Still being in the meeting is the reason she’s not back.” The context of the next example is apparently right after a child has refused or neglected to address an older person using the appropriate kinship term. (16′)

Lei5 jiu3 giu3 jan4 aa1maa3. 2s need call person AAMAA “You need to address people/this person.”

In (16′), D is straight forward based on what is known about Hong Kong Chinese parents’ child-rearing behavior. It is something like “I have told you many times to address people.” There are two possible contexts for this example, which will be contrasted in Sect. 6.1.3. For now, let us assume that the adult to be addressed is still present. Since the child did not address the adult, the D2 that the child is assumed to be thinking based on contextual evidence is likely this: “You don’t need to address this person.” Yip and Matthews presented (16) as an example of aa1maa3 being used to teach or remind people of rules and facts. Based on the explication in (26) aa1maa3-, attachment works as an appeal to get the listener to change from thinking D2 to instead think P2, which in this case is something like “You need to address this person.” This is something more than just teaching or reminding. It is an attempt to persuade the child and influence his or her beliefs. The context in (17′) is one in which a parent is trying to excuse his or her child’s behavior. (17′)

Keoi5 zung6 sai3 aa1maa3. Dim2 sik1 gam3 do1 je5 aa3? 3s still small AAMAA how know so much thing SFP “S/he’s still young . How could s/he know much?”

The listener can know P (“S/he’s still young”) from knowing D, which is either having been told the age of the child and/or seeing that the child is young. D2 is probably something like “It is bad for this child to behave this way.” The speaker wants the listener to change from thinking D2 to instead think (P2: it is acceptable for this child to behave this way because s/he’s still young). The next example is a bus driver telling a passenger that he or she has gotten on the wrong bus; it is a different bus from the one that goes to where he or she wants to go. (18′)

M4-hai6 sap6 hou6 aa3, jing1goi1 daap3 jat1 hou6 aa1maa3. NEG-be ten number SFP should ride one number AAMAA “You should take bus number one , not number ten.”

For (18′), D could either be “This is common knowledge” or something like “It says this on the bus signs.” After stating P (“You should take bus number one”), the speaker said “not number ten,” making it obvious that D2 is, “You should take number ten”; the speaker wants the listener to think something else instead: “You should take number one.” In this case, P and P2 are the same.

6.1  Two Evidential Particles: lo1 and aa1maa3

125

The remaining examples are all from Lee and Law (2001). All of the aa1maa3-­ suffixed sentences in (19) to (23) are responses to questions, as were examples (14) and (15). In such a context D2 comes from the question, that is, the question’s expression of a lack of knowledge. For wh- questions, D2 is “You don’t know why/ where/who/etc …” For polar questions, D2 is “You don’t know if …” By using aa1maa3 or its English equivalent, the speaker is saying to the listener: “If you think about this (P), you will no longer think you don’t know the answer to your question.” The first of Lee and Law’s (2001) examples that I discussed is a response to the question, “Why is s/he so happy?” (19′)

Keoi5 jeng4-zo2 maa5 aa1maa3. 3s win-PERF horse AAMAA “S/he won a bet on a horse .”

Here D is perhaps: “Someone told you that s/he won a bet on a horse.” The listener’s question creates D2 in the mind of the speaker: “You don’t know why s/he’s so happy.” The speaker wants the listener to think something else: (P2: you know that winning the horse bet is the reason s/he is so happy). Example (20) was a response to the question, “Where did you go yesterday?” D is perhaps “I told you this before” or “I go there every day.” D2 is “you don’t know where I went yesterday,” and P2 is “You know I went to school yesterday.” (20′)

Ngo5 kam4jat6 faan1-zo2 hok6haau6 aa1maa3. 1s yesterday return-PERF school AAMAA “I went to school yesterday.”

Example (21) is an answer to “Who won the Mark-six (lottery)?” D is something like “I/Someone told you Mary won.” D2 is “You don’t know who won the Mark-­ six,” and P2 is “You know Mary won.” (21′)

Mary jeng4-zo2 aa1maa3. Mary win-PERF AAMAA “Mary won.”

Example (22) is an answer to “What did daddy put on the bookcase?” D is probably “You saw him put a vase on it.” D2 is “You don’t know what daddy put on the bookcase,” and P2 is “You know he put a vase on it.” (22′)

Keoi5 fong3-zo2 go3 faa1zeon1 aa1maa3. 3s put-PERF CL vase AAMAA “He put a vase (on it).”

Example (23) is the answer to the question “Are you going to the meeting in a little while?” D is what I show in parentheses as a follow-up statement to the answer:

126

6  The Results of the Research

“I just told you.” D2 is “You don’t know if I am going to the meeting,” and P2 is “You know I am/am not going to the meeting.” (23′)

M4/Hoi1 aa1maa3. (Ngo5 tau4sin1 mai6 gong2-zo2 lo1.) NEG/open AAMAA 1s just MAI say-PERF LO “Yes/No . (I just told you.)”

The two responses in (24) are the two possible replies to the choice-type question: “Are you coming to the meeting later or going home?” For both a and b, D is the same as it was for (23): “I just told you.” D2 is “You don’t know whether I am coming to the meeting later or going home,” and P2 is “You know I am going to the meeting” and “You know I am going home,” respectively for a and b. (24′)

a. Heoi3 hoi1 wui2 aa1maa3. (Ngo5 tau4sin1 mai6 gong2-zo2 lo1.) go open meeting AAMAA 1s just MAI say-PERF LO “I’m going to the mee ting. (I just told you.)” b. Faan1 uk1kei2 aa1maa3. (Ngo5 tau4sin1 mai6 gong2-zo2 lo1.) return home AAMAA 1s just MAI say-PERF LO “I’m going home . (I just told you.)”

In this last example from Lee and Law (2001) in (25), I added the generic subject “you” in the translation because I think the hypothetical teacher is more likely to be providing information than issuing a command, which means the sentence is more likely to be a declarative than an imperative. (25′)

Zing3ming4 li1 jat1 bou6fan6 sin1 aa1maa3. Prove this one part first AAMAA “You prove this part first .”

Here D is “I taught you before to prove this part first.” D2 is something like “You don’t know how to do/approach/start this problem,” and P2 is “You know to prove this part first.” The first example from the data is (28), where P is “I didn’t know (that it was fake).” The speaker assumes this can be known from knowing D, which is probably something like “I (or people in general) wouldn’t buy something if I knew it was fake.” The listener created a D2 in the mind of the speaker by incredulously asking “You still bought it (even though it was fake)?” The speaker thinks that after the listener thinks about P, then he will no longer think this (D2: I was stupid enough to buy something that I knew was fake); the listener will instead think something else (P2: my having bought it is understandable because I didn’t know at the time that it was fake). (28′)

Ngo5 m4-zi1 aa1maa3, maai5 ge3 si4hau6. 1s NEG-know AAMAA buy PRT time “I didn’t know when I bought it.”

6.1  Two Evidential Particles: lo1 and aa1maa3

127

In the next example where two people are talking about whales, the listener has just said that she has no interest in seeing whales. The speaker then expressed an interest in going on a tour to see whales and wanted the listener to understand why. Here P is “I’ve never seen a whale that big,” which can be known from knowing D, which is that the speaker just said she would like to see some whales. (29′)

Jan1wai6 mei6 gin3-gwo3 gam3 daai6 tiu4 king4jyu5 aa1maa3. because not yet see-EXP so big CL whale AAMAA “Because I’ve never seen a whale that big.”

As with example (28), the use of aa1maa3 in (29) is an appeal for understanding. In this case, however, the speaker does not want the listener to understand why she did something that seems stupid (i.e., to buy something that was fake), but rather why she wishes to do something toward which the listener has just expressed a lack of interest. The speaker thinks that if the listener thinks about her never having seen a whale that big (i.e., P), then she (the listener) will no longer think this (D2: my going on such a tour is a bad idea). Instead she will think something else (P2: my going on such a tour is a good idea). The D2 of this discourse stems from the listener having said she had no interest in whales, implying that she probably thinks anyone who goes on such a tour would be wasting their time and money. In the third example from the data, the speaker has just told the listener that a former classmate of theirs plans to quite his or her job. The speaker wants the listener to understand that the former classmate had a good reason for doing so, which is this (P: he found a school that’s gonna hire him). We have to speculate as to what the D is that leads to knowing P, but it is probably something like “our classmates commonly get hired by schools.” Choosing this to represent D is supported by the fact that after the speaker said P, the listener said, Go3-go3 dou1 gaau3 syu1 (CL-­ CL all teach book) “Everybody’s teaching.” D2 is perhaps “he may have quit his job for no good reason,” and what the speaker wants the listener to think instead of D2 is P2, which is something like “He had a good reason for quitting his job.” (30′)

Keoi5 wan2 dou2 hok6haau6 ceng2 keoi5 aa1maa3. 3s find achieve school hire 3s AAMAA “(Because) s/he found a school that’s gonna hire him/her.”

The following subsection contrasts the uses of lo1 verus aa1maa3 and their English equivalents, providing more evidence to support of the validity of their proposed NSM explications.

6.1.3  Summary and Analysis In all of the examples above for lo1 and aa1maa3 and their English equivalents, the NSM explications were explained for contexts where they are acceptable. This was done in order to see if the explications can account for what, on the surface, appears

128

6  The Results of the Research

to be multiple meanings and functions, or in other words, to demonstrate that a single definition works for every context in which these SFPs and their English counterparts occur. It is important and significant to note that I, along with other native English speakers consulted, recognize both of the lo1- and aa1maa3-­equivalent tonal morphemes that were used by the informants in their mimic translations. I use these tones myself and believe that they have the meanings expressed in the NMS explications that are proposed for them in (8) and (27). It is of course possible that these are not exact equivalents between English and Cantonese but are merely very close approximations. Regardless, it is an interesting discovery, and these tones can now be studied independently of the Cantonese SFPs used to discover them. The accuracy of the explications, and the degree to which they are the same in both languages, can be determined by their ability to explain the linguistic facts in both Cantonese and English. Even more informative are tests on contexts where they are predicted to be unacceptable based on their proposed meanings. This ­subsection demonstrates that the explications of lo1 and aa1maa3 given in (7) and (26), respectively, appear to succeed at accounting for the contexts where these two SFPs can and cannot be used. The acceptability of attaching one or the other of these SFPs to a sentence is determined by the status of P, P2, D, and D2. If no D exists in the context, then neither lo1 nor aa1maa3 is acceptable because neither is interpretable without this antecedent from the discourse. This relates to Fung’s (2000) observation that lo1 cannot attach to a sentence that begins a conversation. I agree with this and argue that the same is true of aa1maa3. Based on the explications, the following predictions about acceptability can be made. If D (i.e., evidence of P) exists in the discourse context, but not D2 (i.e., a listener-oriented stance that the speaker wants to change), then only lo1 is acceptable because aa1maa3 additionally requires the discourse element D2 as an antecedent in order to license its use. When a D2 does exist in the context, then aa1maa3 is acceptable to use. The SFP lo1 is also acceptable to use if there is a D2, but only when P and P2 are approximately the same, because in such cases, aa1maa3 does not express a meaning that drastically differs from lo1—which SFP the speaker chooses to use in such cases depends on whether he or she simply wants to point out that the listener can know P (i.e., lo1), or whether he or she wants to persuade the listener to think that P2, in which case aa1maa3 is required. If, on the contrary, P2 is significantly different from P, then lo1 is not very appropriate and aa1maa3 should be used instead. The greater the difference between P and P2, the less appropriate lo1 sounds. These predictions about their contextually based distributions can be shown as follows, where the question marks indicate semantic unacceptability: (31)

a. b. c. d.

??lo1/ ??aa1maa3 lo1/ aa1maa3 ?lo1/ aa1maa3 lo1/ ??aa1maa3

if there is no D if P and P2 are approximately the same if P and P2 differ significantly if there is no D2

To the extent that the English equivalents of these particles have the same meanings, it is predicted that the same contextually based distributions will apply, which can be illustrated as follows, based on the explications in (8) and (27):

6.1  Two Evidential Particles: lo1 and aa1maa3 (32)

a. ?? b. c. ? d.

/?? / / /??



129

if there is no D if P and P2 are approximately the same if P and P2 differ significantly if there is no D2

Question marks are used rather than asterisks because these minimal-pair distributions are based on semantics. The distributions are obviously not due to the syntactic properties of the SFPs because both SFPs are always sentence final, both SFPs attach to the same types of clauses, and none of the examples include any other SFPs, which means there are no potential co-occurability problems in relation to other SFPs. There is only one question mark in front of lo1 and its English equivalent in (31c) and (32c) because the acceptability of lo1 will vary depending on the extent to which P and P2 differ from each other, plus the extent to which the context indicates that the purpose of the utterance is to persuade the listener to think P2. In such cases, it will sound odd to merely say, “You can know P.” This is illustrated with examples below. Starting with contexts that have no D, consider examples (23) and (24). These are the examples that Lee and Law (2001) used to demonstrate that aa1maa3 cannot attach to the answer of an A-not-A or choice-type question. Their conclusion was based on the idea that speakers use A-not-A questions when they have no knowledge as to whether the answer is “A” or “not-A” and that they use choice-type questions when they have no knowledge as to which of the choices provided in the question is correct. If the speaker believes that the listener indeed has no knowledge of anything that can lead to knowing the answer to the question, then there is no D, and neither lo1 nor aa1maa3 can be attached to the answer. However, if the speaker believes the listener does have some knowledge that can lead to knowing the answer, then either lo1 or aa1maa3 can be attached to the answer. In (23′) and (24′), I demonstrated that aa1maa3 can be used in Lee and Law’s examples if the speaker thinks he or she has previously told the listener what the answer is—and in both of those examples, aa1maa3 could be replaced by lo1. These examples probably require the addition of “I just told you” in order to allow the use of aa1maa3 or lo1, but this is not surprising since the listener had just used a question that indicated a lack of knowledge about P. Consider the following choice-type question in (33) that can be answered very naturally using either lo1- or aa1maa3-attachment. In this example, it is easier to think of a D than it was in Lee and Law’s example. This is because it is natural for a speaker to assume that a listener should possess some knowledge related to what day of the week it is—D could be “Yesterday was Thursday,” for example. (33)

A: Gam1jat6 sing1kei4sei3 ding6 sing1kei4ng5 aa3? today Thursday or Friday SFP “Is today Thursday or Friday?” B: Sing1kei4ng5 lo1/ aa1maa3. Friday LO/ AAMAA “It’s Fri- / day.”

130

6  The Results of the Research

Contrast (33) with a context in which two friends are driving somewhere and get lost. They stop just before a T-intersection, not knowing whether to turn right or left. The passenger gets out of the car and asks a store owner for directions to their destination. After getting back into the car, the following question-answer dialogue takes place: (34)

Driver:

Zyun3 zo2 ding6 jau6 aa3? turn left or right SFP “(Do we) turn right or left?”

Passenger: Zyun3 zo2 ??lo1/ ??aa1maa3 turn left LO AAMAA “(We turn) left ?? /?? .”

It is inappropriate for the passenger to attach either lo1 or aa1maa3 to his or her answer because he or she knows that the driver, who stayed in the car, did not hear the store owner’s directions. The passenger therefore knows that the driver has no knowledge of any D that could lead to knowing P: “we turn left.” The same is true of the two SFPs’ English equivalents in the English version of this dialogue; using either of those tones on the word “left” sounds inappropriate. It is probably because both SFPs require the existence of a D that caused Lee and Law (2001) to conclude that aa1maa3 cannot attach to the answers of certain types of questions. Rather than being related to the question type, however, I propose that it instead depends on whether or not the speaker assumes that the listener knows some D that can lead to the knowledge of P (i.e., the answer to the questions). It is difficult to find examples that have no D because it is something that exists in the mind of the speaker and therefore can potentially be part of almost any context. However, when the listener cannot understand why he or she is expected to know something which can lead to the knowledge of P, then he or she will consider it inappropriate for the speaker to use either lo1 or aa1maa3. We saw an example of this for lo1 with the DJ in dialogue (4), who did not know why he should be expected to have any knowledge related to the name of the boy. We also see this in Lee and Law’s conclusion regarding A-not-A and choice-type questions because listeners who ask such questions probably will not think they should be expected to know something that can lead to knowing the answer—otherwise they would not ask the question. Perhaps the easiest way to demonstrate the need for a D is to construct a context that uses lo1, aa1maa3, or either of their English equivalents to start a conversation. Fung (2000) said that a statement “P lo1” cannot begin a conversation but must be preceded by something linguistic or nonlinguistic. This is also true for aalmaa3, as well as for the English equivalents of lo1 and aa1maa3. This supports the argument that some discourse element D is required to license the use of either particle or their counterparts in English. Next, I will discuss the contexts of (31b) and (32b), where P and P2 are approximately the same. The dialogues of (14), (15), and (18) through (25) are each examples of this. As predicted, it would be acceptable to replace aa1maa3 with lo1 in each of those dialogues. It is up to the speaker to determine which meaning he or she

6.1  Two Evidential Particles: lo1 and aa1maa3

131

prefers to express. It should be noted that lo1 and its English equivalent often sound more abrupt than aa1maa3 and its English equivalent and may therefore sound more rude in comparison. This makes sense, since lo1’s meaning is similar to matter-­of-factly saying “you should know this (P),” while aa1maa3’s meaning ­additionally includes, “if you think about this (P), you will change your mind about something related to it (D2).” In the discourse contexts of (31c) and (32c), there is an obvious D2 and P2 is significantly different from P. When this is the case, it sounds odd, rude, or nonsensical to simply say “You can know P,” and it therefore sounds inappropriate to replace aa1maa3 with lo1. Dialogues (16) and (17) are examples of this. The dialogue in (16) is interesting because it can be construed with a P2 that is either different from or the same as P, depending on whether or not the person to be addressed by the child is still present. If the person is present then jan4 (“person”) will be interpreted to mean “this person,” but if the person is not present, then jan4 will be interpreted to mean “people in general.” Therefore, if the person is still present, then P is “You need to address this person”; if the person is no longer present, then P is “You need to address people (in general).” Let us consider the former case first, where P is interpreted as an instruction to address this person now. In this context, using lo1 is acceptable, though less polite than using aa1maa3. Using these two SFPs in (16) would mean the following, with the meaning of lo1-suffixing represented by the first two lines only: you can know this (P: you need to address this person) because you know something else (D: I have told you many times to address people) I want you to think about this (P) now after this, you will not think this (D2: you don’t need to address this person) you will think something else (P2: you need to address this person)

The P and P2 in this context are the same, which is why both particles are allowed. In either case, it functions to remind the child to do something, but aa1maa3 is much more likely to be used because it additionally functions to persuade the child to think for himself or herself that he or she needs to perform this culturally required act. Using lo1 after the person to be addressed has left would sound odd. In this case, P can only mean “You need to address people (in general).” The D2 in this case relates to the child not having addressed someone prior to the parent’s making this utterance, and the P2 that is related to this D2 is quite different from P: (P: you need to address people (in general)) I want you to think about this (P) now after this, you will not think this (D2: you did not need to address that person) you will think something else (P2: you needed to address that person, but you did not do so)

To recap, when the adult to be addressed is present, using lo1-attachment on (16) tells a child that he or she knows he or she needs to address this person and can func-

132

6  The Results of the Research

tion to get the child to do so. However, using aa1maa3 is far more likely because this works to get the child to think for himself or herself that he or she needs to do this; aa1maa3 is therefore better for teaching the child because it functions to influence his or her thoughts. In the other scenario when (16) is said after the adult has left, the utterance will be interpreted as a reprimand for not having addressed the person. In this case, it sounds odd to only say, “you know you need to address people.” Instead, using aa1maa3 to get the child to think he or she should have done something that he or she neglected to do is much more suitable to the context. In both cases, the English equivalent of aa1maa3 seems better to me that the equivalent of lo1. This is harder to prove in English, however, because emphatic intonation, which sounds similar to lo1-equivalent intonation, is appropriate to use in both contexts. In Cantonese, this would be done by lengthening the rime of the verb giu3 (“address”). For the English equivalents of these two SFPs, consider a comparable context that is more appropriate to an English-speaking culture. A parent and child arrive at a restaurant and run into some friends who are just leaving the restaurant. The friends give the child a piece of candy and the child takes it, unwraps it, and eats it without saying “thank you.” The parent then says to the child, “You need to say thank you.” Just as in our Chinese example, it seems that both lo1- and aa1maa3-­ equivalent tones can be used on the word “thank” of this sentence if the people are still present, with the aa1maa3-equivalent tone sounding more polite and appropriate. If, on the contrary, the parent said this after the people left, then using aa1maa3-­ equivalent intonation sounds appropriate, while lo1-equivalent intonation sounds odd—but again, emphatic intonation would be appropriate. The other dialogue with a P2 that is different from P is (17). After saying P (“S/ he’s still young”), the speaker says, “How can s/he know so much,” which indicates that P was said for the purpose of influencing the listener’s views about the child’s behavior. It does not make sense for P2 to be approximately the same as P in this context,6 and I propose that P2 is something like “It is acceptable for this child to behave this way because s/he’s still young.” Since P2 is significantly different from P, lo1 is not as appropriate as aa1maa3 in this context. According to my intuition, the same is true of lo1-equivalent intonation for the English translation of this dialogue. (Note that lo1 and its English equivalent sound more natural if “S/he’s still young” is said in response to a question like “Why did s/he do that?” In this case, P and P2 would be the same.) The following constructed dialogue further illustrates the contrast between a context in which P is the same as P2, and one in which P is different from P2. Consider a bilingual husband and wife who are each native speakers of the other’s L2. They have the practice of alternating the language that they speak in order to keep each other proficient in their L2s; Mondays, Wednesdays, and Fridays are Cantonese days, and Tuesdays, Thursdays, and Saturdays are

6  Readers should recall P2 is a belief or stance that the speaker assumes the listener does not hold, and therefore P2 cannot be “S/he’s still young,” because the listener clearly knows and believes this.

6.1  Two Evidential Particles: lo1 and aa1maa3

133

English days. They have been doing this for a while and sometimes one or the other forgets which language they are supposed to speak on a given day. One Friday morning, the first thing the husband says to the wife is, “Do we speak Chinese or English today?” The wife responds in one of two ways shown in (35a) and (35b): (35)

Husband: Gam1jat6 gong2 Zung1man4 ding6 gong2 Jing1man4 aa3? today speak Chinese or speak English SFP “Do we speak Chinese or English today?” Wife: a. Zung1man4 lo1/ aa1maa3. Chinese LO AAMAA “Chinese / .” b. Gam1jat6 sing1kei4ng5 ?lo1/ aa1maa3. today Friday LO AAMAA “Today’s Fri- ? / day.”

For the wife’s response in (35a), P2 is roughly the same as P, but in (35b) it is different. In (35b), the use of aa1maa3 expresses: “after you think about this (P: today’s Friday), you will not think this (D2: you don’t know whether we speak Chinese or English today); you will think something else (P2: you know we speak Chinese today).” If lo1 is used in (35b), then only expresses, “You can know this (P: today is Friday) because you know something else (D: yesterday was Thursday).” This does not directly address the husband’s question and is therefore not very appropriate. If as in dialogue (33), the husband had asked, “What day of the week is it?”, then attaching lo1 to the answer “Today’s Friday” would be fine because P and P2 would be the same. According to my judgment, the same distribution applies to the English equivalents of lo1 and aa1maa3 in this dialogue. Pragmatically, lo1-­ suffixing works here because the wife knows the husband should be able to make a connection between the day of the week and which language they are to speak, but the effort required to make this connection makes lo1 less appropriate than aa1maa3. The following constructed example in (36) also demonstrates a context in which there is an obvious D2 that is related to a P2 that is different from P. It again shows that lo1-attachment, though logically possible, does not fit into such dialogues very well. The context is right after someone drops some food onto the table at a restaurant and then picks it up to eat it. This person’s friend says, “You’re still gonna eat that?” and the person gives a reply using either lo1- or aa1maa3-attachment: (36)

A: Lei5 zung6 sik? 2s still eat “You’re still gonna eat that?” B: Zoeng1 toi2 gon1zeng6 ?lo1/ aa1maa3. CL table clean LO AAMAA “The table’s clean ? / .”

134

6  The Results of the Research

In speaker B’s reply, lo1-attachment is technically logical because it is easy to think of a D, which could be something like “You see that the table is clean.” However, using lo1 sounds odd because it only expresses that the table is obviously clean without connecting this to speaker A’s question. Speaker A’s question implied that he or she thinks it is bad to eat food after it has dropped onto the table. This creates an obvious D2 in the mind of the speaker. It therefore makes sense for the speaker to use aa1maa3, which means, “After you think about this (P: the table is clean), you will not think this (D2: the table is dirty and food that has dropped onto it shouldn’t be eaten); you will think something else (P2: the table is clean and it is okay to eat something that has dropped onto it).” I indicate in the English translation of (36B) that the English equivalent of lo1 also sounds odd. Again, this is not as obvious as it is in Cantonese because it can be mistaken for emphatic intonation, which is acceptable in this context and which has a very similar form. Emphatic meaning expressed by lengthening of the adjective (gon1zeng6) is also acceptable to use in the Cantonese version of this dialogue. I believe that a high-falling tone on “clean” in English sounds more acceptable using lo1 in Cantonese because a high-falling tone will be interpreted as emphatic intonation. Note that it is only after having discovered lo1-equivalent intonation that this distinction between an emphatic tone verses an evidential tone can be argued for. In the next constructed example in (37), both the father’s and the daughter’s replies have the same structure (i.e., hai6 lo1), which is what Lee and Law (2001) referred to as the “agreement formula.” Interestingly, only the daughter’s reply can be construed as agreement, and therefore only her reply can acceptably use lo1; the father’s reply can only use aa1maa3. While the daughter’s reply can acceptably use either lo1 or aa1maa3, the expressed meaning of one versus the other is drastically different: (37)

Father: (to daughter)

Waa4, Lei5 jyut6 lei4 jyut6 fei4 aa3. PRT 2s more come more fat SFP “Wow, you’re getting fat.”

Mother: (to father)

Lei5 zou6 me1 waa6 keoi5 fei4 aa3? 2s do what say 3s fat SFP “Why’d you say she’s fat?”

Daughter: Hai6 [PA] lo1./ [PB] aa1maa3. be LO AAMAA “Yeah , [PA].” / “I am (getting fat).” Father:

Hai6 [PA] ??lo1./ [PB] aa1maa3. be LO AAMAA “Yeah ?? , [PA].” / “She is (getting fat).”

It makes sense for the daughter, but not the father, to express agreement using hai6 lo1. When hai6 lo1 is said in response to a question, the null PA will be interpreted as “That’s a good question.” The father’s statement is what triggered the mother’s question, and her question challenged his statement. This therefore com-

6.1  Two Evidential Particles: lo1 and aa1maa3

135

mits the father to a stance that prevents him from agreeing that the mother’s question was a good one—it makes no sense for him to say: “That’s a good question.” This is why lo1 is preceded by two question marks on the father’s response.7 The D in the mind of the daughter is either something like “I don’t look fat” or if she believes she does look fat, then it could be something like “It is not nice to say someone looks fat; you shouldn’t say I look fat.” When lo1-attachment is used in the “agreement formula,” there is no D2 about which the speaker wants to influence the listener’s beliefs. Without any D2, aa1maa3 is uninterpretable. It is nonsensical for the father or daughter to respond to the m ­ other’s question with the meaning: “I want you to think about this (P) now (that is a good question); after this you will not think this (D2: ?)” I cannot think of anything from this context that could function as D2 here if PA is construed as “That’s a good question.” This explains why aa1maa3 can never be used to express agreement. If the father or daughter uses aa1maa3 instead of lo1, then P changes and becomes what is shown as PB in (37), which is something like “She is/I am getting fat.” (“She is” vs. “I am” represent the father’s and daughter’s perspectives, respectively.) Now the interpretation of the elided proposition is based on the father’s original statement and is knowable from a different D, which is related to a perceived change in the daughter’s physical appearance. This is no longer an agreement formula, and stemming from the mother’s question, there is now a D2: “You (the mother) think she is/I am not getting fat.” With aa1maa3 attachment, the daughter is saying, “I want you to think about this (PB) now (I am getting fat); after this, you will not think this (D2: I’m not getting fat), you will think something else (PB-2: I am getting fat).” The father can also say the same thing from his perspective, replacing “I am” with “She is.” It is beyond question that in a context such as this, changing hai6 lo1 (be LO1) to hai6 aa1maa3 (be AA1MAA3) changes the meaning from an expression of agreement to one of persuading the listener to believe something—in this case, to believe that the daughter is getting fat. This is clear evidence that there is a null proposition present in “hai6 + SFP” sentences and that changing one SFP for another changes the interpretation of the null proposition’s semantic content. The fourth and final situation is (31d) and (32d), which is a context that has no D2. In this case, aa1maa3-attachment should not be allowed because its explication requires a D2 in order for it to be interpretable. It was just demonstrated above in relation to (37) that what Lee and Law (2001) referred to as the “agreement formula” is an example of this, which results in the following distribution: hai6 + lo1/ ??aa1maa3 (“Yeah / ?? ”).

 The father could use lo1 attachment in a joking manner but would need to immediately restate the mother’s question addressing himself: “Yeah , [PA]. Why did I say she’s getting fat?” The proposition P is still “that’s a good question,” and this is in fact a conceivable tactic that a quickthinking father could use in an attempt to retract his or her initial statement. 7

136

6  The Results of the Research

6.2  Two Question Particles This section discusses the meaning of the two Cantonese question SFPs me1 and aa4. NSM explications are proposed for the two particles, and their English intonational equivalents are presented based on the ambilinguals’ translations. The discussion and analysis of the two tones draws on the literature’s discussion of English rising question tones, especially Gunlogson’s (2003) detailed study of what she referred to as rising declaratives. Adopting her analysis, it is assumed that rising question tones do not type the sentence as an interrogative. Throughout this section, authors will be quoted in ways that use the terms “interrogative” and “question” interchangeably, but readers should keep in mind that here these SFPs are analyzed as question particles only, that is, not as interrogative particles (see Chap. 7 for a detailed discussion). Gunlogson proposed a definition for the sentence-final rising tone in English, and in this section, I will propose a refining of her definition, giving distinct meanings to a mid-rise versus a high-rise tone. The Cantonese interrogative SFP maa3 has been referred to as a question particle, but Cantonese linguists agree that the particle maa3 types the sentence as an interrogative. Based on the assumption that maa3 is strictly a grammatical particle with no semantic content, it is considered to be very different from the two question particles under discussion here and is therefore excluded from the discussion.

6.2.1  The Particle me1 The NSM Explication of me1  Most dictionaries use the words “surprise” and/or “doubt” to define me1. Meyer and Wempe (1947: 375) said that me1 is an “[i]nterrogative expressing surprise.” P. Huang (1970: 421) defined me1 as “indicating surprise, doubt, etc.” Ball (1888/1971: 114) said it is “interrogative, or expressing some surprise as well, as—‘Is it so?’” Lau (1977: 558) said that it “transforms statements into questions that indicate doubt or surprise.” Zhang (1999) said that me1 questions the validity of something, which is a form of doubt. In the literature, several authors defined me1 in ways that are similar to these dictionary definitions (e.g., Chan 1955: 283; Huang and Kok 1973: 94; Lau 1973: 71; Yip and Matthews 2000: 130). Boyle (1970a: 64) said, “me1 is an interrogative sentence suffix indicating [a] surprised question” and then paraphrased its meaning by saying that it turned a statement into a question: “with the force of ‘What?! I can hardly believe it!’” Paraphrases of this sort are a step in the direction of an NSM explication and are very useful to non-native speakers. Boyle gave an example, the context of which was a question addressed to a listener who had just claimed to know which variety of Chinese some people were speaking. The speaker understood this to mean that the listener was claiming to know Shanghainese and said the following: (38)

Lei5 sik1 gong2 soeng6hoi2waa2 me1?! 2s know speak Shanghainese ME “You can speak Shanghainese?!”

6.2  Two Question Particles

137

Boyle used an exclamation mark in addition to a question mark. This is a good way to distinguish this type of rising declarative from those that are equivalent to the question particle aa4, and I will use this convention for all of the English translations of me1 that follow. Baker and Ho (2006) gave an excellent description of the form of the English equivalent of me1, which is the only one of its kind for any SFP throughout the literature as far as I know. What makes their description unique is the fact that it describes an English intonational equivalent of an SFP in terms of its pitch, though it does so in informal, laymen terms: “If you want to express great incredulity in a question in English (You can speak 57 languages fluently?!) you raise your voice almost to a squeak at the end of the question” (p. 40). To be more precise, me1-­ equivalent English intonation is a rise in pitch used on a declarative clause, changing it from a statement into a question that has a specific connotative meaning. Baker and Ho (2006: 40) went on to explain and paraphrase the meaning of me1, saying that it “indicates great surprise, astonishment, near disbelief, surely that’s not the case, is it?, do you mean to say that...?” Related to this, Matthews and Yip (2011: 360) said that me1 “denot[es] surprise and [is] used to check the truth of an unexpected state of affairs,” and they gave this example: (39)

Lei5 zou6-gwo gam3 do1 ci3 dou1 m4-sik1 ge3 me1? 2s do-EXP so many time still NEG-know PRT ME “After doing it so many times you still don’t know?!”

Chan (2001: 59) contrasted me1 with maa3 by citing the following minimal-pair sentences originally from Deng (1991): (40)

a) Lei5 sik1 keoi5 maa3? 2s know 3s MAA “Do you know her?” b) Lei5 sik1 keoi5 me1? 2s know 3s ME “You know her?!”

Chan said the following about (40a–b): [S]entence [(40a)] is a fairly neutral, information-seeking question … In changing the particle to me1, sentence [(40b)] conveys the speaker’s startled reaction or surprise. The context in which it is uttered is not neutral; the speaker is seeking some kind of confirmation. It might, for instance, be asked by a young woman with a hint of jealousy to her boyfriend upon seeing an attractive stranger wave to them. (Chan 2001: 59)

The functions and meanings of me1 that linguists are virtually in unanimous agreement about can be summarized as (1) it is question forming and (2) it expresses surprise and/or doubt. This is very helpful, but because me1 is a discourse particle that links the sentence to the discourse, we are still missing a description of how this particle connects the proposition P to some element D in the discourse. Unlike the interrogative particle maa3, which forms neutrally intoned interrogatives, me1 is

138

6  The Results of the Research

not neutral, as can be seen from Chan’s (2001) examples in (40a–b). The SFP me1 is used when the speaker has a particular belief (i.e., a presupposition) about the proposition (Kwok 1984). Capturing a good portion of the meaning expressed in my proposed explication for me1 below, Leung (1992/2005: 78) said that it “attaches to sentences which state the opposite of what the speaker knows or assumes, and the speaker is asking for confirmation” (translation that of the author). The use of me1 implies that, prior to the time of speech, the speaker believed that the proposition P “is not the case.” A me1-suffixed sentence is uttered after a speaker has become aware of some form of evidence that P is the case. If the speaker has not been fully convinced (or not convinced at all) by this evidence, then he or she remains doubtful of the proposition at the time of speech. If the speaker has been fairly or fully convinced that the proposition is true, then his or her stance has been changed from disbelief to belief, and the speaker is thus surprised. The definition of me1 must include a discourse element D that acts as potential evidence to change the speaker’s mind about the proposition that the speaker believed to be false. Those speakers who are “surprised” are more convinced by the evidence D than are those speakers who remain “doubtful.” The explication I propose for me1 is as follows, where P is the proposition to which me1 attaches, and D is a discourse element that functions as a form of evidence that challenges the validity of P: (41)

P + me1 = a.

ngo5 lam2 ho2lang4 hai6 gam2joeng2 (P) 1s think maybe be like this “I think maybe it’s like this (P)”

b. jan1wai6 jau5 je5 faat3sang1 (D) because have thing happen “because something happened (D)” c.

lei1 joeng6 je5 faat3sang1 (D) zi1cin4, ngo5 lam2: m4-hai6 this CL thing happen before 1s think NEG-be



gam2joeng2 (P) this way “before this happened (D), I thought: it isn’t like this (P)”

d.

ngo5 soeng2 zi1dou3 1s want know “I want to know”

e.

jan1wai6 gam2joeng2, ngo5 soeng2 lei5 gong2 di1 je5 because like this 1s want 2s say CL thing “I want you to say something because of this”

The degree of possibility expressed by the prime “maybe” in line a. will vary. It depends on the degree to which the something that happened in line b. is taken as evidence to indicate that “it is like this.” As a result, me1 may express anything from extreme surprise to extreme doubt. Which stance the speaker maintains will be determined by the context, as well as by the speaker’s preconceived beliefs in relation to both D and P. It also appears that changes in voice or pitch qualities, as well as facial expressions, are used with me1 to express the degree of the speaker’s stance

6.2  Two Question Particles

139

change. My impression is that small stance changes, which maintain doubt in the proposition, are accompanied by a lower pitch key and a scowl, while large stance changes, which express surprise that the proposition is likely to be the case, are accompanied by a higher pitch key, a raising of the eyebrows, and a widening of the eyes. Whether or not this is actually the case is left for future research to determine. Line e. is what makes me1 a question particle; the speaker is requesting information. Like polar questions, Me1-suffixed questions are usually answered with either a positive or negative form of the verb, and their English equivalents are normally answered with “yes” or “no.” The English translation of the explication in (41) defines its English-equivalent tone, shown here in (42). The data that provide evidence of its equivalent form in English are presented in the following section: (42)

P + me1-equivalent intonation = a. I think maybe it’s like this (P) b. because something happened (D) c. before this happened (D), I thought: it isn’t like this (P) d. I want to know e. I want you to say something because of this8

In addition to “doubt” and “surprise,” some authors have said that me1 can also express “disbelief” (e.g., Kwok 1984: 88; S. Law 1990: 18). The explications in (41) and (42) technically do not express this meaning. Line c. expresses disbelief but as a prior stance only. The current stance expressed by me1 and its English equivalent is line a., which is not disbelief. If the speaker still maintains disbelief at the time of speech, then his or her stance has not changed from c. to a. and therefore does not

8  It is interesting and informative to compare this NSM explication with the one that Wong (2004: 782) proposed for the Singapore English particle meh: a. at a time before now, I thought something b. something happened now c. because of this, d. I think I can’t think like this anymore e. I think I have to think like this (P) f. I don’t know g. I want to know h. because of this, I want you to say something about it to me now Wong’s explication for meh is notably similar to my explication of me1. This is not surprising since it has been proposed that meh was borrowed from Cantonese “as a package,” including its form, tone and meaning (Lim 2007: 463). I argue that my explication in (41) better captures the meaning of me1 than does Wong’s (2004) explication for meh, but this could of course be because the two particles do not have precisely the same meaning. The key difference between the two explications relates to the stance change of the speaker. Wong’s lines d. and e. indicate that meh expresses that the speaker has changed his or her stance from thinking something different from P to now thinking P. This contrasts with my line (42a) that indicates me1 expresses that the speaker maybe, but not necessarily, now thinks P. Wong’s explication, in essence, articulates a stance change from “I thought X before” to “now I can’t think X; I must think Y.” My explication, on the contrary, expresses a stance change related to a single proposition, going from “I thought not X before” to “now I think maybe X.

140

6  The Results of the Research

mean exactly what the explication says. I propose that there are two ways in which me1 might be analyzed as an expression of disbelief. The first is based on pragmatics. In many cases, speakers use me1 in such a way as to indicate that their stance has not been affected much at all by the evidence D. In such cases, the “maybe” in line a. is extremely weak, and an interpretation that me1 is expressing disbelief is understandable. The second way that me1 can express disbelief involves a closely related particle that uses a high-falling rather than a high-level tone. I will call this related SFP me12 and tentatively propose that a core part of its meaning expresses: “it is not like this (P).” There is a clear difference between the pronunciations of me1 and me12, as well as a clear difference in meaning, which is why I conclude that they are two particles with different meanings. In addition to having a high-falling tone, the vowel of me12 is significantly lengthened. Rather than analyze these as two separate particles as I have done, H. Huang (1989: 414–5) suggested that intonation affects the meaning of me1, saying that “[w]hen it expresses a typical question, it is pronounced with a high level tone; when it is used to express a retort, then a high-­falling tone is used” (translation that of the author). In this section, I will not try to settle the issue as to whether me12 is a separate particle or is a combination of me1 and a fixed form of intonation. The remainder of this section relates only to me1—how me12 relates to me1, and whether or not it has an English equivalent form is left for future research. The English Equivalent of me1 based on the data  This section discusses the form of the English pitch contour that is proposed to be equivalent in function and meaning to me1. The meanings of me1 and its English equivalent in relation to these dialogues will be discussed in the next section. In the first dialogue with a me1-suffixed sentence, two speakers are talking about what a third person has chosen to study at university. Speaker B has just told speaker A that this person will study physiotherapy. (43)

A: Bin1gaan1 ju1 aa3, lam2-zyu6 gaan2? which-CL U(niversity) SFP think-ASP choose “Which university is he planning to choose?” B: Poly ze1maa3, dak1. SFP only “PolyU is the only choice.” A: Hai6 me1? Dak1 Poly jau5 dak1 duk6? be ME only have can study “Really ?! PolyU is the only place you can study that?”

For this and every dialogue that follows, all of the informants translated me1 as a rising pitch contour. In each case, the contour begins on the nucleus of the sentence’s final intonational phrase and continues upward across the remaining syllables to the end of the phrase/sentence (Figs. 6.29 through 6.32).

6.2  Two Question Particles

141

Pitch (Hz logarithmic)

346 245 173

R E A L L Y 122

0.5

1.0 Time (s.)

Fig. 6.29  Female-a: “Really?!”

Pitch (Hz logarithmic)

216 153 108

R E A L L Y 76

0.4

0 Time (s.)

Pitch (Hz logarithmic)

Fig. 6.30  Male-a: “Really?!”

335

237

R

E

A

L

L

Y

168

0.5

0 Time (s.) Fig. 6.31  Female-b: “Really?!”

142

6  The Results of the Research

Pitch (Hz logarithmic)

190 134 95

R

E

A

L L

Y

67

0.3

0.5

1.0 Time (s.)

Fig. 6.32  Male-b: “Really?!”

As was the case for the translations of lo1 and aa1maa3, male-b’s translations were closer to a canonical form of neutral, declarative-like intonation than were those of the other informants. Nevertheless, male-b’s translations were recognizably rising declaratives that clearly sounded like questions and were therefore considered to be tokens of the same form of intonation as those of the other informants. Note that, like female-a and male-a, male-b also raised his pitch on this utterance enough to require adding an extra pitch line to the figure. Interestingly, female-a’s pitch level still extended above the figure’s border. In the next dialogue, speaker B is surprised to hear that speaker A has never been to Australia. (44)

A: Ngo5 zeoi3 soeng2 heoi3 Ou3zau1 Lau2sai1laan4 go2-bin6. 1s most want go Australia New Zealand that-side “I want to go to Australia and New Zealand the most.” Jan1wai6 dou1 mei6 heoi3-gwo. because still not-yet go-PERF “Because I’ve never gone before.” B: Hai6 aa4? be SFP “Really?” Ou3zau1 lei5 dou1 mei6 heoi3-gwo3 me1? Australia 2s also not-yet go-PERF ME “You’ve never been to Australia ?!” A: Mei6 aa3. not-yet SFP “Not yet.” B: Ngo5 ji5wai4 lei5 zing6hai6 Nau2sai1laan4 mei6 heoi3-gwo3 tim1. 1s think 2s only New Zealand not-yet go-PERF PRT “I thought it was only New Zealand that you hadn’t been to before.”

6.2  Two Question Particles

143

The tone placed over the two syllables of “either” or “-stralia” rises significantly (Figs. 6.33 through 6.36):

Pitch (Hz logarithmic)

346 245 173

Y’HAVEN’TBEENT’A U S T R A L I A EITHER 122

0.6

1

1.5

2.0

Time (s.)

Pitch (Hz logarithmic)

Fig. 6.33  Female-a: “You haven’t been to Australia either?”

153

108

YOU’VENEVERBEENTOAUSTRALIA 76

0

0.5

1

1.5

Time (s.)

Pitch (Hz logarithmic)

Fig. 6.34  Male-a: “You’ve never been to Australia?!”

335

237

YOU HAVEN’T BEEN TO AUSTRALIA 168

0

0.5

1 Time (s.)

Fig. 6.35  Female-b: “You haven’t been to Australia?!”

1.2

6  The Results of the Research

Pitch (Hz logarithmic)

144

134

95

YOU’VENEVERBEENTOAUSTRAILIA 67

0

0.5

1 Time (s.)

1.5

1.8

Fig. 6.36  Male-b: “You’ve never been to Australia?!”

In the next dialogue, speaker B had told speaker A about two mutual acquaintances going to dinner together. Speaker A wants to hear the rest of the story, but speaker B says he cannot say any more about it. (45)

A: Lei5 gai3zuk6 gong2 maai4 lok6heoi3, bat1jyu4. 2s continue speak finish descend how-about “Why don’t you finish telling me (your story about the dinner).” B: M4-dak1 ge3. NEG-can SFP “I can’t.” A: M4-gong2 dak1 gaa3 me1? NEG-speak can SFP ME “You can’t talk about it ?!”

Again all the translations consistently use a rising tone. This is an interesting example because it shows how a discourse tone can interact with contrastive intonation. Female-a, male-a, and female-b all used the rising me1-equivalent tone across the entire VP, beginning with the verb, which was “tell” or “talk.” Male-b’s translation used the rising tone on the single word “me,” which places it into focus and contrasts

145

6.2  Two Question Particles

Pitch (Hz logarithmic)

346 245 173

WHAT

Y’CAN’T

TE LL ME

122

0.50.5

1.4

1 Time (s.)

Pitch (Hz logarithmic)

Fig. 6.37  Female-a: “What you can’t tell me?!”

153

108

WHAT YOU CAN’T TALK ABOUT IT 76

0

0.5 Time (s.)

1

1.1

1

1.1

Pitch (Hz logarithmic)

Fig. 6.38  Male-a: “What you can’t talk about it?!”

335

237

WE CAN’T TALK

A BOUT IT

168

0

Fig. 6.39  Female-b: “We can’t talk about it?!”

0.5 Time (s.)

6  The Results of the Research

Pitch (Hz logarithmic)

146

134

95

WHATYOU CAN’T TELL ME 67

0

0.5

1

1.4

Time (s.) Fig. 6.40  Male-b: “What you can’t tell me?!”

it with “other people.” This was not part of the context, but it was reasonable and logical for male-b to add this meaning to the context. If there were a context where the subject “you” were contrasted with another person, then it would sound very natural to begin the rising tone on the word “you.” Regardless of where it begins, it will always continue to rise until the end of the sentence (Figs. 6.37 through 6.40). In the next dialogue, the speaker is talking about another person and the listener expresses recognition, indicating that she knows who the speaker is talking about. The speaker doubts that the listener actually knows who he is talking about and questions the listener in order to confirm: (46)

Lei5 zi1 bin1go3 me1? 2s know who ME “You know who he is ?!”

Pitch (Hz logarithmic)

This final translation gets the same results (Figs. 6.41 through 6.44).

245

173

YOUKNOW WHO THAT IS 122

0

Fig. 6.41  Female-a: “You know who that is?!”

0.5 Time (s.)

1.0

147

Pitch (Hz logarithmic)

6.2  Two Question Particles

153

108

YOU KNOW WHO I’M TALKIN’ A BOUT 76

0

0.5 Time (s.)

1

1.2

Pitch (Hz logarithmic)

Fig. 6.42  Male-a: “You know who I’m talkin’ about?!”

335

237

YOU KNOW WHO HE

IS

168

0

0.9

0.5 Time (s.)

Pitch (Hz logarithmic)

Fig. 6.43  Female-b: “You know who he is?!”

134

95

YOUKNOWWHO HE IS 67

0

0.5

1 Time (s.)

Fig. 6.44  Male-b: “You know who he is?!”

1.3

148

6  The Results of the Research

Based on the data, it can be concluded that me1 has an intonational equivalent in English, which is a high-rising tone that appears in sentence-final position. It is hypothesized to have the definition proposed in (42). As for its form, its starting point depends on what information is put into focus. An example of this can be seen in Figs. 6.37 and 6.40, where, respectively, female-a places focus on the VP “tell me” and male-b places focus on the object “me.” It is interesting to note that the contrastive tone used on “me” has assimilated with the me1-equivalent tone by changing from a High tone to a Low tone, allowing the me1-equivalent tone to begin at a low point, so that it can rise. The starting point of this rising tone not only assimilates with another discourse tone, but also with lexical stress. For example, if it begins on a multisyllabic word, it will begin on whichever syllable carries that word’s primary stress, for example, “You saw a(n) ELephant/goRIlla/chimpanZEE?!” Applying the NSM explication to the examples of me1 from the literature and the data  Based merely on looking at the limited contexts provided in the literature, it is not always possible to determine whether the speaker is doubtful or surprised that P might be the case. Regardless, the explication remains the same. None of the examples from the literature include enough context to know for sure what D is, but as was done in the previous section, this can be reasonably construed pragmatically. The lack of context can also make it unclear whether me1 is being used to express surprise, doubt, or something in between. Which meaning is expressed in the first example in (38), for instance, depends on the speaker’s original stance regarding the listener’s ability to speak Shanghainese. If he or she knows the listener well and has never heard about his or her ability to speak Shanghainese, then he or he will not readily accept that he or she does based on casual evidence. (38′)

Lei5 sik1 gong2 Soeng6hoi2waa2 me1?! 2s know speak Shanghainese ME “You can speak Shanghainese ?!”

This example can be inserted into the explication as follows: I think maybe it’s like this (P: you can speak Shanghainese) because something happened (D: you said those people are speaking Shanghainese) before this happened, I thought: it isn’t like this (P) I want to know I want you to say something because of this

Using a high-rising tone on the final syllable of “Shanghainese” in (38′) sounds natural and, based on my native-English intuition, sounds like it expresses this meaning. Matthews and Yip (2011) said this next example demonstrated that me1 expresses “surprise.” No context was given, but a likely possibility is that the speaker saw the listener try unsuccessfully to do something, which then becomes D. The first part of the sentence indicates that the speaker believes the listener has done this task many

6.2  Two Question Particles

149

times before, and this is why the speaker previously “thought: it isn’t like this” (P: you still don’t know how to do it). (39′)

Lei5 zou6-gwo gam3 do1 ci3 dou1 m4-sik1 ge3 me1? 2s do-EXP so many time still NEG-know PRT ME “After doing it so many times, you still don’t know how to do it

?!”

Again we have to construct a likely possibility for D for this example. We can assume that the speaker saw the listener try and fail to do whatever task is being referred to. This failure then causes the speaker to think maybe the listener still does not know how to do this task even though he has done it many times before. The proposed explication of me1, based on this context, appears to capture what me1 expresses, which is as follows: I think maybe it’s like this (P: you still don’t know how to do it) because something happened (D: you failed when you tried to do it) before this happened, I thought: it isn’t like this (P) I want to know I want you to say something because of this

The English equivalent rising tone also works for the English translation of the context of (39). It sounds most natural to begin the tone either on “still” or on “do,” depending on whether or not the speaker wants to put “still” into focus. In the last example cited above from the literature, Chan (2001) constructed a context in which a girl, who is standing next to her boyfriend, sees an attractive girl wave to them. The jealous girl then asks her boyfriend this question: (40′)

Lei5 sik1 keoi5 me1? 2s know 3s ME “You know her ?!”

I will assume that the jealous girlfriend (i.e., the speaker) considers the attractive girl’s wave to be sufficient evidence to conclude that her boyfriend knows this attractive girl, causing the speaker’s knowledge to change from disbelief to belief, resulting in surprise: I think maybe it’s like this (P: you know her) because something happened (D: that girl waved to us) before this happened, I thought: it isn’t like this (P) I want to know I want you to say something because of this

In (40), D is based on something that is seen rather than something that the listener has said. In this case, the relationship between D and P is entirely pragmatic, but this poses no problem for the application of the explication. The me1-equivalent tone will begin on “know” unless the context additionally includes a contrast between knowing this girl rather than knowing someone else, in which case the rising tone would only be on “her.”

150

6  The Results of the Research

The first example used from the data to elicit translations of me1 was (43). The speaker had just been told that The Hong Kong Polytechnic University is the only university that offers a degree in physiotherapy. The speaker apparently assumed that other universities in Hong Kong also offered degrees in physiotherapy and responded by attaching me1 to the copula, which translates into English using a high-rising tone on “really.” (43′)

Hai6 [P] me1? Dak1 Poly jau5 dak1 duk6? be ME only have can study “Really , [P]?! PolyU is the only place you can study that?”

As explained above, whenever an SFP is used with the copula, the sentence is analyzed has having an elided predicate, and this phonologically null predicate is the proposition that functions as the antecedent of P in the SFP’s explication. In this case, the proposition is uttered immediately after uttering hai6 + me1. The D in the context of this example is obviously the listener having just said PolyU was the only choice for studying physiotherapy, and the proposition P relates directly to that statement. This example, along with the two following examples, all illustrate a common relationship between P and D when using a question particle, which is that P is embedded in D as the complement of the verb “say,” that is, D is “you said P”: I think maybe it’s like this (P: PolyU is the only place you can study that) because something happened (D: you said PolyU is the only choice [for studying that]) before this happened, I thought: it isn’t like this (P) I want to know I want you to say something because of this

Whether the example in (43) expresses surprise or doubt depends on the speaker’s original stance about the proposition, plus the degree to which he considered the listener a reliable source for knowing the truth about it. In the next example in (44), the context indicates that the speaker is probably surprised rather than doubtful. (44′)

Ou3zau1 lei5 dou1 mei6 heoi3-gwo3 me1? Australia 2s also not-yet go-EXP ME “You’ve never been to Australia ?!”

Under normal circumstances, there is usually no reason to doubt someone’s claim about never having been to a certain place before. Therefore, the speaker probably had no reason to doubt the validity of D (you said you’ve never been [to Australia]) as a source of evidence regarding P (you’ve never been to Australia). Another reason to assume that the speaker is surprised is because she precedes her question in (44) with “Hai6 aa4” (“Really?”), which is a question that uses the question particle aa4, which, I will argue in Sect. 6.2.2, can indicate surprise but not doubt. Another clue from the context is the fact that the listener followed up her me1-suffixed question with a statement that indicated she believed the listener had

6.2  Two Question Particles

151

never been to Australia: “I thought it was only New Zealand that you hadn’t been to before.” Here is how the explication explains this particular example of me1-suffixing: I think maybe it’s like this (P: you’ve never been to Australia) because something happened (D: you said you’ve never been [to Aus- tralia]) before this happened, I thought: it isn’t like this (P) I want to know I want you to say something because of this

In the next example, it is not possible to know for sure where the meaning of this use of me1 lies along the continuum from surprise to doubt. There is not enough known about the “dinner story” that the listener said he cannot talk about, such as whether it might contain potentially embarrassing information about the participants involved. Also the relationship between the speaker and listener is unknown. (45′)

M4-gong2 dak1 gaa3 me1? NEG-speak can SFP ME “You can’t talk about it ?!”

The use of me1 and its English equivalent in this context again appear to mean what the explication says, which can be tested by inserting the antecedents of “this” and “something happened,” which are, respectively, (P: you can’t talk about it) and (D: you said you can’t [talk about it]). In the final example from the data, the speaker is asking the listener to say whether or not she knows who he is talking about, because she responded as if she knew. We would again need to know the speaker’s prior stance regarding the proposition in order to determine whether he is expressing doubt or surprise. (46′)

Lei5 zi1 bin1go3 me1? 2s know who ME “You know who he is ?!”

Again we can insert the antecedents of P and D into the explication and see that it appears to express what me1 and its English equivalent mean: I think maybe it’s like this (P: you know who he is) because something happened (D: you responded as if you know who I’m talking about) before this happened, I thought: it isn’t like this (P) I want to know I want you to say something because of this

The following subsection now looks at the meaning of the related question particle aa4, proposing an explication for it and showing its English equivalent tone based on translations from the ambilingual participants.

152

6  The Results of the Research

6.2.2  The Particle aa4 The NSM explication of aa4  S. Law (1990), Leung (1992/2005), and Sybesma and Li (2007) all said that aa4-attachment does not always result in question formation, which means there appear to be two aa4 particles, one that is question-forming and one that is not. Leung (1992/2005) said that the vowel of the nonquestion-­ forming aa4 is often realized as a schwa and that this nonquestion version is used to express dissatisfaction or feelings of resentment. All following references to aa4 refer to the question-forming version of this particle. Dictionary definitions and literature descriptions of aa4 are quite similar to those of me1, saying that aa4, like me1, forms questions, expresses surprise, and seeks confirmation. Huang and Kok (1973: 2), for example, said that aa4 is a “sentence particle making [a] statement into [a] yes/no question; [it] expresses surprise or disbelief, [and] asks for confirmation of [a] surprising statement.” Several authors concluded that the core function of aa4 is to form questions that seek confirmation (S. Law 1990; Luke and Nancarrow 1997; Leung 1992/2005). Related to this, Chao (1969: 102) said that “[t]he difference between me1 and aa4, both of which change a preceding statement into a question, is that the former asks, ‘Is it true, do you mean to say that....?,’ while the latter merely asks, ‘Do I hear you right? am I repeating your statement correctly?’” This indicates that me1 challenges the truthfulness of something, while aa4 merely seeks to confirm it. Matthews and Yip (2011: 359) said that aa4 expresses surprise or disapproval, and they gave this example: (47)

Lei5 haa6 go3 lai5baai3 fong3gaa3 aa4? 2s next CL week take-leave AA “You’re going on leave next week?”

Matthews and Yip (2011: 360) argued that “[t]his form of question tends to presuppose a positive answer, being used to check the validity of an assumption.” This reflects the key difference between aa4, which “presuppose[s] a positive answer,” and me1, which expresses doubt and is therefore “used to check the truth of [P]” (ibid). Their observation that aa4 presupposes a positive answer while me1 does not reveals a difference in the epistemic stance of the speaker one using one particle versus the other: only me1 expresses that the speaker held a prior stance of “it is not like this (P).” In other words, if the speaker formerly held the stance “not P,” then he or she will use me1, and if he or she did not hold this stance, then he or she will use aa4. Yip and Matthews (2001) said, me1 and aa4 turn a statement into a question of a particularly loaded kind. Me1 indicates surprise that something should be the case (“How can this be true?”), [while] aa4 suggests surprise and often an element of disapproval (“If this is true I don’t think much of it”). (Yip and Matthews 2001: 114)

6.2  Two Question Particles

153

Their paraphrase “How can this be true” for me1 indicates an element of doubt, while their paraphrase “If this is true…” for aa4 does not. They gave the following two examples of aa4-suffixing: (48)

Gam3 cin2 ge3 dou6lei5 dou1 m4-ming4 aa4? so shallow PRT principle even NEG-understand AA “You can’t even understand such a simple principle?”

(49)

Keoi5 dou3 ji4gaa1 dou1 m4-hang2 jyun4loeng4 lei5 aa4? 3s up to now even NEG-willing forgive 2s AA “He’s still unwilling to forgive you even now?”

Based on the observations in the literature and consultations with native Cantonese speakers, I propose this explication for aa4: (50)

P + aa4 = a. b.

ngo5 lam2 ho2lang4 hai6 gam2joeng2 (P) 1s think maybe be this way “I think maybe it’s like this (P)” jan1wai6 jau5 joeng6 je5 faat3sang1 (D) because have CL thing happen “because something happened (D)”

c.

ngo5 soeng2 zi1dou3 1s want know “I want to know”

d.

jan1wai6 gam2joeng6, ngo5 soeng2 lei5 gong2 di1 je5 because this way 1s want 2s say CL thing “I want you to say something because of this”

According to this, the only difference between the meanings of aa4 and me1 is reflected in the fact that the explication of aa4 does not contain line c. of the explication of me1: “before this happened (D), I thought: it isn’t like this (P).” In other words, unlike me1, the SFP aa4 does not express a prior stance that P was not the case. This accounts for Kwok’s (1984) observation that aa4 conveys a lesser degree of surprise than me1. A challenge to one’s beliefs about something, as opposed to merely being made aware of something, will naturally result in a greater degree of surprise and, in some cases, will cause one to maintain doubt. The English translation of the explication in (50) defines aa4’s English-equivalent tone, shown here in (51). The data that provide evidence of its English-equivalent form are presented in the following section: (51)

P + aa4-equivalent intonation = a. b. c. d.

I think maybe it’s like this (P) because something happened (D) I want to know I want you to say something because of this

154

6  The Results of the Research

The English equivalent of aa4 based on the data  Translations were elicited for dialogues that include aa4 in the audio corpus. All of the F0 contours showed a sentence-final rising tone, and it was therefore concluded that aa4 has an intonational equivalent in English, which is a rising tone. To my native-English ear, this tone sounded like it did not rise as high as the me1-equivalent tone. I therefore refer to the English equivalent of me1 as a “high-rising” question tone and to the English equivalent of aa4 as a “rising” question tone. However, it was not possible to confirm this on paper based on the translations of me1 and aa4 that all came from different contexts (Interested readers can see Wakefield (2010, 2014) for the graphs and discussion of the aa4 translations). In order to test my impression that there was a difference between the two rising tones, I constructed three minimal-pair dialogues and asked two native Cantonese speakers to confirm that they sounded natural and then record them acting these dialogues out. Each minimal-pair dialogue consisted of the same dialogue acted out and recorded two times, once using me1 and once using aa4. This allowed translations of the same sentence within the same context using me1 versus aa4 to be elicited from the ambilingual participants. One of those, the minimal-pair dialogues, is shown here (see Wakefield (2010, 2014) for more minimal-pair dialogues and their F0 graphs).9 (52)

A: Ngo5 baan1 gei1 ting1jat6 loeng5 dim2 bun3 hei2fei1 aa3. 1s CL plane tomorrow two CL half rise-fly SFP “My plane leaves at 2:30 tomorrow.” B: Lei5 ting1jat6 zau2 aa4/ me1? 2s tomorrow leave AA/ ME “You’re leaving tomorrow?/ ?!”

It is not possible to use aa4 and me1 together. Therefore, showing both SFPs together in line (52B) indicates that this dialogue had two versions, one in which the response from speaker B had the SFP aa4 attached, and one in which it had the SFP me1 attached. The punctuation at the end of the English translation, that is, “?/?!”, also indicates two separate translations. The F0s of the four participants’ translations for both versions of (52) are as follows, with each participant’s two translations shown side-by-side; the aa4 translations are on the left and the me1 translations are on the right: For all four participants, the translations of their me1-suffixed sentences (shown on the right) rose more than did the translations of their aa4-suffixed sentences. All participants began their tones on the second syllable of “tomorrow,” except for female-a’s translation of me1, where she began it on the first syllable of “leaving.” This creates a difference in focus between the two sentences, but the contrast in the degree of the rise of tones remains relevant. According to Praat, the contrast in the

9  Female-b was not initially available for these follow-up translations to contrast me1 and aa4. Her F0 graphs are therefore not included in Wakefield (2010). She later became available and these translations were then collected from her and were discussed in Wakefield (2014).

155

6.2  Two Question Particles

Pitch (Hz logarithmic)

346 245 173

OHYOU’RE LEAVING TOMORROW 122

0.50.5

1.5

1 Time (s.)

Fig. 6.45  Female-a: “Oh you’re leaving tomorrow?” (aa4)

Pitch (Hz logarithmic)

346 245 173

OHYOU’RELEAVING TOMORROW 122

0.2

0.5

1

1.2

Time (s.) Fig. 6.46  Female-a: “Oh you’re leaving tomorrow?!” (me1)

maximum F0 for female-a’s translation of aa4 versus me1 was 397.5 Hz and 485.6 Hz, respectively. This same respective contrast exists between the maximum F0’s of male-a’s two translations (210.8 Hz and 260.7 Hz) and of male-b’s two translations (171.2 Hz and 221.4 Hz). This is significant because even slight differences in frequency can be detected by the human ear. Tests of perception have shown that “0.1% changes in frequency can be heard, e.g., with synthetic speech, for a tone around 1,000 Hz, listeners can detect a 1-2 Hz difference” (Chun 2002: 11). Another influencing factor related to detection of pitch differences is that there does not appear to be a “one-to-one relationship between frequency and pitch, i.e., a tone that is judged to be twice as high as another tone does not necessarily have twice

156

6  The Results of the Research

Pitch (Hz logarithmic)

216 153 108 76

0

YOU’RE LEAVING TO MORROW

0.5 Time (s.)

0.9

Fig. 6.47  Male-a: “You’re leaving tomorrow?” (aa4)

Pitch (Hz logarithmic)

216 153 108

YOU’RE LEAVING TOMORROW

76

0

0.5 Time (s.)

1.0

Fig. 6.48  Male-a: “You’re leaving tomorrow?!” (me1)

the Hz value” (ibid). This means that the difference in degree to which the F0 curves rise between the me1- and aa4-equivalent translations is likely to be perceived as having a greater difference than the actual physical difference in Hz values. There may also be some voice quality factors that influence pitch perception shown in Figs. 6.45 through 6.52. Female-b’s translations of aa4 were interesting and surprising and are worth discussing. Each of her translations of aa4 for the dialogues from the corpus used a rising tone, which matches all the other ambilingual participants’ translations, but her translations of aa4 for the constructed minimal-pair dialogues were unique; they all used a falling tone. There is evidence that two other participants also considered a falling question tone to be close to the meaning of aa4. Readers may recall that the participants were allowed to redo any translation that they felt they could improve

157

Pitch (Hz logarithmic)

6.2  Two Question Particles

335

237

YOU’RE LEAVING TOMORROW 168

0.6

1 Time (s.)

1.4

Pitch (Hz logarithmic)

Fig. 6.49  Female-b: “You’re leaving tomorrow?” (aa4)

335

237

168

0.5

YOU’RE LEAVING TO- MORROW

1 Time (s.)

1.5

Fig. 6.50  Female-b: “You’re leaving tomorrow?!” (me1)

on, and some of the recordings of their initial attempts were kept. Upon reviewing the data, it was discovered that male-a (one time) and female-b (two times) used a falling queclarative tone as a translation of aa4, but they then replaced those translations with rising tones after reconsidering. A possible reason that female-b consistently used a falling tone in her aa4 translations of the minimal pairs, and that two others leaned towards doing so as well, is because the minimal-pair context may have caused them to sense a need to clearly distinguish the translations of aa4 from the translations of me1. Remember, none of the participants translated the aa4-­ suffixed sentences from the corpus dialogues as anything other than a rising tone. Based on my intuition, the falling tones that were used in the participants’ translations of aa4 clearly sounds like a question tone and can therefore be considered an example of what Geluykens (1987, 1988) referred to as a falling queclarative, which

158

6  The Results of the Research

Pitch (Hz logarithmic)

190 134 95 67

0.8

YOU’RE LEAVING TO- MORROW

1

1.5 Time (s.)

2

2.4

Fig. 6.51  Male-b: “You’re leaving tomorrow?” (aa4)

Pitch (Hz logarithmic)

190 134 95 67

0.6

YOU’RE LEAVING TO- MORROW

1

1.5

2 2.1

Time (s.) Fig. 6.52  Male-b: “You’re leaving tomorrow?!” (me1)

is a declarative clause with a falling tone that functions to ask a question. Based on my intuitional sense of its meaning, this falling question tone is tentatively proposed to differ from the aa4-equivalent intonation in its level of commitment to the proposition. A possible explication for this tone was proposed in Wakefield (2014: 144) as follows: (53)

P + falling question tone a. I know it’s like this (P) b. because something happened (D) c. I want you to say something because of this

6.2  Two Question Particles

159

Line a. of this explication replaces line a. of the explication of the aa4-equivalent tone given in (51), that is, “I think maybe it’s like this” is replaced by “I know it’s like this.” This indicates that the event D referred to in line b., whether a speech act or something pragmatic, was considered by the speaker to be sufficient evidence to conclude that P is true. Line c. of (51), that is, “I want to know,” is no longer necessary because line a. already indicates that the speaker knows P. Line d. of (51) and line c. of (53), which are both the same, are what make these tones question tones; they request the listener to provide information in relation to P and D. Applying the NSM explication to the examples of aa4 and its English equivalent from the literature  The first example from the literature was (47) from Matthews and Yip (2011): (47′)

Lei5 haa6 go3 lai5baai3 fong3gaa3 aa4? 2s next CL week take-leave AA “You’re going on leave next week ?”

If the aa4-equivalent tone is used without contrastive intonation, then it will begin its rise on the syllable “leave” and continue to the end of the sentence. If the tone begins on “next” or on “week,” then it will place “next” or “week,” respectively, into focus, contrasting them with appropriate alternatives. These contrastive meanings could only be uttered if the appropriate contrasts are understood to be present in the discourse. The D in this context is most likely “You/ someone told me P.” Note that if a contrastive focused meaning is added, this focus would simply be added to the antecedents of P and D.  The explication still works to express the meaning of aa4 or its English equivalent in the given context. The meaning of aa4 in this context, based on the explication of (51), is as follows: I think maybe it’s like this (P: you’re going on leave next week) because something happened (D: you said that you’re going on leave next week) I want to know I want you to say something because of this

The next two examples were both from Yip and Matthews (2001): (48′)

Gam3 cin2 ge3 dou6lei5 dou1 m4-ming4 aa4? so shallow PRT principle even NEG-understand AA “You can’t even understand such a simple principle ?”

For this example, it is very unlikely that the antecedent of D is “you said P.” One way to represent the antecedent of D is as “you said something,” where “something” refers to something the listener said which the speaker considers to be evidence of P: I think maybe it’s like this (P: you can’t even understand such a simple princi ple)

160

6  The Results of the Research

because something happened (D: you said something [that indicates you don’t understand]) I want to know I want you to say something because of this

The final example was (49): (49′)

Keoi5 dou3 ji4gaa1 dou1 m4-hang2 jyun4loeng4 lei5 aa4? 3s up to now even NEG-willing forgive 2s AA “He’s still unwilling to forgive you even now ?”

In this example, D could be “you said P” or, as in (48), “you said something,” where “something” refers to some evidence of P. It could also be something said by the third person “he” who is referred to in P. The aa4-equivalent tone would most naturally occur on the final syllable “now,” because “even now” expresses a contrastive meaning with “before now.” Using the explication to show the meaning that is expressed by aa4 and its English equivalent in this context is straightforward.

6.2.3  Summary and Analysis Based on (1) the translations of the me1 and aa4 dialogues from the corpus, all of which included rising tones, (2) the minimal-pair translations, most of which included rising tones (all for me1 and most for aa4), and (3) my native-speaker intuition that hears these two rising tones as distinct from each other, I conclude that the me1- and aa4-equivalent English discourse tones have distinct rising forms. The pitch of the me1-equivalent tone rises more than does the aa4-equivalent tone; these two tones are therefore referred to as a high-rising tone and a mid-rising tone, respectively. In order to contrast the meanings between me1 and aa4 (and by extension their English equivalents), let us consider the minimal-pair dialogue from (52), which was the following response to the conversational participant having said “My plane leaves at 2:30 tomorrow.” (52′)

Lei5 ting1jat6 zau2 aa4/ me1? 2s tomorrow leave AA/ ME “You’re leaving tomorrow ?/ ?!”

Expressing this meaning in terms of the explications looks like this, where line in square brackets refers only to the meaning of me1: I think maybe it’s like this (P: you’re leaving tomorrow) because something happened (D: you said you leave at 2:30 tomorrow) [before this happened, I thought: it isn’t like this (P)] I want to know I want you to say something because of this

6.2  Two Question Particles

161

Based on this, the only difference between these two particles is that me1 is only used when the speaker believed the negative form of the proposition prior to the time of speech. This is a speaker-oriented stance that is only known to the speaker. For this reason, these two question particles and their English equivalents are exchangeable in the vast majority of contexts where they can be used. The only difference is that me1 reveals a prior stance on the part of the speaker, while aa4 does not. However, it is possible to construct contexts where the speaker did or did not believe a negative form of the proposition prior to using one of these question particles. Such contexts are exploited below to test the validity of the explications proposed for these two particles. The discovery of these two question tones is a meaningful contribution to the study of the forms and meanings of rising declaratives. The most thorough and detailed study of the meaning of rising declaratives, as far as I know, is Gunlogson’s (2003), which will be discussed in some detail below. Me1, aa4, and their high- and mid-rising English equivalents are all assumed to be question particles, functioning to request information about the proposition. We can more specifically say that these particles form “polar questions” in the “functional sense of soliciting a yes/no response from a knowledgeable addressee” (Gunlogson 2003: 68), which is precisely the intended function of the last two lines of both particles’ explications. In the last line (i.e., “I want you to say something because of this”), the deictic “this” refers to the immediately preceding line (i.e., “I want to know”), and the null complement of “know” relates to P, expressing that the speaker wants to know if P is or is not the case. The final line “I want to know” is therefore what makes sentences using me1, aa4, or their English equivalents polar questions in the sense proposed by Goddard (2002: 4) for yes/no questions: “if it’s like this, I want you to say it’s like this; if it’s not like this, I want you to say it’s not like this.” Eliciting a response as to whether P is or is not the case makes a sentence that includes me1 or aa4 a polar question, but at the same time, it is something more. The last line of the explications requests the listener to “say something because of this (i.e., because of the speaker wanting to know whether or not P is the case).” In many contexts, it is naturally understood that the speaker is asking for more of a response than simply “yes, P is the case” or “no, P is not the case.” Consider (40), for example, where the jealous girlfriend asked her boyfriend if he knew the ­attractive girl who just waved at him: Lei5 sik1 keoi5 me1 (“You know her?!”). The speaker is undoubtedly requesting the listener to say something more than only “yes, I do” or “no, I don’t.” I will now discuss the sarcastic and rhetorical uses of these two question particles, because such uses have been talked about in the literature. Above I said that me1 was polysemous and suggested that its additional meaning can be thought of as a separate particle that can be referred to as me12. It appears at first glance that the explications I have proposed for me1 or aa4 cannot account for them being used sarcastically or rhetorically, and that perhaps that more related particles should be proposed in order to account for the sarcastic/rhetorical uses of these two SFPs. I do not think this is necessary and suggest that conversational participants can pick up these meanings pragmatically. In other words, the core meanings of the question

162

6  The Results of the Research

particles and their rising tone equivalents do not change, just as the core meanings of the words in a sentence that is said ironically or sarcastically do not change—the difference in overall meaning is entirely pragmatic. Consider a context that can use either particle sarcastically. Suppose a couple have dinner with a friend and the friend’s new girlfriend. After they have finished eating and said goodbye, the couple is now separated from their friend and his girlfriend. The following two-line dialogue occurs: (54)

woman: Waa4, keoi5 leoi5pang4jau5 gam3 ai2 ge3. PRT 3s girlfriend so short PRT “Wow, his girlfriend is so short.” man:

Lei5 hou2 gou1 aa4/ me1? 2s very tall AA/ ME “And you’re tall ?/ ?!”

In (54), the man obviously does not believe the proposition “you’re tall.” However, it is a request for information, albeit an insincere one. The use of me1 or aa4 or their equivalent rising tones in English expresses the meaning “I think maybe you’re tall because you said she’s short.” It is understood pragmatically that the speaker does not actually think that maybe the speaker is tall. This sarcastic use of me1 and aa4 is not like any of the examples seen thus far. As used in (54), it literally expresses the following, where only me1 includes the middle line in square brackets, but aa4 does not: I think maybe it’s like this (P: you’re tall) because something happened (D: you said his girlfriend’s so short) [before this happened, I thought: it isn’t like this (P)] I want to know I want you to say something because of this

The man’s utterance implies the understanding of a common-knowledge belief about the unspoken rules of criticism, namely that a speaker should criticize someone for having quality X unless the speaker does not have this quality X. In this case, only tall (i.e., not short) people should criticize other people for being short. Of course, this does not imply that the man thinks it is proper for tall people to say this; he may or may not. The D of this context is obviously the woman’s utterance, and both the man and the woman understand that her utterance does not influence his stance about her height, that is, he does now not think that maybe she is tall as a result of her saying that the girlfriend was short. He is therefore not serious about wanting to know if P is the case. The sarcastic meaning is understood pragmatically basic on logic and on the context and does not come from the meaning of the question particle. Therefore, there is no need to assume polysemy for the question particles or their tonal equivalents. The fact that me1 and aa4 are “commonly used in rhetorical questions” (Matthews and Yip 2011: 360) appears to pose a problem for defining them as question particles based on the idea that a question is a request for information about the proposition, because rhetorical questions do not solicit a verbal response. We can get around

6.2  Two Question Particles

163

this if we include the solicitations of responses that listeners are expected to provide to themselves, which is the function of a rhetorical question. Rhetorical questions are used to get the listener to think about, or say to one’s self, a response to the question, rather than to say it aloud. (It could be argued that sarcastic questions such as (54) are also rhetorical in the sense that the listener may just think about it rather than respond verbally). Consider these examples of rhetorical uses of me1 and aa4 from Matthews and Yip (2011: 360, English translations that of the author): (55)

Zung6 sai2 lei5 gong2 me1? still need 2s say ME “I need you to tell me .”

(56)

Ngo5 sai2 keoi5 lei5 ngo5 aa4? 1s need 3s care 1s AA “I need him to care about me .”

Even without knowing the details of the contexts for (55) and (56), and therefore not knowing what D is, it appears obvious that the speaker is not asking for a verbal response to the question. It is rhetorical; the speaker wants the listener to answer it in his or her own mind, and in both cases, the speaker wants the listener to answer it in the negative. Whether or not a question is rhetorical is determined pragmatically from the context, and so is the understanding that the response should be negative. Gunlogson’s (2003) insightful observations about rising declaratives were helpful for further analyzing the meanings and contextual distributions of the me1- and aa4-equivalent tones. She considered all three sentences in (57a–c) to have the same propositional content and said that they form two minimal pairs: (57)

a. Is it raining? b. It’s raining? c. It’s raining. (Gunlogson 2003: 8)

Gunlogson said (57a) and (57b) are minimal pairs because they contrast “only in syntactic form,” and (57b) and (57c) are “identical except for intonational contour.”10 Using minimal pairs such as these, Gunlogson constructed a large number of contexts to show how rising declaratives pattern in relation to interrogatives and falling declaratives. She then proposed the following two generalizations to explain the linguistic facts:

 My own conclusions based on my research differ from Gunlogson’s (2003). I agree that (57a) and (57b) are syntactically distinct clause types, one being an interrogative and the other a declarative. However, this is not their only difference; (57b) additionally includes a tonal morpheme that is absent from (57a).

10

164

6  The Results of the Research

(58)

Declaratives [rising or falling] express a bias that is absent with the use of interrogatives; they cannot be used as neutral questions.

(59)

Rising declaratives, like interrogatives, fail to commit the Speaker to their content. (Gunlogson 2003: 99)

Related to (58), Gunlogson (2003: 54) said that rising declaratives “cannot readily be used as questions ‘out of the blue,’ with no particular context, as interrogatives can be.” This is accounted for in the explication for me1, aa4, and their English equivalents, because they refer to a discourse element D that is the antecedent of “something happened.” Gunlogson (2003: 16–8, 54–5) gave a number of examples of out-of-theblue contexts in which nothing had happened (i.e., where there was no D) to cause the speaker to think that maybe P was the case. She demonstrated that rising and falling declaratives are not compatible with such contexts. Here is one example she gave: (60)

[initiating a phone conversation] a. Is Laura there? b. ??Laura’s there? c. ??Laura’s there. (Gunlogson 2003: 55)

She contrasted those contexts with others that included a discourse element D, and in such cases, as would be expected, rising declaratives are acceptable. Here is one of her examples: (61)

A: Maria’s husband was at the party. B’s reply: a. Is Maria married? b. Maria’s married? c. ??Maria’s married.

The rising declarative in (61B-b) translates very well as a me1- or aa4-suffixed question, as would be expected. It is easy to see how the meaning of either me1 or aa4equivalent intonation fits here by inserting D and P into the explications of (42) and (51), respectively: “I think that maybe it’s like this (P: Maria’s married) because something happened (D: you said Maria’s husband was at the party)”. The sentence in (61B-b) represents two separate tonal morphemes with different meanings, so there should actually be four sentences shown in (61) rather than three, and this is true for all of Gunlogson’s contexts that can acceptably use rising declaratives. The English equivalent of me1 in (61B-b) would only be used in this context if the speaker originally thought: “Maria’s not married,” otherwise the tone would be the English equivalent of aa4. To account for there being two different rising declaratives with distinct meanings, we could add this to go along with Gunlogson’s generalizations in (58) and (59): (62)

The meaning of high-rising, but not mid-rising, declaratives entails a prior belief in the negative form of their content.

6.2  Two Question Particles

165

Gunlogson (2003: 10) defined the intonation of a rising declarative as “non-­ falling from the nuclear pitch accent to the terminus and ending at a point higher than the level of the nuclear accent.” The problem is that this describes both me1and aa4-equivalent intonation, as well as any other potential tonal morphemes English that have a rising shape and are used on either declaratives or interrogatives; it does not distinguish among them. The findings of this study do not conflict with Gunlogson’s conclusions, but they suggest that there could be a more detailed, refined account of the linguistic facts than what Gunlogson provided. One way to add detail is to divide the rising declaratives in her examples into two, with one representing the me1-equivalent tone and the other the aa4-equivalent tone, as is done here in this modification of example (57): (57′)

a. Is it raining? (rising interrogative)11 b1. It’s raining?! (high-rising me1-equivalent tone) b2. It’s raining? (mid-rising aa4-equivalent tone) c. It’s raining.

Gunlogson’s (2003) generalities do not account for the differences in meaning between (57′b1) and (57′b2). She also did not show any contexts in which a rising declarative patterns separately from both an interrogative and a nonrising declarative. To illustrate this, consider the following context where neither a neutrally intoned rising interrogative nor a neutrally intoned declarative is acceptable: (63)

[B was outside one minute ago and the sky was blue] A: Lok6-gan2 jyu5 wo3 fall-PROG rain SFP “It’s raining.” B’s response:

a. ??Hai6-m4-hai6 lok6-gan2 jyu5 aa3? be-NEG-be fall-PROG rain SFP ??“Is it raining?”

b1. Lok6-gan2 jyu5 me1? (surprise, doubt) fall-PROG rain ME “It’s raining ?!” b2. ?Lok6-gan2 jyu5 aa4? fall-PROG rain AA ?“It’s raining ?”

c. ??Lok6-gan2 jyu5 aa3. fall-PROG rain SFP ??”It’s raining.”

 There are also nonrising interrogatives that pattern differently from rising interrogatives, but Gunlogson (2003) purposely left those out of her minimal-pair contrasts.

11

166

6  The Results of the Research

In the English translations, there are two rising declaratives. The me1-equivalent one in (63 B-b1) is the only sentence that is fully acceptable in this context. The explication of me1 can account for this because it includes the proposition: “before this happened (i.e., before you said, ‘It’s raining’), I thought: it is not raining.” This is because just one minute prior, the speaker had been outside under a blue sky. The SFP aa4 and its English equivalent are not as acceptable, indicated by a single question mark preceding the sentence. Even though (63 B-b2) does not express that the speaker is committed to the proposition, it indicates that the speaker can readily accept the proposition based on speaker A’s statement and is just asking for confirmation. This is an unlikely stance to take just one minute after seeing that the sky was blue. Here is another example of a context where neither a neutrally intoned rising interrogative nor a nonrising declarative sounds natural. In this case, the aa4-­ equivalent tone is acceptable, but the me1-equivalent tone is not: (64)

[Speaking to a person who lives in the same block of flats as the speaker, and who the speaker occasionally sees in the elevator on the way to work. The listener is wearing a uniform and it is the time s/he normally leaves for work, making it obvious (pragmatically) that s/he is going to work.] a. ??Lei5 hai6-m4-hai6 faan1 gung1 aa3? 2s be-NEG-be go work SFP ??“Are you going to work?” b1. ??Lei5 faan1 gung1 me1? 2s go work ME ??“You’re going to work ?!” b2. Lei5 faan1 gung1 aa4? 2s go work AA “You’re going to work ?” c. ??Lei5 faan1 gung1 aa3. 2s go work SFP ??“You’re going to work.”

The aa4-equivalent tone in (64b2) is the only sentence that sounds natural in this context. The explication of me1 accounts for this because it includes the meaning: “before this happened (D: I saw you in the elevator dressed for work at the time you normally go to work), I thought: it isn’t like this (P: you’re going to work).” This would be very odd, because the speaker knows this is the time that the listener normally goes to work. A related context in which the use of me1 or its English equivalent would sound natural is if this scenario were changed to a Sunday, a day when the speaker would have a reason to assume that the listener should not be going to work, then me1-suffixing and its English equivalent sound perfectly natural for the context, and aa4 and its equivalent do not. Examples such as (63) and (64) demonstrate that there exists at least two tonal particles in English that form rising declaratives, and it is proposed that the difference in their meanings is captured in the

6.3  Two “Only” Particles: ze1 and zaa3

167

explications proposed for them here, a difference which is expressed by the generalization in (62).

6.3  Two “Only” Particles: ze1 and zaa3 The SFPs ze1 and zaa3 are obviously related, and the literature has frequently compared and contrasted them with each other. This section will therefore discuss this pair of particles simultaneously rather than in separate subsections as was done for the other two pairs of particles in Sects. 6.1 and 6.2. The bulk of the discussion focuses on ze1, which is semantically more complex, but zaa3 is also discussed to the extent that it informs the description of ze1.

6.3.1  The Particles ze1 and zaa3 The NSM explications of ze1 and zaa3  Fung (2000: 30–73) discussed seven SFPs that all have the unaspirated alveolar affricate onset /z/, two of which are ze1 and zaa3. She proposed that each of these seven particles has the core semantic feature [+restrictive], which functions to delimit, and further proposed that ze1 and zaa3 both also have the feature [+exclusive] and that ze1 additionally has the semantic feature [+diminutive], which functions to downplay. Other authors have similarly concluded that these two SFPs each express the meaning “only” and that ze1 additionally has a downplaying function (e.g., Matthews and Yip 2011; Sybesma and Li 2007). Sybesma and Li (2007: 1754) pointed out that most authors “agree that zaa3 conveys ‘only’ in the neutral sense of ‘not more than that’ or ‘and not something else as well’.” Sybesma and Li also explained that zaa3 can convey either (1) a restrictive, delimiting meaning (i.e., not more than a certain amount on a scale) or (2) an exclusive meaning (i.e., only this, and not something else). This is also true for English’s “only” and its closest counterparts in Cantonese: zing6hai6 or zi2hai6 (“only”). Whether a restrictive or an exclusive meaning is implied is pragmatically determined by listeners based on the context and/or on the content and wording of the proposition. For example, the statement “I only ate one piece of pie” will very likely be interpreted as scaler—not more than one piece. In contrast, the statement “I only ate pie” will be interpreted as exclusive—not pie and something else (e.g., not also cake). Kwok (1984) suggested that ze1 indicates something is not excessive, while zaa3 indicates it is insufficient. Saying that ze1 indicates something is not excessive is complementary to other authors claiming it expresses “downplaying.” However, the meaning “insufficient” is not inherent to the semantics of zaa3, as Kwok suggested; it is a pragmatic meaning that is understood from the context. Consider the following contrasts:

168 (65)

6  The Results of the Research [In response to: “How much money do you have?”] a. Sap6 man1 zaa3. ten dollar ZAA “Only ten dollars.” (not as much as I want/need) [In response to: “How much is it?”] b. Sap6 man1 zaa3. ten dollar ZAA “Only ten dollars.” (less than one would expect) [In response to: “Why are you wasting your money on that?”] c. Sap6 man1 ze1. ten dollar ZE “(It’s) only ten dollars.” (not enough to matter) (modified from Fung 2000: 59–60)

The zaa3-suffixed response in (65a) seems to support Kwok’s (1984) claim that zaa3 indicates the amount is “insufficient.” However, when this same zaa3-suffixed sentence is said in reply to the question in (65b) (i.e., “How much is it?”), then the amount “ten dollars” is not understood to be an insufficient amount. This indicates that zaa3 only entails the neutral meaning “only,” and the implication that the amount expressed in the proposition is unfortunately less than the speaker wanted (i.e., 65a), or thankfully less than expected (i.e., 65b), is determined pragmatically based on the discourse context. In contrast to (65a) and (65b), the use of ze1 in (65c) downplays the amount, indicating that it is not a large enough amount to be considered as a waste of one’s money. The SFP ze1 is used when the speaker thinks that the listener believes something is excessive. This can be demonstrated by replacing zaa3 with ze1 in both (65a) and (65b). The use of ze1 is acceptable in the context of (65b), but not in the context of (65a). The listener’s question in (65b) (i.e., “How much is it?”) would usually imply that the listener assumes the price will be higher than desired. This licenses the use of ze1 to downplay the amount, so it is acceptable in this context. In contrast, the listener’s question in (65a) (i.e., “How much money do you have?”) would not normally imply that the listener thinks the amount is excessively high.12 Therefore, downplaying the amount does not make sense in this context, rendering the use of ze1 here unacceptable. The contrast between zaa3, which neutrally expresses the meaning “only,” and ze1, which expresses “only” plus “downplaying,” is shown in the following proposed definitions: (66)

P + zaa3 = hai6 gam3 do1 (X), m4 hai6 do1di1 be this much/many NEG be more “it is this much/many (X), (it is) not more”

 Of course, a context could be constructed so that this question implies an excessive amount. For example if the speaker asks it right after the listener says he or she bought a new Porsche.

12

6.3  Two “Only” Particles: ze1 and zaa3 (67)

169

P + ze1 = a. hai6 gam3 do1 (X), m4 hai6 do1di1 be this much/many NEG be more “it is this much/many (X), (it is) not more” b.

lei5 lam2-gan2 m4-hou2 ge3 je5 2s think-PROG NEG-good GEN thing “you are thinking something bad (D)”

c.

m4-hai6 gam2joeng2 NEG-be like this “it is not like this”

These explications are updated from earlier versions that appeared in Wakefield (2012b). One problem with those earlier versions was that both included the s­ entence “it’s not more than this.” Leung (2016: 240) pointed out that the comparative “than” is not an NSM prime and that the phrase “(not) more than X” therefore violates NSM grammar.13 The new definition in (66) corrects this mistake. The “not more” portion of the definition can be interpreted to mean either not more in degree/amount, giving a scaler reading, or not more in addition, giving an exclusive reading. The first line of the definition of ze1 (i.e., 67a) indicates that the meaning of zaa3 is entailed within the meaning of ze1. The antecedent of “this” in (66) and (67a) is referred to as X; it is contained within the proposition P to which zaa3 or ze1 is attached. X can be the entire VP or some constituent within the VP, and whatever it is must be figured out pragmatically by the listener based on the context and/or common sense reasoning; which portion of the VP is singled out by zaa3 or ze1 does not appear to be marked in any way, intonationally or otherwise. In (65), the X that is the antecedent of “this” is sap6 man1 (“ten dollars”). Other examples will be discussed below. The English translations of the explications in (66) and (67) are proposed as explications, shown here in (68) and (69), that define the English equivalents of ze1

 Leung (2016), who also proposed an NSM explication of zaa3, did not conclude, as most authors have, that zaa3 is a neutral expression of the meaning “only,” and she added this additional line to her definition: “someone can feel something because of this.” I do not agree that zaa3 expresses this, but interested readers can refer to Leung (2016, Chap. 7) to see her explanation. The first line of Leung’s explication for zaa3 also did not include “much/many” and added the meaning of similarity using “like,” resulting in: “it is like this, (it is) not more.” While this can refer to a state or an event, it does not seem that this accurately refers to an amount, such as the “ten dollars” of the example in (65). In contrast, wording it as I have in (66) and (67) (i.e., “it is this much/many, (it is) not more”) can refer to an amount, or to a state or event. For example, if a boy was explaining to a jealous girlfriend that he was only chatting with that girl she saw him with, nothing more, then he could attach zaa3 or ze1 to a statement such as “Ngo5dei6 king1gai2 ge3 zaa3/ze1” (“We were (only) chatting”), and it could be followed by “Hai6 gam3 do1” (“Nothing more (than that)”). This indicates that the activity “chatting,” which is an event, can be thought of in terms of an amount, i.e., not more serious than this. Or it could also be interpreted with an exclusive reading, where conversing is an event with more than one possible interpretation: chatting, flirting, dating, courting, etc. And here zaa3/ze1 are used to express that the event of these two people talking had only one of those possible meanings or interpretations, i.e., “chatting.”

13

170

6  The Results of the Research

and zaa3. The data that provide evidence of their English-equivalent forms are presented in the following section. (68) (69)

P (including “only”) + zaa3-equivalent intonation = it is this much/many (X), (it is) not more P (including “only”) + ze1-equivalent intonation = a. b. c.

it is this much/many (X), (it is) not more you are thinking something bad (D) it is not like this

The two evidential particles (i.e., lo1 and aa1maa3) and the two question particles (i.e., me1 and aa4) discussed in the previous sections all translated into English as a single pitch contour. In contrast, ze1 and zaa3 both translate as “only” (or “just”) plus a pitch contour. This means that these two morphemes in Cantonese each translate into English as two morphemes. Arguably, it is even three morphemes if we include the emphatic intonation that consistently appeared in the translations on “only,” but it must be noted that the constructed dialogues emphasized ze1 through the use of lengthening of the vowel, so emphatic intonation on “only” is perhaps not a translation of something that is inherent to ze1 itself. Putting aside the appearance of emphatic intonation in the dialogues and their translations, the contrast between Cantonese ze1 and its English equivalent can be shown as follows, where […X…] represents the proposition containing a focused element X: (70)

[[…X…] ze1] Cantonese [[…only X…] ] English

The semantics of ze1 is actually more complicated than what is shown in (67) because ze1 does not always include the meaning of “only” and does not always include the meaning of “downplay” (see Fung (2000: 48ff) for a detailed discussion). The following dialogue illustrates this:14 (71)

A: Keoi5 dak6dang1 gik1 haam3 keoi5. 3s deliberately incite cry 3s “S/he deliberately made her cry.” B: Gam2 hou2 seoi1 ze1. thus very bad ZE “That’s really bad.”

The ze1-suffixed response in (71B) does not include the meaning “only,” nor does it downplay the situation. To account for this, the present study assumes that ze1 is polysemous (e.g., Fung 2000). Its other meaning(s) and any possible English equivalent(s) of those other meaning(s) are beyond the scope of this study, which relates only to the meaning of ze1 as paraphrased in (67). 14

 I thank the anonymous reviewer who provided this example.

171

6.3  Two “Only” Particles: ze1 and zaa3

The English equivalents of ze1 and zaa3 based on the data  Wakefield (2010) included a preliminary study on the English equivalents of ze1 and zaa3. Based on translations from the ambilingual participants, it was concluded that zaa3 has an English equivalent in the form of the word “only” plus focus stress on the accented syllable of the word, or head word of the phrase, that is pragmatically understood to be put into focus by zaa3. In contrast to zaa3, no strong claim could be made about the English equivalent of ze1 because it did not consistently translate into English using the same form of intonation. It was proposed that the inconsistency in the translations was related to ze1 being polysemous. Nevertheless, two of the fifteen translations of ze1 used a rise-fall-rise pitch contour, and it was suggested, based on the author’s intuition, that this form of pitch in English has the meaning of “downplay,” which, aside from “only,” is the most common meaning ascribed to ze1 in the literature.

P itch (H z logarithm ic)

One of the rise-fall-rise translations from Wakefield (2010: 250) is shown here in Fig.  6.53. The translated sentence, which was “It was only two,” shows a high-­ falling tone on the first syllable of “only,” and a rise-fall-rise contour on the rime of the syllable “two”: The rise-fall tone that shows on paper associated with “only” is heard only as a fall, so the rise was concluded not to be linguistically significant. It is merely a rise in pitch to enable the fall. The aim of the study reported in this section was to test the validity of Wakefield’s (2010) tentative conclusions. For the moment, let us assume that the meaning of ze1 was translated accurately into English as depicted in Fig.  6.53 and that ze1 is accurately defined by (67). Based on this assumption, we could conclude that English expresses the meaning of Cantonese ze1 through the use of an emphatically stress “only” plus a rise-fall-rise tone on the element that “only” has put into focus. This translation of ze1 is defined by (69a), which defines “only,” plus (69b-c), which defines the downplaying meaning of the rise-fall-rise tone. Further data were collected in order to test whether more evidence could be obtained in support of this conclusion. For this purpose,

134

95

IT WAS O

N LY

T

WO

67

0

Fig. 6.53  Male-a: “It was only two”

0.5 Time (s.)

1.0

172

6  The Results of the Research

dialogues that included a ze1-suffixed response to a question were constructed. Based on the assumption that ze1 is polysemous, each dialogue was purposely designed to elicit a translation of the meaning of ze1 as defined in (67), rather than one of its other meaning(s). In other words, ze1-suffixed sentences were placed in contexts intended to elicit translations of the meaning of “downplay.” This follow-up study involved two parts. For the first part, three of the four ambilingual translators who provided the translations shown above in Sects. 6.1 and 6.2 participated (male-b was the only one who chose not participate for whatever reason). These three participants each provided Cantonese-to-English mimic translations of ze1-suffixed sentences from three constructed dialogues. Later, four more ambilingual translators were recruited, and two additional question–answer ­dialogues were constructed, recorded, and translated by them—therefore, these four participants each translated five sentences. The three dialogues constructed for the first stage of the study are shown below in (72) to (74), and the two additional dialogues constructed for the second stage are shown in (75) and (76). A total of 29 ze1-suffixed sentences were translated; the first three participants translated three sentences each (= 9 tokens), and the other four participants translated five sentences each (= 20 tokens). In each of the dialogues, the target sentence (i.e., the ze1-suffixed sentence) that is shown in bold is speaker B’s response to speaker A. (72)

A: Lei5 baai2 gam3 do1 ho2lok6 hai2dou6, dim2 gau3 wai2… 2s put so much coke here how enough space …baai2 kei4taa1 je5 aa3? put other thing SFP “If you put that much coke here, how will there be enough space to put other stuff?” B: Hai6 dak1 saam1-gun3 ze1. be only three-CL ZE “There’re only three cans.”

(73)

A: Waa4! Lei5 tai2 jat1tou3 hei3 ho2ji5 tai2 gam3 do1 ci3 ge2. wow 2s watch one-CL movie can watch so many time SFP “Wow! You can watch the same movie so many times!” B: Ngo5 tai2-gwo3 loeng5 ci3 ze1. 1s watch-EXP two time ZE “I’ve only seen it twice.”

(74)

(75)

A: Waa4! Lei5 jam2-zo2 gam3 do1 aa4? wow 2s drink-PERF so much SFP “Wow! You’ve drunk so much!” B: Ngo5 jam2-zo2 hou2 siu2 ze1. 1s drink-PERF very little ZE “I only drank a little!” A: Waa4, lei5 sau3-zo2 gam3 do1 aa3! Jau5 mou5 si6 aa3? Wow 2s thin-PERF so much SFP have not-have thing SFP “Wow, you’ve lost so much weight! Are you okay?” B: Lei5 gam3 kwaa1zoeng1 gaa3. Ngo5 sau3-zo2 gei2 bong6 ze1. 2s so exaggerate SFP 1s thin-PERF few pound ZE “You exaggerate so much. I’ve only lost a few pounds.”



6.3  Two “Only” Particles: ze1 and zaa3 (76)

173

A: Lei5 zou6 me1 gam2joeng2 zing2 ngo5 aa3? Lei5 zi1-m4-zi1… 2s do what thus fix 1s SFP 2s know-not-know …hou2 tung3 aa3? very hurt SFP “Why are you doing that to me? Do you know that really hurts?” B: Ngo5 waan2-haa5 ze1. 1s play-DM ZE “I’m just playing.”

Speaker A’s utterances all imply that he or she thinks the amount or action being referred to is excessive to the point of being bad, and speaker B’s response implies that it is not excessive and therefore not that bad. It was predicted that this would cause the ambilingual translators to translate the meaning of downplay into English, that is, to downplay the number of coke cans in (72), the number of movie viewings in (73), the amount that was drunk in (74), the amount of weight lost in (75), and the seriousness of the action in (76). In Wakefield (2010: 249–251), only two out of fifteen translations of ze1 used a risefall-rise contour, one of which is shown above in Fig. 6.53. Those two tokens of the rise-fall-rise contour came from different participants, meaning that none of the participants produced more than one token of this pitch contour. In this follow-up study, twenty-two out of twenty-nine of the translations of ze1 used a rise-fall-rise contour, and all but one participant produced it more than once, as shown here in Table 6.1: The possible reasons as to why female-a did not use the rise-fall-rise contour in any of her translations, and why female-f only used it in two of her five translations, are discussed in Sect. 6.3.2. The F0 contours of the 22 translations that translated as “only/just” plus a rise-fall-rise tone are shown below in Figs. 6.54 through 6.75, and these are then followed by a discussion of what they show. Figs. 6.54 through 6.56 show male-a’s translations. Fig. 6.54 shows a rise-fall-­ rise tone associated with the rime of the syllable “cans”; Fig. 6.55 shows the same tone on the rime of “twice”; and Fig. 6.56 shows this same tone associated with the two syllables of the word “little.” The initial rise in Fig. 6.56 occurs on the rime of the first syllable of the word, and the following fall-rise is on the rime of the second syllable, which is an alveolar lateral approximant. The peak between the first rise and the fall coincides with the alveolar tap onset of the second syllable. These three figures show that the word “only” plus a rise-fall-rise tone was used in each of the Table 6.1  Number of times each participant used a risefall-rise contour

male-a: 3 out of 3 translations female-a: 0 out of 3 translations female-b: 3 out of 3 translations female-c: 4 out of 5 translations female-d: 5 out of 5 translations female-e: 5 out of 5 translations female-f: 2 out of 5 translations total: 22 out of 29 translations

6  The Results of the Research

Pitch (Hz logarithmic)

174

134

95

TH’RONLYTHREE CA

N

S

67

0

0.5

1

1.5

1.7

Time (s.)

Pitch (H z logarithm ic)

Fig. 6.54  Male-a: “There’re only three cans”

134

95

‘V ONLY S EEN IT T WI

CE

67

0

0.5

1

1.3

Time (s.) Fig. 6.55  Male-a: “I’ve only seen it twice”

male participant’s translations, providing some empirical evidence in support of the claim that the meaning of ze1 is expressible in English and that this is the form it takes. None of female-a’s translations used a rise-fall-rise contour. Possible reasons for this are discussed in Sect. 6.3.2. Female-b’s first translation in Fig.  6.57 shows a relatively subtle rise-fall-rise contour associated with the syllable “cans,” and it sounds like a somewhat rushed and muted version of this form of pitch. Her two other translations were not so muted. Fig. 6.58 shows the tone associated with the syllable “twice,” and her final translation in Fig. 6.59 shows a rise-fall-rise associated with the two-syllables of “little.” Just as with male-a’s translation shown in Fig. 6.56, the peak after the first

Pitch (Hz logarithmic)

6.3  Two “Only” Particles: ze1 and zaa3

175

134

95

I ONLY DRAN K A LI TTLE 67

0

0.5 Time (s.)

1.2

1

Pitch (Hz logarithmic)

Fig. 6.56  Male-a: “I only drank a little”

311

220

IT’SONLY THREE C A

N

S

156

0

0.5 Time (s.)

1

1.1

Fig. 6.57  Female-b: “It’s only three cans”

rise coincides with the onset of the second syllable, which in this case was realized as an aspirated alveolar plosive. Female-c did not use a rise-fall-rise contour in her first translation but she did in the remaining four that she translated, which are shown in Figs. 6.60 through 6.63. In Fig. 6.60, the tone is shown on the syllable “twice.” Fig. 6.61 shows it to be associated with the two syllables of “little.” In Fig. 6.62, it is associated with the syllable “pounds,” and in Fig. 6.63, it is on the two syllables “playing.” Female-d used a rise-fall-rise tone in all five of her translations, which are shown in Figs. 6.64 through 6.68. The rise-fall-rise tone on “playing” in her final ­translation in Fig. 6.68 looks on paper to be almost a straight line, but the rise-fall-rise shape of this tone can clearly be heard in the audio. Female-e also used the rise-fall-­rise con-

6  The Results of the Research

Pitch (Hz logarithmic)

176

311

220

I’VONLY WATCH’DIT T WICE 156

0

0.5

1.3

1 Time (s.)

Pitch (Hz logarithmic)

Fig. 6.58  Female-b: “I’ve only watched it twice”

311

220

I ONLY DRANKA LI

156

0

0.5 Time (s.)

TT LE

1

1.2

Fig. 6.59  Female-b: “I only drank a little”

tour in all of her translations, and all of them show up very clearly on paper as seen in Figs. 6.69 through 6.73. Female-f only used the rise-fall-rise tone in two of her translations. The first one in Fig. 6.74 looks subtle but is clearly heard as a rise-fall-rise. The second translation of hers, which is shown in Fig. 6.75, is clearly a rise-fall-rise both on paper and in the audio. Based on the data, it is proposed that the English equivalents of ze1 and zaa3 both include the word “only” (or “just”) with emphatic intonation, plus a high-­ falling tone or a rise-fall-rise tone, respectively, on the element that is put into focus by “only.” Further conclusions about the data are discussed in Sect. 6.3.2.

Pitch (Hz logarithmic)

6.3  Two “Only” Particles: ze1 and zaa3

177

281

199

I’VEONLYWATCHEDIT T WICE 141

0

0.5

1.5

1 Time (s.)

Pitch (H z logarithm ic)

Fig. 6.60  Female-c: “I’ve only watched it twice”

281

199

I’VE ON LY DRAN KA LITTLE 141

0

0.5

1.3

1 Time (s.)

Pitch (Hz logarithmic)

Fig. 6.61  Female-c: “I’ve only drank a little”

281

199

I’VEONLYLOSTA FEW POUNDS

141

0

0.5

1 Time (s.)

Fig. 6.62  Female-c: “I’ve only lost a few pounds”

1.5

1.7

6  The Results of the Research

Pitch (Hz logarithmic)

178

281

199

I’M ON

LY P LAY

ING

141

0

1.0

0.5 Time (s.)

Pitch (Hz logarithmic)

Fig. 6.63  Female-c: “I’m only playing”

342

242

IT’S ONLY TH REE C ANS

171

0

0.5

1

1.5 1.6

Time (s.)

Pitch (Hz logarithmic)

Fig. 6.64  Female-d: “It’s only three cans”

342

242

I’VEONLYWATCHEDITTWICE

171

0

0.5

1 Time (s.)

Fig. 6.65  Female-d: “I’ve only watched it twice”

1.5

1.7

179

Pitch (Hz logarithmic)

6.3  Two “Only” Particles: ze1 and zaa3

342

242

I ONLY

171

DRAN

0

KA LI

0.5

TTLE

1.4

1 Time (s.)

Pitch (Hz logarithmic)

Fig. 6.66  Female-d: “I only drank a little”

342

242

I’VEONLYLOSTA F EW P OUNDS 171

0

0.5

1 Time (s.)

2.1

1.5

Pitch (Hz logarithmic)

Fig. 6.67  Female-d: “I’ve only lost a few pounds”

342

242

I’M

J U S’

P

LAY

ING

171

0

0.5

1 Time (s.)

Fig. 6.68  Female-d: “I’m just playing”

1.3

6  The Results of the Research

Pitch (Hz logarithmic)

180

243

172

IT’S JU S’ THREE

C ANS

122

0

0.5

1.3

1 Time (s.)

Pitch (Hz logarithmic)

Fig. 6.69  Female-e: “It’s just three cans”

243

172

I’VEONLY S EENIT T WI

CE

122

0

0.5

1.4

1 Time (s.)

Pitch (Hz logarithmic)

Fig. 6.70  Female-e: “I’ve only seen it twice”

243

172

I ONLY DRANKA

122

0

Fig. 6.71  Female-e: “I only drank a little”

L I TTLE

0.5 Time (s.)

1

1.1

181

Pitch (Hz logarithmic)

6.3  Two “Only” Particles: ze1 and zaa3

243

172

I ONLYLOSTAFEW

P O U NDS

122

0

0.5

1 Time (s.)

1.5

1.7

Pitch (Hz logarithmic)

Fig. 6.72  Female-e: “I only lost a few pounds”

243

172

I’M

JU S’

P LAY

ING

122

0

Fig. 6.73  Female-e: “I’m just playing”

0.5 Time (s.)

1

1.2

6  The Results of the Research

Pitch (Hz logarithmic)

182

239

169

I

ONLY

DRAN K A LI

TTLE

0.5

1

120

0

1.4

Time (s.)

Pitch (Hz logarithmic)

Fig. 6.74  Female-f: “I only drank a little”

239

169

I’M

J U S’

P

LAY

ING

120

0

0.5 Time (s.)

0.9

Fig. 6.75  Female-f: “I’m just playing”

Applying the NSM explications to the examples of ze1 and zaa3 from the literature and the data  As was done for the particles in previous sections, the pitch contours that were proposed to be the equivalents of ze1 and zaa3 are shown here informally with a pitch contour that follows the syllable(s) with which it is associated. Starting with the examples from (65a/b) and (65c) as follows, which represent oral English translations of zaa3 and ze1, respectively: (65′)

a/b. Sap6 man1 zaa3. ten dollar ZAA “On ly ten do llars.” c. Sap6 man1 ze1. ten dollar ZE “(It’s) on ly ten dollars

.”

6.3  Two “Only” Particles: ze1 and zaa3

183

In (65′a/b), zaa3 is shown to translate as “only” with an emphatic falling tone on its first syllable, plus a high-falling tone on the first syllable of “dollars.” In (65′c), ze1 is shown to translate as “only” with a falling tone, plus a rise-fallrise tone on both syllables of “dollars.” For (65′a/b), we can paraphrase the meaning that zaa3 expresses in those contexts based on the explication in (66) as follows: it is this much/many (X: ten dollars), (it is) not more

The context of (65a) was a response to the question: “How much money do you have?” and the context of (65b) was a response to the question: “How much is it?” In both cases, zaa3 merely expresses that the amount is not more than what is stated within the proposition. Whether the amount is considered to be unfortunately insufficient or luckily not to large, is determined pragmatically. In both contexts, English expresses this meaning as shown in the translation of (65′a/b). For (65′c), the meaning expressed by ze1 in that context can be paraphrased as follows based on the explication in (67): it is this much/many (X: ten dollars), (it is) not more you are thinking something bad (D: buying this is a waste of my money) it is not like this

The context of (65c) was a response to the question: “Why are you wasting your money on that?” This question creates a discourse element D, which is that the listener who has just asked this question thinks the item must cost more than it is worth and is therefore a waste of money. The speaker thinks the item is worth ten dollars and likely assumes that the listener must think it costs more than this, so he or she downplays the amount by attaching ze1 to the response. English expresses this meaning as shown in the translation of (65′c). The dialogues in (72) to (75) are all similar to (65c); each is a ze1-suffixed response to a question that implies the listener is thinking something bad, and the response therefore downplays the amount of the item being discussed. Each is shown here with the ambilingual participants’ English translation showing the rise-­ fall-­rise tone following the syllables that it was realized on. The meaning of ze1 in each context is then shown based on the explication. (72′)

Hai6 dak1 saam1 gun3 ze1. be only three CL ZE “There’re on ly three cans

.”

it is this much/many (X: three cans), (it is) not more you are thinking something bad (D: it is so many cans that it will not leave enough room for other things) it is not like this (73′)

Ngo5 tai2-gwo3 loeng5 ci3 ze1. 1s watch-EXP two times ZE “I’ve on ly seen it twice .” it is this much/many (X: two times), (it is) not more you are thinking something bad (D: I have seen it many times) it is not like this

184 (74′)

(75′)

6  The Results of the Research Ngo5 jam2-zo2 hou2 siu2 ze1. 1s drink-PERF very little ZE “I on ly drank a little .” it is this much/many (X: a little), (it is) not more you are thinking something bad (D: I drank a lot) it is not like this Ngo5 sau3-zo2 gei2 bong6 ze1. 1s thin-PERF few pound ZE “I’ve on ly lost a few pounds .” it is this much/many (X: a few pounds), (it is) not more you are thinking something bad (D: I lost many pounds) it is not like this

The final example in (76) is comparable to the example discussed in footnote 13. Here, the speaker has been doing something that is painful to the listener, and this pain-causing event can be interpreted in more than one way. One possible interpretation is that the speaker intentionally wants to hurt the listener, and the listener’s questions imply that he or she may think this is the case (i.e., “Why are you doing that to me? Do you know that really hurts?”). This becomes the antecedent D in the discourse for “thinking something bad.” The first line of the explication (i.e., “it is this much/many, (it is) not more”) can refer to an amount, as in the above examples, or it can refer to a state or event. In this case, it is an event in the form of the action of causing pain, and this event can have more than one possible interpretation. It could be interpreted as an intention to cause pain, or as some other intention, in which case the pain caused is accidental. Here, the speaker attaches ze1 to a sentence that states a benign action (i.e., “playing”), and this indicates that the resulting pain is incidental rather than intentional. In this way, ze1 has an exclusive reading, that is, only the action of causing pain accidentally while playing, and not the action of causing pain intentionally. (76′)

Ngo5 waan2-haa5 ze1. 1s play-DM ZE “I’m just playing

.”

it is this much/many (X: playing), (it is) not more you are thinking something bad (D: I am intentionally hurting you) it is not like this

Both ze1 and zaa1 are acceptable in these contexts and are probably exchangeable in all contexts where ze1 expresses the meaning of “only” plus downplay. Whichever is used depends on whether or not the speaker decides to additionally express the meaning of downplay as expressed by the second and third lines of ze1’s explication. The same holds for their English intonational equivalents.

6.3  Two “Only” Particles: ze1 and zaa3

185

6.3.2  Summary and Analysis

Pitch (Hz logarithmic)

The ze1-suffixed sentence in (72B) includes the word dak1 (“only”), so the translation of that sentence does not show that ze1 entails the meaning “only.” However, the ze1-suffixed sentences in (73B) through (76B) do not contain any word or morpheme in addition to ze1 that could be translated into English as “only,” which indicates that the minds of all of the ambilingual translators include “only” (or “just”) as part of the meaning of ze1. One could argue that the English word “only” is all that ze1 means and is therefore a complete translation of it, but this would be inconsistent with the literature’s description of ze1 as meaning something more than the SFP zaa3. The SFP zaa3, which expresses the neutral meaning “only,” translated consistently into English as “only” plus a high-falling tone on the focused element. Since English translations of ze1 were discovered to differ from translations of zaa3 only with respect to the shape of the tone that appears on the focused element, it seems reasonable to conclude that this different form of English intonation is how English expresses the difference in meaning between ze1 and zaa3. Unlike the other six participants, none of the translations from female-a included the rise-fall-rise tone. However, her intonation was not neutral, nor was it like the translations of zaa3. She used what sounded like defensive intonation, as if to indicate that she had done nothing wrong (or bad) and did not like the fact that the conversational participant had implied that she had. Her overall pitch height, or key, was raised on the translations of her ze1-suffixed sentences. Fig.  6.76 shows her translation of dialogue (73): The mean pitch in Fig. 6.76 is 303Hz, with a range from 207Hz to 349Hz. This contrasts with a 6.7 second extract from female-a telling a story using neutral intonation, which had a mean pitch of 184Hz, and a range of 77Hz to 283Hz. In addition to raising the pitch key, she also used emphatic intonation on “twice.” No further attempt will be made here to describe the meaning of female-a’s intonational forms other than to say that they sounded suitable to the contexts of the

317

224

I ONLY WATCHED IT TWICE 158

0

0.5

1 Time (s.)

Fig. 6.76  Female-a: “I only watched it twice”

1.3

186

6  The Results of the Research

dialogues. There are different possibilities as to why she used a different form of intonation from the other participants: she may express the meaning of “downplay” in a different form, or she may have added something additional to the discourse context in her mind, causing her to replace the connotative meaning of ze1 with some other connotative meaning. In other words, 1) it could be a speaker individual difference in relation to the intonational form of this meaning or 2) it could be that the constructed question–answer dialogues did not work to elicit the meaning of ze1—as defined in (67)—from female-a, while these dialogues did work to elicit this meaning from the other five participants. Table 6.1 shows that, other than female-a, all the other participants translated ze1 as a rise-fall-rise tone plus the word “only” (or “just”) in some or all of their translations; female-c did not do so in one of her five translations and female-f did not do so in three of her five translations. The reason these two participants did not use the same translation each time is perhaps due to the difficulty of the task. Hirst (1983b: 97, footnote) noted that “[i]t is a remarkable thing in itself that many (though not all) untrained speakers are capable of reproducing the intonation of a sentence on a meaningless sequence of syllables.” If merely mimicking the intonation of an utterance is remarkable, then translating a semantically abstract segmental particle’s meaning from one language into another language that expresses that meaning as a form of intonation is beyond remarkable. There are all sorts of things that could interfere with the cognitive processes involved. It is therefore very interesting that six of the seven participants of this study each used a rise-fall-rise tone to translate ze1 more than once, and they did so without hearing each other’s translations. This empirical evidence supports the claim that the particle ze1 has a (near) equivalent in English of the form discovered through the ambilinguals’ translations and that this form of intonation is therefore a tonal morpheme. It cannot be ignored that the F0 shape of the rise-fall-rise contour shown in the figures above varies a great deal from one occurrence to the next. Significant differences in duration seem to be allowed, along with significant differences in the distance between the peak of the initial rise and the valley of the fall. A drastic example is seen in Fig. 6.68 and Fig. 6.72; the duration of the rise-fall-rise tone in Fig. 6.68 is roughly twice as long as in Fig. 6.72, while the distance between the peak and valley is significantly less in Fig. 6.68 than it is in Fig. 6.72. The shape of the rise-­ fall-­rise tone is more subtle in some cases than in others, but there are good reasons to classify the various rise-fall-rise shapes in the figures above as allotones of a single tone. One reason is that they all come from the same source, which is a ­translation of ze1. Another reason is that the pitch of the rise-fall-rise tone shape can be heard in each of the translations (though the two translations shown in Figs. 6.57 and 6.64 are harder to detect). Based on the evidence of the translations, combined with the author’s native English-speaker intuition about this intonational form, it is proposed that the grammar of English includes this rise-fall-rise pitch contour and that it functions to express the meaning that is paraphrased in lines b and c of the definition in (67), which is the meaning of ze1 less the meaning of “only” in (67a). The following definition is proposed accordingly:

6.3  Two “Only” Particles: ze1 and zaa3 (77)

P+ a. b.

187

= you think something bad (D) it is not like this

The rise-fall-rise tone of (77) only expresses the downplaying portion of the meaning of ze1. Now that this English pitch contour and its associated meaning has been discovered through translation from Cantonese, it can be examined independently of the SFP ze1. Unlike ze1, it does not entail the meaning “only” and should therefore be attachable to propositions which do not have this meaning. Consider the following constructed sentences in (78), which seem to the author to be appropriate for this form of intonation. They are said by a mother to her son, who has just complained about having to look after his younger brother for an hour. The mother understands that the older sibling considers this amount of time to be too long, and this task to be too demanding. She therefore uses this tone on two sentences: one that downplays the amount of time and one that downplays the degree of inconvenience. (78)

a. It’s only for an hour b. It’s not gonna kill you

. .

The rise-fall-rise contour sounds natural on both the single syllable “hour” in (78a) and the two syllables “kill you” in (78b). Only (78a) includes the adverb “only,” but the explication in (77) seems to reflect the meaning of downplay that is present in both sentences in (78), indicating that this downplaying tone does not need to cooccur with “only.” The meaning of the tone used in these sentences can be paraphrased as follows based on (77). The meaning of “only” is added to (78′a) in italics: (78′)

a. it is this (X: for an hour), not more you think something bad (D: You will have to babysit a very long time) it is not like this b. you think something bad (D: babysitting will kill you) it is not like this

In both sentences, the rise-fall-rise tone downplays something that the speaker assumes the hearer to believe. In this context, both sentences relate to babysitting. What is shown in italics inside parentheses is an approximate paraphrase of what the speaker interprets the hearer’s stance to be (i.e., what D is), and the stance “babysitting will kill me” is of course figurative, not literal. There has been some discussion in the literature of a rise-fall-rise tone, but the analysis and/or the tones themselves differ from what has been proposed here. Sag and Liberman (1974: 420) described a rise-fall-rise contour, calling it the “contradiction contour.” Their example of this contour is realized across the entire sentence: “Elephantitis isn’t incurable.” I believe this is different from the tone discussed here. There example appears to include two tones—a high tone on the fourth syllable of elephanTItis and a rising tone that starts on the second syllable of incurable. According to Sag and Liberman, the meaning expressed is that of “contradiction”

188

6  The Results of the Research

(i.e., contradicting the presupposed stance of the listener), and this tone, or combination of tones, is therefore different from the tone under discussion here. If, on the contrary, a rise-fall-rise contour was used on the final three syllables of “incurable,” then this would be the same tone as is being discussed here, rather than two separate tones: (79)

Elephantitis isn’t incurable

.

This tone sounds natural used on the final three syllables of “incurable.” Based on the explication in (77), the meaning it expresses in this sentence is this: (79′)

you think something bad [elephantitis is incurable] it is not like this

Another example in the literature comes from Hirschberg and Ward (1992), who attributed the meaning “uncertainty” to a rise-fall-rise contour. In their examples, this contour was associated with a noun or a noun phrase. In the example shown here, the tone is associated with the syllable “cat”: (80)

A: Did you feed the animals? . B: I fed the cat

According to my intuition, this could possibly be construed as expressing “uncertainty” if there is no stress on “fed.” If “fed” is stressed, then the meaning of (80B) seems to be that of the downplay contour under discussion here. This is made more obvious if it is said in response to the accusation: “Why didn’t you feed the animals?” Using the explication, this tone expresses the following in addition to the meaning of the proposition itself, which states that the cat was fed by the speaker: you think something bad [none of the animals got fed] it is not like this

The meaning expressed is that the situation is not as bad as the listener implied with his or her question. Through the use of this downplaying tone the speaker indicates that not all the animals have gone unfed; the cat was fed. And since the cat has been fed the situation is not as bad as the listener thinks. The apparent difference in meaning when there is no stress on “fed” could indicate that this rise-fall-rise tone has more than one meaning. It is possible that this contour is polysemous (or homophonous with one or more tones of the same form). The only argument that is being made here is that the meaning of ze1 as defined in (67) is similar enough to one of the meanings of the English rise-fall-rise contour, so as to cause ambilinguals to use it quite consistently as a translation of ze1, and that this further warrants concluding that this contour has the same (or a very similar) meaning as the downplaying portion of ze1. The fact that both ze1 in Cantonese and the rise-fall-rise contour in English may be polysemous is not relevant, nor should it be surprising.

6.3  Two “Only” Particles: ze1 and zaa3

189

The form of the tone discovered in this study is comparable with what Pierrehumbert and Steele (1989) referred to as the L∗+H_L_H% tone. Their study found that speakers make a binary, categorical (rather than a gradient) distinction between this tone and what they referred to as an L+H∗_L_H% tone, in which the stress is associated with the first peak (indicated by H∗). In other words, speakers make a categorical distinction between a rise-fall-rise tone and a fall-rise tone. Gussenhoven (1984) likewise found that English speakers distinguish between these two contours. The rise-fall-rise tone of this study should therefore not be confused with a fall-rise, or with what some have analyzed as a pitch accent plus a high boundary tone. It is proposed to be different from a fall-rise even when the fall-rise is preceded by a relatively low level of pitch within the utterance, which would then look like a rise-fall-rise on paper. The rise-fall-rise of the present study is tentatively analyzed as a single tone with a specific global shape, and as such is not considered to be made up of three separate components (i.e., not as L∗+H_L_H%). The difference between the two contours discussed in the preceding paragraph is therefore not which of four tones is associated with an accented syllable, but rather whether or not the rise at the beginning of the contour is considered to be a component of the contour. Under the analysis being used here, the initial rise is part of the rise-fall-rise tone contour. In contrast, if there was a rise that appeared as part of a similar looking F0 contour on paper, but this rise was not heard as a rise by native speakers, then this contour would not include an initial rise as part of its composition. The consequence of adopting this analysis is that F0 contours on paper are insufficient evidence on their own; native-speaker intuition must also be consulted. It would be possible to use something like the Tones and Break Indices (ToBI) system and describe the tone under discussion as four tones rather than a single contour. This could still be analyzed as a single morpheme because, just as the four segments of the word “land” form a single morpheme, so could four tones combine to form a single morpheme. We could say that the rise-fall-rise contour is four tones (L+H+L+H) rather than a single tone that forms a rise-fallrise contour. This would contrast with a fall-rise contour, which could then be analyzed either as three tones (H+L+H) or as a single contour. When the two different forms look the same on paper, then instead of saying it is a difference of syllable accent placement, it would be analyzed as either four tones or three. We would not show the difference as L∗+H+L+H versus L+H∗+L+H, but would instead show it as L+H+L+H versus H+L+H, respectively. In other words, the L which shows up on paper before the H∗ would not be considered a component of the morpheme. Even though it is possible to think of the rise-fall-rise contour as four tones making up a single morpheme, it is nevertheless proposed that it is contour shapes that are meaningful and that the tone described here is therefore a single tone. Roach’s (2009) position is adopted here (see Sect. 5.2.4), and it is assumed that a combination of low and high tones does not fully capture the form and quality of a pitch contour. A strong form of evidence in support of this stance comes from Sect. 6.2, which describes two distinct rising tones with distinct meanings. The ToBI system

190

6  The Results of the Research

would describe both rising tones using the same notation, even though the intuition of native speakers distinguishes them as having two distinct forms and meanings.

6.4  Concluding Remarks The observed pitch patterns of this research are assumed to be the surface reflex of the interaction of these underlying tonal morphemes with other accentual and boundary phenomena. A tone’s shape will vary from one occurrence to the next based on such interactions, and these variations can be seen as allotones of the same tone. The data show that discourse tonal morphemes can vary significantly in duration and height and that they can be associated with one or more syllables, and even with more than one word. It is beyond the scope of this book to propose the precise phonological properties of these tonal morphemes, and it is suggested that further research involving native English-speaker judgments be conducted with this goal in mind. It is very possible that parameters other than pitch are included in the makeup of these tones. There are inherent complications associated with using the kinds of translations that were used as the main source of data. Nevertheless, the translations can be judged as having a high degree of validity based on the fact that there was a significant amount of consistency among them. An SFP was only concluded to have an intonational equivalent in English if participants showed some consistency among their own translations, which came from different sentences in different contexts, plus consistency among the other participants’ translations. Even though translations of sentences from naturally occurring data are not themselves naturally occurring data, there is still a sense in which these translations were arguably natural data— even the translations of constructed data. This is because the ambilingual participants were not told, and therefore did not know, that their task was to translate SFPs; they only knew that they were translating sentences. The intonation of the translations can therefore be seen as a product of their subconscious knowledge relating to the forms and meanings of the SFPs and their English tonal counterparts. Another potential complication is isolating the forms and meanings of the SFPs from any nonlinguistic emotions that were expressed by suprasegmentals in the dialogues. However, I believe this was not a serious problem because it is assumed that the SFPs themselves translated as tones, while any nonlinguistic suprasegmentals that were present would presumably have forms such as voice quality, pitch key, and pitch range and would be similar in both English and Cantonese. Even if these changes in pitch key and rang affected the pitch contours, the use of multiple translations from different contexts should have reduced the likelihood that the same suprasegmentals would have the same influence on each occurrence of the same pitch contour. The evidence of this study is comparable with, but more rigorous than, Liberman’s (1979: 96) examination of “the meaning of [a pitch contour in order] to demonstrate

References

191

that there is some real linguistic entity here.” He argued for it being “a sort of intonational word, a unit of meaning” (p. 97, emphasis his), and he went on to say, Like any such argument, ours is essentially an appeal to intuition. It is well known that the meaning of a more conventional sort of word, e.g., “game,” is difficult to state with theoretical precision, yet everyone will agree that there is a word “game,” and that it does mean something. This agreement is based on our ability to recognize this word as an element of any utterance in which it may occur, as an abstract feature which is common to these otherwise quite different utterances, and which contributes something towards their final interpretation. All we require … is that the reader be convinced that there exists an intonational unit, a “tune,” an abstract feature which is common to the otherwise rather different examples we have cited, and which contributes something to their communicative value.

My arguments are also an appeal to intuition: first is my argument that the SFPs of this research have context-independent meanings along the lines of what was proposed for them in their NSM explications; and second is my argument that the pitch contours discovered in the ambilinguals’ translations also have those (or very similar) context-independent meanings. The data that these arguments are based on support the claim that intonation is morphemic and is therefore part of speakers’ lexicons. If true, then these tonal morphemes must be located in the syntax. In light of this, the next chapter proposes how and where intonation fits into the syntax of sentences.

References Aikhenvald, A. (2004). Evidentiality. Oxford: Oxford University Press. Baker, H., & Ho, P. (2006). Teach yourself Cantonese. London: Teach Yourself Books. Ball, J. D. (1971). Cantonese made easy: A book of simple sentences in the Cantonese dialect, with free and literal translations, and directions for the rendering of English grammatical forms in Chinese (2nd ed.). Taipei: Ch’eng Wen. Boyle, E. L. (1970a). Cantonese: Basic course volume one. Washington: Foreign Service Institute. Boyle, E. L. (1970b). Cantonese: Basic course volume two. Washington: Foreign Service Institute. Chan, Y. K. (1955). Everybody’s Cantonese (4th ed.). Hong Kong: Chung Yuen Printing Press. Chan, M. K. M. (2001). Gender-related use of sentence-final particles in Cantonese. In M. Hellinger & H. Bussmann (Eds.), Gender across languages (pp. 57–72). Amsterdam: John Benjamins. Chao, Y. R. (1969). Cantonese primer. New York: Greenwood Press. Chun, D. M. (2002). Discourse intonation in L2: From theory and research to practice. Amsterdam: J. Benjamins. Deng, S. (1991). 廣州方言常見的語氣詞 [Some common particles of Cantonese]. [Fangyan], 2, 126–132. Fung, R. S.-Y. (2000). Final particles in standard Cantonese: Semantic extension and pragmatic inference. Unpublished doctoral dissertation, Ohio State University, Columbus, OH. Geluykens, R. (1987). Intonation and speech act type: An experimental approach to rising intonation in queclaratives. Journal of Pragmatics, 11(4), 483–494. Geluykens, R. (1988). On the myth of rising intonation in polar questions. Journal of Pragmatics, 12(4), 467–485.

192

6  The Results of the Research

Goddard, C. (2002). Yes or no? The complex semantics of a simple question. In P. Collins & M. Amberber (Eds.), Proceedings of the 2002 Conference of the Australian Linguistic Society. http://www.als.asn.au Gunlogson, C. (2003). True to form: Rising and falling declaratives as questions in English. New York: Routledge. Gussenhoven, C. (1984). On the grammar and semantics of sentence accents. Dordrecht: Foris. Hirschberg, J., & Ward, G. (1992). The influence of pitch range, duration, amplitude and spectral features on the interpretation of the rise-fall-rise intonation contour in English. Journal of Phonetics, 20(2), 241–251. Hirst, D. (1983a). Interpreting intonation: A modular approach. Journal of Semantics, 2(2), 171–182. Hirst, D. (1983b). Structures and categories in prosodic representations. In A. Cutler & D. R. Ladd (Eds.), Prosody: Models and measurements (pp. 93–156). Berlin: Springer. Huang, P. P. (1970). Cantonese dictionary: Cantonese-English, English-Cantonese. New Haven: Yale University Press. Huang [黃皇宗]. (1989). 廣州話教程 [Guangzhou dialect text book]. 廣州: 中山大學出版社. Huang, P. P., & Kok, G. P. (1973). Speak Cantonese book I (3rd ed.). New Haven, CT: Far Eastern Publications, Yale University. Kwok, H. (1984). Sentence particles in Cantonese. Hong Kong: Centre of Asian Studies, University of Hong Kong. Ladd, D. R. (1978). The structure of intonational meaning: Evidence from English. Bloomington: Indiana University Press. Lau, S. (1973). A Cantonese-English and English-Cantonese glossary to accompany “Elementary Cantonese”. Hong Kong: The Government Logistics Department. Lau, S. (1977). A practical Cantonese-English dictionary. Hong Kong: The Government Printer. Law, S.-P. (1990). The syntax and phonology of Cantonese sentence-final particles. Unpublished doctoral dissertation, Boston University, Boston, MA. Law, A. (2002). Cantonese sentence-final particles and the CP domain. UCL Working Papers in Linguistics, 14, 375–398. Lee, T. H., & Law, A. (2000). Evidential final particles in child Cantonese. In E. V. Clark (Ed.), The proceedings of the thirtieth annual child language research forum (pp.  131–138). Stanford: Center for the Study of Language and Information. Lee, T.  H., & Law, A. (2001). Epistemic modality and the acquisition of Cantonese final particles. In M. Nakayama (Ed.), Issues in East Asian language acquisition (pp. 67–128). Tokyo: Kuroshio. Lee, T. H., & Man, P. (1997). Notes on an evidential conditional particle in Cantonese [粵語的 一個條件表証標記]. Presented at the 1997 YR Chao Center Annual Seminar, City University of Hong Kong. Leung, C. (2005). 當代香港粵語語助詞的研究 [A study of the utterance particles in Cantonese as Spoken in Hong Kong]. Hong Kong: Language Information Sciences Research Centre, City University of Hong Kong. Leung, H. H. L. (2016). The semantics of utterance particles in informal Hong Kong Cantonese (Natural semantic metalanguage approach). Doctoral thesis, Griffith University, Australia. Li, B. (2006). Chinese final particles and the syntax of the periphery. Unpublished doctoral dissertation, Leiden University, Leiden. Liberman, M. (1979). The intonational system of English. New York: Garland. Lim, L. (2007). Mergers and acquisitions: On the ages and origins of Singapore English particles. World Englishes, 27(4), 446–473. Luke, K. K. (1990). Utterance particles in Cantonese conversation. Amsterdam: John Benjamins. Luke, K. K., & Nancarrow, O. T. (1997). Sentence particles in Cantonese: A corpus-based study. Presented at The Yuen Ren society meeting, University of Washington. Matthews, S., & Yip, V. (2011). Cantonese: A comprehensive grammar (2nd ed.). London: Routledge.

References

193

Meyer, B.  F., & Wempe, T.  F. (1947). The student’s Cantonese-English dictionary (3rd ed.). New York: Field Afar Press. Pierrehumbert, J., & Steele, S. A. (1989). Categories of tonal alignment in English. Phonetica, 46(4), 181–196. Roach, P. (2009, July). Advantages and disadvantages of the ToBI system: A lecture by Peter Roach. Retrieved from http://www.youtube.com/watch?v=AL-uMriM4ns Sag, I., & Liberman, M. (1974). Prosodic form and discourse function. Papers from the 10th regional meeting (pp. 416–427). Chicago Linguistic Society. Stockwell, R.  P. (1972). The role of intonation: Reconsiderations and other considerations. In D. Bolinger (Ed.), Intonation: Selected readings (pp. 87–109). Middlesex: Penguin Books. Sybesma, R., & Li, B. (2007). The dissection and structural mapping of Cantonese sentence final particles. Lingua, 117(10), 1739–1783. Tang, S.-W. (2008). 粵語框式虛詞「咪……囉」的句法特點 [Syntactic properties of the discontinuous construction mai … lo in Cantonese]. 《中國語言學集刊》 [Bulletin of Chinese Linguistics], 3(1), 72–79. Wakefield, J. C. (2010). The English equivalents of Cantonese sentence-final particles: A contrastive analysis. Unpublished doctoral thesis, The Hong Kong Polytechnic University, Hong Kong. Wakefield, J. C. (2012a). A floating tone discourse morpheme: The English equivalent of Cantonese lo1. Lingua, 122(14), 1739–1762. Wakefield, J. C. (2012b). It’s not so bad: An English tone for “downplaying.” Presented at the third international symposium on tonal aspects of languages (TAL-2012), Nanjing. Wakefield, J.  C. (2014). The forms and meanings of English rising declaratives: Insights from Cantonese. Journal of Chinese Linguistics, 42(1), 109–149. Wakefield, J.  C. (in press). It’s not as bad as you think: An English tone for downplaying. In W. Gu (Ed.), Studies on tonal aspects of languages. Hong Kong: Journal of Chinese Linguistics Monograph. Wong, J. O. (2004). The particles of Singapore English: A semantic and cultural interpretation. Journal of Pragmatics, 36(4), 739–793. Yip, V., & Matthews, S. (2000). Basic Cantonese: A grammar and workbook. London: Routledge. Yip, V., & Matthews, S. (2001). Intermediate Cantonese: A grammar and workbook. London: Routledge. Yiu, C.  Y. (2001). Cantonese final particles “LEI”, “ZYU” and “LAA”: An aspectualstudy. Unpublished master of philosophy thesis, The Hong Kong University of Science and Technology. Zhang, L. (1999). Gang shi Guangzhou hua ci dian. Xianggang: Wan li ji gou, wan li shu dian.

Chapter 7

The Syntax of Intonation

Previous chapters provided arguments and evidence in support of the hypothesis that intonation comprises morphemes. If this hypothesis is correct, then intonation is in the syntax, an idea that goes at least as far back as Hirst (1977), whose seminal study tried to bring English intonation into the framework of generative syntax. He said that most phoneticians have only worked toward what Chomsky referred to as observational adequacy, being “concerned merely to give an account of the primary data that is the input to the acquisition device” (Chomsky 1964: 29). Hirst (1977) explained that even with this comparatively easier goal of observational adequacy, there has not been a great deal of success regarding intonation because, unlike the situation with segmental phonemes, there has not been a lot of agreement about the forms and functions of intonational features. Addressing this issue, Chaps. 2 and 3 outlined a proposal for how the forms and functions of intonation can be analyzed in a way that conforms to the hypothesis that it is morphemic. Chapters 4 and 6 then reviewed the large and growing amount of evidence in support of this hypothesis. This chapter now tentatively proposes how intonation might be represented in the syntactic structure of the sentence. My proposal is straightforward. I postulate that intonational morphemes are located in the same syntactic slots as their segmental counterparts. Based on this idea, this chapter discusses the syntax of some of the grammatical tones discussed in Sect. 4.1, as well as the syntax of the discourse tones discussed in Sect. 4.2 and in Chap. 6. How and where they fit into the syntactic structure is postulated based on what has been said in the literature about their segmental counterparts.

7.1  Background Information Some readers may not be familiar with the terminology and notations used in current models of generative syntax. With this in mind, I have written a basic introduction as an appendix at the back of the book for the purpose of enabling readers not © Springer Nature Singapore Pte Ltd. 2020 J. C. Wakefield, Intonational Morphology, Prosody, Phonology and Phonetics, https://doi.org/10.1007/978-981-15-2265-9_7

195

196

7  The Syntax of Intonation

familiar with current models of generative syntax to follow the arguments presented below. Readers who feel they need such an introduction may want to read the appendix now before proceeding with the rest of the chapter.

7.1.1  Intonation and Syntax This subsection briefly explains some of the issues related to how and why intonation has been considered problematic for syntax. Hirst (1983) and Selkirk (1984) pointed out that some linguists had presented intonation as a counterexample to the Extended Standard Theory (EST) of generative grammar. This is because EST theorized that the phonological and semantic components of language each interact directly with the syntax, but that they do not interact directly with each other. This results in what is referred to as the Y-Model of grammar (Fig. 7.1). Restating this using more recent Minimalist terminology, we can say that Phonological Form (PF) and Logical Form (LF) (i.e., the semantic component) interact directly with the computational system (i.e., the syntactic component), but not directly with each other. In a review of the guiding ideas of the Minimalist Program, Chomsky (1995: 220) said, “[w]e thus adopt the (nonobvious) hypothesis that there are no PF-LF interactions relevant to convergence—which is not to deny, of course, that a full theory of performance involves operations that apply to the (π, λ) pair.” By “convergence,” he meant “fully interpretable,” and the two elements within “the (π, λ) pair” refer to the PF and LF representations of a sentence, respectively. In other words, while some operations might apply equally to both the PF and LF levels, the two levels still do not interact directly with each other. These levels are considered to be the interfaces that link the computational system to the two performance systems—the PF level links it to the articulatory-perceptual performance system, which is related to the interpretation and production of sound (or the signs of sign languages), and the LF level links it to the conceptual-intentional performance system, which is related to meaning. Hirst (1983) explained that if intonation, a phonological element of language, gets a direct semantic interpretation without any involvement of the computational system, then this would indeed be a counterexample to EST, as well as the more recent versions of Chomsky’s theory. However, if all intonational forms are in fact morphemes, as argued and evidenced in this book, then this issue is resolved. Intonational morphemes, like their segmental counterparts, are in the lexicon and enter the syntax in the same way all morphemes do—they are included among the Fig. 7.1  Y-Model of grammar

7.1  Background Information

197

set of lexical items selected for generating a sentence. This set of selected lexical items is called the numeration. Analyzing intonation as comprising morphemes simultaneously solves another theoretical issue, which is that if intonation did not comprise morphemes, it would also be a problem for Chomsky’s (1995: 228) inclusiveness condition. This condition states that any derived structure that converges should be “constituted of elements already present in the lexical items selected for [the numeration]; no new objects are added in the course of computation apart from rearrangements of lexical properties.” This means that if a derived structure includes semantic elements whose phonological forms are intonational, and if those elements did not originate from the numeration, then intonation would be a counterexample to the inclusiveness condition. Here again, if intonation is morphemic, this issue is resolved.

7.1.2  Cartographic Syntax Cartographic syntax, as the name implies, is an attempt to map out the syntactic structures of sentences. Cinque and Rizzi (2010: 63) said that “[t]he cartographic studies can be seen as an attempt to ‘syntacticize’ as much as possible the interpretive domains.” It is also an attempt to discover a universal structure that applies to all languages (cf. Wiltschko 2014). With these goals in mind, Rizzi’s (1997) split­CP hypothesis divided what was traditionally analyzed as a single complementizer phrase (CP) into a number of functional projections based largely on the ordering of constituents seen in the left periphery of Italian sentences. He placed the force phrase (ForceP) at the top of this structure, arguing that ForceP is headed by a clause-typing operator (or an overt morpheme in some languages). Rizzi also proposed a number of functional projections below ForceP. Related to this idea that interpretable features must be in the syntax, Aboh (2010, 2016) further proposed that the semantic features [+Focus] and [+Top] are in the numeration. He explained that Focus and Topic markers in Gungbe are segmental particles, while in Hungarian, they are tonal morphemes. These segmental or tonal particles head the Focus phrase (FocP) and a Topic phrase (TopP), respectively, and cause focused and topicalized phrasal constituents to raise to the specifier (SPEC) positions of those phrases. If these features were not in the numeration, Aboh argued, then the existence of the semantic notions Topic and Focus in any derived structure would violate the inclusiveness condition. In other words, everything that is semantically interpretable must be in the numeration and must occupy a syntactic slot. Rizzi (2001) further developed his original structure from Rizzi (1997), and this newer structure is adopted as a starting point for the arguments made here: (1)

Force > (∗Top) > Inter > (∗Top) > Foc > (∗Top) > Fin > TP

198

7  The Syntax of Intonation

A speech act layer is still missing from this structure, allowing nowhere for discourse particles to go unless they are analyzed as clause-typers, in which case they would be located at the head of ForceP. Some linguists have in fact analyzed Chinese sentence-final particle (SFPs) as clause-typers (see Sect. 7.3.2), but this analysis is problematic because many SFPs can attach to more than one type of clause. Haegeman (2009: 14) observed the same thing in relation to West Flemish discourse markers, motivating her to argue that they therefore “are not in Force but they select Force.” It is therefore assumed here that discourse particles are located inside two speech act layers that were first added above ForceP by Speas and Tenny (2003). They proposed an evidential phrase and an evaluative phrase based on data from a number of languages. Inspired by their idea, Haegeman and Hill (2013) and Haegeman (2014) gave a detailed proposal for two speech act phrases above ForceP for discourse particles in West Flemish and Romanian. Linguists have also proposed something similar for Cantonese SFPs (e.g., Sybesma and Li 2007; Tang 2015) and Mandarin SFPs (e.g., Paul 2014). All of this work in cartographic syntax has provided evidence for a detailed syntactic structure that enables me to propose how and where discourse tones fit into this universal structure by contrasting them with their segmental equivalents. This will be done in Sect. 7.3.2. First, I will address the comparatively easier task of proposing the syntactic properties of grammatical tones.

7.2  T  onal Morphemes That Function as Grammatical Particles The tonal morphemes that function as grammatical particles are the least controversial. There is debate about the phonological properties of individual tones, but there appears to be no debate about the fact that they are morphemes. Arguing that they occupy syntactic slots is therefore relatively uncontroversial. As far as I know, only one controversial idea about these tones has been proposed in this book, which is that grammatical tonal morphemes should be categorized as a subset of intonation; I am not aware of any linguists thus far having labeled the kinds of tonal morphemes I am talking about here as intonation. My reason for relabeling them as intonation is given in Sects. 2.2 and 3.3 and will not be repeated here. The goal here is to discuss how these tonal particles fit into the syntactic structure. Many tonal particles have segmental counterparts in their own languages or in closely related languages. When this is the case, it is reasonable to assume that such tonal particles occupy the same syntactic slots as their segmental counterparts. For example, Cheng and Kula (2006) said, Subject relatives [in Bemba] have the option of either being marked by a pre-prefix segmental relative marker … or by a tonal strategy that places a low tone on the subject marker. We propose to treat this low tone as a tonal morpheme that is functionally equivalent to its segmental counterpart. (Cheng and Kula 2006: 33)

7.2  Tonal Morphemes That Function as Grammatical Particles

199

An example of Bemba’s relative marker tone is shown here in (2b), which contrasts with its segmental counterpart in (2a): (2)

a. Ba-kafúndisha á-bá-léé-lolesha pansé ni ba-Mutale 2PFX-teacher 2REL-2SM-TNS-look 16outside COP 2PFX-Mutale “The teacher who is looking outside is Mr Mutale” b. Ba-kafúndishá bà-léé-lolesha pansé ni ba-Mutale 2PFX-teacher 2REL.2SM-TNS-look 16outside COP 2PFX-Mutale “The teacher who is looking outside is Mr Mutale” (Cheng and Kula 2006: 34; note: the numbers refer to agreement classes)

In (2a), the relativization of the subject is marked by the preprefix relative marker á-. This contrasts with (2b) where relativization is marked by a Low tone morpheme that appears on the subject marker bá-, changing its tone to bà-.1 It is reasonable to assume that this tonal relative marker is located in the same syntactic slot as its segmental counterpart á-; both forms are functionally equivalent and appear phonologically in essentially the same linear position. Another example is the tonal degree operator that was discussed in Sect. 4.1, which appears in the complementizer phrases of certain Northern Norwegian degree questions. This degree operator has a segmental counterpart in Icelandic, a closely related language (Svenonious and Kennedy 2006). The following two sentences (repeated for convenience from Sect. 4.1) show the contrast between Icelandic (3a), which uses the segmental degree operator hvað (“what”), and Northern Norwegian (3b), which uses a tonal morpheme equivalent: (3)

a. Hvað ertu gammall? what are.you old “How old are you?” b. Er du gammel? are you old “Are you old?” “How old are you?” (Svenonious and Kennedy 2006: 134)

Svenonious and Kennedy said that the sentence in (3b) is interpreted in one of two ways depending on whether the prosodic peak is associated with the predicative adjective gammel, in which case the sentence is interpreted as a yes/no question, or whether it is associated with a word further to the left, in which case the sentence is interpreted as a degree question. This sentence is therefore ambiguous on paper

1  Some readers may notice from looking at (1b) that the final vowel of the head noun changes to a high tone when the relative clause is marked with a low tone. Cheng and Kula (2006) argued that this high tone on the noun’s final vowel is not what marks the relative clause. See Cheng and Kula (2006: 34 and 40ff) for details.

200

7  The Syntax of Intonation

because the (non)existence of the tonal degree operator is not indicated in writing. This is a problem with tonal morphemes in general, especially the discourse morphemes discussed in the next section, because it is not always clear, even for linguists, how and where to add them into a written sentence. It makes sense that the prosodic peak in (3b) moves leftward toward the front of the sentence when it is a degree question. The front of the sentence is where the degree operator would appear if it were segmental. Svenonious and Kennedy referred to this as a null operator and assumed that it occupies the same syntactic slot that it would if it were segmental, arguing that the degree operators in both of these languages originate inside the AdjP and move into the complementizer phrase—the only difference being that one is null and one is segmental. Rather than referring to it as being “null,” perhaps it is more accurate to analyze the Northern Norwegian operator as a tonal morpheme that interacts with the prosodic peak, shifting the peak leftward. Following their analysis of the syntax, this tonal morpheme is assumed to have raised up from the AdjP, the same as Icelandic’s hvað. Another example of a grammatical tonal morpheme is a sentence-final tone used in some languages to form unbiased interrogatives—unbiased (or neutral) in the sense that they express no additional connotative meaning and can be used to begin a conversation. Two languages that use such a tone are Gungbe and French (Aboh and Pfau 2010). The two sentences in (4a) and (4b), repeated from Sect. 4.1, illustrate an example of the sentence-final Low tone that forms interrogatives in Gungbe. Note the contrast between the tones on the verb “come” (wá ➔ wȃ): (4)

a. Sέtò kò wá. Seto already come “Seto arrived already.” b. Sέtò kò wȃ? Seto already come.INTER “Has Seto arrived yet?” (Aboh and Pfau 2010: 92)

This sentence-final Low tone in Gungbe types the clause as a polar interrogative. Aboh and Pfau noted that French similarly has a polar interrogative tonal particle: (5)

a. Pierre est parti? Peter is.3s leave [rising interrogative tone] “Did Peter leave?” (Aboh and Pfau 2010: 117)

This neutrally-intoned (i.e., unbiased) interrogative tone is unlike the rising tones that form biased questions in English. Aboh and Pfau (2010: 118) concluded that this interrogative tone in French is “realized by a null morpheme that triggers rising intonation,” but I think it could perhaps be better described as a tonal morpheme whose form is a rising tone. Sentence-final interrogative tones like those found in Gungbe and French are the tonal counterparts of segmental sentence-final polar interrogative par-

7.3  Tonal Morphemes That Function as Discourse Particles

201

ticles seen in many languages, for example, Fongbe à, Mandarin ma, Japanese ka, Polish czy, Turkish mi, Bengali ki, Korean nunya, and so on. I propose that tonal and segmental interrogative particles occupy the same syntactic slot, the details of which are laid out in Sect. 7.3.2. Many of the examples of grammatical tonal morphemes that were discussed in Sect. 4.1 are similarly hypothesized to occupy syntactic positions that can be determined based on direct analogy to segmental counterparts, whether in the same languages’ present or past, or in other languages. Proposing the syntax of many grammatical tones is fairly straightforward, but there are complicated cases such as, for example, the second versus third person distinction in the Bantu language Chimwiini. This distinction is made with a High tone on either the final or penultimate position of the noun object, even when the object is embedded between two verb-modifying phrases. It is beyond the scope of this chapter to attempt to resolve difficult cases like this, but it is nevertheless assumed that this tone in Chimwiini occupies a specific syntactic slot, even though in this particular case, its position is difficult to determine. This example shows that there are limits to how far we can reliably use the method of determining a tone’s syntactic properties based on a comparison with a segmental counterpart. Another complicating factor is that the same grammatical feature is not necessarily marked by the same category in all languages. Definiteness, for example, is marked by the use of a bare classifier preceding the noun in Cantonese, by a definite article in English, by a suffix in Swedish, and in yet other ways in other languages. The idea, though, is that most tonal morphemes are likely to have at least one segmental counterpart in some language. For any tonal morpheme that does not have a clear and obvious segmental counterpart, there is no reason to conclude that it therefore does not occupy any syntactic position, but rather that its syntactic properties are just more difficult to determine.

7.3  Tonal Morphemes That Function as Discourse Particles Unlike the grammatical tonal particles just discussed, discourse tonal particles are included within what linguists have traditionally labeled intonation. Many linguists do not consider such discourse tones to be morphemic. Therefore, arguing that they are included in the syntactic derivation of sentences is somewhat controversial. The approach taken here is the same as for grammatical particles; the syntactic positions of discourse tonal particles are hypothesized to be the same as their segmental counterparts. There is a complicating factor, however, because most of these tones do not have obvious segmental counterparts. But some of them do, as was demonstrated in Chaps. 4 and 6. Based on that evidence, it is assumed by logical extension that all discourse tones are morphemes with meanings that could theoretically be expressed by a segmental particle. In order to make the following arguments clear, I must first explain how and why I distinguish the term interrogative from the term question, and will then propose a universal structure for polar interrogatives. After that I will propose a structure for main clauses that includes discourse particles and, by logical extension, discourse

202

7  The Syntax of Intonation

tones. Readers should be reminded that I consider polar interrogative particles to be grammatical particles rather than discourse particles. They were discussed in the preceding section where they belong, but their syntax is discussed here in this section for the purpose of clearly distinguishing (grammatical) interrogative particles from (discourse) question particles.

7.3.1  The Syntax of Polar Interrogative Particles Cartographic syntacticians have traditionally assumed that clause-typing particles, including polar interrogative particles and other types of question particles, head ForceP. For example, some linguists placed the Cantonese polar interrogative particle maa3 in ForceP along with the question-forming particles me1 and aa4, arguing that they all have the feature [+Q] and therefore type the clause as an interrogative (e.g., A. Law 2002, 2004; Sybesma and Li 2007). There are three things about this analysis that I propose differently here. First, I adopt Rizzi’s (2001) structure in (1) and therefore assume that interrogative particles head InterP instead of ForceP, while clause-typing particles that mark other clause types are still assumed to be located in ForceP. Second, I include particles that are realized phonologically as tones.2 And third, I make a distinction between interrogative particles and questionforming discourse particles, which contrasts with many linguists who have used (and still use) the terms interrogative and question interchangeably. I assume the only Cantonese SFP that functions as a polar interrogative particle is maa3. The particles me1 and aa4, along with others, are distinguished as question particles, including the Cantonese rising tonal particle that has been widely documented. Interrogative particles (e.g., Cantonese maa3, Fongbe à, Japanese ka, French and Gungbe interrogative tones, etc.) change a declarative into an interrogative. They are strictly grammatical particles with no semantic content. This contrasts with question-forming particles, such as sentence-final me1 and aa4 in Cantonese, or huh in English, as well as the rising question tones that are used to form the rising declaratives found in many languages. I argue that unlike interrogative particles, these question particles do not have the clause-typing feature [+Interrogative]. Note that I use the feature [+Interrogative] rather than [+Q] in order to distinguish between a “question,” which is a speech act, and the clause type “interrogative,” which is a grammatical feature. Interrogative particles, whether segmental or tonal, are [+Interrogative] and assumed to head InterP.  Question particles, in contrast, head a speech act phrase above ForceP.

2  This is not entirely new. Sybesma and Li (2007) included tones in their structure, but these were the tones associated with SFPs, which are arguably lexical tones. They did not include any tonal discourse particles. More along the lines of what is being proposed here, Tang (2006) argued that the rising question tone in Cantonese is an SFP that occupies the same syntactic slot as the question particle me1.

7.3  Tonal Morphemes That Function as Discourse Particles

203

Kwok (1984) said that the Cantonese question particles me1 and aa4 form questions that are different from semantically neutral questions such as those formed in Cantonese using maa3 and in English using subject-auxiliary inversion (SAI). She said that me1- and aa4-suffixed questions are not semantically neutral but are instead like intonation questions that imply that the speaker holds a particular belief about the proposition. Bailey (2010) similarly said that English can form questions either with SAI or with intonation but that the pragmatics of each are different. Related to this, Gunlogson (2003) referred to English intonation questions as rising declaratives, saying they are questions with presuppositions. Using numerous examples, she demonstrated that rising declaratives do not share the same discourse-­ context distributions with interrogatives that are formed by SAI, and she proposed the following generalization to explain the linguistic facts: Declaratives [rising or falling] express a bias that is absent with the use of interrogatives; they cannot be used as neutral questions. (Gunlogson 2003: 99)

Gunlogson (2003: 54) said that rising declaratives “cannot readily be used as questions ‘out of the blue,’ with no particular context, as interrogatives can be.” She illustrated this with the example shown in (6), where the question mark at the end of the clause in (6b) represents a rising tone on a declarative clause. (6)

[context: initiating a phone conversation] a. Is Laura there? b. ??Laura’s there? c. ??Laura’s there. (Gunlogson 2003: 55)

For the context in (6), the rising declarative question in (6b) pairs with the neutrally-­intoned declarative in (6c) because they both “express a bias,” i.e., they include a presupposition which prevents them from being used to initiate a phone conversation. This indicates that (6b) is a declarative rather than an interrogative. The following examples from Hirst (1983) provide further evidence that a rising tone does not change a declarative clause into an interrogative. (7)

a. Did he buy something? b. Did he buy anything?

(8)

a. He bought something. b. ∗He bought anything.

(9)

a. He bought something? b. ∗He bought anything? (Hirst 1983: 176)

The question marks in (9a and 9b) represent rising intonation. Hirst explained that the two ungrammatical sentences (8b) and (9b) pair together as unacceptable sentences because the rising intonation of (9a) and (9b) does not change those declarative sentences into interrogatives. The evidence that comes from unacceptable

204

7  The Syntax of Intonation

sentences like (6b) and (9b) counters Allan’s (2006: 7) claim that the sentences in a pair such as (10a) and (10b) “are uncontroversially formally distinct clause-types.” (10)

a. John’s gone to New York. b. John’s gone to New York?

Allan supported his claim by stating two examples where intonation is used to mark clause type: In Navajo, the imperative has the same morphosyntax as the declarative, but it is prosodically distinct. In Korean declarative, interrogative, imperative and propositive (let’s constructions) are formally identical in the so-called intimate and polite speech levels, but are distinguished by prosody. It follows that prosody is a defining characteristic in the definition of clause-type. (Allan 2006: 7)

At least two things about Allan’s statements are noteworthy. First, it is not accurate to say that sentences like (10a) and (10b) are “uncontroversially” distinct clause types because some linguists have convincingly argued otherwise based on evidence. Second, while I agree that intonation can be used to mark clause types, as Allan argued with examples from Navajo and Korean, it does not follow from this that the rising tone in (10b) must necessarily have this function rather than a speech act function. The Gungbe and French examples in (4b) and (5), respectively, show that intonation does in some cases function to mark clause types. However, in other cases intonation is used to add semantic content to a sentence without changing clause type. Following the arguments put forth by Hirst (1983) and Gunlogson (2003), I assume (10a) and (10b) are both declarative clauses and that the intonation of (10b) is a discourse tone with semantic content rather than a clause-typer. Put simply, the rising tone illustrated in (10b) changes a statement into a question but does not change a declarative into an interrogative. Siemund (2001: 1012) made a claim similar to that of Allan (2006), saying that “most languages and maybe all seem able to mark polar interrogatives solely by intonation”; this statement indicates that Siemund probably would have agreed with Allan that (10b) is an interrogative clause. Contrary to that view, I assume that most and maybe all languages can form questions through the use of intonation, but only some languages can form interrogatives with intonation. Section 6.2 showed the similarity in discourse distributions between English rising tones and the Cantonese question particles me1 and aa4. For example, neither me1 nor aa4 could attach to a sentence like (6b) in a Cantonese equivalent of that phone conversation dialogue. Based on that, plus consistent translations from native bilinguals that rendered me1-suffixed and aa4-suffixed sentences as rising declaratives in English, it was argued that me1 and aa4 are both equivalent (or very similar) in meaning to English’s high-rising and mid-rising question tones, respectively, which attach to declarative clauses. It is therefore hypothesized that the syntactic position of the particles associated with English rising question tones is the same as that of me1 and aa4, the details of which will be spelled out below.

7.3  Tonal Morphemes That Function as Discourse Particles

205

In some contexts, it looks as though interrogatives and rising declaratives pair together, as in the following example: (11)

[context: a reply to someone saying “Maria’s husband was at the party”] a. Is Maria married? b. Maria’s married? c. ??Maria’s married. (Gunlogson, 2003:73)

The pairing of (11a) with (11b) is actually not problematic for the arguments being made here because they only pair together when the interrogative in (11a) is uttered with a tone that expresses a bias; if (11a) is uttered with neutral intonation void of any presupposition, then it pairs with the neutrally-intoned declarative in (11c). The reason is because the preceding utterance in the discourse entails the proposition “Maria is married.” Therefore, the only type of question acceptable in this context is one that includes a bias such as “I didn’t know she was married” or “I thought she wasn’t married.” This bias can be expressed with a discourse tone on the interrogative sentence in (11a), and this tone is proposed to head a speech act-level phrase above ForceP. In other words, (11a) is a rising interrogative with a rising tone that differs from the rise of a neutrally-intoned polar interrogative. (11a) and (11b) both include the bias necessary for asking this question in this context, while (11c), if it is neutrally intoned, has no such bias and is therefore unacceptable.3 I will now propose a syntactic structure that is hypothesized to apply to all neutral polar interrogatives. Included along with this will be a proposal as to how all the phonological forms of polar interrogatives found in the world’s languages fit onto this structure. Siemund (2001: 1012), in his survey of interrogative constructions, said that “[t]he strategies for marking polar interrogatives in the languages of the world vary within clearly fixed bounds.” Although the bounds are fixed and limited, on the surface there still appears to be a significant number of ways to form polar interrogatives. Siemund (2001) and Dryer (2013) both provided a list of polar interrogative types found in the world’s languages, and their lists are combined in (12). Different terminologies from the two authors that appear under the same letter are considered to refer to the same thing:

3  It must be noted that nonquestioning biases would also be acceptable here. For example, it would be possible to utter a falling declarative that included an intonational meaning expressing disappointment, perhaps something that implied “(Oh damn,) Maria’s married (there go my dreams of marrying her).” (I thank Daniel Hirst for pointing this out). But this again is different from an unbiased declarative, which would not be acceptable here.

206 (12)

7  The Syntax of Intonation a.

intonation patterns (Siemund 2001); interrogative intonation (Dryer 2013)

b.

interrogative particles (Siemund 2001); question particles (Dryer 2013)

c.

tags (Siemund 2001)

d.

a change in the order of constituents (Siemund 2001); interrogative word order (Dryer 2013)

e.

disjunctive structures (Siemund 2001)

f.

verbal inflection (Siemund 2001); interrogative verb morphology (Dryer 2013)

g.

The absence of a declarative morpheme (Dryer 2013)

h.

No formal marking of polar questions (Dryer 2013)

I will first trim this list down to what is shown below in (13), and then I will argue that the differences among all of these forms of interrogatives relate to only two things: the phonological form of the interrogative particle and the type of movement involved. As pointed out above, I am referring exclusively to neutral polar interrogatives that contain no bias and no focused element, i.e., they unbiasedly question the entire proposition. With this in mind, we can remove from (12a) any forms of intonation that can be analyzed as question tones, such as the English examples illustrated in (6b), (9a), (10b), and (11b). Interrogative intonation is restricted here to refer only to tones such as the Gungbe and French examples in (4b) and (5). Siemund (2001) did not appear to make this distinction, but Dryer (2013) said he restricted his list to “unbiased questions” as opposed to “leading questions,” so to the extent that he did so, the examples of interrogative intonation that he lists should be what I am referring to as interrogative tonal particles, and of the 955 languages he looked at, 173 were listed as having interrogative intonation.4 Because this list is restricted to interrogatives, the term “question particles” can be removed from (12b). This makes the categories of (12a) and (12b) combinable into the single category “interrogative particles” as shown in (13a) below. This is based on the idea that tonal morphemes and segmental particles are two forms of the same thing and can therefore be listed together as particles, either tonal or segmental. The tags of (12c) are excluded entirely from the list; not only do they express a bias, but they are assumed (not uncontroversially) to involve a separate clause. The categories (12d–g) remain in the list as (13b–e), but the term “disjunctive structures” is relabeled “A-not-A constructions” because that is what Siemund (2001) was referring to. Finally, (12h) is removed because I presume it to be an inaccurate analysis of the one language in Dryer’s (2013) list that was said to have no marking for interrogatives. Macaulay (1996, cited in Dryer 2013) was the source of this information about Chalcatongo Mixtec, a language from Mexico, and he may have missed whatever it is that marks interrogatives in that language, whether it be tonal, or as I speculatively consider in Chap. 8, gestural. The now reduced list is as follows: 4  This number should not be accepted without some reservation because Dryer’s (2013) data were primarily from secondary sources, which presumably varied in their methods and in their detail of description.

7.3  Tonal Morphemes That Function as Discourse Particles (13)

207

a. interrogative particles (tonal or segmental) b. interrogative word order c. A-not-A constructions d. verb morphology e. absence of a declarative morpheme

Based on this list of neutral polar interrogative types, I will now explain how the syntax of each type can be described in terms of the universal structure proposed by Rizzi (2001) shown in (1) above. The relevant portions of the structure are shown here in (14). For ease of presentation, I have left out the phrases above InterP, as well as those between InterP and TP. (14)

It is hypothesized that every type of interrogative involves a particle (PRT) with a [+Interrogative] feature that is base generated at the head of InterP. The only differences among the various types of interrogatives are the phonological form of the [+Interrogative] particle and the type of movement involved. Each possibility will be illustrated and discussed in relation to the interrogative types listed in (13). The first possibility is no movement. In this case, the interrogative particle will almost always be segmental.5 This case represents those languages that have a sentenceinitial interrogative particle. An example of this is the following sentence from Tzotzil, a Mayan, Mexican language. The interrogative particle la is base generated at the head of InterP, and since no movement takes place, it remains in sentenceinitial position. (15)

La k’ol Aa Teeko chjaay? PRT be youth Diego at.home “Is Diego at home?” (Aissen 1987, cited in Bailey 2013: 33)

The structure for this looks no different from (14), but a note is added to indicate that the particle is segmental, as shown in (16).

5  One exception to this is (13e), in which a phonologically null interrogative particle’s presence is made known by the absence of a segmental declarative particle.

208

7  The Syntax of Intonation

(16)

There are two reasons for assuming that the particle must be segmental. First, if it had no phonological form, its presence would not be known without any movement of constituents.6 Second, it appears that a tonal particle cannot be used without movement; I am not aware of any language that uses an interrogative tonal particle in the sentence-­ initial position. In theory, since segmental particles can appear either sentence-­initially or sentence-finally, tonal particles should also be able to appear in either location. The reason they do not appear sentence-initially could be related to phonological rather than syntactic factors. Zhang (2001: 23) said that “prosodic-­final position … has extra duration due to final lengthening … [and] thus it should be a preferable position for contour tones.” Segmental particles, in contrast, are not affected by this phonetic factor. It should be noted that languages forming interrogatives according to the structure shown in (16) do not necessarily end up with the interrogative particle in sentence-­initial position. In a language like Russian, for example, the interrogative particle will be preceded by the element being questioned, which is topicalized by raising to a TopP that lies above InterP. This is true even for neutral questions, such as (17a), which topicalize the verb. (17)

a. Čital li ty ètu knigu? read PRT you this book (Topicalizing the verb čital (“read”) gives a neutral reading “Have you read this book?”) b. Ty li čital ètu knigu? you PRT read this book (Topicalizing the pronoun ty (“you”) gives the reading “Have you read this book?”) c. Ètu li knigu ty čital? this PRT book you read (Topicalizing the demonstrative ètu (“this”) gives the reading “Have book?”) (Comrie 1984, cited in Siemund 2001: 1014)

you read this

The next possibility is either a segmental or tonal particle that causes TP to raise to the SPEC position of InterP, resulting in a sentence-final position for the particle.

 Carnie (2013: 217), for example, suggested that English has a null interrogative particle in sentence-initial position, and the auxiliary moves to its position in order to make the presence of the particle known. 6

7.3  Tonal Morphemes That Function as Discourse Particles

209

(18)

An example of this is the following sentence from the Nigerian language Mupun. The interrogative particle -e is base generated at the head of InterP and the TP Wu naam un (“He saw them”) raises to the SPEC of InterP, placing -e in the sentence-­ final position. (19)

Wu naam un-e? 3-male see 3pl-PRT “Did he see them?” (Frajzyngier 1993, cited in Bailey 2013: 33)

Dryer said that some sentence-final interrogative particles are clitics, and this may be the case for the particle -e in (19). An example of a particle that is clearly phonologically independent is Mandarin’s ma. The sentence-final particle in the Gungbe example in (4b) is similar to (19), except that the particle is a tone, the TP Sέtò kò wá (“Seto arrived already”) has raised to SPEC of InterP, and the [+Interrogative] Low tone that was base generated in Inter0 interacts with the high tone of wá, changing it to wȃ. French also has a sentence-final interrogative tone as exemplified in (5), and Dryer (2013) lists 173 languages that he claims use a tone to form what he called “unbiased questions.” An interesting possible example of an interrogative tonal particle is seen in the language Burunge, spoken in Tanzania. For declarative clauses, the final vowel of the sentence is whispered, while in interrogatives, it is voiced. Dryer (2013) speculated that this may be an example of marking an interrogative by the absence of marking it as a declarative, but another possibility is that voicing is evidence of a sentence-final tone because the phonological production of tone requires voicing.7 We have now accounted for the types of interrogatives in (13a). Those types listed in (13b–c, that is, interrogative word order and A-not-A constructions) are proposed to have the following structure:

 I thank Lian-Hee Wee (p.c., 2019) for suggesting this possible analysis.

7

210

7  The Syntax of Intonation

(20)

This is head-to-head movement rather than movement of a phrase into a SPEC position. The head (shown as X in (20)) could be an auxiliary, a verb, or a null functional head with a [+Interrogative] feature. What Siemund (2001) and Dryer (2013) were referring to by “interrogative word order” was primarily SAI, but in some cases, it is the verb rather than the auxiliary that moves. The French sentence in (21) demonstrates an example of verb raising, while its English translation illustrates raising of the auxiliary: (21)

Achetait-elle le journal? bought-she the paper “Did she buy the paper?” (Haegeman 2006: 177)

In both the French and English versions of (21), the contents of T0 have moved up to Inter0. In the French sentence, the verb achetait first raises from V0 to T0 for purposes of tense, and then moves from there to Inter0. In the English, however, verbs do not raise to T0. Instead the auxiliary did is inserted in T0, so that it can raise from there to Inter0 to form an interrogative clause (Haegeman 2006: 171–178). It is assumed that SAI (or verb raising) involves a null interrogative particle base generated at the head of InterP, which then attracts the contents of T0 to its position. Movement from T0 to Inter0 is required in order to make the presence of the null particle known (cf. Carnie’s (2013: 217) proposal for SAI—the only difference being that he suggested the contents of T0 move into the head of CP rather than the head of InterP). An example of a null functional head raising to Inter0 is what Siemund (2001) called “disjunctive structures,” which refers to A-not-A interrogatives, where A is a verb or an adjective. In addition to having a sentence-final interrogative particle, Chinese uses A-not-A structures to form polar interrogatives. This is demonstrated in the following Cantonese example: (22)

Lei5 heoi3-m4-heoi3 (aa3)? 2s go-NEG-go SFP “Are you going?”

7.3  Tonal Morphemes That Function as Discourse Particles

211

The V-not-V structure heoi3-m4-heoi3 is how this clause is phonologically realized as a polar interrogative.8 What actually types the clause, however, is a null particle in Inter0. According to Huang et al. (2009: 254–5), A-not-A interrogatives involve an interrogative functional head made up of the feature [+A-not-A], which is located in the same position where negation would appear in a negative sentence and “the A-not-A constituent moves to an appropriate position in CP at LF.” P. Law (2006: 99) offered a similar proposal that involves “an abstract feature [+Q]” that is base generated adjoined to VP and raises to SPEC of CP in LF. Slightly modifying Huang et al.’s (2009) analysis, I hypothesize that there is a [+Interrogative] functional head associated with the A-not-A constituent; it heads a functional phrase adjoined to VP (i.e., where NegP would be located in a negative sentence) and raises from there to Inter0 in LF. This hypothesis is also represented by (20) because an A-not-A sentence’s structure and movement is similar to the T0 to Inter0 movement of SAI and verb raising, except that the element that raises to Inter0 originates in a different position, and it raises covertly in LF rather than in PF. The next form of polar interrogatives is verb morphology (i.e., 13d). There are at least three proposals for describing the syntax behind this. The first possibility is that the main verb raises to Inter0. In this case, the syntax is similar to (20) above, with the only difference being that the interrogative particle, instead of being null, is a bound morpheme that attaches to the verb after it raises, as shown in (23). The motivation for raising is the same as what has been proposed for verbs raising to T0 in French in order to attach to verbal suffixes; the suffix is phonologically dependent and therefore needs to attach to something in order to be pronounceable. Any constituents that may appear in front of the raised verb can be analyzed as topicalized elements that have raised to the SPEC position of one or more TopPs above InterP. (23)

A second proposal for explaining verb morphology is that the verbal affix attaches to the verb in V0. In this case, the syntax would be represented by (20) and would be the same as for A-not-A constructions. A null interrogative particle that heads a functional phrase attached to VP raises to Inter0 in LF. In other words, the

 The sentence-final particle aa3 merely softens the question and make it less blunt; it is optional and is therefore not what is typing the clause. 8

212

7  The Syntax of Intonation

only difference between a verbal-affix interrogative and an A-not-A interrogative is that the presence of the functional head that raises to Inter0 is made known by an affix attached to the verb rather than by an A-not-A construction. The third proposal for verbal morphology is that the bound morpheme is a clitic rather than an affix. In this case, the constituent that raises would be a phrase with the sentence’s main verb in its final position, and it would raise to SPEC of InterP, as shown here: (24)

Dryer (2013) said it is not always easy to distinguish a suffix from a clitic, so it would require a case-by-case investigation to distinguish whether a given language’s verbal suffix in polar interrogatives is a case of (23), (24), or possibly even (18) for verb-final languages. The final form of polar interrogatives listed in (13) is the absence of a declarative particle. For this type of interrogative, a null interrogative particle is presumably base generated in Inter0 whenever a declarative particle is absent from Force0. It seems unusual that a declarative would be more marked than an interrogative, which is likely why only 4 of the 955 languages that Dryer (2013) looked at marked interrogatives in this way. This is illustrated in (25), where Force0 is shown as empty, making known the presence of a null particle in Inter0. (25)

Summarizing the syntax of polar interrogatives, it is hypothesized that languages universally use the structure shown in (14). The only differences among languages is the phonological form of the [+Interrogative] particle that is base generated in

7.3  Tonal Morphemes That Function as Discourse Particles

213

Inter0 (or that raises to this position in the case of A-not-A sentences), and the type of movement involved, if any. Intonation’s place within this analysis relates to the structure in (18), where a tonal particle appears linearly at the end of a TP that has raised into the SPEC position of InterP. Having proposed how intonation fits into the syntax of neutrally-­intoned polar interrogatives, I will next propose how it fits into the syntax of sentences that include discourse particles, which are assumed to head two speech act layers above ForceP.

7.3.2  The Syntax of Discourse Particles Sentence-final tones can type clauses as interrogatives, but such tones are different from those tones (rising or otherwise) that function to ask biased questions. Question tones add meaning to a sentence beyond neutral, unbiased questioning, and they do so without changing the clause type. Carnie (2013) differentiated between English SAI interrogatives and intonation questions as follows: [A sentence] with subject/aux inversion is a request for information, [while an intonation question] is an expression of doubt and a request for confirmation. How such phonological licensing is encoded into the syntactic tree is very controversial. One solution is that, like wh-questions and yes/no questions, echo questions and intonational questions involve a special complementizer. We can indicate this as C[+Intonation]. The [+Intonation] feature doesn’t trigger any movement, but it instructs the phonology to put a rising intonation curve on the clause that follows the C. (Carnie 2013: 383)

Note that Carnie placed this [+Intonation] complementizer in C0, which is the same position he suggested for the null polar interrogative particle that causes the auxiliary to raise to C0. It is not clear whether he thought this rising intonation changes the clause to an interrogative or not, but here it is assumed that it does not do so for the reasons given above. It is further assumed that a rising tone that expresses a biased question can be used on either a declarative or an interrogative clause, as illustrated here: (26)

a. Did you go to the party? b. You went to the party?! c. Did you go to the party?!

The sentence in (26a) represents a neutrally-intoned polar interrogative. In (26b) and (26c), the combination of a question mark and an exclamation mark indicates the use of a high-rising question tone; both of these sentences are biased questions and their rising question tones appear in different clause types. Since a biased question tone can be used with either a declarative or an interrogative, there must be two separate syntactic positions involved for forming an interrogative versus forming a question: one position for a null interrogative particle that attracts the auxiliary to its position, as represented in (20), and a separate position for an intonational question particle. The same thing is illustrated by (11a) and (11b), where a biased question

214

7  The Syntax of Intonation

tone is used on both an interrogative and a declarative, respectively. Based on the fact that question tones can appear in both types of clauses, it is assumed that the discourse particles associated with these tones are located above ForceP. This relates to Haegeman’s (2009: 14) argument stated above, which is that because West Flemish discourse markers can appear in more than one type of clause, they “are not in Force but they select Force.” A tonal question particle is likewise assumed to select Force as its complement. A number of linguists have proposed one or more functional projections above ForceP. For example, a single AttitudeP based on Mandarin SFPs was hypothesized by Paul (2014). Speas and Tenny (2003), based on particles in several languages, proposed two functional phrases: a higher evaluative mood phrase (EvalP) and a lower evidential mood phrase (EvidP). Based primarily on West Flemmish discourse markers, Haegeman (2014) proposed a higher speaker-oriented speech act phrase (SAP) and a lower hearer-oriented SAP. Several proposals have been made based on Cantonese SFPs, and since this language has one of the largest (if not the largest) inventory of SFPs, it is a good source of evidence for proposing their syntax. For this reason and because the discourse tones discovered in Chap. 6 are linked directly to Cantonese SFPs, the syntactic proposals based on Cantonese SFPs are particularly relevant here. In earlier work, Cantonese SFPs were placed in C0 (e.g., S-P.  Law 1990; Tang 1998). A. Law (2002, 2004) later placed them in ForceP based on the hypothesis that they are all clause-typers having the feature [±Q]. This is similar to Huang et  al.’s (2009) argument that Mandarin SFPs are all clause-typers. There is a problem with this conclusion because most Mandarin SFPs can attach to more than one type of clause (e.g., Li and Thompson 1981; Li 2006), indicating that they are located above ForceP rather than in ForceP. Huang et al. (2009: 35) said that “[w]hat remains unclear is why [a clause-typer] in Chinese never occurs with embedded clauses. Possibly, there are unidentified discourse functions that ma, ba, and ne perform that are associated only with matrix clauses.” I assume that their speculation about SFPs having discourse functions is correct and that Chinese SFPs therefore lie above ForceP. Based on this, we can propose that SFPs do not occur in embedded clauses because the discourse phrases above ForceP only project in main clauses. Related to this, Tang (2010: 61) said that SFPs “express discourse meanings, and are related to the discourse context at the time of speech, [while clause types] are classified according to sentence-internal meanings, and are determined by the grammatical properties of the sentence independent of the context” (translation that of the author). It seems reasonable to propose that embedded clauses only include functional projections that relate to sentence-internal meanings, and that his is why discourse particles never occur in embedded clauses. Referring to Speas and Tenny’s (2003) work, Tang (2015) examined the Cantonese hearsay SFP wo5 and placed it inside an evidential phrase. Heim et al. (2016) looked at particles in Canadian English, Cantonese, and Medumba, a Bantu language. They proposed two speech act phrases for each language, with a higher “call on addressee” phrase and a lower “speaker commitment” phrase. They proposed that the Canadian question SFP “eh” and the Cantonese question SFPs me1 and ho2 all occupy both of these speech act phrases. In the case of “eh,” they argued

7.3  Tonal Morphemes That Function as Discourse Particles

215

that it is two morphemes: one whose form is the vowel segment of “eh” and the other whose form is the rising intonation used in conjunction with that vowel. For Cantonese me1 and ho2, they argued that these are single morphemes spanning over both functional heads (i.e., spanning à la Williams 2003 and Svenonius 2012, both cited in Heim et al. 2016: 120).9 Their arguments were based on the idea that “eh,” me1, and ho2 all include meanings related to both “call on addressee” and “speaker commitment.” My definition of me1 in Sect. 6.2 arguably agrees with this idea that both meanings are expressed by me1, but I believe the focus is on “speaker commitment” and propose that me1 only occupies the lower SAP. Heim et al. (2016) argued that in Medumba, two segmental discourse particles can simultaneously occur, and that each one occupies one of the two speech act phrases. This agrees with Haegeman’s (2014) and Haegeman and Hill’s (2013) conclusions for West Flemish and Romanian, two other languages that also use two discourse particles simultaneously within a sentence. Haegeman also agreed on the ordering of the phrases proposed by Heim et al. What Heim et al. referred to as “call on addressee,” Haegeman (2014: 135) referred to as “catching the addressee’s attention,” and both authors said that this is the higher of the two phrases and that particles expressing meanings related to speaker commitment head the lower of the two phrases. Adopting Haegeman’s phrase names, I call the higher one a speaker-oriented speech act phrase (SAPS), and the lower one a hearer-oriented speech act phrase (SAPH), resulting in the following structure: (27)

 However, they did entertain the possibility that these SFPs, like “eh,” can be divided into two morphemes, with their tones being analyzed as intonational rather than lexical (cf. Sybesma and Li 2007). 9

216

7  The Syntax of Intonation

The Cantonese SFPs described in Chap. 6, and the forms of English intonation that they were each linked to, are all hypothesized to head the lower SAPH. These function to express the speaker’s stance to the hearer and to link the propositional content of TP to the discourse. They are all assumed to be base generated at the head of SAPH and ForceP raises to SPEC of SAPH, resulting in a sentence-final position for the particle. The question particles me1 and aa4 for example, head SAPH, and a declarative ForceP raises to its SPEC position. The same is assumed for the high-­ rising and low-rising tones in English that are comparable with me1 and aa4, respectively. Carnie (2013) said there is no movement involved in English rising-­ tone questions, but that a [+Intonation] feature associated with the question particle results in a rising tone at the end of the sentence. This is possible, but I will tentatively assume that ForceP raises to SPEC of SAPH for both Cantonese and English declarative questions. The same is assumed for Cantonese’s epistemic particles of obviousness lo1 and aa1maa3, and the diminutive particle ze1, plus all of their intonational counterparts in English (see Chap. 6 for details). Some (and perhaps all) of English’s discourse intonation is assumed to comprise discourse tonal particles that head the functional phrase SAPH. Related to this, Sect. 3.2.1 mentioned a number of linguists who claimed that Cantonese has sentence-­ level tones that can be classified as SFPs, and I assume that these also head SAPH. Leung (1992/2005: 80–83), for example, described six pitch contours in Cantonese that he referred to as SFPs, though he said nothing about their syntax. Law (1990: 172) argued that rising declaratives10 are formed by what she referred to as a floating high-level tone particle, although I disagree with her further claim that it “occur[s] at the boundary of the utterance node” and is therefore not in the syntax. Sybesma and Li (2007) argued that four of the Cantonese tones that occur with SFPs (i.e., tones 1, 3, 4, and 5) are morphemes rather than lexical tones and ­proposed syntactic slots for them, two of which they placed in the higher of two functional phrases above ForceP. Yip (2002) said that phrase-level tones in Cantonese can be thought of as discourse particles. And fully in line with my hypothesis here, Tang (2006) said that the rising question tone in Cantonese is an SFP that occupies the same syntactic slot as me1. Making the same kind of argument for English discourse particles should not be any more controversial. While I do not agree with Sybesma and Li’s (2007) analysis for several reasons,11 it is very thought provoking and raises the still debated question as to whether the

 She referred to them as echo questions.  Their analysis is based on the idea that all the segments of all the Cantonese SFPs are morphemes (or what they called “minimal meaningful units (MMUs)”. The onset, rime, coda and tone are all MMUs, which implies that every Cantonese SFP is actually a cluster of SFPs, with each MMU occupying a different syntactic slot in a split-CP. It is an interesting idea, but has a few problems; it seems unlikely that Cantonese would be the only language that does this; not every Cantonese SFP can map its segments onto their proposed syntactic structure (e.g., aa1maa3); the meanings of some of the particles do not seem to be a combination of the meanings of their MMUs; and they placed the question particles me1 and aa4 in ForceP along with the interrogative particle maa3, which is something I have argued against. 10 11

7.3  Tonal Morphemes That Function as Discourse Particles

217

Table 7.1  Properties of four “huh” particles in American English Properties Sentence-initial Sentence-final Can be used as interjection Discourse intonation on associated sentence

huh-­ interesting √ X √

huh-­ shocked √ X √

huh-confirming agreement X √ X

huh-confirming prediction X √ X





X

X

tones of SFPs are lexical or intonational in nature. This is a question not only for SFPs in tonal languages like Cantonese but also for SFPs in intonational languages like English. Should the different tones associated with American English’s “huh” and Canadian English’s “eh” be analyzed like lexical tones or, as Heim et al. (2016) concluded, like intonational morphemes? Let us consider this by looking at the particles that can optionally function as interjections, which were argued by Haegeman (2014) to always occupy the higher SAPS. Haegeman (2014: 135) “characterize[d] the higher SAP[S] as ‘dynamic’ and ‘directional’: it relates the utterance to an addressee as the one for whom the utterance is intended.” When such particles are used as discourse particles, there is no pause between them and their associated sentence. When they are used as interjections, they are assumed to be a separate utterance, appearing in isolation, or before or after a sentence with a sufficient pause between it and the sentence. The arguments that follow are more speculative than those above, but I believe that hypothesizing some possibilities can be useful for guiding future research. A particle like American English’s “huh” is an interesting case to consider because it can use more than one tone contour, and these contours clearly change its meaning, and even determine whether or not it is a particle that can function as an interjection. (28)

a. Huh [high-falling], I didn’t know that. b. Huh [high-rising], she left you?! c. It tastes good, huh? [high-rising] d. You like it, huh? [dipping-rising low tone]

(interesting) (shocked) (confirming agreement) (confirming prediction)

Giving preliminary, rough definitions based on my own intuition, “huh” can be used in at least the four ways shown in (28), which can be described as follows: sentence-initially with a high-falling tone to mean something like “that’s interesting/surprising” (28a), sentence-initially with a high-rising tone to mean roughly “I find this hard to believe/I’m shocked” (similar to sentence-initial “What?!”) (28b), sentence-finally with a high-rising tone to mean something like “I think it is like this; do you agree?” (28c), and sentence-finally with a dipping-rising low tone to mean something like “I think it is like this; I thought before now that it would be like this” (28d). For ease of discussion, I will refer to these as huh-interesting, huhshocked, huh-confirming agreement, and huh-confirming prediction, respectively. There are some noteworthy facts about these four particles:

218

7  The Syntax of Intonation

• Although huh-shocked and huh-confirming agreement are both described as high-rising, their phonological properties appear to me to be clearly different, though a phonetic analysis would need to be conducted to confirm this—regardless, they should be analyzed as different tones because they obviously express different meanings. • huh-interesting and huh-shocked are always sentence-initial and can be used in isolation as interjections.12 In contrast, huh-confirming agreement and huh-­ confirming prediction can only appear in sentence-final position and cannot be used as interjections. • According to my intuition, a clause that follows huh-interesting or huh-shocked can include its own discourse-related intonation, but a clause preceding huh-­confirming agreement or huh-confirming prediction cannot. For example, the sentence following huh-shocked in (28b) can use the rising question tone that is equivalent to the Cantones question SFPs me1 (see Sect. 6.2), but the sentence preceding huh-confirming agreement in (28c) does not seem to be able to use a tone that adds connotative meaning.13 The only potential intonation used on the sentences that precede huh-confirming agreement and huh-confirming prediction appears to be limited to tones related to emphasis or focus, and this should be allowed because those tones would head FocP, which is lower down in the structure. These differences can be summarized as shown in Table7.1. Haegeman (2014) said the following in relation to the connection between sentence-­initial particles and interjections: … only a [discourse marker (DM)] that can be initial can also constitute an utterance by itself. Final DMs cannot appear in isolation—i.e., as “interjections.” The generalisation extends to Dutch and to the Italian dialects analysed by Penello and Chinellato (2008a, b). Anticipating the discussion, the outcome of my analysis is that only DMs that are merged in the higher Speech Act Projection (cf. section 5) can be used as interjections. (Haegeman 2014: 118, footnote 2)

Based on Haegeman’s analysis, I presume that huh-interesting and huh-shocked occupy SAPS and that huh-confirming agreement and huh-confirming prediction occupy the lower SAPH. In line with this, Tang (2011) proposed an extra projection above the SFP projection, arguing that this is where Cantonese interjections lie. A

 It is of course possible to utter a sentence and then to use huh-interesting or huh-shocked as an interjection afterwards, but only so long as there is a sufficiently long pause, which is taken to mean it is a separate utterances rather than a case of huh-interesting and huh-shocked appearing in sentence-final position. 13  Consistent uttering of the same form of intonation can be tricky, and we need to be careful not to change a tone and then see it as evidence of a tone being able to appear where it cannot. In this case, if a rising tone “huh” follows a rising-tone question version of (29c), then this “huh” must be uttered like huh-shocked, not huh-confirming agreement, and it additionally requires a pause between it and the preceding sentence. In other words, it is okay to utter huh-shocked as an interjection in isolation after uttering a sentence using a high-rise question tone, but huh-confirming agreement cannot be uttered as an SFP following such a sentence. 12

7.3  Tonal Morphemes That Function as Discourse Particles

219

particle is used as an interjection when the ForceP below it is null, but as a discourse particle when the ForceP is pronounced. Accordingly, American English’s huh-­ interesting and huh-shocked, which can be used as interjections, occupy SAPS. This leaves the lower SAPH empty, allowing for it to be filled by a tonal discourse particle, which is why the clause that follows can use a discourse tone. This matches what is shown in Table  7.1. Sentence-final huh-confirming agreement and huh-­confirming prediction, in contrast, occupy the lower SAPH, preventing a discourse tone from being used on the clause because the “huh” particle has already occupied the position. The clause is assumed to raises to SPEC of SAPH. Tones related to focus or emphasis are still free to occupy FocP. The structure in (29) shows the syntax for an English sentence with a sentence-initial segmental particle (e.g., “huh” or some other initial particle) and a sentence-final contour. (29)

I tentatively assume that no sentence-initial discourse tonal particle is ever used, which matches what also appears to be the case with regard to tonal interrogative particles. Many languages, such as West Flemish, can use two segmental particles simultaneously, for example; the sentence in (30) below that shows the use of the attention-seeking particle zé, which uses a rising tone, in combination with the evidential particle zè, which uses a falling tone. The structure in (29) proposes how two particles can be used in a single sentence. In English, this would typically be a sentence-initial segmental particle used in combination with a sentence-final tonal

220

7  The Syntax of Intonation

particle. In a language like West Flemish, it is not uncommon for two segmental particles to appear in a single sentence. When this happens using the two ze particles, the particle that heads the higher SAPS (i.e., zé) can be sentence-initial, while the one heading SAPH (i.e., zè) must be sentence-final—this involves movement of ForceP up to SPEC of SAPH as shown in (29). Another option is for the higher particle zé to appear sentence-finally after the lower particle zè, as shown below in (30b). In this case, after ForceP has raised to SPEC of SAPH, SAPH then raises to SPEC of SAPS, resulting in both particles appearing in sentence-final position with the higher one appearing furthest to the right. Haegeman (2014: 123) referred to the tones associated with the particle ze as “rising intonation” and “falling intonation,” implying that it is something additionally added to the particles. She concluded that attention-seeking zé with rising intonation occupies the higher SAPS, and evidential zè with failing intonation occupies the lower SAPH. As such, zé can either appear in sentence-initial position or in sentence-final position after zè. (30)

a. Zé, k’een gedoan zè. zé I-have done zè “I have finished, see.” b. K’een gedoan zè, zé. (Haegeman 2014: 123)

An obvious question that arises is the nature of the four tones in (28a–d) that are used on the syllable “huh,” as well as the rising and falling tones used with the West Flemish particles in (30). A tonal morpheme can occupy a syntactic slot distinct from the slot that is occupied by the segmental morpheme(s) across which the tone is phonologically realized—in other words it can be a floating tone. In example (4), for example, the verb wá (“come”) is located at the head of the verb phrase, even after the verb-final TP has raised and the interrogative tonal particle has changed the verb’s tone to wȃ. The syntactic location of the interrogative tonal particle is not the same as that of the verb; it is located at the head of InterP. If we argue something similar for “huh,” then it will be analyzed as a two-particle cluster: a segmental particle plus a tonal particle. In the case of sentence-final huh-confirming agreement and huh-confirming prediction, it could be argued that the tonal particle occupies the higher SAPH, but for sentence-initial huh-interesting and huh-shocked, our structure in (29) does not have any position above SAPS that the tonal particle could be said to occupy. One possible solution is to hypothesize yet another functional phrase higher up in the structure, but this is not an ideal conclusion if we can avoid it because it adds more to the structure than has been proposed thus far in most of the studies cited above. Another possible solution is to hypothesize that the tones associated with English’s “huh” are functioning as lexical tones that merely distinguish the various versions of “huh” from one another. This again is not an ideal proposal in relation to a nontonal language and in relation to tones that intuitively sound intonational rather than lexical, but it is nevertheless a possibility. A third hypothesis is that the

7.3  Tonal Morphemes That Function as Discourse Particles

221

tones are actually affixal in nature. According to this idea, a tone is analyzed as a meaningful morpheme, in line with what has been argued throughout this book, but it is attached like a tonal affix to a segmental morpheme, combining its meaning with that s­ egmental morpheme and occupying the same syntactic slot.14 These three hypotheses can be simply stated as follows: (31)

A tone associated with a segmental discourse particle is a. a lexical tone; b. a tonal morpheme located in a higher functional projection; c. an affixal morpheme attached to the segmental particle.

The nature of the tones associated with discourse particles in lexical-tone languages has also been debated. Many linguists assume that the tones associated with Cantonese SFPs are lexical tones and therefore assume the hypothesis in (31a) for Cantonese particles. Sybesma and Li (2007) alternatively adopted (31b) by concluding that the tone of an SFP in Cantonese is a morpheme that heads its own functional phrase. Other linguists have proposed that the tones of SFPs are intonational in nature (e.g., Wu 2008; Ding 2012), which also implies support for (31b). Heim et al. (2016) explicitly chose hypothesis (31b) in relation to the particle “eh” in Canadian English. They said when it is used with rising intonation, it comprises two morphemes with the segmental “eh” morpheme occupying the lower SAPH and the rising-tone morpheme occupying the higher SAPS. Citing Sybesma and Li (2007), Heim et al. also entertained the possibility that (31b) applies to the Cantonese particles me1 and ho2, but their ultimate conclusion was that these particles are each single morphemes that occupy both speech act phrases. A problem with analyzing the SFP me1 as occupying both speech act phrases, whether as two morphemes or as a single morpheme that spans both, is that it can be used in combination with the sentence-initial particle mat1 (“what”). The particle mat1 appears sentence-initially, and it arguably functions in part to get the attention of the hearer; it can therefore be analyzed as heading the higher SAPS. (32)

Mat1 ngo5 san1 joeng5-zo2 zek3 gau2 me1? what 1s new raise-PERF CL dog ME “What, I have a new dog?! (Actually I don’t)” (Heim et al. 2016: 121)

 Yet another possibility is that the segments of “huh” merely function to provide a means of articulating the tonal particle—in fact, according to my intuition, “hmm” can be used in place of “huh” without a change in meaning, so long as the tone remains the same. If so, this now means that only the tone is a morpheme, and this is no longer a two-particle cluster. This hypothesis is difficult to apply to other interjections such as “whoa” and “what,” however, which have semantic content of their own, even if bleached; I therefore do not consider this hypothesis a plausible syntactic solution for the vast majority of particles that include a form of intonation.

14

222

7  The Syntax of Intonation

If the particle me1 occupies both SAPH and SAPS, then example (32) implies that the higher phrase is filled twice. We would thus need to propose a third speech act phrase above SAPS. Hypothesis (31b) is also problematic for interjections. If we accept Haegeman’s (2014) argument that interjections all occupy the higher SAPS, then there is no syntactic position higher up for any intonation associated with interjections to occupy. We cannot argue that it is only the tones of interjections that occupy the higher SAPS phrase and that the particle made up of the interjections segments occupies the lower SAPH phrase. Arguing that interjections occupy both phrases in this way would mean that when they are used as sentence-initial particles, they could not be used in combination with a sentence-final particle, whether tonal or segmental, but examples such as (28a–b) and (30b) show that they can be used in combination with SFPs, either tonal or segmental. Accepting hypothesis (31b) therefore forces us to add more phrases to the structure. Hypothesis (31c) allows for a particle and its associated tone to be analyzed as two morphemes heading a single phrase. This would allow for sentences like those in (28) and (30), as well as for tones on interjections, without adding additional phrases to our structure. If the proposal that intonation is morphemic is accepted, then analyses of the syntax of discourse particles will have to resolve it by (1) analyzing the tones used with segmental particles as lexical tones, even for nontonal languages, (2) increasing the number of functional phrases above ForceP, (3) analyzing the tones as affixes, or (4) proposing some other way I have not thought of. Consider the West Flemish examples in (30). If we adopt (31a) and assume these tones to be lexical, then we are assuming that languages which do not use lexical tone generally, do so with discourse particles. If we adopt (31b) and assume the tones to be morphemes heading their own functional phrases, then the examples in (30) involve four functional phrases above ForceP rather than two. If we alternatively adopt (31c) and assume that these tones are affixal in nature, then they are still analyzed as meaningful morphemes, but each combines with ze to form a single lexical item heading a single functional phrase. A similar decision needs to be made about all such cases. Gussenhoven (2004: 46), for example, concluded that the tones associated with certain Dutch and Bengali discourse particles “constitute morphemes in their own right, and, unlike lexical tone, do not form part of the representation of the segmentally represented morphemes. These toned particles are thus polymorphemic expressions.” For the tones he was referring to, he rejected the possibility of (31a). If he is correct, then there is a question as to where and how the tones of these particles fit into the syntactic structure. Calhoun and Schweitzer (2012) looked at intonational forms in English that are associated with single words (e.g., English’s “yeah”) or fixed phrases, and Schweitzer et al. (2015) did the same for German and English. Their conclusions appear to support (31c). They speculated as to whether the intonation is stored separately in the lexicon or together with the word, and proposed it was the latter. This is similar to the debate about whether tense inflections are stored separately in the lexicon and

7.4  Prosodic Structure

223

attach to verbs that move into their syntactic position (or vice versa) or whether each verb has a separate entry in the lexicon for each of its inflected forms (e.g., “walk,” “walked,” and “walks” would each be analyzed as separate lexical entries). Schweitzer et al. (2015: 79) said, “we think it is more likely, and consistent with our results, that the word(s), contour, and pragmatic function are stored as a combined unit if they co-occur frequently enough.” (31c) is perhaps the best hypothesis among those I have proposed, but further research needs to be done before any strong claims can be made one way or the other.

7.4  Prosodic Structure Something must be said here about the existence within language of prosodic structures that does not align directly with syntactic structures. In Chap. 2, prosodic structure was analyzed as not belonging to intonation based on the assumption that it is a form of “non-morphemic linguistic prosody.” It was suggested in Sect. 2.1 that prosodic phrasing is merely a by-product of the syntactic structure, linearization, and prosodic well-formedness (cf. Bennet and Elfner 2019). In other words, syntactic structure is marked by means other than intonation or prosody, and prosodic phrasing is merely a phonological phenomenon that has no semantic or grammatical interpretation. Hirst (1993: 781) said some linguists had “explored the possibility that prosodic structure can be derived from surface syntactic structure by a unified set of primitives combining language-specific parameters and universal constraints.” If true, then this supports the idea that prosodic structure is a by-­ product of syntax that entails no additional semantic or grammatical content, such content having already been marked (or represented in the mind) by other means. However, if on the contrary, prosodic structure does contribute interpretable features to sentences, then it would need to be analyzed as morphemic using arguments along the same lines as those given above for grammatical and discourse tones. One argument against claiming that the tones related to prosodic phrasing are morphemic is that they do not appear to have segmental counterparts, which one would expect if they were morphemic. If we assume they are not morphemic but that they do have semantic content, then they would constitute a counterexample to the inclusiveness condition and to the hypothesis that the PF and LF levels of grammar do not interact directly. Selkirk (1986, quoted in Hirst 1993: 782) said that “intonational phrasing appears to be subject to semantic well-formedness constraints rather than to conditions based on surface syntactic structure.” In an attempt to address this, Hirst (1993) formulated a “Mapping Rule” that was proposed to show how the syntactic and prosodic structures are linked.

224 (33)

7  The Syntax of Intonation Mapping Rule15 Map a syntactic structure exhaustively onto a linear sequence of [prosodic] phrases such that a. the left end of each [prosodic] phrase corresponds to the left end of a major syntactic constituent b. the [prosodic] phrase is no longer than the corresponding syntactic constituent (Hirst 1993: 786)

This rule relates to the following kinds of prosodic phenomenon illustrated by Steedman (2000). Each set of brackets indicates the beginning and ending of a prosodic phrase: (34)

a. (The absent-minded professor) (was avidly reading) (about the latest biography) (of Marcel Proust). b. (Marcel proved) (completeness). c. ∗(Three mathematicians) (in ten derive a lemma). d. ∗(Seymour prefers the nuts) (and bolts approach). e. ∗(They only asked whether I knew the woman who chaired) (the zoning board). (Steedman 2000: 649-650)

If we assume X′ levels to be “major syntactic constituents,” then the Mapping Rule in (33) accounts for (34a–d). The left end of each prosodic phrase in (34a–b) corresponds to a major syntactic constituent (all are phrase level, except for “was avidly reading...,” which is a T′). The right ends of all the prosodic phrases in (34a– b) do not extend beyond the ends of the syntactic phrases that begin where the prosodic phrase begins. The unacceptability of (34c) is accounted for by the fact that the prosodic phrase that begins at the left end of the preposition phrase “in ten” is longer than this PP phrase. The unacceptability of (34d) is accounted for because the prosodic phrase beginning on “and” does not begin at the left end of that conjoined construction (i.e., “and” is not the beginning of a major syntactic constituent). The Mapping Rule cannot account for the unacceptability of (34e), however, because the prosodic phrase beginning at the left end of the entire TP should be allowed to end immediately prior to the left end of any major category within the sentence, and it does so before the noun phrase (NP): “the zoning board.” It is possible that the unacceptability of a sentence like (34e) is a matter of context and processing rather than grammar. It seems acceptable to include a pause before “the zoning board” if the speaker is trying to remember the name of the board, or even for dramatic or comic effect. It also seems to become acceptable if

 I changed his “intonational phrase” to “prosodic phrase” in order to align it with the terminology I use here, based on the tentative assumption that this relates to what I call prosody and not to what I call intonation.

15

7.4  Prosodic Structure

225

the NP object of “chaired” is lengthened by, for example, adding a relative clause modifier after “board.”16 Perhaps the reason (34e) appears unnatural when written down with the prosodic phrasing shown is simply a processing issue, akin to when questioned constituents are moved up from too far down within a multiclausal structure; people normally judge such sentences as ungrammatical even though they are grammatical. Even if we assume that the Mapping Rule, or some other rule or constraint (cf. Truckenbrodt 1999; Wagner 2005; Selkirk 2011; Güneş 2015), successfully matches prosodic structure to the surface syntactic structure, we must determine whether or not the prosodic phrasing of a sentence like (34b) “(Marcel proved) (completeness)” includes some interpretable feature that is not present in the alternative prosodic phrasing “(Marcel proved completeness).” If so, then based on my proposal of the nature of intonation, this feature enters the numeration and is projected in the syntactic structure. An obvious question then arises as to where such features are located. Is the prosodic break preceding “completeness” merely a type of Focus marker (or an NP marker) that results in the beginning of a new prosodic phrase at the left end of NP? If so, this would explain why the first prosodic phrase that begins at the left end of the TP ends where it does, and in fact would explain why all prosodic phrases begin at the left ends of major constituents—they mark the following constituent. Steedman (2000) went beyond arguing that the left end of prosodic phrases mark the following syntactic constituent. He suggested that within the sentence “(Marcel proved) (completeness),” the prosodic phrase “Marcel proved” can be analyzed as a syntactic surface constituent and that this sentence therefore has two possible structures depending on which prosodic phrasing is used.

Marcel proved completeness

Marcel

proved completeness

Steedman (2000: 652)

The structure on the left represents a traditional TP structure consisting of an NP subject and a VP predicate. The structure on the right, in contrast, shows the TP to be made of the constituents “Marcel proved” and “completeness.” Steedman (2000: 652) said, “More complex sentences like ‘Marcel says that Harry proved completeness’ may have many surface structures for each reading.” He then went on to argue for this using coordination examples of the kind “X proved, and Y prove, completeness.” This conclusion is not ideal, however, because it forces us to reanalyze all sentences as potentially having multiple surface structures, and it would require accepting unconventional phrases made up of, in this case, subject-verb.

 I thank Daniel Hirst (p.c., 2019) for suggesting that the length of the NP may have an effect on acceptability.

16

226

7  The Syntax of Intonation

It is beyond the scope of this chapter to attempt any definitive answers about this long-debated issue. I will merely list what I see as the three possibilities and state that I tentatively assume 1. to be correct: 1. The tones associated with prosodic phrasing contribute no semantic content or grammatical functions to the sentence beyond what is already present in the numeration; 2. The tones associated with prosodic phrasing do contribute semantic content and/ or grammatical functions to the sentence, and they are morphemes with functional features that enter N and project in the syntax; 3. The tones associated with prosodic phrasing do contribute semantic content and/ or grammatical functions to the sentence, and they are a counterexample to the inclusiveness condition and the hypothesis that there are no direct interactions between PF and LF. It is worth adding a note about the prosody associated with topics. Even if we assume that prosodic structure is not morphemic, it is still possible that topicalized phrases are marked by intonation (i.e., not merely marked by nonmorphemic prosody). Topics always appear to be marked in some way, which is not surprising considering Aboh’s (2010) argument that [+Top] is in the numeration. The question is how we should analyze a tone that appears to be marking a topic. Is it a tonal morpheme that is optional, just as segmental topic markers are optional in Chinese, or is it merely incidental prosody that is a by-product of an already marked structure? The existence of segmental topic particles offers evidence that topics are marked by morphemes, but topic particles in Mandarin (e.g., a, ne, me, ba) are optional. In some cases, a pause is used where the particle would appear, and in some cases, there is no particle and no pause. Topic particles are assumed to head TopP and have a [+Top] feature, and when no overt particle is used, the [+Top] feature is assumed to be contained within a null operator. In all cases, the topicalized phrase moves to SPEC of TopP.17 In English, which has no topic particles, topicalized phrases often include an optional tone that marks it off from the rest of the sentence. Gussenhoven and Jacobs (2014: 149) gave the example “Once we’re in China, we can practice our Chinese” and said that a high tone is likely to be used on the last syllable of “China” to mark the boundary that separates the topic from the comment. Note that they said, “likely,” indicating that this tone is optional like the topic particles of Mandarin. There is a question as to whether this tone should be analyzed as the phonological representation of a [+Top] particle that heads a TopP, which would then mean it is a tonal counterpart to topic particles like those found in Mandarin, or whether it is merely a by-product of something that has already been marked due to the topicalized phrase having raised to SPEC of TopP.

 Some linguists argue that Chinese and other so-called topic prominent languages can have “dangling topics,” which are assumed to have no syntactic relationship to the rest of the sentence, only a semantic relationship. I instead adopt Shi’s (2000) conclusion that dangling topics do not exist.

17

References

227

7.5  Concluding Remarks This chapter argued that intonation is not a counterexample to the hypothesis that the PF and LF levels of grammar do not interact directly, nor is it a counterexample to the inclusiveness condition. This is based on the hypothesis that intonation is part of the lexicon and enters the syntax the same as all lexical items and all interpretable functional features. A proposal was put forth as to how and where intonation is located in the structure of a sentence. This was done by hypothesizing that tonal morphemes occupy the same syntactic slots as their segmental counterparts. For grammatical tones, segmental counterparts are often found in the language’s own history and/or in closely related languages. In the case of functional categories at the left periphery of the sentence, comparable segmental counterparts can be found in other languages, and the syntactic positions of the tonal morphemes can then be postulated based on where their counterparts with similar discourse functions are located inside a proposed universal structure that includes two speech act phrases above ForceP. A decision must be made about the nature of the tones that are associated with discourse particles, and further research will be required to test which of the hypotheses in (31)—or perhaps some other—best accounts for the linguist facts cross linguistically. Of course, it could be possible that these tones have a different nature in different languages, or even among particles within a single language. Ideally we want to end up with a theory that produces a universal structure and accurately accounts for the linguistic facts. The nature of the tones associated with prosodic structure must also be determined. Are they merely a reflection of the semantic content and syntactic structure that already exists in the sentence, or do they add structure and/or meaning to the sentence? Here, I have assumed the former, but this is a difficult question that ­linguists have been discussing for decades and will no doubt continue discussing for years to come.

References Aboh, E. O. (2010). Information structuring begins with the numeration. IBERIA, 2(1), 12–42. Aboh, E. (2016). Information structure: A cartographic perspective. In C. Féry & S. Ishahara (Eds.), The Oxford handbook of information structure (pp. 147–164). Oxford: Oxford University Press. Aboh, E. O., & Pfau, R. (2010). What’s a wh-word go to do with it? In P. Benincà & N. Munaro (Eds.), Mapping the left periphery (pp. 91–124). Oxford: Oxford University Press. Allan, K. (2006). Clause-type, primary illocution, and mood-like operators in English. Language Sciences, 28(1), 1–50. Bailey, L.  R. (2010). Sentential word order and the syntax of question particles. Newcasltle Working Papers in Linguistics, 16, 23–43. Bailey, L. (2013). Question particles: Thai, Japanese and English. Linguistica Atlantica, 32, 34–51. Bennet, R., & Elfner, E. (2019). The syntax-prosody interface. Annual Review of Linguistics, 5, 151–171. Calhoun, S., & Schweitzer, K. (2012). Can intonation countours be lexicalised? Implications for discourse meanings. In G. Elordieta, & P. Prieto (Eds.), Prosody and meaning: Interface explorations (pp. 271–328). Berlin: de Gruyter Mouton.

228

7  The Syntax of Intonation

Carnie, A. (2013). Syntax: A generative introduction (3rd ed.). Malden, MA: Wiley-Blackwell. Cheng, L., & Kula, N. C. (2006). Syntactic and phonological phrasing in Bemba relatives. ZAS Papers in Linguistics, 43, 31–54. Chomsky, N. (1964). Current issues in linguistic theory. Hague: Mouton. Chomsky, N. (1995). The minimalist program. Cambridge, MA: The MIT Press. Cinque, G., & Rizzi, L. (2010). The cartography of syntactic structures. In The Oxford handbook of linguistic analysis (pp. 51–66). Oxford: Oxford University Press. Ding, P. S. (2012, March). Utterance-final particles with grammaticalized intonation in Cantonese. Presented at the workshop on innovations in Cantonese linguistics (WICL), Columbus, Ohio. Dryer, M. S. (2013). Polar questions. In M. S. Dryer & M. Haspelmath (Eds.), The World Atlas of language structures online. Leipzig: Max Planck Institute for Evolutionary Anthropology. Güneş, G. (2015). Deriving prosodic structures. PhD thesis, University of Groningen, Netherlands. Gunlogson, C. (2003). True to form: Rising and falling declaratives as questions in English. New York: Routledge. Gussenhoven, C. (2004). The phonology of tone and intonation. Cambridge: Cambridge University Press. Gussenhoven, C., & Jacobs, H. (2014). Understanding phonology (4th ed.). London: Routledge. Haegeman, L. (2006). Thinking syntactically: A guide to argumentation and analysis. Malden, MA: Blackwell. Haegeman, L. (2009). The cartography of discourse markers in West Flemish. University of Ghent/ FWO. Haegeman, L. (2014). West Flemish verb-based discourse markers and the articulation of the speech act layer. Studia Linguistica, 68(1), 116–139. Haegeman, L., & Hill, V. (2013). The syntactization of discourse. In R.  Folli, C.  Sevdali, & R. Truswell (Eds.), Syntax and its limits (pp. 370–390). Oxford: Oxford University Press. Heim, J., Keupdjio, H., Lam, Z.  W.-M., Osa-Gómez, A., Thoma, S., & Wiltschko, M. (2016). Intonation and particles as speech act modifiers: A syntactic analysis. Studies in Chinese Linguistics, 37(2), 109–129. Hirst, D. (1977). Intonative features: A syntactic approach to English intonation. The Hague: Mouton. Hirst, D. (1983). Interpreting intonation: A modular approach. Journal of Semantics, 2(2), 171–182. Hirst, D. (1993). Detaching intonational phrases from syntactic structure. Linguistic Inquiry, 24(4), 781–788. Huang, C.-T.  J., Li, Y.  A., & Li, Y. (2009). The syntax of Chinese. Cambridge: Cambridge University Press. Kwok, H. (1984). Sentence particles in Cantonese. Hong Kong: Centre of Asian Studies, University of Hong Kong. Law, S.-P. (1990). The syntax and phonology of Cantonese sentence-final particles. Unpublished doctoral dissertation, Boston University, Boston, MA. Law, A. (2002). Cantonese sentence-final particles and the CP domain. UCL Working Papers in Linguistics, 14, 375–398. Law, A. (2004). Sentence-final focus particles in Cantonese. Unpublished doctoral dissertation, University College, London. Law, P. (2006). Adverbs in A-not-A questions in Mandarin Chinese. Journal of East Asian Linguistics, 15(2), 97–36. Leung, C. (2005). 當代香港粵語語助詞的研究 [A study of the utterance particles in Cantonese as Spoken in Hong Kong]. Hong Kong: Language Information Sciences Research Centre, City University of Hong Kong. Li, B. (2006). Chinese final particles and the syntax of the periphery. Unpublished doctoral dissertation, Leiden University, Leiden. Li, C.  N., & Thompson, S.  A. (1981). Mandarin Chinese: A functional reference grammar. Berkeley: University of California Press.

References

229

Paul, W. (2014). Why particles area not particular: Sentence-final particles in Chinese as heads of split CP. Studia Linguistica, 68(1), 77–115. Rizzi, L. (1997). The fine structure of the left periphery. In L. Haegeman (Ed.), Elements of grammar: Handbook in generative syntax (1997, pp. 281–337). Dordrecht: Kluwer Academic. Rizzi, L. (2001). On the position of int(errogative) in the left periphery of the clause. In G. Cinque & G. Salvi (Eds.), Current studies in Italian syntax (pp. 287–296). Amsterdam: Elsevier. Schweitzer, K., Walsh, M., Calhoun, S., Schütze, H., Möbius, B., Schweitzer, A., & Dogil, G. (2015). Exploring the relationship between intonation and the lexicon: Evidence for lexicalised storage of intonation. Speech Communication, 66, 65–81. Selkirk, E. O. (1984). Phonology and syntax: The relation between sound and structure. Cambridge, MA: The MIT Press. Selkirk, E. O. (2011). The syntax-phonology interface. In The handbook of phonological theory (2nd ed., pp. 435–484). Malden, MA: Wiley-Blackwell. Shi, D. (2000). Topic-comment constructions in Mandarin Chinese. Language, 76(2), 383–408. Siemund, P. (2001). Interrogative constructions. In M. Haspelmath, E. König, W. Oesterreicher, & W. Raible (Eds.), Language typology and language universals: An international handbook (Vol. 2, pp. 1010–1028). Berlin: Walter de Gruyter. Speas, P., & Tenny, C. (2003). Configurational properties of point of view roles. In A. M. Di Sciullo (Ed.), Asymmetry in grammar: Volume 1: Syntax and semantics (pp. 315–344). Amsterdam: John Benjamins. Steedman, M. (2000). Information structure and the syntax-phonology interface. Linguistic Inquiry, 31(4), 649–689. Svenonius, P., & Kennedy, C. (2006). Northern Norwegian degree questions and the syntax of measurement. In M. Frascarelli (Ed.), Phases of interpretation (pp. 133–161). Berlin: Mouton de Gruyter. Sybesma, R., & Li, B. (2007). The dissection and structural mapping of Cantonese sentence final particles. Lingua, 117(10), 1739–1783. Tang, S.-W. (1998). Parametrization of features in syntax. Unpublished doctoral dissertation, University of California, Irvine, CA. Tang, S.-W. (2006). 粵語疑問句「先」的句法特點 [Syntactic properties of sin in Cantonese interrogatives]. 《中國語文》 [Zhongguo Yuwen], 312(3), 225–232. Tang, S.-W. (2010). 漢語句類和語氣的句法分析 [A syntactic analysis of clause types and mood in Chinese]. 漢語學報 [Hanyu Xuebao (Chinese Linguistics)], 29(1), 59–63. Tang, S.-W. (2011). Cartographic syntax of interjections in Cantonese. Presented at the fifth international conference on formal linguistics (ICFL-5), Guangdong University of Foreign Studies, Guangzhou. Tang, S.-W. (2015). Cartographic syntax of pragmatic projections. In A. Li, A. Simpson, & W.-T. D. Tsai (Eds.), Chinese syntax in a cross-linguistic perspective (pp. 429–441). New York: Oxford University Press. Truckenbrodt, H. (1999). On the relation between syntactic phrases and phonological phrases. Linguistic Inquiry, 30(2), 219–255. Wagner, M. (2005). Prosody and recursion. Unpublished doctoral thesis, Massachusetts Institute of Technology. Wee, L.-H. (2019). Phonological tone. Cambridge: Cambridge University Press. Wiltschko, M. (2014). The universal structure of categories: Towards a formal typology. Cambridge: Cambridge University Press. Wu, W.-L. (2008). An acoustic phonetic study of the intonation in sentence-final particles in Hong Kong Cantonese. Asian Social Science, 4(2), 23–29. Yip, M. (2002). Tone. Cambridge: Cambridge University Press. Zhang, J. (2001). The effects of duration and sonority on contour tone distribution—Typological survey and formal analysis. PhD, University of California, Los Angeles.

Chapter 8

Conclusions and Implications

Two main hypotheses have been put forward in this book. First is the definition given in Sect. 2.2, which states that intonation is “a suprasegmental form that has semantic content or a grammatical function.” This categorizes discourse intonation together with grammatical tonal morphemes, much in the same way that the term particles is used occasionally in a way that loosely categorizes various types of grammatical and discourse particles together. Categorizing all tonal morphemes together in this way emphasizes that discourse intonation is assumed here to be morphemic in exactly the same way that tonal particles have always been assumed to be, and by logical extension, to be morphemic in the same way that all segmental particles are assumed to be. This leads to the second main hypothesis of the book, which is that intonation and segmental particles are two forms of the same thing. Excluded from the definition of intonation are any nonlinguistic suprasegmental forms that express human emotions, as are the tones associated with prosodic structure, which are tentatively analyzed as a phonetic by-product of syntactic information that is already present in the sentence. This way of viewing intonation differs from previous analyses. Those linguists who do not consider discourse intonation to have semantic content will need to determine whether they feel the evidence presented in this book (including the reviews of others’ works) is sufficient enough to convince them otherwise. In contrast, linguists who can readily accept the idea that intonation is morphemic will merely see this as a recategorization of intonation’s features and functions. For these latter linguists, the proposals I have made will not appear very drastic, except perhaps the syntactic proposals in Chap. 7. If the ideas presented in this book are accepted as a working hypothesis for the nature and properties of intonation, then this will have implications for the goals and methods of future research related to intonation. The ultimate goal of intonational studies will be to isolate meaningful (or grammatical) suprasegmental forms from all others. These are the only type of suprasegmentals that fall under the label intonation—all others are subtypes of nonmorphemic prosody. Of course, studies on intonation cannot be conducted in isolation from the other categories of prosody, but the ultimate goal is to distinguish one from the other, both in form and in meaning. © Springer Nature Singapore Pte Ltd. 2020 J. C. Wakefield, Intonational Morphology, Prosody, Phonology and Phonetics, https://doi.org/10.1007/978-981-15-2265-9_8

231

232

8  Conclusions and Implications

Unfortunately, I do not see how this hypothesis about intonation could be falsified, but data that demonstrate suprasegmental forms to have segmental morpheme counterparts provide a convincing form of empirical evidence that further strengthens the hypothesis. Looking for and recording more such cases is therefore worthwhile. The research of mine and others that was reviewed in Chaps. 4–6 describe how such research has been and can be done, and hopefully new and more effective methods will be discovered in the future. It is challenging research, even when using a language as ideal as Cantonese. This is because not all discourse particles have intonational counterparts. Preliminary analysis of several Cantonese sentence-final particles (SFPs) indicates that they appear not to translate as an identifiable form of intonation into English. One or more of these particles may possibly have tonal equivalents in English that I did not discover. Some may be polysemous, which could have caused the ambilingual participants to translate the same particle in different ways and prevented me from discovering a recurring intonational form among the relatively small sampling of translations. Another possibility is that these SFPs have no intonational equivalents in English, but that they may have yet-to-be-­ discovered intonational counterparts in other languages. Discovering tonal counterparts to segmental particles between any pair of languages is a worthwhile contribution to this research project. One assumption implicit in this research is that discourse particles and their floating tone counterparts have definable, context-independent meanings. If not, we would never find any consistent, one-to-one correspondence between a particle in language A and a tonal morpheme in language B. The idea that each particle and floating tone has a definable meaning does not mean that each form has only a single meaning; many are polysemous, and others are homophonous. It also does not mean that every particle has an equivalent form in all other languages. It is possible that some discourse-related meanings may be universal, but presumably many are specific to one or a few languages. This complex situation is partially illustrated in the following table, though more possibilities exist. In language A, particle 1 is polysemous, having the three meanings i, ii, and iii. Language B does not include a tone that expresses meaning i. Meaning ii is expressed in language B as tone 1, and meaning iii is expressed as tone 2. In language A, the form of particle 2 differs from the form of particle 1, and particle 2 expresses meaning iv. In language B, meaning iv is expressed in the form of tone 3, which has the same form as tone 2. Linguists who wish to conduct contrastive-based Table 8.1  Example of how polysemy of particles and homophony of tones can affect intonational research Discourse particles in language A Particle 1—polysemous meaning i Particle 1—polysemous meaning ii Particle 1—polysemous meaning iii Particle 2—meaning iv

Tones in language B No tone with meaning i Tone 1 has meaning ii Tone 2 has meaning iii Tone 3 has meaning iv, but is homophonous with tone 2

8  Conclusions and Implications

233

research to discover the forms and meanings of tonal morphemes must be aware of these kinds of possibilities when analyzing the data. The most difficult way to discover and analyze forms of discourse intonation is through direct observation of suprasegmentals, but this still has to be done because we cannot always use segmental particles as a tool. Probably only a minority of meaningful discourse tones have segmental counterparts, and even in those cases where such pairings exist, it is not easy to discover them in whatever language(s) they exist, and without the use of ambilinguals, it is difficult to demonstrate that both members of the tone/particle pair share the same meaning. In those cases where it is convincingly done, however, the meaning of the segmental particle can indirectly reveal the meaning of the intonational form to which it corresponds. It will always be necessary in intonational research to conduct studies that analyze intonation directly. To the extent possible, I think it is best to start with a meaning and to use elicitation and interpretation tasks. I believe this is easier and more effective than exploratory studies that analyze naturally-occurring data in an attempt to discover matches between form and meaning, though this is sometimes where one must start. It is difficult to discover the form and meaning of any given discourse tone through the observation of naturally-occurring data alone, due to the multiple complicating factors mentioned throughout this book. These complicating factors include multiple suprasegmental forms co-occurring, the existence of homophonous and/or phonologically similar tones, the accidental inclusion of meanings from the sentence and/or the discourse in the meaning of a tone, or incorrectly assuming that all tones of the same (or a similar) form should be analyzed as the same tone. Consider the following remarks from Féry (2017): [O]ne has to be careful to keep apart the role of the contour in a certain context, in its association with a piece of discourse, and the contour itself. It is in general a simple exercise to find other options of how different meanings are assigned to the same tonal contour, depending on the context … and contexts can be easily found where a tonal pattern assumed to be associated with a specific meaning has a completely different function …. To this date, it has not been shown that a tone or a sequence of tones is obligatorily associated with a specific meaning all by itself. Rather, it must be associated with the text or be licensed in the context in a certain way to be meaningful. (Féry 2017: 168–70)

In addition to considering content from the discourse to be part of a contour’s meaning, Féry (2017) made what I believe is a common false assumption, which is that two occurrences of the same intonational form should be assumed to be the same tone. Homophony, polysemy, and similar (but different) forms should all be considered possible for intonation because these phenomena are regularly found in segmental forms of language, including discourse particles. If native speakers agree that two tokens of a tone sound like the same tone, but they appear to have different meanings, then I believe we should not automatically conclude from this that intonational forms have no context-independent meaning. Instead, we should consider the possibility that either they are two homophonous (or phonologically similar) tones, or they represent a single tone which has been given a meaning that mistakenly includes meanings from the different contexts in which it occurs, and its definition therefore needs to be revised. Research should be guided by native-speaker intuition and care-

234

8  Conclusions and Implications

ful analysis of the contexts in an attempt to look for a common meaning that is expressed by a given tone—and we much always consider the possibility that we do not yet have an accurate definition of the tone, and/or that we are actually looking at instances of more than one tone. Chapter 3 discusses these issues. Perhaps gesture should also be analyzed as one of intonation’s forms and should therefore be included as a component of this research project. Sandler (2010) discussed signed intonation in sign languages. Since gesture is a form of signing, it should not be hard to accept the possibility that gesturing may be used as a form of intonation in nonsigned languages. One speculative example relates to the Mexican language Chalcatongo Mixtec, which Dryer (2013) reported as having “no formal marking of polar questions … [having sentences that can] be either a declarative sentence or an interrogative sentence, with no difference in intonation associated with the two meanings.” This was the only language out of 955 listed by Dryer that did not mark polar interrogatives. It seems unlikely that a language would not differentiate declaratives from polar interrogatives, as this would presumably cause communication problems. Assuming that some form of verbal marking was not missed by the field researcher(s), a potential speculative explanation is that the difference is marked gesturally. Bolinger (1983: 157) claimed that “[i]ntonation belongs more with gesture than with grammar,” but since intonation is analyzed here as belonging entirely with grammar, perhaps gesture can be seen as belonging mostly with paralinguistic features, but having some forms that belong to intonation.1 In other words, perhaps some gestures are morphemic in nonsigned languages just as they are in sign languages. It is possible that gestures, even in nonsigned languages, may be used in a regular manner to express certain connotative meanings, to distinguish clause types, or to distinguish homophonous forms of intonation. If this is the case, then under the assumption that emotion-related and linguistic-related suprasegmentals belong to two separate systems, a potential object of study might be to separate gestures that are linked to one system from those that are linked to the other. In addition to affecting the research questions and methods applied to intonational research, the theoretical assumptions made here also have implications for how intonation ought to be studied within psycholinguistics and language acquisition. Intonation is seen here as part of the lexicon, which is a core component of the Language Faculty. This differs significantly from assuming that intonational meaning is entirely pragmatic in nature, which would mean it belongs to the nonlinguistic aspects of human cognition that merely interact with language rather than being a component of language. Assuming intonation to comprise morphemes means it is represented in the mind, and acquired by children the same way all morphemes are represented and acquired. There are also implications for the study and development of strong artificial intelligence (Strong AI). Strong AI refers to the ultimate goal of making the “intellectual” capacity of machines identical to that of humans—a core component of which is language. If intonation is lexical, then in theory, it can potentially be programed into a machine, the same as morphemes, and in fact, this must 1  This even complies with the definition of intonation given here, so long as we allow ourselves to include gestures under the label “suprasegmentals.”

8  Conclusions and Implications

235

be done in order for a machine’s programming to accurately represent human language. At the very least, this book’s conclusions have implications for the design and application of the representation of intonation in artificial speech recognition and production. Aside from any potential practical applications, the ultimate goal of linguistics is to develop an accurate theory of language. Whether or not such a theory includes intonation being part of the lexicon is an empirical matter that must ultimately be determined through research. A good approach to determining this is further research that contrasts intonation with segmental particles, even beyond making one-to-one comparisons; if tonal and segmental particles are actually two forms of the same thing, then studying particles on their own can help us gain insight into the functions and meanings of tonal particles. Segmental particles are easier to study than intonation, so it makes good sense to study intonational functions and meanings indirectly through particles. Armstrong (2015: 1) “argue[d] that we should keep in mind the range of possible meanings of SFPs in other languages in order to refine our methodology in a way that allows us to make better predictions about the pragmatic division of labor among intonation contours.” Even though segmental particles can be polysemous, any given segmental particle is far easier to recognize from one occurrence to the next than is any given discourse tone. Based on the hypothesis that tonal and segmental particles are counterparts of each other, we can use segmental particles to verify whether or not intonational meanings can be defined independently of the discourse and can test the effectiveness of different approaches to defining them. Using native-speaker intuition to conduct research on intonation is not unscientific. In fact, that is what all intonational research must necessarily stem from, and intuition should be used as a guide to help throughout. Chomsky (1964: 29) said “a grammar that aims for descriptive adequacy is concerned to give a correct account of the linguistic intuition of the native speaker.” Studies like those from Maekawa (2004), Liu et al. (2013), Armstrong (2015), Prieto and Roseano (2016), and my own, have all elicited the expressions of specific intonational meanings from native speakers. These kinds of studies can produce strong evidence for the matching of a tone to a meaning. Also related to tapping into native-speaker knowledge, Rietveld and Chen (2006) proposed methods for eliciting judgments on intonational meanings, and Gussenhoven (2006) proposed methods for eliciting judgments of discreteness for both the forms and meanings of tones. It is the refining and improvement of these kinds of native-speaker informed studies that I believe will ultimately provide an accurate picture of the forms and meanings of intonation. Controlled studies like these differ from intonation research that analyzes naturally-­occurring data (e.g., Couper-Kuhlen and Selting 1996; Szczepek Reed 2011)—this difference is analogous to studying fluid dynamics through controlled experiments in a laboratory on the one hand, and carefully observing and recording the movement of water in a river or along an ocean coastline on the other. There is nothing wrong with doing both, but I think controlled methods guided by intuition will ultimately prove the most effective for gaining a deeper understanding of this most difficult aspect of language.

236

8  Conclusions and Implications

References Armstrong, M. E. (2015). Accounting for intonational form and function in Puerto Rican Spanish polar questions. Probus: International Journal of Romance Languages, 29(1), 1–40. Bolinger, D. (1983). Intonation and gesture. American Speech, 58(2), 156–174. Chomsky, N. (1964). Current issues in linguistic theory. Hague: Mouton. Couper-Kuhlen, E., & Selting, M. (1996). Prosody in conversation: Interactional studies. Cambridge: Cambridge University Press. Dryer, M. S. (2013). Polar questions. In M. S. Dryer & M. Haspelmath (Eds.), The World Atlas of language structures online. Leipzig: Max Planck Institute for Evolutionary Anthropology. Féry, C. (2017). Intonation and prosodic structure. Cambridge: Cambridge University Press. Gussenhoven, C. (2006). Experimental approaches to establishing discreteness of intonational contrasts. In S. Sudhoff, D. Lenertová, R. Meyer, S. Pappert, P. Augurzky, I. Mleinek, et al. (Eds.), Methods in empirical prosody research (pp. 321–334). Berlin: Walter de Gruyter. Liu, F., Xu, Y., Prom-on, S., & Yu, A. C. L. (2013). Morpheme-like prosodic functions: Evidence from acoustic analysis and computational modeling. Journal of Speech Sciences, 3(1), 85–140. Maekawa, K. (2004). Production and perception of “paralinguistic” information. Presented at the Speech Prosody, Nara, Japan. Prieto, P., & Roseano, P. (2016). The encoding of epistemic operations in two Romance languages: Intonation and pragmatic markers. In J. Barnes, A. Brugos, S. Shattuck-Hufnagel, & N. Veilleux (Eds.), Speech prosody 2016. Boston: Boston University. Rietveld, T., & Chen, A. (2006). How to obtain and process perceptual judgements of intonational meaning. In S. Sudhoff, D. Lenertová, R. Meyer, S. Pappert, P. Augurzky, I. Mleinek, et al. (Eds.), Methods in empirical prosody research (pp. 283–319). Berlin: Walter de Gruyter. Sandler, W. (2010). Prosody and syntax in sign languages. Transactions of the Philological Society, 108(3), 298–328. Szczepek Reed, B. (2011). Analysing conversation: An introduction to prosody. London: Palgrave Macmillan.

Appendix

For those readers who need it, this appendix provides a brief background on some of the ideas and terminology used in current models of generative syntax. It is divided into two sections. The first section explains X-bar theory and presents the terminology and notations used for describing phrases and sentences based on this theory. The second section explains the Split-CP hypothesis and the cartographic approach. Of course, this is not a detailed introduction, and it does not present the arguments that lie behind the underlying assumptions of the theory; it is merely an attempt to enable readers who are not familiar with current models of generative syntax to better understand the arguments presented in Chap. 7. For readers interested in a comprehensive introduction, the one by Carnie (2013) is excellent.

X-Bar Theory X-Bar Theory hypothesizes that all phrases in all languages have the same basic underlying structure. The X in the term X-Bar is a variable that represents a word class or, in some cases, a functional category that may or may not be phonologically overt. An XP can therefore be any category, such as a noun phrase (NP), a verb phrase (VP), a tense phrase (TP), a Focus phrase (FocP), a speech act phrase (SAP), and so on. All phrases are assumed to have the structure shown in (1): (1)

© Springer Nature Singapore Pte Ltd. 2020 J. C. Wakefield, Intonational Morphology, Prosody, Phonology and Phonetics, https://doi.org/10.1007/978-981-15-2265-9

237

238

Appendix

The tree in (1) represents a phrase of the category XP. The head word of this phrase is X0, with the superscript zero representing that it is the head of the phrase occupying its lowest level. There is a level between X0 and XP, which is shown as X′ (pronounced “X-Bar”), and this is what gives the theory its name. YP and WP represent phrases of other categories. They are both shown in parentheses because phrases in these positions are optional in the sense that not all phrases include them.1 YP is located in the specifier position of the phrase, or SPEC for short. The phrase that occupies the SPEC position is something like the subject of the phrase. In a tense phrase (TP), which is a clause, this position is where the NP subject goes. In X-Bar Theory, all phrases include a SPEC position, though it is often empty. The WP in (1) is the complement of X0. In a VP, for example, this would be the NP complement (i.e., object) of V. The head of a clause is assumed to be its tense element, so all main and embedded clauses are shown as a TP.2 Here is an illustration of how the structure of (1) can be used to represent an English sentence: (2)

The word categories shown as variables in (1) have changed to actual word categories in (2) as follows: XP→TP; YP→NP; WP→VP. Note that triangles are used for the NP and the VP. This is often done when the details of the phrase are not critical to whatever the linguist is discussing about the structure. This next diagram shows the same sentence with some details of the VP added:

 It is also worth noting that the order shown is the typical order seen in languages like English, where subjects appear before the verb and objects appear after the verb (see the structure in (3)). 2  Some authors use IP instead of TP, but they are the same thing. IP refers to an inflectional phrase. 1

Appendix

239

(3)

Note that the structure of the VP is exactly the same as that of the TP. It has a SPEC position, which in this case is empty, a V′ position, and a head position V0. It also has a complement to its head, which is the NP “my sandwich.” In Chap. 7, all the trees show phrases having this same structure, and like in (3), phrases stack on top of each other because each phrase includes a phrase within it that functions as the complement of its head word. This stacking can occur multiple times. Note that the NP complement of V0 could also be drawn out in detail using the same basic structure.3

The Split-CP Hypothesis and the Cartographic Approach It has long been assumed that all sentences, even main clauses, have a complementizer phrase (CP) above the sentence (i.e., the TP), as shown here: (4)

 Most generative syntacticians assume that an NP is embedded inside a DP and would therefore show “my sandwich” as a DP rather than an NP. In this case, the determiner “my” heads the DP, and the NP “sandwich” is the complement of D. 3

240

Appendix

While I do not explain the analyses and assumptions that motivate the way in which phrase structures are drawn (that requires a full textbook), I will explain some of the reasons why a CP is assumed to exist above main clause TPs, because doing so will allow me to explain some things about movement relevant to the discussion in Chap. 7. It may seem odd that main clauses are assumed to have a CP above them because a main clause does not allow the inclusion of an overt complementizer. For example, “∗That he will go home now” is not grammatical if used as a main clause (the ∗ marks the sentence as ungrammatical). The grammatical version of the sentence does not include the complementizer: “He will go home now.” It is nevertheless assumed that there is a CP above TP as shown in (4) and that the complementizer in C0 is phonologically null. One reason this is assumed is because if this were not the case, then there would be nowhere for the tense element to move up to when forming a polar question, or for a wh-phrase to move up to when forming a wh-question. Consider if the sentence in (3) were made into a polar question; the auxiliary “is,” which occupies T0, will move up above the subject, resulting in this word order: “Is that man stealing my sandwich?” This movement can be accounted for by the existence of a null complementizer that occupies the C0 position and has the feature [+Interrogative]. This null complementizer attracts the auxiliary to raise up and attach to it, giving us the correct word order. Now consider if we wanted to form a wh-question to question what that man is stealing. In this case, in addition to “is” raising up, the NP “my sandwich” would be changed to “what” and would raise up to the SPEC of CP, resulting in this word order: “What is that man stealing?” If there were no CP above TP, then neither the wh-phrase nor the auxiliary would have any syntactic position to occupy. These movements of T0 to C0 and of NP to SPEC of CP are shown as follows: (5)

Something to note about movement is that heads can move to other head positions, and phrases can move up to SPEC positions, but heads cannot move to a

Appendix

241

SPEC position, and phrases cannot move to a head position. Another thing to note about movement is that sometimes things can move semantically without moving phonologically. The semantic component of language is called Logical Form (LF), so when something is said to have moved in LF, this means there is no movement at the phonological level (PF). It is silent movement that affects meaning. An example of this is wh-insitu languages, where the wh-words in wh-questions do not move up to the left periphery of the sentence as they do in English, but they are still assumed to move in LF for reasons of semantic interpretation—this is because, semantically, they obviously have scope over the whole sentence and therefore must be positioned at (or near) its highest level in the mind of the speaker, even though they are pronounced in a lower position. Having described the concept of a silent CP above TP and having explained that this analysis is motivated by the observed leftward movement of auxiliaries and whphrases, it should be easy to understand the concept and motivation behind Rizzi’s (1997) Split-CP hypothesis. This hypothesis is based on Rizzi’s observations, mainly in Italian, of a number of phrases that move to the left periphery of the sentence in order to be placed into focus or to be topicalized. In order to accommodate for the movement of multiple elements in a single sentence, it was proposed that there must be more than a single phrase above TP; otherwise, all these elements that raise up would have nowhere to go. Based on this, the original CP was split into a number of functional phrases, resulting in the following structure proposed by Rizzi (2001). Note that this structure no longer includes a phrase referred to as CP: (6)

Force > (∗Top) > Inter > (∗Top) > Foc > (∗Top) > Fin > TP

The Split-CP hypothesis is part of what is referred to as the cartographic approach, which, as its name implies, is an attempt to map out the entire structure of sentences, so as to include phrases for all interpretable features. All of the items in (6) refer to phrases even though they do not include a “P,” and the left-to-right ordering represents hierarchical position. ForceP is directly above the first Topic phrase (TopP), which in turn is directly above the interrogative phrase (InterP), and so on. In generative syntax, illocutionary force is used in a relatively restricted sense to refer to clause type, so ForceP is a phrase that is headed by the grammatical element that types the clause. A note about notation is that the parentheses around the three TopPs indicate that those phrases are optional. The phrases that are not optional (i.e., all those shown without parentheses) are assumed to be present in all sentences. The ∗ that marks all the TopPs indicates that any number of TopPs can appear in those positions, so there could be two TopPs between ForceP and InterP, for example. If drawn as a tree diagram, the structure in (6) looks like (7). Note that only the first optional TopP is included, and the details of TP are not shown, so a triangle is used:

242

Appendix

(7)

This structure will be discussed in Chap. 7, and most of the discussion will focus on InterP and on two speech act phrases (SAPs) that will be added above ForceP. The original idea that tense elements move up to the head of CP (i.e., that T0 moves to C0) is now modified to say that the contents of T0 move to the head of the interrogative phrase (i.e., T0 moves to Inter0). The two SAPs that will be added above ForceP are headed by discourse particles. When a discourse particle or tone appears in the sentence-final position, it is assumed that ForceP has raised up to the SPEC position of that SAP. This then results in a sentence-final position for the particle. This is shown in (29) of Chap. 7, which also includes a sentence-initial particle because there are two SAPs headed by discourse particles in that structure, and ForceP has raised to the SPEC position of the lower SAP. Hopefully, this brief appendix has provided sufficient background and sufficient knowledge of terminology to make the arguments in Chap. 7 readily accessible to all readers.

References

Carnie, A. (2013). Syntax: A generative introduction (3rd ed.). Malden, MA: Wiley-Blackwell. Rizzi, L. (1997). The fine structure of the left periphery. In L. Haegeman (Ed.), Elements of grammar: Handbook in generative syntax (1997, pp. 281–337). Dordrecht: Kluwer Academic. Rizzi, L. (2001). On the position of int(errogative) in the left periphery of the clause. In G. Cinque & G. Salvi (Eds.), Current studies in Italian syntax (pp. 287–296). Amsterdam: Elsevier.

© Springer Nature Singapore Pte Ltd. 2020 J. C. Wakefield, Intonational Morphology, Prosody, Phonology and Phonetics, https://doi.org/10.1007/978-981-15-2265-9

243